FLTK logo

STR #1162

FLTK matrix user chat room
(using Element browser app)   FLTK gitter user chat room   GitHub FLTK Project   FLTK News RSS Feed  
  FLTK Apps      FLTK Library      Forums      Links     Login 
 Home  |  Articles & FAQs  |  Bugs & Features  |  Documentation  |  Download  |  Screenshots  ]
 

Return to Bugs & Features | Roadmap 1.1 | SVN ⇄ GIT ]

STR #1162

Application:FLTK Library
Status:1 - Closed w/Resolution
Priority:5 - Critical, e.g. nothing working at all
Scope:2 - Specific to an operating system
Subsystem:Core Library
Summary:FLTK crash on Windows (somehow related to recent changes in Fl_Menu_Button)
Version:1.1.7
Created By:geuzaine.acm.caltech
Assigned To:matt
Fix Version:1.1-current (SVN: v5037)
Update Notification:

Receive EMails Don't Receive EMails

Trouble Report Files:

No files


Trouble Report Comments:


Name/Time/Date Text  
 
#1 geuzaine.acm.caltech
18:10 Jan 30, 2006
Some of my users encounter crashes on Windows (with Cygwin) when my
application is linked against FLTK 1.1.7.

After investigating for a while, I found that the culprit seems to be
the redraw() call recently added at the end of Fl_Menu_Button::popup().

Here is what I think might be happening:

- In my application the callback associated with the Fl_Menu_Button is
  designed to delete the menu button. It does this by calling
  Fl::delete_widget(butt) (prior to 1.1.6 I used to just do "delete
  butt").

- the redraw() added after picked() in Fl_Menu_Button::popup() tries
  to access the widget, but somehow the widget has already been
  deleted. (I don't understand this, as it was my understanding that
  Fl::delete_widget(butt) was designed to avoid just that kind of
  issue.)

This is of course all hypothetical: I don't know the internals of FLTK
well enough to determine if this actually makes sense.

To make things worse I cannot get a stack trace on my Windows machine
(gdb consistently crashes on startup with any FLTK program, even the
test programs in fltk-1.1/test). And (of course!) I cannot reproduce
the problem on Mac OS X or on Linux.

Also, in order to reproduce the crash on Windows, it seems that a
certain series of events must happen in a certain order. Could there
be some race condition somewhere (with an Fl::wait() called too early,
forcing an early deletion of the widget)? I tried to build a small
program that reproduces the behavior, but I failed miserably...

Anyway, reverting Fl_Menu_Button.cxx to the svn revision
{"2006-01-10"} fixes the problem, as does the fact of simply
commenting out the second call to redraw() at the end of
Fl_Menu_Button::popup().

Not sure what a real fix for this problem is, though.

--Christophe
 
 
#2 geuzaine.acm.caltech
19:59 Jan 31, 2006
Update: the crash actually also happens on Linux. (So it's not a Windows-only thing.)  
 
#3 geuzaine.acm.caltech
18:41 Feb 27, 2006
Here is the stack trace on Windows after the crash:

Program received signal SIGSEGV, Segmentation fault.
Fl_Widget::damage (this=0x34558c0, fl=128, X=-274, Y=-274, W=-274, H=-274)
    at ../FL/Fl_Widget.H:110
110       uchar type() const {return type_;}
(gdb)
(gdb) backtrace
#0  Fl_Widget::damage (this=0x34558c0, fl=128, X=-274, Y=-274, W=-274, H=-274)
    at ../FL/Fl_Widget.H:110
#1  0x00580b95 in Fl_Widget::damage (this=0x34558c0, fl=128 '\200')
    at ../FL/Fl_Widget.H:113
#2  0x00580bda in Fl_Widget::redraw (this=0x34558c0) at Fl.cxx:1035
#3  0x005912f8 in Fl_Menu_Button::popup (this=0x34558c0)
    at Fl_Menu_Button.cxx:62
#4  0x00591396 in Fl_Menu_Button::handle (this=0x34558c0, e=1)
    at Fl_Menu_Button.cxx:89
#5  0x0058ec4e in send (o=0x34558c0, event=54876352) at Fl_Group.cxx:67
#6  0x0058eefa in Fl_Group::handle (this=0x3d9978, event=1)
    at Fl_Group.cxx:195
#7  0x0058ec4e in send (o=0x3d9978, event=1) at Fl_Group.cxx:67
#8  0x0058eefa in Fl_Group::handle (this=0x3d96d0, event=1)
    at Fl_Group.cxx:195
#9  0x005802dc in send (event=1, to=0x3d96d0, window=0xfffffeee) at Fl.cxx:662
#10 0x005806fb in Fl::handle (e=1, window=0x3d96d0) at Fl.cxx:700
#11 0x00581cfb in mouse_event (window=0x3e, what=143, button=1, wParam=1,
    lParam=4063375) at Fl_win32.cxx:546
#12 0x00583b37 in WndProc (hWnd=0x2900dc, uMsg=513, wParam=1, lParam=4063375)
    at Fl_win32.cxx:739
#13 0x77d43a5f in USER32!CreateWindowExA ()
   from /cygdrive/c/WINDOWS/system32/user32.dll
#14 0x77d43b2e in USER32!CreateWindowExA ()
   from /cygdrive/c/WINDOWS/system32/user32.dll
#15 0x77d43d6a in USER32!CreateWindowExA ()
   from /cygdrive/c/WINDOWS/system32/user32.dll
#16 0x77d441fd in USER32!DispatchMessageA ()
   from /cygdrive/c/WINDOWS/system32/user32.dll
#17 0x009f32a0 in fl_selection_buffer ()
#18 0x00000001 in ?? ()
#19 0x0058140c in fl_wait (time_to_wait=1e+20) at Fl_win32.cxx:291
#20 0x00581696 in Fl::wait (time_to_wait=1e+20) at Fl.cxx:289
#21 0x005831c5 in Fl::run () at Fl.cxx:357
#22 0x004018b3 in main (argc=3, argv=0x3d26e0) at Main.cpp:247
(gdb)
 
 
#4 matt
05:01 Mar 28, 2006
Christophe, did you remember to remove the Menu_Button form its parent before calling Fl::delete_widget()?

I tried some code to duplicate your crash, but could not. Would it be possible to create a sample source code that crashes in the way you explain? Thanks.
 
 
#5 geuzaine.acm.caltech
05:43 Mar 28, 2006
> Christophe, did you remember to remove the Menu_Button form its parent
> before calling Fl::delete_widget()?

Matthias - I think so: when the menu button is created it is added to a scroll area, so I just call scroll->remove(menu_button) before Fl::delete_widget(menu_button). Is this the right thing to do?

>
> I tried some code to duplicate your crash, but could not. Would it be
> possible to create a sample source code that crashes in the way you
> explain? Thanks.

I tried pretty hard (but I guess not hard enough...) to create a simple stand-alone example that would reproduce the problem, but did not succeed. As I tried to explain in the bug report, the crash in my "big" application only occurs in certain cases, when the callback has a lot of heavy lifting to do. My guess is that somehow the code runs Fl::wait() before it actually gets to redraw() the menu_button, which causes the crash.

I could give you detailed instructions on how to reproduce the crash with my actual code if you think it could help?

--Christophe
 
 
#6 mooses
01:10 Apr 05, 2006
It looks like i am fighting a similiar symptom.
I am using fltk-1.1.7 on RH-7.3 and RH-8.0. It is configured with:
--enable-shared --enable-xdbe --enable-threads

It is not a fltk-1.1.7 specific symptom, since i see it with fltk-1.1.6
and also with fltk-1.1.4

What leads to the problems i encounter with my fltk-application is
similiar to what Christophe does in his app:
In a button-press-callback (load configuration) i have to do a
calculation (faked calculation takes two seconds, real calculation takes
up to 120 seconds). After this i remove() Fl_Group-widgets from a
Fl_Scroll and after this they are delete'd (no Fl::delete_widget());
then some new Fl_Group-widgets are created and add()'ed to the
Fl_Scroll. Then the callback is finished.
The symptom i see is, after the callback has finished, the first button-
press to a Fl_Counter produces 2 (!) consecutive callbacks but the
Fl_Counter-button was pressed only once. After this the Fl_Counter
itself works fine again. The two conscutive callbacks are even
encountered when prior to pressing the Fl_Counter-button there
were callbacks to some other widget (not Fl_Counter type).
Even when the Fl_Counter itself works again correct after the multiple
callback symptom, the application crashes rather often after the
multiple callback happened; not immediately after it, but it will
after doing some fltk-interaction. The crash itself seems to be
unrelated to the  multiple-callback symptom, but it only happens,
when there was some action leading to the multiple-callback. In all
other cases, where haevy fltk-interaction is involved but no multiple
callback symptom, there are no crashes with the application.
Doing some testing, i found that reducing the calculation time to one
second instead of two, there is no multiple callback symptom anymore.
Unfortunately, as described above, in the deployed application i have
to do a calculation of 120 seconds, so this finding is no help to me.
On futher testing i commented out the calls to Fl::redraw() after
remove()ing/delete'ing and creating/add()ing the Fl_Group-widget to no
avail. So i can not confirm Christophes finding that removing redraw()
cures the problem.
I also commented out the delete after the remove() of the
Fl_Group-widget, again to no avail.
What struck me further and was the reason i switched from 1.1.6 to 1.1.7
is the following entry in the CHANGES log of fltk-1.1.7:
"Fl_Browser_ was calling the callback multiple times for a single
selection change with FL_WHEN_CHANGED (STR #834)"
I was hoping that this is somehow related to the multiple callback
symptom i encounter; unfortunately not. Nevertheless there is a
similiarity between what leads to the multiple callback in STR#834
and what leads to the multiple callback in my application:
The STR-reporter says that he is using fl_ask(); as one of the first
actions i do when the initial button-press-callback (load configuration)
is done is fl_file_chooser();
Maybe fltk gets into inconsistent state when opening/closing windows in
callbacks.

One more difference to Christophe: i am able to provoke and reproduce
the multiple callback symptom, though the crash (probably) related to
this is a matter of time and luck and can not be provoked at will.

Hope this helps.

Darius.
 
 
#7 AlbrechtS
16:22 Apr 19, 2006
When I read the following:

- the redraw() added after picked() in Fl_Menu_Button::popup() tries
  to access the widget, but somehow the widget has already been
  deleted. (I don't understand this, as it was my understanding that
  Fl::delete_widget(butt) was designed to avoid just that kind of
  issue.)

My first thought was: Isn't the popup implemented like a modal window, as is the case with fl_ask() and similar functions? If this is true, then there would be a chance that the widget would be deleted earlier than intended, because the full FLTK message loop is run while waiting for input.

Albrecht
 
 
#8 matt
19:25 Apr 26, 2006
Christophe, I assume you solved the problem by now.

Anyway, the reason your callback's failing is, that you mark the widget delete, but then allow Fl::flush() to be called before the callback returns. You must move the code for deleting the widget to the very end of the callback.

But reading your description, it seems to be a good idea to move your 200 second calculation entirely out of the callback and use timers or threads to keep you application interactive.
 
 
#9 matt
20:28 Apr 26, 2006
Reopened. Will take another look.  
 
#10 matt
14:43 Apr 27, 2006
Fixed using a smart pointer mechanism.

RFC: It would be possible to use these smart pointers for all cross-callback pointer needs, making the Fl::delete_widget function obsolete... .
 
     

Return to Bugs & Features ]

 
 

Comments are owned by the poster. All other content is copyright 1998-2024 by Bill Spitzak and others. This project is hosted by The FLTK Team. Please report site problems to 'erco@seriss.com'.