| [ Return to Bugs & Features | Roadmap 1.1 | SVN ⇄ GIT ]
STR #1162
Application: | FLTK Library |
Status: | 1 - Closed w/Resolution |
Priority: | 5 - Critical, e.g. nothing working at all |
Scope: | 2 - Specific to an operating system |
Subsystem: | Core Library |
Summary: | FLTK crash on Windows (somehow related to recent changes in Fl_Menu_Button) |
Version: | 1.1.7 |
Created By: | geuzaine.acm.caltech |
Assigned To: | matt |
Fix Version: | 1.1-current (SVN: v5037) |
Update Notification: | |
Trouble Report Files:
No files
Trouble Report Comments:
|
#1 | geuzaine.acm.caltech 18:10 Jan 30, 2006 |
| Some of my users encounter crashes on Windows (with Cygwin) when my application is linked against FLTK 1.1.7.
After investigating for a while, I found that the culprit seems to be the redraw() call recently added at the end of Fl_Menu_Button::popup().
Here is what I think might be happening:
- In my application the callback associated with the Fl_Menu_Button is designed to delete the menu button. It does this by calling Fl::delete_widget(butt) (prior to 1.1.6 I used to just do "delete butt").
- the redraw() added after picked() in Fl_Menu_Button::popup() tries to access the widget, but somehow the widget has already been deleted. (I don't understand this, as it was my understanding that Fl::delete_widget(butt) was designed to avoid just that kind of issue.)
This is of course all hypothetical: I don't know the internals of FLTK well enough to determine if this actually makes sense.
To make things worse I cannot get a stack trace on my Windows machine (gdb consistently crashes on startup with any FLTK program, even the test programs in fltk-1.1/test). And (of course!) I cannot reproduce the problem on Mac OS X or on Linux.
Also, in order to reproduce the crash on Windows, it seems that a certain series of events must happen in a certain order. Could there be some race condition somewhere (with an Fl::wait() called too early, forcing an early deletion of the widget)? I tried to build a small program that reproduces the behavior, but I failed miserably...
Anyway, reverting Fl_Menu_Button.cxx to the svn revision {"2006-01-10"} fixes the problem, as does the fact of simply commenting out the second call to redraw() at the end of Fl_Menu_Button::popup().
Not sure what a real fix for this problem is, though.
--Christophe | |
|
#2 | geuzaine.acm.caltech 19:59 Jan 31, 2006 |
| Update: the crash actually also happens on Linux. (So it's not a Windows-only thing.) | |
|
#3 | geuzaine.acm.caltech 18:41 Feb 27, 2006 |
| Here is the stack trace on Windows after the crash:
Program received signal SIGSEGV, Segmentation fault. Fl_Widget::damage (this=0x34558c0, fl=128, X=-274, Y=-274, W=-274, H=-274) at ../FL/Fl_Widget.H:110 110 uchar type() const {return type_;} (gdb) (gdb) backtrace #0 Fl_Widget::damage (this=0x34558c0, fl=128, X=-274, Y=-274, W=-274, H=-274) at ../FL/Fl_Widget.H:110 #1 0x00580b95 in Fl_Widget::damage (this=0x34558c0, fl=128 '\200') at ../FL/Fl_Widget.H:113 #2 0x00580bda in Fl_Widget::redraw (this=0x34558c0) at Fl.cxx:1035 #3 0x005912f8 in Fl_Menu_Button::popup (this=0x34558c0) at Fl_Menu_Button.cxx:62 #4 0x00591396 in Fl_Menu_Button::handle (this=0x34558c0, e=1) at Fl_Menu_Button.cxx:89 #5 0x0058ec4e in send (o=0x34558c0, event=54876352) at Fl_Group.cxx:67 #6 0x0058eefa in Fl_Group::handle (this=0x3d9978, event=1) at Fl_Group.cxx:195 #7 0x0058ec4e in send (o=0x3d9978, event=1) at Fl_Group.cxx:67 #8 0x0058eefa in Fl_Group::handle (this=0x3d96d0, event=1) at Fl_Group.cxx:195 #9 0x005802dc in send (event=1, to=0x3d96d0, window=0xfffffeee) at Fl.cxx:662 #10 0x005806fb in Fl::handle (e=1, window=0x3d96d0) at Fl.cxx:700 #11 0x00581cfb in mouse_event (window=0x3e, what=143, button=1, wParam=1, lParam=4063375) at Fl_win32.cxx:546 #12 0x00583b37 in WndProc (hWnd=0x2900dc, uMsg=513, wParam=1, lParam=4063375) at Fl_win32.cxx:739 #13 0x77d43a5f in USER32!CreateWindowExA () from /cygdrive/c/WINDOWS/system32/user32.dll #14 0x77d43b2e in USER32!CreateWindowExA () from /cygdrive/c/WINDOWS/system32/user32.dll #15 0x77d43d6a in USER32!CreateWindowExA () from /cygdrive/c/WINDOWS/system32/user32.dll #16 0x77d441fd in USER32!DispatchMessageA () from /cygdrive/c/WINDOWS/system32/user32.dll #17 0x009f32a0 in fl_selection_buffer () #18 0x00000001 in ?? () #19 0x0058140c in fl_wait (time_to_wait=1e+20) at Fl_win32.cxx:291 #20 0x00581696 in Fl::wait (time_to_wait=1e+20) at Fl.cxx:289 #21 0x005831c5 in Fl::run () at Fl.cxx:357 #22 0x004018b3 in main (argc=3, argv=0x3d26e0) at Main.cpp:247 (gdb) | |
|
#4 | matt 05:01 Mar 28, 2006 |
| Christophe, did you remember to remove the Menu_Button form its parent before calling Fl::delete_widget()?
I tried some code to duplicate your crash, but could not. Would it be possible to create a sample source code that crashes in the way you explain? Thanks. | |
|
#5 | geuzaine.acm.caltech 05:43 Mar 28, 2006 |
| > Christophe, did you remember to remove the Menu_Button form its parent > before calling Fl::delete_widget()?
Matthias - I think so: when the menu button is created it is added to a scroll area, so I just call scroll->remove(menu_button) before Fl::delete_widget(menu_button). Is this the right thing to do?
> > I tried some code to duplicate your crash, but could not. Would it be > possible to create a sample source code that crashes in the way you > explain? Thanks.
I tried pretty hard (but I guess not hard enough...) to create a simple stand-alone example that would reproduce the problem, but did not succeed. As I tried to explain in the bug report, the crash in my "big" application only occurs in certain cases, when the callback has a lot of heavy lifting to do. My guess is that somehow the code runs Fl::wait() before it actually gets to redraw() the menu_button, which causes the crash.
I could give you detailed instructions on how to reproduce the crash with my actual code if you think it could help?
--Christophe | |
|
#6 | mooses 01:10 Apr 05, 2006 |
| It looks like i am fighting a similiar symptom. I am using fltk-1.1.7 on RH-7.3 and RH-8.0. It is configured with: --enable-shared --enable-xdbe --enable-threads
It is not a fltk-1.1.7 specific symptom, since i see it with fltk-1.1.6 and also with fltk-1.1.4
What leads to the problems i encounter with my fltk-application is similiar to what Christophe does in his app: In a button-press-callback (load configuration) i have to do a calculation (faked calculation takes two seconds, real calculation takes up to 120 seconds). After this i remove() Fl_Group-widgets from a Fl_Scroll and after this they are delete'd (no Fl::delete_widget()); then some new Fl_Group-widgets are created and add()'ed to the Fl_Scroll. Then the callback is finished. The symptom i see is, after the callback has finished, the first button- press to a Fl_Counter produces 2 (!) consecutive callbacks but the Fl_Counter-button was pressed only once. After this the Fl_Counter itself works fine again. The two conscutive callbacks are even encountered when prior to pressing the Fl_Counter-button there were callbacks to some other widget (not Fl_Counter type). Even when the Fl_Counter itself works again correct after the multiple callback symptom, the application crashes rather often after the multiple callback happened; not immediately after it, but it will after doing some fltk-interaction. The crash itself seems to be unrelated to the multiple-callback symptom, but it only happens, when there was some action leading to the multiple-callback. In all other cases, where haevy fltk-interaction is involved but no multiple callback symptom, there are no crashes with the application. Doing some testing, i found that reducing the calculation time to one second instead of two, there is no multiple callback symptom anymore. Unfortunately, as described above, in the deployed application i have to do a calculation of 120 seconds, so this finding is no help to me. On futher testing i commented out the calls to Fl::redraw() after remove()ing/delete'ing and creating/add()ing the Fl_Group-widget to no avail. So i can not confirm Christophes finding that removing redraw() cures the problem. I also commented out the delete after the remove() of the Fl_Group-widget, again to no avail. What struck me further and was the reason i switched from 1.1.6 to 1.1.7 is the following entry in the CHANGES log of fltk-1.1.7: "Fl_Browser_ was calling the callback multiple times for a single selection change with FL_WHEN_CHANGED (STR #834)" I was hoping that this is somehow related to the multiple callback symptom i encounter; unfortunately not. Nevertheless there is a similiarity between what leads to the multiple callback in STR#834 and what leads to the multiple callback in my application: The STR-reporter says that he is using fl_ask(); as one of the first actions i do when the initial button-press-callback (load configuration) is done is fl_file_chooser(); Maybe fltk gets into inconsistent state when opening/closing windows in callbacks.
One more difference to Christophe: i am able to provoke and reproduce the multiple callback symptom, though the crash (probably) related to this is a matter of time and luck and can not be provoked at will.
Hope this helps.
Darius. | |
|
#7 | AlbrechtS 16:22 Apr 19, 2006 |
| When I read the following:
- the redraw() added after picked() in Fl_Menu_Button::popup() tries to access the widget, but somehow the widget has already been deleted. (I don't understand this, as it was my understanding that Fl::delete_widget(butt) was designed to avoid just that kind of issue.)
My first thought was: Isn't the popup implemented like a modal window, as is the case with fl_ask() and similar functions? If this is true, then there would be a chance that the widget would be deleted earlier than intended, because the full FLTK message loop is run while waiting for input.
Albrecht | |
|
#8 | matt 19:25 Apr 26, 2006 |
| Christophe, I assume you solved the problem by now.
Anyway, the reason your callback's failing is, that you mark the widget delete, but then allow Fl::flush() to be called before the callback returns. You must move the code for deleting the widget to the very end of the callback.
But reading your description, it seems to be a good idea to move your 200 second calculation entirely out of the callback and use timers or threads to keep you application interactive. | |
|
#9 | matt 20:28 Apr 26, 2006 |
| Reopened. Will take another look. | |
|
#10 | matt 14:43 Apr 27, 2006 |
| Fixed using a smart pointer mechanism.
RFC: It would be possible to use these smart pointers for all cross-callback pointer needs, making the Fl::delete_widget function obsolete... . | |
[ Return to Bugs & Features ]
|
| |