| [ Return to Bugs & Features | Post Text | Post File ]
STR #3436
Application: | FLTK Library |
Status: | 5 - New |
Priority: | 4 - High, e.g. key functionality not working |
Scope: | 3 - Applies to all machines and operating systems |
Subsystem: | Core Library |
Summary: | use of isspace(), ispunct(), and others must correctly test unicode characters |
Version: | 1.4-current |
Created By: | AlbrechtS |
Assigned To: | AlbrechtS |
Fix Version: | Unassigned |
Update Notification: | |
Trouble Report Files:
[ Post File ]
Trouble Report Comments:
[ Post Text ]
|
#1 | AlbrechtS 08:14 Nov 17, 2017 |
| This STR is a placeholder for the use of all functions like isspace() and ispunct(). These functions are not unicode aware and most of them, if not all, are defined for int's in the range -1, 0, .., 255, where -1 stands for EOF (end of file). Using these functions with int's outside this range yields "undefined" results.
This has two issues:
(1) moderate: the result is wrong (all platforms).
(2) severe: "Debug" builds of Visual Studio run into an 'assert' failure and the program is terminated. The outcome on other platforms is at least "undefined" (i.e. it may crash as well).
Note: Visual Studio "Release" builds don't fail but return wrong results.
See fltk.coredev, thread "editor fails on cyrillic symbols": https://groups.google.com/d/msg/fltkcoredev/Yo3LN8jPe0A/TJBj-NzzDAAJ
"start test/editord.exe, copy 'oй' and paste into editor, press Ctrl+Left (previous word, etc) and the application fails!"
The above test scenario seems to be fixed now in FLTK 1.4 (but not in FLTK 1.3), but see Nikita's enumeration of other crashes in: https://groups.google.com/d/msg/fltkcoredev/Yo3LN8jPe0A/tn8HE9cKDQAJ
Nikita's comment cited here in case the link above doesn't work:
1) Double click at any cyrillic word makes crash in Fl_Text_Editor, you can test it with my previos example, even after Albrecht's patch.
2) The same action in Fl_Input gives the same result (inputd.exe helps you).
3) Trying to put cyrillic symbol after first @ in label makes crash too (I used Fluid).
4) Trying to generate code in Fluid when fl file in in russian. E.g. 'Тест.fl'
In other words, I found places where isspace() (isalpha(), etc) is used without mask ( & 255) and checked them. They are very suspicious ones.
- End of Citation - | |
|
#2 | AlbrechtS 08:23 Nov 17, 2017 |
| More detailed information: POSIX defines the following functions, some of them may be used in FLTK (or not). Reference man7.org, POSIX man pages: http://man7.org/linux/man-pages/man3/isalpha.3p.html
isalpha, isalnum, isblank, iscntrl, isdigit, isgraph, islower, isprint, ispunct, isspace, isupper, isxdigit
From the man page referenced above (isalpha):
"The c argument is an int, the value of which the application shall ensure is representable as an unsigned char or equal to the value of the macro EOF. If the argument has any other value, the behavior is undefined."
Note: the Macro EOF is usually equivalent to the value (-1), but implementation defined. | |
|
#3 | AlbrechtS 08:38 Nov 17, 2023 |
| FWIW, I posted a shell script 'check_isalpha' that can be used to find all occurrences of the mentioned functions that are used in current FLTK (Git master, commit 44bb080c0ff81b16d48dccd8d15809f058cc68ea).
This needs more investigation:
1. Check if every usage of these functions is on the correct parameter types and that the return value is properly tested.
2. Functions like 'isupper()' should verify not to test parts (single bytes) of UTF-8 sequences. | |
[ Return to Bugs & Features | Post Text | Post File ]
|
| |