FLTK logo

STR #3436

FLTK matrix user chat room
(using Element browser app)   FLTK gitter user chat room   GitHub FLTK Project   FLTK News RSS Feed  
  FLTK Apps      FLTK Library      Forums      Links     Login 
 Home  |  Articles & FAQs  |  Bugs & Features  |  Documentation  |  Download  |  Screenshots  ]

Return to Bugs & Features | Post Text | Post File | SVN ⇄ GIT ]

STR #3436

Application:FLTK Library
Status:5 - New
Priority:4 - High, e.g. key functionality not working
Scope:3 - Applies to all machines and operating systems
Subsystem:Core Library
Summary:use of isspace(), ispunct(), and others must correctly test unicode characters
Created By:AlbrechtS
Assigned To:AlbrechtS
Fix Version:Unassigned
Update Notification:

Receive EMails Don't Receive EMails

Trouble Report Files:

Post File ]
Name/Time/Date Filename/Size  
#1 AlbrechtS
08:31 Nov 17, 2023

Trouble Report Comments:

Post Text ]
Name/Time/Date Text  
#1 AlbrechtS
08:14 Nov 17, 2017
This STR is a placeholder for the use of all functions like isspace() and ispunct(). These functions are not unicode aware and most of them, if not all, are defined for int's in the range -1, 0, .., 255, where -1 stands for EOF (end of file). Using these functions with int's outside this range yields "undefined" results.

This has two issues:

(1) moderate: the result is wrong (all platforms).

(2) severe: "Debug" builds of Visual Studio run into an 'assert' failure and the program is terminated. The outcome on other platforms is at least "undefined" (i.e. it may crash as well).

Note: Visual Studio "Release" builds don't fail but return wrong results.

See fltk.coredev, thread "editor fails on cyrillic symbols":

"start test/editord.exe, copy 'oй' and paste into editor, press Ctrl+Left (previous word, etc) and the application fails!"

The above test scenario seems to be fixed now in FLTK 1.4 (but not in FLTK 1.3), but see Nikita's enumeration of other crashes in:

Nikita's comment cited here in case the link above doesn't work:

1) Double click at any cyrillic word makes crash in Fl_Text_Editor, you
can test it with my previos example, even after Albrecht's patch.

2) The same action in Fl_Input gives the same result (inputd.exe helps you).

3) Trying to put cyrillic symbol after first @ in label makes crash too
(I used Fluid).

4) Trying to generate code in Fluid when fl file in in russian. E.g.

In other words, I found places where isspace() (isalpha(), etc) is used
without mask ( & 255) and checked them. They are very suspicious ones.

- End of Citation -
#2 AlbrechtS
08:23 Nov 17, 2017
More detailed information: POSIX defines the following functions, some of them may be used in FLTK (or not). Reference man7.org, POSIX man pages:

isalpha, isalnum, isblank, iscntrl, isdigit, isgraph,
islower, isprint, ispunct, isspace, isupper, isxdigit

From the man page referenced above (isalpha):

"The c argument is an int, the value of which the application shall ensure is representable as an unsigned char or equal to the value of the macro EOF. If the argument has any other value, the behavior is undefined."

Note: the Macro EOF is usually equivalent to the value (-1), but implementation defined.
#3 AlbrechtS
08:38 Nov 17, 2023
FWIW, I posted a shell script 'check_isalpha' that can be used to find all occurrences of the mentioned functions that are used in current FLTK (Git master, commit 44bb080c0ff81b16d48dccd8d15809f058cc68ea).

This needs more investigation:

1. Check if every usage of these functions is on the correct parameter types and that the return value is properly tested.

2. Functions like 'isupper()' should verify not to test parts (single bytes) of UTF-8 sequences.

Return to Bugs & Features | Post Text | Post File ]


Comments are owned by the poster. All other content is copyright 1998-2024 by Bill Spitzak and others. This project is hosted by The FLTK Team. Please report site problems to 'erco@seriss.com'.