FLTK logo

STR #3355

FLTK matrix user chat room
(using Element browser app)   FLTK gitter user chat room   GitHub FLTK Project   FLTK News RSS Feed  
  FLTK Apps      FLTK Library      Forums      Links     Login 
 Home  |  Articles & FAQs  |  Bugs & Features  |  Documentation  |  Download  |  Screenshots  ]
 

Return to Bugs & Features | SVN ⇄ GIT ]

STR #3355

Application:FLTK Library
Status:1 - Closed w/Resolution
Priority:1 - Request for Enhancement, e.g. asking for a feature
Scope:3 - Applies to all machines and operating systems
Subsystem:FLUID
Summary:Support generation of UTF-8 file from FLUID
Version:1.4-feature
Created By:JYG
Assigned To:matt
Fix Version:1.4.0
Fix Commit:b490ce3463e9008d03224feb44c8b365a8e21954
Update Notification:

Receive EMails Don't Receive EMails

Trouble Report Files:


Name/Time/Date Filename/Size  
 
#1 AlbrechtS
05:24 Nov 22, 2016
test.fl
0k
 
 
#2 AlbrechtS
05:27 Nov 22, 2016
main.cxx
0k
 
 
#3 AlbrechtS
05:27 Nov 22, 2016
fluid_write_code_utf8.patch
1k
 
     

Trouble Report Comments:


Name/Time/Date Text  
 
#1 JYG
05:42 Nov 21, 2016
FLUID generated cxx files with ASCII encoded UTF-8 using octal values. It's annoying to see "\303\251" instead of "é" and impossible to search string in the code.
I think FLUID may have an option to use more modern file generation using utf-8 file with BOM or without BOM.
 
 
#2 AlbrechtS
05:24 Nov 22, 2016
For more information and the full discussion of this topic please see this thread in fltk.general:
https://groups.google.com/forum/#!topic/fltkgeneral/gf0Z3BW-zuc

This an edited excerpt of one of my replies:

There can always be characters inside a string that must be quoted (decimal 0-31, e.g. 10 = 0x0a = <LF> = '\n') or DEL (decimal 127). The current fluid code does also quote all  values in the range 128 to 255.

I did not write the code, but I can only assume that this [was done because it] is always safe for all compilers...

The patch I append should work for all Unicode characters if the compiler interprets strings as UTF-8.

Now to the patch: I attach three files to this post for later reference:

(1) test.fl: a fluid file with all ISO-8859-1 characters encoded as UTF-8 (only extended range, not ASCII part). This is also a subset of Microsoft's Windows Codepage 1252 ("Western"). Unicode range U+00a0 to U+00ff).

(2) main.cxx: a main program to compile test.cxx. This #include's test.cxx and indirectly test.h generated by fluid from test.fl.

(3) fluid_write_code_utf8.patch: the patch against FLTK 1.3.4 (stable release).

This patch basically does three things:

 - Fix reading character string bytes "unsigned", i.e. in range 0-255.
 - Don't limit line length to avoid breaking lines inside UTF-8 char's.
 - Write all ASCII and UTF-8 characters literally, i.e. without quoting.

You may use this patch if it works for you. Note that this is tested with the posted test cases, but I'm not sure if this will be okay for all users and compilers.

A "complete" solution would split strings (limit line length) w/o breaking inside UTF-8 characters and would presumably have an option to switch literal UTF-8 output on and off (on: literal/new vs. off: octal-quoted/old behavior).

Note: the posted patch is for FLTK 1.3 and contains only the minimal changes. The complete solution should be in FLTK 1.4 with an option to switch formats as described above.
 
 
#3 matt
12:29 Dec 17, 2021
Fixed in Git repository.  
 
#4 matt
12:29 Dec 17, 2021
Fixed in Git repository.  
     

Return to Bugs & Features ]

 
 

Comments are owned by the poster. All other content is copyright 1998-2024 by Bill Spitzak and others. This project is hosted by The FLTK Team. Please report site problems to 'erco@seriss.com'.