FLTK logo

STR #3197

FLTK matrix user chat room
(using Element browser app)   FLTK gitter user chat room   GitHub FLTK Project   FLTK News RSS Feed  
  FLTK Apps      FLTK Library      Forums      Links     Login 
 Home  |  Articles & FAQs  |  Bugs & Features  |  Documentation  |  Download  |  Screenshots  ]
 

Return to Bugs & Features | Roadmap 1.3 | SVN ⇄ GIT ]

STR #3197

Application:FLTK Library
Status:1 - Closed w/Resolution
Priority:3 - Moderate, e.g. unable to compile the software
Scope:2 - Specific to an operating system
Subsystem:Unicode support
Summary:Odd behavior with fl_input and locale
Version:1.3.3
Created By:AlainBandon
Assigned To:AlbrechtS
Fix Version:Will Not Fix
Update Notification:

Receive EMails Don't Receive EMails

Trouble Report Files:

No files


Trouble Report Comments:


Name/Time/Date Text  
 
#1 AlainBandon
08:28 Feb 23, 2015
When typing (from a french keyboard) an accented letter like é, è, à and so on, in a fl_input box, the display is correctly displaying, the letter, but when trying to get the string from the code (with a buffer->text() ), it appears to contain é (0xe9 0xc3) instead (probably the ascii representation of utf8 encoding), when all other letters from the string are correct.

At the opposite, when inserting from the code a string containing é (0xa9)(encoded in ascii with the locale), the display is still correct but some strange behaviour occurs with the selector when trying to place the cursor just before é letter : it is just impossible to do it by click (only left arrow keyboard works). Even stranger, any string finishing by "é\n" is displayed in a multiline text_display as if \n were not there.


Long story short, this problem makes my text search input a real disaster when dealing with thoses letters. Any quick fix or workaround possible for at least getting the "real string" with ascii-only chars ?
 
 
#2 AlainBandon
08:41 Feb 23, 2015
erratum :
myInput->value() instead of ->buffer()->text()


I found the magic function deciphering the buffer for the display :
const char* Fl_Input_::expand(const char* p, char* buf) const

Is there a way to use this function or get the result of it ?
 
 
#3 AlbrechtS
09:54 Feb 23, 2015
FLTK 1.3 uses exclusively UTF-8 text encoding. There are options though that make FLTK accept _some_ text in ISO-8859-1 encoding and (maybe) display it accordingly instead of displaying an error. This is probably what you are seeing if you enter text in your locale into any widget by using value(char *). All user input, however, is definitely encoded in UTF-8. See this link:

http://www.fltk.org/doc-1.3/migration_1_3.html

Citation: "It is important that, although your software uses only ASCII characters for input to FLTK widgets, the user may enter non-ASCII characters, and FLTK will return these characters with UTF-8 encoding to your application, e.g. via Fl_Input::value(). You will need to re-encode them to your (non-UTF-8) encoding, otherwise you might see or print garbage in your data."

And this means really ASCII (range 32-126 for printable characters).

If you need more info about Unicode and UTF-8 please consult this link:
http://www.fltk.org/doc-1.3/unicode.html

That said, what you are seeing might seem to work, but _ALL_ user input will be encoded in UTF-8, so if you save any user input data to a file, take care of this. If you use a FLTK input widget (your search input) to search text that is encoded in another encoding, this is not going to work.

You may be able to convert your input text to your locale encoding, but this is OT here (this behavior is not a bug). If you have further questions how to solve your problem, please ask in fltk.general.
https://groups.google.com/forum/#!forum/fltkgeneral
 
 
#4 AlainBandon
10:48 Feb 23, 2015
I understand the logic. I am supposed to read the value as utf8, and either reencode it as utf-16 in wchar or reconvert it again as chars using the local.

But this also means in this case that there is a problem with the copy paste function inside the input : I can copy a 'é' encoded in ascii (0xa9) from the data or any external way, and paste it in the input. In this case instead of converting the 'é' into utf-8, the input keep it as is. Correct me if I'm wrong but this 'é' should be encoded to utf-8 if I follow your logic.
 
 
#5 AlainBandon
11:07 Feb 23, 2015
It's really strange actually... The exact repro of the bug is the following :
I have a diplay_text filled with ascii locale chars (coming from user data). As I explained before, the display is bugged when dealing with 'é' char (\n supressed and cursor bugged), and I understand I'm faulty to not encode the string I want to display in utf-8.

But here comes the tricky part :
I copy the word containing my 'é' and I paste it into my search input, and the "bad" ascii 'é' is conserved.
But if I paste it in any other soft like notepad of firefox's search bar and then copy it again from there and finally paste it into my fltk search box, the bad ascii 'é' is correctly converted into utf-8.

So maybe is the copy to clipboard function simply not correctly used in fltk and assume that everything copied from fltk is utf8 encoded (what should be the case but is not in my case). I know that nearly all apps using the clipboard for text (in windows) generally all encode the text as wide chars.
 
 
#6 AlbrechtS
11:19 Feb 23, 2015
If I was not clear: your input is wrong, thus any behavior dependent on this wrong input is undefined. Unfortunately FLTK's default behavior is defined to tolerate the wrong input and make it _look_ right, but it isn't. This is a compromise to be as kind as possible and not to modify your data.

That said, copy&paste and drag&drop work correctly if you use a correct source and destination. The data is converted accordingly during the d&d or c&p operation. I suggest to test this with a browser and FLTK's text editor (test/editor) or another working editor. I recommend notepad++ where you can set/show/convert the file encoding and do any drag&drop operations with FLTK widgets. The input and output will be converted.

Example: open a file encoded in your locale (I assume Windows CP1252 or ISO-8859-1 or similar) with notepad++. Use the editor's menu "encoding" to display the encoding and/or convert it. Even if the editor displays "Ansi" (aka CP1252) you can drag'n'drop text from that source into FLTK's editor or your search input. It _will_ be converted to UTF-8 on the fly.

Even if you open the file with FLTK's test/editor it will be converted to UTF-8 and you will see a warning popup.

HTH
 
 
#7 AlbrechtS
11:29 Feb 23, 2015
Okay, my last reply (#6) was after reading your post #4.

Regarding #5: My #6 may explain some parts of your observations. Wrong input leads to undefined (faulty) behavior. FLTK tries to display your data correctly, but OTOH it assumes that the data is in UTF-8 encoding when you copy it. This may sound weird, but that's the case.

Clipboard encoding is generally flexible, but I don't know the exact details. You can rely on the correct reading and writing however, if the application has the correct data. It will be converted on the fly. Try it yourself, but please don't do it with your wrong data in FLTK. This doesn't work.

The same is true for drag&drop.
 
 
#8 AlbrechtS
11:35 Feb 23, 2015
Well, here is another experiment you can try (but I didn't test it myself).

Use notepad++ as I suggested before. Open a text file in your locale encoding. notepad++ will show the correct encoding ("Ansi").

In the "encoding" menu you have two choices:

 (1) check another encoding, e.g. UTF-8.
 (2) use "convert to UTF-8".

Try both. If you use (1) and do copy and paste, you will probably see similar results as in FLTK with your text, because notepad++ is in error about the encoding.

If you use (2) everything should work flawlessly.
 
 
#9 AlainBandon
12:03 Feb 23, 2015
I made all your test and all works correctly : generally speaking It is impossible to reproduce my bug using any external text whatever the encoding I use.

So I made the opposite to understand what exactly in the clipboard if fucked when I copy a bad string (fucked up) from the display_text and paste it into the input.

Here is what I tried :
- copy the fucked text from the display_text and paste it to the input -> kept fucked
- copy the fucked text from the display_text and paste it into HxD as text -> é as 0xE9 (ANSI)
- copy the copied fucked text from hxD to the input -> no more fucked

And now I'm completely lost...
 
 
#10 AlainBandon
12:43 Feb 23, 2015
I installed a software called insideclipboard that allows you to check all parts of the clibboard.

And made the test of copying the string "traité" from the ascii encoded display_text or the input into the clipboard, and then pasting it anywhere else and copying it to the clipboard again, and there is absolutely no difference in the clipboard content.

So I have absolutely no idea of why it gets unfucked when pasted again... this is just magic X_X
 
 
#11 AlbrechtS
12:56 Feb 23, 2015
General support is not available via the STR form. Please post to the FLTK forums and/or mailing lists for general support.

Short form: garbage in, garbage out.

Long form: sometimes software is "guessing". Essentially that's what FLTK does as well when it is displaying your "Ansi" encoded text as you expect, which is _technically_ wrong (because your text is not UTF-8 encoded).

I'm sorry, I'd like to help you more, but this is really not the place to do so. Please post further questions, as I said before, in our user forum (fltk.general).

https://groups.google.com/forum/#!forum/fltkgeneral
 
     

Return to Bugs & Features ]

 
 

Comments are owned by the poster. All other content is copyright 1998-2024 by Bill Spitzak and others. This project is hosted by The FLTK Team. Please report site problems to 'erco@seriss.com'.