| [ Return to Bugs & Features | Roadmap 1.3 | SVN ⇄ GIT ]
STR #2735
Application: | FLTK Library |
Status: | 2 - Closed w/o Resolution |
Priority: | 3 - Moderate, e.g. unable to compile the software |
Scope: | 3 - Applies to all machines and operating systems |
Subsystem: | Unicode support |
Summary: | fl_utf_toupper() and Eszett |
Version: | 1.3-current |
Created By: | corvid |
Assigned To: | AlbrechtS |
Fix Version: | Will Not Fix |
Update Notification: | |
Trouble Report Files:
Trouble Report Comments:
|
#1 | corvid 05:40 Oct 16, 2011 |
| Here's a patch just gluing it in as a special case... | |
|
#2 | AlbrechtS 02:33 Oct 17, 2011 |
| Well, as a German, I can say that "Eszett" (German sharp s = 'ß') is usually capitalized as 'SS', but that's not always done (often it is left as-is, since there is no upper-case 'ß').
That said, is there a standard (ISO, POSIX, or anything else) that documents that this is the *correct* behavior?
Notes: (a) at least this doesn't seem to change the number or bytes of the (UTF-8) string, but (b) it it not invertible. | |
|
#3 | corvid 04:25 Oct 17, 2011 |
| I've been looking into text-transform in CSS, and Firefox shows "SS" when using text-transform values of "uppercase" and "capitalize". For the microsoft browser, it's mentioned, evidently as a bug (http://msdn.microsoft.com/en-us/library/ff405742(v=vs.85).aspx).
Not being a German speaker, I've only been going by what I've read, which gives me the impression that it generally becomes "SS". And an upper case character that is used occasionally on signs got a Unicode code point not too long ago.
*digs that up* http://unicode.org/versions/Unicode5.1.0/#Tailored_Casing_Operations
'In particular, capital sharp s is intended for typographical representations of signage and uppercase titles, and other environments where users require the sharp s to be preserved in uppercase. Overall, such usage is rare. In contrast, standard German orthography uses the string "SS" as uppercase mapping for small sharp s. Thus, with the default Unicode casing operations, capital sharp s will lowercase to small sharp s, but not the reverse: small sharp s uppercases to "SS". In those instances where the reverse casing operation is needed, a tailored operation would be required.' | |
|
#4 | corvid 05:17 Oct 17, 2011 |
| I discovered Unicode's page with the, well, special cases:
http://www.unicode.org/Public/UNIDATA/SpecialCasing.txt | |
|
#5 | torsten.giebl 16:03 Oct 17, 2011 |
| As another german coder, i found it very interesting that an ToUpper function changes an eszet to SS, i would have never guessed it would do that.
For me the eszet symbol was always the same as the euro sign, the same symbol for upper and lower case.
Personally i would not change it, it WILL confuse germans. Also is there any bad impact for other nations, if a big eszet = small eszet ? | |
|
#6 | ianmacarthur 03:15 Dec 21, 2011 |
| Responding to Torsten's question, I don't think other languages are likely to be affected? The sharp-s thing is more of a Germanic languages thing, and I don't think it occurs at all in (for example) the Romance languages? It doesn't even occur in English, which is (in large part) like a Germanic language...
I do find it odd that native German speakers who responded seem to think that the Unicode recommendations are out of sync with real world usage, though! (Or maybe I am not so surprised...) | |
|
#7 | chris 21:43 May 09, 2012 |
| From (german) wikipedia: http://de.wikipedia.org/wiki/Gro%C3%9Fes_%C3%9F
The uppercase sharp s is defined as: U+1E9E LATIN CAPITAL LETTER SHARP S
Most newer linux systems support it with the standard german keyboard:
It appears when caps lock is activated and you type 'ß' or with Shift+AltGr+ß.
Tried it in Ubuntu 11.04 (in console and office program) and it works. | |
|
#8 | michaelbaeuerle 05:01 Aug 28, 2013 |
| In germany every few years there is a reformation of orthography. Many people no longer know what is currently "correct". But it looks like that still haven't wasted enough resources (like printing new school books) and some people still don't know what to do with their time ... and therefore they invented the captital sharp s.
As a german I can say: This is really annoying. Regardless what you implement today, it will likely be wrong again tomorrow :-( | |
|
#9 | ianmacarthur 16:02 Mar 11, 2014 |
| It's ben a while since this came up.
Is there a view on the Best Way Forward?
I'm inclined towards the Do Nothing option...
We can close? Or...? | |
|
#10 | ianmacarthur 16:36 Sep 04, 2014 |
| I'm inclined towards closing this one?
Anybody have a view?
Neither corvid nor I know what is right, and the native German speakers who responded all seems to think the Unicode recommendations are "unhelpful"...
We leave it as is and close as "won't fix"? | |
|
#11 | AlbrechtS 09:16 Sep 02, 2016 |
| I agree with Ian that we should close this STR (will not fix).
The given patch would lead to inconsistencies since we also have, for instance:
/** Returns the Unicode upper case value of \p ucs. */ int fl_toupper(unsigned int ucs) { return Toupper(ucs); }
Note that this returns exactly _one_ upper case Unicode character value for the input character code (ucs). There is also fl_tolower() with the opposite conversion.
If we changed the buffer (string) conversion function fl_utf_toupper() as proposed, then we'd have different results if we use the string conversion as opposed to converting character by character with fl_toupper() or fl_tolower().
The only way to solve this inconsistency would be to convert "lowercase sharp s" to "uppercase sharp s" as discussed, but this is IMHO even more non-standard (at least at this time, since in standard German typography there is no "uppercase sharp s".
Conclusion: leave as is, close STR ("Will Not Fix").
Final note: I'll close this STR as said above if we don't get serious objections in a week or so. Anybody? | |
|
#12 | AlbrechtS 01:18 Sep 11, 2016 |
| Closed w/o Resolution. | |
[ Return to Bugs & Features ]
|
| |