Subject: I18N, L10N, msgcat, bindings and bindings (long) - DN [1]
lduperval@sprint.ca - 16 Jul 2000 - comp.lang.tcl
Hi all,
As promised, here is a summary of various exchanges I've had over the past few
weeks while trying to localise the Tk dialogs. The localisation went fairly
well. Translating the strings was the easy part. After the transalations were
complete, that's when the real challenges started showing up. These
challenges revolve around key bindings and single-character underlining in
button-type widgets. These underlined characters usually have a binding
associated with them.
When the dialogs are translated, the underlined characters and the key
bindings no longer make any sense. Underlines start appearing anywhere, even
under characters that aren't letters. The key bindings no longer work either,
for much the same reasons. The problem gets exacerbated when the target
language doesn't use a Western-like alphabet.
Many discussions have taken place prvately or on comp.lang.tcl. These
discussions have yielded a number opf options and suggestions and a lot of
discussions. For everyone's benefit, I'll try to concentrate all the
information in this message. I invite anyone who want to participate to do so
but to try to keep the discussions on comp.lang.tcl so that everyone can
benefit.
I'll make a list of the suggestsions presented so far and list pros and cons
as I see them. From all this, I'm hoping that a concerted effort will allow us
to solve the problem by sacrificing as little functionality and flexibility as
possible, while maintaining backward compatibilty so that scripts don't break
because of the new scheme used. If the best solution will require breaking a
lot of scripts, it'll have to be postponed until 9.0 (TclMillenium or whatever
the code name will be) comes out.
So here are the various possibilities I've collected. Bear in mind that most
of these solutions have a similar defect which how to associate a key binding
to the letter that is underlined.
1) Use a special character (& has been mentioned most often) to mark the
underlined character. Modify the behaviour of -underline so that it will
accept '&' as a correct value. It will then underline the character that
appears after the '&'. If the value provided to -underline is a number, the
behaviour will be the same as it currently is in Tk.
Pro: If one doesn't use the '&' value, this solution is 100% backward
compatible.
Pro: It is an approach which is familiar to programmers on Windows (except
that you don't need to use "-underline &").
Pro: In further releases of Tk, the default value of -underline can be set to
'&' instead of -1, which will mean less typing.
Pro: The same logic should be useable across various languages since & isn't a
real letter.
Pro: Behind the scenes, the Tk display routine doesn't need to be modified. It
current uses an integer to determine which character to underline. Converting
the position of the & character to an integer will allow the same logic to be
used.
Con: The message catalog may become more complex to maintain. Among other
things situations may come up where the same text is used in many locations
but with a different (or no) mnemonic. You could also have a situation where
the same text appears in two different locations witht he same mnemonic in
English but a different one in another language (something like "E&xit" which
must be "&Quitter" in one dialog but "Qui&tter" in another when translated to
French).
Con: Will require more memory because there will most likely be a need to keep
dual representation of the string in order to keep track of the position of
the &.
Con: The core needs to be changed to allow escaping the & character to make
sure that it can still be used in a string along with "-underline &".
Con: Doesn't address the key bindings issue.
2) Same as (1) but allowing any non-numeric character to specify underlining
Most of the pros and cons of (1) can be applied to this solution.
Pro: Programmers who don't like to use '&' can use any other character. This
becomes especially useful when strings contains lots of &'s.
Con: Can become confusing.
Con: Supporting escaped characters becomes much more difficult since escape
sequences will depend on the value of -underline
Con: Message catalogs become more difficult to maintain.
Con: Doesn't encourage atardardised coding. In other words, it becomes too
perlish for my taste: pick a characer, any character! You can do it more ways
than one, or whatever the line is.
Con: Doesn't address the key bindings issue.
3) Same as (1) but use a different character
Same pros and cons as (1) except that it isn't as familiar to Windows
programmers.
4) Use a special Unicode character (\u0332) which is the underline character.
In Unicode, this character is placed after the character to underline and the
Unicode support is supposed to take care of it automagically.
Pro: Support for this is built into Unicode. Using this will not affect the
way Tk behaves in any way.
Con: You must use a font that supports this character. Otherwise the display
code must be modified in the Tk core (I think).
Con: Doesn't address the key bindings issue.
5) Don't modify the core, do it all in Tcl
Pro: The core isn't modified. That should allow for greater flexibility if the
functionality need to be modified.
Pro: You can most implement any of the solutions outlined above using Tcl,
provided you don't use a solution that requires modifying low-level display
code.
Pro: Some code already exists (BWidget does it, Christopher Nelson has posted
some code, Jan Nijtmans has done it) so it should be possible to leverage that
code and add it to the Tk library.
Con: Well, you can code the working porton in Tcl but depending on the
approach you take, you will probably face the same types of problems as
mentioned for that approach. However, it should be easier to fix the problem
at the Tcl level than at C level.
6) Put all the information in the message catalog
Pro: All done at the Tcl level. Can be augmented more easily to support more
functionality.
Pro: All the information for mnemonics and their bindings are concentrated in
a single location.
Con: Makes the message catalog more complex.
Con: Tends to give a Tk bias to the message catalog, while it can be used in
Tcl as well as Tk. What's more the message catalogs are distributed with Tcl
not Tk. It seems a bit illogical to me that the message catalog support
functionality that only works if Tk is present also.
7) At the core level add a new option which accepts the character to underline
Pro: Maintains backward compatibility
Pro: Doesn't require dual representation of text
Con: Another option to add to the button widget
Con: Doesn't address the key bindings issue.
-------------------------
Whew! And I haven't even touched on everything. I've concentrated more on the
mnemonics than on the message catalogs and the associated bindings.
Seems to me that coding this in Tcl will be the most beneficial to everyone.
It also looks like the Tcl implementations have some sort of concensus
(unconcerted I'm sure but most likely having a similar origin around
north-western Washingtons state somewhere) where the & character is used for
marking mnemonics. The implementations do cause a bacward compatibility
problem because any use of the & character in a string will need to be
escaped.
It also looks like the best policy should be to include as much information as
possible in the message catalogs. This will either require enhancing the
msgcat package to support new functionality or to embed the new functionality
somewhere in Tk library and have it work only with Tk's message catalog.
If there are some glaring omissions, let me know. I've tried to take most of
everything into account but I could have missed some stuff. Note also that I
expect to have a hectic week at work this week and I'm on vacation the two
weeks following that. I will probably be less available to work on this in the
immediate future so if someone wants to jump in and do so, feel free. Just
make sure you let other people know so you don't do any duplicate work.
L
Last modified
2000-07-20
2000-07-20
(195.108.246.52)
Note: you are looking at
the snapshot of an old wiki
- much of this information
is likely to be very outdated
