Subject: I18N, L10N, msgcat, bindings and bindings (long) - DN [1]


lduperval@sprint.ca - 16 Jul 2000 - comp.lang.tcl

 Hi all,

 As promised, here is a summary of various exchanges I've had over the past few
 weeks while trying to localise the Tk dialogs. The localisation went fairly
 well. Translating the strings was the easy part. After the transalations were
 complete, that's when the real challenges started showing up.  These
 challenges revolve around key bindings and single-character underlining in
 button-type widgets. These underlined characters usually have a binding
 associated with them.

 When the dialogs are translated, the underlined characters and the key
 bindings no longer make any sense. Underlines start appearing anywhere, even
 under characters that aren't letters. The key bindings no longer work either,
 for much the same reasons. The problem gets exacerbated when the target
 language doesn't use a Western-like alphabet.

 Many discussions have taken place prvately or on comp.lang.tcl. These
 discussions have yielded a number opf options and suggestions and a lot of
 discussions. For everyone's benefit, I'll try to concentrate all the
 information in this message. I invite anyone who want to participate to do so
 but to try to keep the discussions on comp.lang.tcl so that everyone can
 benefit.

 I'll make a list of the suggestsions presented so far and list pros and cons
 as I see them. From all this, I'm hoping that a concerted effort will allow us
 to solve the problem by sacrificing as little functionality and flexibility as
 possible, while maintaining backward compatibilty so that scripts don't break
 because of the new scheme used. If the best solution will require breaking a
 lot of scripts, it'll have to be postponed until 9.0 (TclMillenium or whatever
 the code name will be) comes out.

 So here are the various possibilities I've collected. Bear in mind that most
 of these solutions have a similar defect which how to associate a key binding
 to the letter that is underlined.

 1) Use a special character (& has been mentioned most often) to mark the
    underlined character. Modify the behaviour of -underline so that it will
    accept '&' as a correct value. It will then underline the character that
    appears after the '&'. If the value provided to -underline is a number, the
    behaviour will be the same as it currently is in Tk.

 Pro: If one doesn't use the '&' value, this solution is 100% backward
 compatible.

 Pro: It is an approach which is familiar to programmers on Windows (except
 that you don't need to use "-underline &").

 Pro: In further releases of Tk, the default value of -underline can be set to
 '&' instead of -1, which will mean less typing.

 Pro: The same logic should be useable across various languages since & isn't a
 real letter.

 Pro: Behind the scenes, the Tk display routine doesn't need to be modified. It
 current uses an integer to determine which character to underline. Converting
 the position of the & character to an integer will allow the same logic to be
 used.

 Con: The message catalog may become more complex to maintain. Among other
 things situations may come up where the same text is used in many locations
 but with a different (or no) mnemonic. You could also have a situation where
 the same text appears in two different locations witht he same mnemonic in
 English but a different one in another language (something like "E&xit" which
 must be "&Quitter" in one dialog but "Qui&tter" in another when translated to
 French).

 Con: Will require more memory because there will most likely be a need to keep
 dual representation of the string in order to keep track of the position of
 the &.

 Con: The core needs to be changed to allow escaping the & character to make
 sure that it can still be used in a string along with "-underline &".

 Con: Doesn't address the key bindings issue.

 2) Same as (1) but allowing any non-numeric character to specify underlining

 Most of the pros and cons of (1) can be applied to this solution.

 Pro: Programmers who don't like to use '&' can use any other character. This
 becomes especially useful when strings contains lots of &'s.

 Con: Can become confusing.

 Con: Supporting escaped characters becomes much more difficult since escape
 sequences will depend on the value of -underline

 Con: Message catalogs become more difficult to maintain.

 Con: Doesn't encourage atardardised coding. In other words, it becomes too
 perlish for my taste: pick a characer, any character! You can do it more ways
 than one, or whatever the line is.

 Con: Doesn't address the key bindings issue.

 3) Same as (1) but use a different character

 Same pros and cons as (1) except that it isn't as familiar to Windows
 programmers.

 4) Use a special Unicode character  (\u0332) which is the underline character.
 In Unicode, this character is placed after the character to underline and the
 Unicode support is supposed to take care of it automagically.

 Pro: Support for this is built into Unicode. Using this will not affect the
 way Tk behaves in any way.

 Con: You must use a font that supports this character. Otherwise the display
 code must be modified in the Tk core (I think).

 Con: Doesn't address the key bindings issue.

 5) Don't modify the core, do it all in Tcl

 Pro: The core isn't modified. That should allow for greater flexibility if the
 functionality need to be modified.

 Pro: You can most implement any of the solutions outlined above using Tcl,
 provided you don't use a solution that requires modifying low-level display
 code.

 Pro: Some code already exists (BWidget does it, Christopher Nelson has posted
 some code, Jan Nijtmans has done it) so it should be possible to leverage that
 code and add it to the Tk library.

 Con: Well, you can code the working porton in Tcl but depending on the
 approach you take, you will probably face the same types of problems as
 mentioned for that approach. However, it should be easier to fix the problem
 at the Tcl level than at C level.

 6) Put all the information in the message catalog

 Pro: All done at the Tcl level. Can be augmented more easily to support more
 functionality.

 Pro: All the information for mnemonics and their bindings are concentrated in
 a single location.

 Con: Makes the message catalog more complex.

 Con: Tends to give a Tk bias to the message catalog, while it can be used in
 Tcl as well as Tk. What's more the message catalogs are distributed with Tcl
 not Tk. It seems a bit illogical to me that the message catalog support
 functionality that only works if Tk is present also.

 7) At the core level add a new option which accepts the character to underline

 Pro: Maintains backward compatibility

 Pro: Doesn't require dual representation of text

 Con: Another option to add to the button widget

 Con: Doesn't address the key bindings issue.

 -------------------------

 Whew! And I haven't even touched on everything. I've concentrated more on the
 mnemonics than on the message catalogs and the associated bindings.

 Seems to me that coding this in Tcl will be the most beneficial to everyone.
 It also looks like the Tcl implementations have some sort of concensus
 (unconcerted I'm sure but most likely having a similar origin around
 north-western Washingtons state somewhere) where the & character is used for
 marking mnemonics. The implementations do cause a bacward compatibility
 problem because any use of the & character in a string will need to be
 escaped.

 It also looks like the best policy should be to include as much information as
 possible in the message catalogs. This will either require enhancing the
 msgcat package to support new functionality or to embed the new functionality
 somewhere in Tk library and have it work only with Tk's message catalog.

 If there are some glaring omissions, let me know. I've tried to take most of
 everything into account but I could have missed some stuff. Note also that I
 expect to have a hectic week at work this week and I'm on vacation the two
 weeks following that. I will probably be less available to work on this in the
 immediate future so if someone wants to jump in and do so, feel free. Just
 make sure you let other people know so you don't do any duplicate work.

 L

Last modified
2000-07-20

(195.108.246.52)

Note: you are looking at
the snapshot of an old wiki
- much of this information
is likely to be very outdated