C · R · C · L 
      Center for Research in Computational Linguistics, Bangkok
 

Keyster Examples

  About Keyster

  From Assistive Technology to Assisting Technology

Keyster is implemented in a combination of Perl and JavaScript, and runs on a vanilla Netscape 7.1 / Apache system (but, oddly, not on IE or IIS). All utilities are based on open-source tools, except for the (freely available) Alternatiff TIFF-diplay plug-in.

    Part of a typical entry screen is shown below.
 -  black: prior image lines. This sample is from a Cambodian etymological dictionary (Jenner & Pou 1980, Lexicon of Khmer Morphology).
 -  white: current image line. This text is far too complex for correct optical character recognition.
 -  beige: 'live' formatted reflection of what has just been typed.
 -  white: keyboard input, including tags - <csx>, <ipa> and so on - that allow precise font matching.
 -  orange: buttons (labeled with CTRL-key shortcuts) custom-built on the fly from the job's stylesheet or DTD.

   

Buttons     CSS definitions (like ipa {font-size:0.9em;_lang:ipa}) are part of every Keyster job's stylesheet. The definition is parsed whenever a page is opened - in effect, every job gets its own custom version of Keyster

    We define a few local atttributes as well. For example, the _lang attribute (commonly associated with HTML tags) helps drive real-time dictionary lookup, based on aspell. 'ipa' is not a language, of course, so we can either build a custom aspell dictionary, or skip spell-checks on these items.

Finding characters     Note the non-Roman characters, such as those used in the CSX system of Indic transliteration, and for IPA phonetic transcription. These constructed characters often appear haphazardly in older texts; but because they may be widely distributed in the Unicode character set they can be a bitch to find.

    Keyster has mechanisms for predefining groups of shortcuts, like these subsets of IPA and CSX used in [Jenner80], and invoked simply as \a, \b, (or ALT-\a, ALT-\b) etc.

   

    We also write utilities that scan the Unicode definition looking for intuitive 'letter-plus' combinations. Keyster builds hotkey panels on the fly; letters can be visually matched by operators without special training (and without having to grind out SGML entities or &#hex; codes). Below, bars separate letters found on different Unicode code pages; note that this reference font is missing some glyphs:

   

    Keyster also supports multiple entry including both key-verify (a popup alerts the typist to every variation from an existing text), and n-key compare, which which multiple, completely independent passes are compared to each other.
    Below, we have a typical example of compare in action. The top line represents a 'current' input line, arbitrarily chosen from text samples that should be identical.  Notice that the final word on this line is incorrect.  The yellow lines hold the reference text.  Both samples mark spaces as blue-backed backslashes for easy reading of multiple blanks.