User Defaults Miscellaneous

Setup: Language Info

  1. User Defaults:

    How to set startup defaults.

  2. Language Info:

    How to add a language.

  3. Miscellaneous:

    Other information on setup.


Contents

Language-Specific Files
The Language Menu


Language-Specific Files

There are two kinds of language-specific files: (1) source files supplied by the grammar writer, and (2) those that are machine-generated:

Language source files
lexiconLang.pl Lexical entries and morphology
parametersLang.pl Parameters for X-bar theory and other principles
peripheryLang.pl Language-particular rules
GLR machine-generated files
transitionLang.pl State transition table
actionLang.pl Shift/reduce action table
igoalsLang.pl Goals for principle-interleaving and head movement
commentsLang.pl Comments for GLR debugger

In general, loading a language means loading the language-particular elements of the theory: namely, the parameters, lexicon and periphery files. The parser also uses a set of GLR machine files which specify a bottom-up stack-machine-based parser (built specifically) for the language. When the user selects an entry from the Language menu, PAPPI will load in both sets of files:

(The details of this menu will be covered later in the next section.)

The grammar associated with the GLR machine is determined by the language-independent X-bar rule system in conjunction with the word-order specifications mentioned in the parameter file. The lexicon also plays a part in shaping the GLR machine. The range of possible subcategorization frames licensed for a given language is determined by its lexical entries. For more details on the organization of the grammar rule system, see the Phrase Structure Rules chapter. The GLR machine files are generated by selecting the GLR Machine Build command under the Theory menu, as shown below:


The Language Menu

Once the files for a given language have been constructed, it should be added to the language menu (shown earlier). The contents of this file is stored in init/Nlanguage_info. The file contains one line per menu entry or language. The format of each line is as follows:
<language> <flag icon path> <large flag icon path> <16pt 2 byte font> \
   <24pt 2 byte font> <encoding>
The elements are explained below:

<language> The name of the language, e.g. Eng.
The characters of the name will be used as a suffix to construct the relevant file names during language loading, e.g. lexiconEng.pl and transitionEng.pl.

<flag icon path> This is the pathname of the flag that is displayed for the language:

By convention, the flag is stored in the bitmap/ subdirectory. For example, the English implementation uses the pathname bitmap/flag_usa.xpm.

The file is encoded using the popular XPM format:

/* XPM */
static char * usa [] = {
"60 39 4 1",
" 	m black	c #2D2D5B5BE5E5",
".	m black	c #E5E52D2D2D2D",
"Y	m white	c #E5E52D2D2D2D",
"X	m white	c #FFFFFFFFFFFF",
"                          .Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y",
"                          Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.",
"  XX  XX  XX  XX  XX  XX  .Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y.Y",
"                          XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
"    XX  XX  XX  XX  XX    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
"                          XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
Note: the dimensions of the flag should be 60 x 39 pixels.

<large flag icon path> This is an alternate flag used when PAPPI is started up with option -scale extra_large.
For example, bitmap/flag_xl_usa.xpm:

Note: the dimension of the flag should be 120 x 78 pixels.

<16pt 2 byte font> The name of the two-byte gloss font or nil.
This font is used for languages where the input is given in romanized form, but lexical tree nodes can still be labelled with native language (double-byte) characters from the font. For example, from the Korean implementation:

For example, the font used above is:

-daewoo-mincho-medium-r-normal--16-120-100-100-c-160-ksc5601.1987-0
The k(_) feature is used in lexical entries to encode the gloss. For more details, see the description in the Word and Lexicon chapter.

For languages not using the double-byte gloss feature, the value of this field should be nil.

<24pt 2 byte font> nil or the 24 point two-byte font used when the tree font size is at least 14 points.
(See the section on Tree Display for information on tree font attributes.)

For example, here is the difference between the 16 and 24 point glosses given the following font definitions taken from the Japanese implementation:

Jap bitmap/flag_nippon.xpm bitmap/flag_xl_nippon.xpm kanji16 kanji24 euc

(a) kanji24 (b) kanji16

For languages not using the double-byte gloss feature, the value of this field should be nil.

<encoding> euc, big5 or nop for double-byte fonts.
This encodes the mapping between the value of the k(_) feature and the font table. See the description in the Word and Lexicon chapter for more information.

For languages not using the double-byte gloss feature, the value of this field should be nop.

User Defaults Miscellaneous