New PAPPI: Loading principles and language files

By default, PAPPI loads in enough code to analyze core English sentences.

The blue tab menu item "load defaults" takes care of loading all relevant language-independent and language-particular files.

(On the Mac, all relevant files are stored in the /Applications/newpappi.app/Contents/Resources folder.)


[Click on image to enlarge.]

The names of the loaded principles file (principles13), language (Eng) and parsing configuration (j5parser) appear just above the list of principles, as shown below:


(Contextually-appropriate menus will pop up when the names all the top are clicked. Currently, only the Eng, calls/successes and trees menus are operational.)

More specifically, "load defaults" loads the following pre-packaged set of files into PAPPI:

  1. It loads in the definitions from the universal core, represented by files principles13.pl and xbar.pl.
  2. The definitions in principles13.pl appear as operations on the left side of the parser window.
    The definitions in xbar.pl contribute to the "Parse S-structure" operation (to be described in detail below).

  3. It also loads in various English language files. Generally, each language L will have its parameterization, lexicon and other peripheral definitions stored in files parametersL.pl, lexiconL.pl, peripheryL.pl, respectively.

    For English, the files parametersEng.pl, lexiconEng.pl, peripheryEng.pl are loaded by "load defaults". Language-particular files can be loaded individually by clicking on the current language name tag, e.g. Eng or Jp (partially obscured below), and selecting one of the menu items shown below.

  4. If you switch languages, you should reload the parser control (see below).

    [The language-specific LR machine has special hooks into loaded interleaved principles so that these principles can apply while structure-building is in progress. Loading a new language (and LR machine) may present a new set of (different) language-specific hooks. Re-loading the parser control will guarantee proper re-attachment of any interleaved principles properly to the newly-loaded language.]

  5. The tree-building operation shown as "Parse S-structure" is a language-specific sentence parser. It generates S-structure trees in accordance with the Xbar-theoretic definitions from the file xbar.pl (plus head and phrasal movement).
    These S-structures are passed to other parser operations to rule upon. By default, the English-language version of the S-structure generator is loaded from the files actionEng.pl, transitionEng.pl, igoalsEng.pl and commentsEng.pl.

    For reasons of computational efficiency, the implementation uses a LR(1)-based parsing algorithm that generates trees in a bottom-up fashion (instead of using the Xbar-based rules directly). Both a stack and a finite-state control over dotted-rule configurations are defined.

    The English-specific LR machine is stored and loaded from files actionEng.pl (structure-building shift/reduce operations), transitionEng.pl (finite-state control), igoalsEng.pl (stub points for principle-interleaving) and commentsEng.pl (not used in this version of the parser). These files are automatically loaded when the language is loaded. The LR machine can also be loaded manually through the language menu shown above.

    If the phrase structure rules have been changed, e.g. if xbar.pl, parametersEng.pl, peripheryEng.pl or lexiconEng.pl have been changed significantly, the LR machine may require updating. Selecting "Build LR Machine" in the language menu will bring up the following pop-up window:

    It is highly recommended that you rename or save a copy of the current machine before building a new one because "Build LR Machine" will automatically overwrite existing files.

    Here, an LR machine with 137 states has been constructed from 77 phrase structure rules. These phrase structure are derived from the core (non-language-particular) rules in xbar.pl parameterized for English word order through settings in parametersEng.pl. Language-particular modifications to the core rules are stored in peripheryEng.pl. Subcategorization possibilities are defined by the theta-grid and other lexical features of heads in lexiconEng.pl.

  6. Finally, a file is loaded that specifies the order of execution of the principles. This is called the parser control file.
    (The order of execution is given by the top-down order of the principles panel.)

    By default, the configuration that is loaded comes from the file j5parser.pl.

    [The 5 in the filename refers to the five greyed-out parser operations that are automatically integrated into the preceding "Parse S-structure" operation at load time. The j5parser.pl file also specifies that the parser operation called "Subjacency" should be merged into the preceding "Trace Theory" operation, which recovers (the possible) movement chains in a given S-structure. This also happens at load time. The net effect is that "Trace Theory" assigns possible movement chains that do not violate the Subjacency condition.]

  7. To reload the parser control file or select a new one, click on the current parser control filename:

    By convention, parser control filenames all must end in 'parser.pl'.


Last modified: Mon Apr 28 22:29:51 MST 2014