New PAPPI: Instrumentation

PAPPI provides several instrumentation tools for analyzing the parsing process.

Consider the sentence John thought nobody liked Mary:


[Click on image to enlarge.]

Notice there are also two numbers (initially zero) in the calls/successes column associated with each (non-greyed-out) operation in the snapshot above. For each parser operation, the left number represents the number of times the operations has been called, i.e. the numbers of times the operation has been supplied with a tree to judge. The right number indicates the number of trees that have been passed by the operation.
(During parsing, these numbers will be updated dynamically.)

For example, "Parse PF", the first-listed parser operation, which tokenizes and performs lexical lookup, receives a single input, namely the sentence typed in the dialog box. In the case of John thought nobody liked Mary, Parse PF produces as output 4 different sequences of heads to be sent to the next operation, i.e. "Parse S-structure".

Normally, for reasons of brevity of output, PAPPI only prints the final tree (or trees) produced. However, you can turn on input (or output) printing for any individual (non-greyed out) operation by clicking on the corresponding left (or right) number and selecting the print before (or after) option, respectively.

For example, suppose you would like to see the actual output generated by Parse PF.

Click on the right zero and select "Activate Print After".
(When printing is activated, the background of the number will change to a light blue.)


(To turn off printing, simply click the number again and select the "Deactivate Print After" option. )

Now, let's re-run the sentence:

The above snapshot shows the (chronological) sequence of outputs obtained during the re-run.
(Note: Line 5 above, indicated by "5. LF(1)", contains the final parse.)

For example, "3. <= Parse PF(3)" indicates that line 3 holds the third output generated by Parse PF.

By clicking on a line, the associated structural output will be displayed in the main window.

The 4 head sequences generated by Parse PF are given below:

The above output makes it clear that the multiple outputs stem from the lexical ambiguity of that and thought.

These 4 candidate head sequences are passed as input to the next parser operation, i.e. "Parse S-structure". But, as indicated by the output display above, only the 4th sequence (with thought as a past tense verb and that as a complementizer) results in a valid parse.

By switching on the output of Parse S-structure, we can see that out of the four initial candidates for S-structures, only the first resulted in a valid parse:

The complete set of four initial S-structure candidates are given below:

1. 2. 3. 4.

[Click on images to enlarge.]

The next operation, "Trace Theory", which recovers the possible chains of movement in candidate S-structure, expands the number of distinct trees to 7. Parser operations such as"Parse S-structure, Trace Theory and Parse PF usually act as "generators", i.e. produce more outputs than inputs.

The following graph summarizes the input/output behavior of the system on the example sentence:


[Click on image to enlarge.]
(Generators (and filters) for this parse are coded in green (and red), respectively, in the graph above.)

Not all operations act as generators, some act as "filters" in the sense they rule out illicit candidate trees. For example, the operation "Case Filter", which states that all lexical noun phrases (NPs) must receive Case, has left and right numbers, 7 and 2, respectively. This means the Case Filter rejects 5 (out of 7 candidate trees) put forward.

The 2 trees that survive the Case Filter are passed onto the next operation "Case Condition on ECs" as input. Eventually, the 2 survivors are passed to the generator "Free Indexation", which freely assigns indices to NPs not already assigned an index through movement. Free Indexation produces 6 different candidate index assignments. The 6 output trees then become the input to the next operation "Functional Determination", and so on.

Of the maximum of 7 trees tested, only one manages to survive all the way through beyond the last operation.

(In situations where multiple parses are possible, the system provides a complete list of the surviving trees under the "trees" column. It displays the last one produced, but the others can be retrieved for display simply by clicking on the individual entries under trees.)

Appendix

To clear all the print before or print after options in one go, click on call/successes and select the "clear all print before/after(s)" shortcut:

To delete all of the output history and trees, press the "clear trees" button on the right side of the application:

Note, as a side-effect, the calls/successes counters will also be zeroed at the same time.


Last modified: Mon Apr 28 22:33:30 MST 2014