Lexicon PF

Lexical Features

  1. Lexicon:

    Predicates that must be supplied in a lexicon.

  2. Lexical Features:

    Obligatory and optional lexical features.

  3. Parse PF:

    The ParsePF system and predicates for taking apart and combining words.


Contents

Obligatory Features:
grid
agr
Optional Features:
k
suffix
agr_feature(*)
Unique Feature System:
unique
unique_category_in
unique_feature_domain
unique_feature_in

(*) Not actually described here. Reference is to constituent feature predicates.


Obligatory Features


Optional Features

k(+Code) The k feature is used for native language glossing. For multiple-byte fonts, the atom Code represents the sequence of characters to be displayed (usually) in Extended Unix Code (EUC) hex format.

Example:

From the Korean lexicon:

lex(sunhee,n,[a(-),p(-),agr([3,sg,f]),class(person),k('3c314871')]).
The k('3c314871') feature causes sunhee to be rendered using the Korean script in parse tree displays as follows:

(a) The k(_) feature

The font used here must be declared in the file init/Nlanguage_info as follows:

# language flag xl-flag 16pt-font 24pt-font encoding:{NOP,EUC,BIG5}

Korean       bitmap/flag_korea.xpm        bitmap/flag_xl_korea.xpm \
    -daewoo-mincho-medium-r-normal--16-120-100-100-c-160-ksc5601.1987-0 \
    -daewoo-mincho-medium-r-normal--24-170-100-100-c-240-ksc5601.1987-0   nop
The code 3c314871 is used to look up sunhee two bytes at a time in the font table. For example, the first two bytes, 3c31, correspond to sun in sunhee. Using xfd, we can see that the correct character is in position 31 on page 3c:

(b) Korean font lookup

The nop encoding flag indicates that character lookup is done directly for the Korean implementation. In the case of Japanese and (traditional) Chinese, the corresponding flags are euc, for Extended Unix Code, and big5, respectively.

See also the definition of the optional feature suffix.

References: suffix


suffix(+Suffix,+Code) The suffix feature is used for suffixing markers. The atom Suffix will be added, after a dash (-) to the lexical item being marked, and the atom Code will be added to the end of the k feature.

Example:

Consider the lexical entries for the nominative Case marker ga and the noun taroo from the Japanese lexicon:

lex(taroo,n,[a(-),p(-),agr([3,sg,m]),class(person),k(c2c0cfba)]). 
...
lex(ga,mrkr,[left(n,[],[morphC(nom),suffix(ga,a4ac)])]).
When the marker ga is applied to a noun like taroo, the suffix feature changes the display of the noun to be taroo-ga and appends the hiragana representation of ga to the k feature of taroo:

(a) Unmodified (b) After suffixation

References: k


Unique Feature System

unique_feature_domain +C Category label C declares a local domain for unique features. Allows the specifications of features that may occur singularly in a given domain. The notion of unique feature is extended to category labels.

The unique occurrence of a category label is handled by the unique_category_in construct. All other features must be declared using unique_feature_in.

Example:

In the English DP implementation, the category label maxdp is declared to be a unique feature domain:

unique_feature_domain maxdp.
This defines a determiner phrase to be the local domain within which unique features can be used to control the lexical insertion and nesting of determiners. Examples are given in the definition of unique_feature_in and unique_category_in.

References: unique_feature_in / unique_category_in


+F unique_feature_in +C Declares that lexical feature F may only occur uniquely in local domain C.

That is, given a phrase C, where C has been previously declared as a local domain, an item with feature F can occur at most once.

Note 1: multiple instances as a result of phrasal projection do not count for uniqueness. By default, PAPPI allows lexical features of heads to be inherited by or made visible in intermediate and maximal projections.

Note 2: the category label C must be previously declared using the unique_feature_domain declaration.

Example:

In the English DP implementation, DP phrase structure follows the X-bar system:

head(d). 	bar(d1). 	max(dp). 
proj(d,d1). 	proj(d1,dp).
head(d,d). 	head(d1,d).  	head(dp,d).
spec(d1,maxdp).	spec(d1,[]).
compl(d,np).	compl(d,dp). 
In particular, this definition allows for the cascading or nesting of determiners. Examples of DP phrase structure:

Define the dummy category label maxdp to be an instance of a DP:

rule maxdp -> [dp] st true.
maxdp will define the extent of the local domain for unique features. In the four tree examples, all the top DP nodes are maxdps. Additionally, the DP specifier of d1 is also defined to be a maxdp. This is to ensure that in expressions like John's few friends, the genitive marked possessor is in a separate local domain from the main DP.

Now we are in a position to define some constraints controlling the insertion and nesting of determiners within a local DP domain via the unique_feature_in construct.

English determiners are labelled as being either strong or weak. Examples of strong determiners are the and every. Examples of weak determiners are several and few. There is a constraint that states that no two strong determiners can occur within a maxdp domain. This blocks phrases like *the every man, but correctly allows for the several men and every few meters. This prohibition can be simply stated via the following declaration:

strong unique_feature_in maxdp.
plus adding the unique feature strong to the lexical entries for the determiners the and every, but not few or several:
lex(every,d,[count(+),agr([3,sg,[]]),op(+),strong]).
lex(few,d,[count(+),agr([3,pl,[]]),op(+)]).
lex(several,d,[count(+),agr([3,pl,[]])]).
lex(the,d,[count(_),agr([3,[],[]]),strong]).
Note that this is also an efficient implementation. The strong feature is checked immediately at lexical insertion time, possibly obviating the need for additional phrase structure construction.

References: unique_feature_domain


+C unique_category_in +D A category with label C can only be inserted once in the local domain D if it has the feature unique(C).

Note: the category label D must be previously declared using the unique_feature_domain declaration.

Example:

In the English DP implementation, cascading or nesting of determiners within a determiner phrase is generally allowed, e.g. all the many/several/seven men who came to the party. (For additional background on the English DP, see the description from the unique_feature_in construct.) However, certain determiners such as the indefinite article a (an), except with many and few in many a and a few, the quantifier some, and the empty determiner D cannot co-occur with any other determiner.

This can be captured using the following declaration:

d unique_category_in maxdp.
plus marking the lexical entries of a(n) and some, but not the, with the unique(d) feature:
lex(a,d,[count(+),agr([3,sg,[]]),vow(-),def(-),unique(d)]).
lex(an,d,[count(+),agr([3,sg,[]]),vow(+),def(-),unique(d)]).
lex(some,d,[count(_),agr([3,[],[]]),op(+),unique(d)]).
lex(the,d,[count(_),agr([3,[],[]])]).
In the case of D, the rule that introduces the empty category should provide the unique(d) feature:
rule d with Fs -> [] st ecDFs(Fs).

% Empty D features
ecDFs(Fs) :- mkFs([agr(_),count(_),strong(_),unique(d)],Fs).
Every time a determiner is inserted into phrase structure, whether it be from the lexicon or an empty category, it will be registered in the local domain. If no unique(d) feature has been found, multiple occurrences will be permitted. However, inserting a determiner marked with unique(d) will constrain the system to check that no determiner has been (or can be) registered for the particular local domain.

References: unique_feature_domain / unique_feature_in

Lexicon PF