This is a introductory course in computational linguistics at an
advanced level.
Reference Textbook We will make use of selected chapters from Speech and Language Processing 2nd edition, by D. Jurafsky and J.H. Martin, Prentice-Hall 2008. Software We will use Perl, Python, and SWI-Prolog (freely available) in the computer laboratory classes. Students will implement finite state automata, transducers, parsers and translation programs based on grammar rules in a series of computer laboratory exercises. In the case of numerical calculations, we will may make use of Microsoft Excel for worked examples and homework questions. |
Instructor: Sandiway Fong sandiway@email.arizona.edu
Office: Douglass 311
Location | McClelland Park 102 (Lecture) |
Time | Tuesdays/Thursday 3:30-4:45 pm |
Date | Lecture Notes | Number of Slides |
Panopto | Topic | |
---|---|---|---|---|---|
Powerpoint | |||||
8/22 | lecture1.pdf | lecture1.pptx | 43 | Viewer | Administrivia and Introduction. Homework 1: read chapter 1 of textbook. Homework 2: Install Perl and Python. |
8/24 | lecture2.pdf | lecture2.pptx | 24 | Viewer | App of the Day: Text Summarizer. Homework 3:
OTS. Programming. Perl.
Slides updated: 5:10pm 8/24 |
8/29 | lecture3.pdf | lecture3.pptx | 16 | Viewer | Homework 3 review. Homework 4. Perl contd.
IMPORTANT! Homework 4 will be reviewed in class next Tuesday, so it is due by Tuesday (Monday midnight), not Tuesday midnight as specified in the slides! |
8/31 | lecture4.pdf | lecture4.pptx | 14 | Viewer | Perl Arrays: @ARGV, sort, cmp, push, pop, shift, unshift,
splice. Perl hash.
Slides updated: 5:30pm 8/31 |
Date | Lecture Notes | Number of Slides |
Panopto | Topic | |
---|---|---|---|---|---|
Powerpoint | |||||
9/5 | lecture5.pdf | lecture5.pptx | 20 | Viewer | Sorting revisited. Homework 4 Review. More Perl: implicit coercion, conditionals and loops. |
9/7 | lecture6.pdf | lecture6.pptx | 15 | Viewer | Clickbait revisited.
More Perl: string operations, and file input/output. Homework 5.
Data files for homework: hw5data1.txt / hw5data2.txt / |
9/12 | lecture7.pdf | lecture7.pptx | 31 | Viewer | RPerl. A worked example. Perl references. Perl modules. cpanm. localtime. |
9/14 | lecture8.pdf | lecture8.pptx | 21 | Viewer | Homework 5 Brief Review. Homework 6: Reading. Perl Regex. Homework 7: Exercises. |
9/19 | No class this week. | ||||
9/21 | No class this week. | ||||
9/26 | lecture9.pdf | lecture9.pptx | 18 | Viewer | Backreferences. Greedy and non-greedy matching. Regex variables and regexs as variables. Split and regexps. Evaluating Perl code inside a regex. Homework 7 Review. |
9/28 | lecture10.pdf | lecture10.pptx | 26 | Viewer | Recursive regexs: pallindrome parsing. Prime numbers: factorizing using regexs. Homework 8 data: wsj.txt / irregular_verbs.txt |
Date | Lecture Notes | Number of Slides |
Panopto | Topic | |
---|---|---|---|---|---|
Powerpoint | |||||
10/3 | lecture11.pdf | lecture11.pptx | 17 | Viewer | Homework 8 clarifications. Finite State Automata (FSA). Perl and Python implementations of the pseudo-code recognizer. |
10/5 | lecture12.pdf | lecture12.pptx | 14 | Viewer | Homework 8 review. FSA e-transitions. Set of states construction:
NDFSA -> FSA. Homework 9 (ungraded).
Modified slides: 5pm 10/5 |
10/10 | lecture13.pdf | lecture13.pptx | 22 | Viewer | Formal equivalence: FSA and regex. Closure properties: intersection, difference, complementation, and reversal. FSA -> regex construction: state bypass method. |
10/12 | lecture14.pdf | lecture14.pptx | 21 | Viewer | Homework 10. Efficient string matching: FSA technology. Beyond regular languages: the case of {a^{n}b^{n}| n ≥ 1}. Pumping Lemma for regular languages. Proofs: a^{n}b^{n} is not regular; set of primes is not regular. |
10/17 | lecture15.pdf | lecture15.pptx | 10 | Viewer | Homework 11: install SWI-Prolog. Quick introduction to Prolog. |
10/19 | lecture16.pdf | lecture16.pptx | 21 | Viewer | Recursive definition and modes of access.
Homework 10 review.
Slides edited: 5:15pm |
10/24 | lecture17.pdf | lecture17.pptx | 26 | Viewer | Regular grammars in Prolog. Example: convert FSA to regex and
regular grammar. Homework 12.
Slides updated: 8:15pm |
10/26 | lecture18.pdf | lecture18.pptx | 22 | Viewer | Left recursion and language enumeration. A look at a grammar for a^{n}b^{n}, n>=1. Using extra arguments for computing a parse tree. |
10/31 | lecture19.pdf | lecture19.pptx | Viewer | Homework 12 review. Using extra arguments to constrain a grammar: the case of context-sensitive a^{n}b^{n}c^{n}, n>=1. Embedding Prolog code inside grammar rules using { ... }. Prolog-style context-sensitive rules. |
Date | Lecture Notes | Number of Slides |
Panopto | Topic | |
---|---|---|---|---|---|
Powerpoint | |||||
11/2 | lecture20.pdf | lecture20.pptx | 21 | Viewer | Natural language grammars and parsing. Stanford Parser. |
11/7 | lecture21.pdf | lecture21.pptx | 23 | Viewer | Homework 13: the Cognate Object Construction (COC). Practice
writing natural language grammars. Agreement:
determiner-noun. Agreement: subject-verb. Left recursion and Prolog
grammars.
Update (5pm): slides updated and psg.pl |
11/9 | lecture22.pdf | lecture22.pptx | 9 | Viewer | A transformation for dealing with left recursion: a right recursive
grammar that produces left recursive parses.
Slides updated, also psg2.pl |
11/14 | lecture23.pdf | lecture23.pptx | 23 | Viewer | 538 Presentations. Homework 13 Review. Finishing up: a right recursive
grammar that produces left recursive parses.
Slides updated: 5:10pm |
11/16 | lecture24.pdf | lecture24.pptx | 40 | Viewer | 538 Presentations. Chomsky Normal Form (CNF). CKY Bottom-up
Parsing. Spelling Errors: Microsoft Word, Google.
Slides corrected: 5:30pm CNF example grammar: cnf.pl |
11/21 | lecture25.pdf | lecture25.pptx | 58 | Viewer | Optional Homework 14. 538 Presentations. Chomsky Normal Form (CNF)
conversion revisited. Minimum Edit Distance.
eds.xls |
11/23 | Thanksgiving | ||||
11/28 | lecture26.pdf | lecture26.pptx | 33 | Viewer | Morphology: Inflectional and Derivational. Stemming: the Porter stemmer in Prolog/Python/Perl. |
11/30 | 538 Presentations (Part I) |
Date | Lecture Notes | Number of Slides |
Panopto | Topic | |
---|---|---|---|---|---|
Powerpoint | |||||
11/30 | 538 Presentations (Part II) |