To my linguistics homepage

LING/C SC/PSYC 438/538
Computational Linguistics
Fall 2018

This is a introductory course in computational linguistics at an advanced level.

Reference Textbook

We will make use of selected chapters from Speech and Language Processing 2nd edition, by D. Jurafsky and J.H. Martin, Prentice-Hall 2008.

Software

We will use Perl, Python, and SWI-Prolog (freely available) in the computer laboratory classes. Students will implement finite state automata, transducers, parsers and translation programs based on grammar rules in a series of computer laboratory exercises.

In the case of numerical calculations, we will may make use of Microsoft Excel for worked examples and homework questions.

Instructor: Sandiway Fong sandiway@email.arizona.edu
Office: Douglass 311

Administrivia

Location ECE 228 (Lecture)
Time Tuesdays/Thursday 2:00-3:15 pm

Syllabus

See lecture1 slides.

Lecture Notes

Available in Adobe PDF formats.

August

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
8/21 lecture1.pdf lecture1.pptx 43 Viewer Administrivia and Introduction. Homework 1: read chapter 1 of textbook for in-class quiz next time. Homework 2: Install Perl and Python.
8/23 lecture2.pdf lecture2.pptx 24 Viewer (part 1)Viewer (part 2) Quick quiz. Computers vs. humans. Text summarization service. Perl one-liners. CMUdict on Perl and Python. Perlintro with Python.
Slides updated: 8/23 5:30pm
8/28 lecture3.pdf lecture3.pptx 14 Viewer Quiz 1 review. Quick Homework 1. Perlintro continued. Arrays and Python lists.
8/30 lecture4.pdf lecture4.pptx 25 Viewer Homework 1 review. Perlintro continued. print and $".

September

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
9/4 lecture5.pdf lecture5.pptx 30 Viewer A note on Perl and Python and the PowerShell in Windows. Perlintro continued. Built-in functions for arrays and Python (no push, no shift/unshift). splice. Hashes in Perl and Python dictionaries.
9/6 lecture6.pdf lecture6.pptx 23 Viewer Homework 2. Perl coercion, conditionals and looping. @ARGV. Hash. Python dict. List comprehension. zip. ranges.
9/11 lecture7.pdf lecture7.pptx 19 Viewer Homework 2 review. Windows 10 PowerShell and Unicode. Reading Homework. Useful Perl and Python string operations. Trimming whitespace. More on list comprehensions.
9/13 lecture8.pdf lecture8.pptx 37 Viewer Quick Homework 3 on clickbait articles. File I/O in Perl and Python. Thinking algorithmically: a worked example - detected repeated words.
9/18 lecture9.pdf lecture9.pptx 29 Viewer Homework 3 review. References in Perl. cpanm.
Slides corrected: 3:15pm
9/20 lecture10.pdf lecture10.pptx 25 Viewer Homework 4. Regular expressions. Perl regex. JM chapter 2 section 1.
File: population.txt
9/25 lecture11.pdf lecture11.pptx 27 Viewer Homework 4 Review. Perl regex contd. grouping. backreferences.
9/27 lecture12.pdf lecture12.pptx 18 Viewer Backreferences revisited. greedy vs. shortest match behavior. Regex exponential time behavior. Global =~ matching. Perl code inside regexs.

October

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
10/2 lecture13.pdf lecture13.pptx 14 Viewer Review of upgraded homework exercises. More Perl regex. Homework 5.
Note: unfortunately Panopto crashed, so the last part of the lecture was not recorded.
Homework files: hw5.pos
Other files: repeated.txt / ab.txt / integerline.txt / grotto.txt
10/4 lecture14.pdf lecture14.pptx 19 Viewer Perl regex to the max: palindromes and prime number testing! Recursive regex. Lookahead and lookbehind. A note on Python's re.
File: prime.perl
10/9 lecture15.pdf lecture15.pptx 27 Viewer Erratum. Homework 5 Review. Finite State Automata (FSA). Perl and Python implementations of the pseudo-code recognizer.
Slides corrected: 5pm
10/11 lecture16.pdf lecture16.pptx 13 No panopto available! Ungraded Homework 6. Empty transitions. Multiple start and end states. NDFSA to (D)FSA conversion. Backreferences revisited.
10/16 lecture17.pdf lecture17.pptx 21 Viewer Ungraded Homework 6 review. Set-theoretic formal definition of a regular language. Closure properties and FSA. Convert FSA to regex.
10/18 lecture18.pdf lecture18.pptx 20 Viewer Homework 7. Efficient string matching: FSA technology. Beyond regular languages: the case of {anbn | n >= 1}. Pumping Lemma for regular languages. Proofs: anbn is not regular; set of primes is not regular.
Slides updated: 7:45pm
10/23 No class.
10/25 lecture19.pdf lecture19.pptx 23 Viewer Review of Homework 7. Quick Homework 8. SWI-Prolog cheatsheet. Factorial and Σ* examples in Prolog.
File: factorial.pl
File: sigma.pl
10/30 lecture20.pdf lecture20.pptx 30 Viewer Definite clause grammar notation. Writing regular grammars in Prolog. Quick Homework 8.
Grammar file: ab.pl

November

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
11/1 lecture21.pdf lecture21.pptx 29 Viewer Homework 8 Review. Left recursion and set enumeration. anbn revisited. Adding a parse tree as an extra argument to grammar rules. Homework 10.
File: anbn.pl
11/6 lecture22.pdf lecture22.pptx 28 Viewer Homework 10 review. Extra arguments for constraints. Inserting Prolog code into grammar rules. Difference lists explained. Difference lists warped for context-sensitivity.
A context-sensitive grammar for anbncn.
File: abc_parse.pl / abc_count.pl / abc_cs.pl
11/8 lecture23.pdf lecture23.pptx 28 Viewer Natural language parsing: syntactic analysis. Stanford / Berkeley parsers. Google Cloud Natural Language. UDPipe. Training data: Penn Treebank. Part of speech tags. Word sense disambiguation. Syntax. Homework 11. Homework 12.
11/13 lecture24.pdf lecture24.pptx 11 Viewer Homework 11 review. Agreement example.
g.pl (from class)
11/15 lecture25.pdf lecture25.pptx 21 Viewer Homework 12 review. Stanford and Berkeley PCFG parsers, Google and UDPipe neural net parsers.
538 Presentations
Grammar transformation: building left recursive structures using right right recursive rules.
11/20 lecture26.pdf lecture26.pptx 9 Two parts: Viewer / Viewer 538 Presentations
Grammar transformation: building left recursive structures using right right recursive rules.
psg.pl (partially modified) / psg2.pl (modification complete)
11/22 Thanksgiving Recess: no class.
11/27 lecture27.pdf lecture27.pptx 2 538 Presentations (Part I)
11/29 lecture28.pdf lecture28.pptx 2 538 Presentations (Part II)

December

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
12/4 lecture29.pdf lecture29.pptx 2 538 Presentations (Part III)


To my linguistics homepage
Last modified: Mon Dec 3 14:30:26 MST 2018