Ling-367 Computational Tools for Linguists

Semeter:  Fall 2006
Time:     Tuesday 4:15-6:15PM
Room:	  CBN 301

Taught by:
George Wilson
   office:         ICC477
   office hours:   Tue 3-4, or by appointment
Markus Dickinson
   office:         ICC452
   office hours:   Mon 12-1,  Thu 11-12, or by appointment



5 Sept   - First Class
TBA      - Midterm


  • BlackBoard - Course Documents including Lecture Notes
  • Tools that will be used in class.
  • Data that will be used in class.
  • Usage Demo from GVW.     NOTE: If you use the demo from off-campus, you will not have access to all datasets unless you have a username and password.
  • WordNet
  • LingPipe
  • Perl

  • Perl Programming - example programs
  • Activeware - You can download perl for Windows or Mac from this site.
  • Perl - Reference Guide
  • Introduction to Perl from Perl Training Australia
  • Perl RegEx Tutorial from University of Georgia
  • Perl Language Home Page
  • CPAN Comprehensive Perl Archive Network
  • Perl Resources and Reviews
  • Perl Quick reference Guide
  • Perl monks - tutorials and other useful information
  • cgi-lib - perl tools for cgi programming
  • Corpora

  • Guidelines for formatting your corpus data
  • Legal aspects of compiling corpora - Thread on the Linguist List
    Note particularly this message containing a Statement on Use
  • LDC - The Linguistic Data Consortium - Georgetown is now a member.
  • Project Gutenberg - many free electronic texts, mostly older texts, but very useful
  • TRAINS - free corpus of speech transcriptions
  • MiCASE Michigan Corpus of Academic Spoken English
  • Ulrich Germann's Aligned Parallel French-English Corpus
  • EuroParl Corpus - 8 language parallel corpus
  • WHO Media Center - press releases in English and French
  • BNC British National Corpus
  • ANC American National Corpus
  • Leipzig Corpus
  • /usr/dict/words has a good list of words (lemmas)
  • Other

  • Merriam-Webster Dictionary
  • Meta-dictionary
  • OneLook Meta-dictionary
  • Unicode Code Charts
  • Unicode Character Names
  • Easily Confused Words - at


  • Name Description - This is a template for adding more entries