Table of contents
Natural Language Processing (CM3060)
This module will provide you with a grounding in both rule-based and statistical approaches to NLP, and it combines theoretical study with hands-on work employing widely used software packages. The module focuses on text processing, and by taking this module, you will learn about how you can work with text-based natural language in your computer programs. You will learn about grammars and how they can be used to analyse text. You will learn how statistical analysis can be used to extract information from and classify text. You will work in an appropriate programming environment for NLP, using libraries to implement NLP workflows.
Professor(s)
- Dr. Tony Russel-Rose
Topics covered
- History of NLP.
- Information retrieval and curation in NLP.
- Curated corpora and raw data sources.
- Formal grammars.
- Rule based NLP.
- Statistical NLP.
- NER (Named Entity Recognition).
- Readers, stemmers, taggers and parsers
- Software packages for NLP
- Applications of NLP
Assessment
One two-hour unseen written examination and coursework (Type I)
Syllabus
Primary programming language
Python
Resources
Notes
Mock exams
Textbooks used in the module
- Bird, Steven, Ewan Klein, and Edward Loper. Natural language processing with Python: analyzing text with the natural language toolkit. “ O’Reilly Media, Inc.”, 2009. https://www.nltk.org/book/
- Jurafsky, Dan, and James H. Martin. “Speech and Language Processing (3rd draft ed.).” (2019). https://web.stanford.edu/~jurafsky/slp3/
- Perkins, Jacob. Python 3 text processing with NLTK 3 cookbook. Packt Publishing Ltd, 2014. https://www.packtpub.com/product/python-3-text-processing-with-nltk-3-cookbook/9781782167853
- Python Natural Language Processing Cookbook: Over 50 recipes to understand, analyze, and generate text for implementing language processing tasks, Zhenya Antić, Packt Publishing Ltd, 2021 ISBN 1838987789, 9781838987787 https://www.packtpub.com/product/python-natural-language-processing-cookbook/9781838987312
- Provost, Foster, and Tom Fawcett. Data Science for Business: What you need to know about data mining and data-analytic thinking. “ O’Reilly Media, Inc.”, 2013. https://www.oreilly.com/library/view/data-science-for/9781449374273/
- Schütze, Hinrich, Christopher D. Manning, and Prabhakar Raghavan. Introduction to information retrieval. Vol. 39. Cambridge: Cambridge University Press, 2008. https://nlp.stanford.edu/IR-book/information-retrieval-book.html
- Hovy, Dirk. Text Analysis in Python for Social Scientists: Discovery and Exploration. Cambridge University Press, 2020. https://www.cambridge.org/core/elements/abs/text-analysis-in-python-for-social-scientists/BFAB0A3604C7E29F6198EA2F7941DFF3
YouTube
- Natural Language Processing with Dan Jurafsky and Chris Manning, 2012 - “This 2012 lecture series from Stanford professors Dan Jurafsky and Chris Manning covers fundamental algorithms and mathematical models for processing natural language, and how these can be used to solve practical problems.”