Prerequisites
- Data Structures and Compiler Design: A foundational understanding of data structures and parsing techniques is essential for grasping the algorithms and methodologies used in NLP.
Course Objectives
The objectives of this course are: 2. Introduction to NLP Problems and Solutions:
- Explore key challenges in NLP and their relation to linguistics and statistics.
- Model Linguistic Phenomena:
- Develop sensitivity to linguistic phenomena and model them using formal grammars.
- Experimental Methodology:
- Understand and apply proper experimental methodology for training and evaluating empirical NLP systems.
- Statistical Models:
- Manipulate probabilities, construct statistical models over strings and trees, and estimate parameters using supervised and unsupervised methods.
- Design and Analysis:
- Design, implement, and analyze NLP algorithms and language modeling techniques.
Course Outcomes
Upon completion of this course, students will be able to: 7. Linguistic Sensitivity:
- Recognize linguistic phenomena and model them with formal grammars.
- Experimental Skills:
- Conduct experiments for training and evaluating NLP systems.
- Statistical Modeling:
- Construct and estimate statistical models for strings and trees using supervised/unsupervised methods.
- Algorithm Design:
- Design, implement, and analyze NLP algorithms.
- Language Modeling:
- Apply various language modeling techniques such as N-gram models, class-based models, and cross-lingual models.
Natural Language Processing Assignment 1 Natural Language Processing Assignment 2 NLP
Syllabus
UNIT - I: Finding the Structure of Words
-
Words and Their Components:
- Morphology: Study of word structure and components.
-
Issues and Challenges:
- Ambiguity, complexity, and variability in word formation.
-
Morphological Models:
- Techniques for analyzing and processing morphological structures.
-
Finding the Structure of Documents:
- Introduction to document structure analysis.
- Methods for identifying structural features in documents.
- Complexity and performance evaluation of approaches.
UNIT - II: Syntax I
- Parsing Natural Language:
- Techniques for analyzing syntactic structure in sentences.
- Treebanks:
- Data-driven approaches to syntax using annotated corpora.
- Representation of Syntactic Structure:
- Tree-based representations of sentence structure.
- Parsing Algorithms:
- Algorithms for syntactic parsing (e.g., CKY, Earley).
UNIT - III: Syntax II
-
Models for Ambiguity Resolution in Parsing:
- Techniques for resolving syntactic ambiguity.
-
Multilingual Issues:
- Challenges in processing multiple languages and cross-lingual parsing.
-
Semantic Parsing I:
- Introduction to semantic interpretation.
- System paradigms for semantic parsing.
- Word sense disambiguation.
UNIT 4 Semantic Parsing II
- Predicate-Argument Structure:
- Representation of relationships between predicates and arguments.
- Meaning Representation Systems:
- Formal systems for representing meaning in natural language.
UNIT 5 Language Modeling
- Introduction to Language Modeling:
- Overview of language models and their applications.
- N-Gram Models:
- Statistical models based on sequences of words.
- Language Model Evaluation:
- Metrics for evaluating language model performance.
- Bayesian Parameter Estimation:
- Bayesian methods for estimating model parameters.
- Language Model Adaptation:
- Techniques for adapting models to specific domains or languages.
- Advanced Models:
- Class-based, variable-length, Bayesian topic-based, multilingual, and cross-lingual models.
Textbooks
- “Multilingual Natural Language Processing Applications: From Theory to Practice”
- Authors: Daniel M. Bikel, Imed Zitouni
- Publisher: Pearson Publication
Reference Books
- “Speech and Natural Language Processing”
- Authors: Daniel Jurafsky, James H. Martin
- Publisher: Pearson Publications
- “Natural Language Processing and Information Retrieval”
- Authors: Tanveer Siddiqui, U.S. Tiwary