Natural Language Processing Magdalena Wolska
Contents I. Introduction
Objectives
Related Fields 1. Linguistics
Literature Natural Language Processing:
Literature Top-tier natural language processing conferences:
Literature Other relevant natural language processing conferences:
Software Annotation software:
Software Algorithm collections:
Chapter NLP:I I. Introduction
Goals of Language Technology 1. Aid humans in writing.
Remarks:
Chapter NLP:I I. Introduction
Examples of NLP Systems Writing Aid: Spelling and Grammar Checking
Examples of NLP Systems Writing Aid: Spelling and Grammar Checking
Examples of NLP Systems Writing Aid: Spelling and Grammar Checking
Remarks:
Examples of NLP Systems Question Answering: IBM Watson at Jeopardy
Examples of NLP Systems Question Answering: IBM Watson at Jeopardy
Examples of NLP Systems Question Answering: IBM Watson at Jeopardy
Remarks:
Examples of NLP Systems Question Answering: IBM Watson at Jeopardy
Examples of NLP Systems Question Answering: IBM Watson at Jeopardy
Examples of NLP Systems Question Answering: IBM Watson at Jeopardy
Examples of NLP Systems Question Answering: IBM Watson at Jeopardy
Examples of NLP Systems Question Answering: Jeopardy Revisited
Examples of NLP Systems Question Answering: Jeopardy Revisited
Chapter NLP:I I. Introduction
NLP Problems State of Affairs: Mostly Solved
NLP Problems State of Affairs: Mostly Solved
NLP Problems State of Affairs: Making Good Progress
NLP Problems State of Affairs: Making Good Progress
NLP Problems State of Affairs: Making Good Progress
NLP Problems State of Affairs: Still Challenging
NLP Problems State of Affairs: Still Challenging
NLP Problems State of Affairs: Still Challenging
NLP Problems State of Affairs: Still Challenging
NLP Problems State of Affairs: Still Challenging
Remarks:
Chapter NLP:I I. Introduction
Challenges for NLP Systems Why is NLP hard?
Challenges for NLP Systems Why is NLP hard?
Challenges for NLP Systems Why is NLP hard?
Challenges for NLP Systems Why is NLP hard?
Challenges for NLP Systems Why is NLP hard?
Challenges for NLP Systems Why is NLP hard?
Challenges for NLP Systems Why is NLP hard?
Challenges for NLP Systems Why is NLP hard?
Challenges for NLP Systems Why is NLP hard?
Challenges for NLP Systems Why is NLP hard?
Challenges for NLP Systems Why is NLP hard?
Challenges for NLP Systems Why is NLP hard?
Chapter NLP:II II. Corpus Linguistics
Empirical Research 1. Quantitative research based on numbers and statistics.
Empirical Research 1. Quantitative research based on numbers and statistics.
Empirical Research 1. Quantitative research based on numbers and statistics.
Empirical Research Research Questions
Empirical Research Research Questions
Empirical Research Research Questions
Empirical Research Empirical Research in NLP
Empirical Research Empirical Research in NLP
Empirical Research Evaluation Measures
Empirical Research Effectiveness
Empirical Research Classification Effectiveness: Instance Types
Empirical Research Classification Effectiveness: Evaluation based on the Instance Types
Empirical Research Classification Effectiveness: Accuracy
Empirical Research Classification Effectiveness: Limitations of Accuracy
Empirical Research Classification Effectiveness: Precision and Recall
Empirical Research Classification Effectiveness: Precision and Recall Implications
Empirical Research Classification Effectiveness: Interplay between Precision and Recall
Empirical Research Classification Effectiveness: F1-Score
Empirical Research Classification Effectiveness: F1-Score Generalization
Empirical Research Classification Effectiveness: F1-Score Issue in Tasks with Boundary Detection
Empirical Research Classification Effectiveness: Other F1-Score Issues
Empirical Research Classification Effectiveness: Micro- and Macro-Averaging
Empirical Research Classification Effectiveness: Confusion Matrix for Micro- and Macro-Averaging
Empirical Research Classification Effectiveness: Computing Micro- and Macro-Averages
Empirical Research Regression Effectiveness
Empirical Research Regression Effectiveness: Types of Regression Errors
Empirical Research Regression Effectiveness: Computation
Empirical Research Other Measures
Empirical Research Experiments
Empirical Research Datasets
Empirical Research Types of Evaluation: Training, Validation, and Test Set
Empirical Research Types of Evaluation: Cross-Validation
Empirical Research Types of Evaluation: Variations
Empirical Research Training Data
Empirical Research Comparison
Empirical Research Comparison: Upper and Lower Bounds
Empirical Research Comparison: Types of Baselines
Empirical Research Comparison: Exemplary Baselines
Empirical Research Comparison: Implications
Chapter NLP:II II. Corpus Linguistics
Hypothesis Testing Statistics
Hypothesis Testing Statistics: Variables and Scales
Hypothesis Testing Descriptive Statistics
Hypothesis Testing Descriptive Statistics: Central Tendency and its Disperson
Hypothesis Testing Descriptive Statistics: Normal Distribution
Hypothesis Testing Descriptive Statistics: Standard Scores
Hypothesis Testing Inferential Statistics
Hypothesis Testing Inferential Statistics: Hypotheses
Hypothesis Testing Four Steps of Hypothesis Testing
Hypothesis Testing Effect Size
Hypothesis Testing What Test to Choose
Hypothesis Testing Assumptions
Hypothesis Testing The Student’s t-Test
Hypothesis Testing One-Sample t-Test
Hypothesis Testing Dependent t-Test (aka paired-sample test)
Hypothesis Testing Independent t-Test
Hypothesis Testing The Student’s t-Test: What to do with the t-Score?
Hypothesis Testing Example: One-Tailed One-Sample t-Test
Chapter NLP:II II. Corpus Linguistics
Text Corpora Corpus Linguistics
Text Corpora Corpus Linguistics
Text Corpora Definition 1 (Text Corpus [Butler 2004])
Text Corpora Text as Data
Text Corpora Text as Data
Text Corpora Metadata
Text Corpora Research in Language Use
Text Corpora Research in Language Use
Text Corpora Vocabulary Growth: Heaps’ Law
Text Corpora Vocabulary Growth: Heaps’ Law
Text Corpora Term Frequency: Zipf’s Law
Text Corpora Term Frequency: Zipf’s Law
Text Corpora Term Frequency: Zipf’s Law
Text Corpora Term Frequency: Zipf’s Law
Remarks:
Text Corpora Term Frequency: Zipf’s Law
Text Corpora Term Frequency: Zipf’s Law
Remarks:
Text Corpora Term Frequency: Zipf’s Law
Text Corpora Term Frequency: Zipf’s Law
Text Corpora n-grams
Text Corpora n-grams
Text Corpora n-grams
Text Corpora n-gram Corpora
Text Corpora n-gram Corpora
Chapter NLP:II II. Corpus Linguistics
Data Acquisition Data Sources
Data Acquisition Newspapers
Data Acquisition Blogs and Forums
Data Acquisition Social network
Data Acquisition Other Sources
Data Acquisition On Representativeness
Data Acquisition Representative Data versus Balanced Data
Chapter NLP:II II. Corpus Linguistics
Data Annotation Definition 1 (Annotation)
Data Annotation Sources of Annotations
Data Annotation Automatic Annotation: Sources
Data Annotation Manual Annotation: Sources
Data Annotation Manual Annotations: Software
Data Annotation Crowdsourcing
Data Annotation Crowdsourcing: Platforms
Data Annotation Crowdsourcing: Issues
Data Annotation Crowdsourcing: Gamification
Data Annotation Crowdsourcing: Gamification
Data Annotation Annotation Tasks
Data Annotation Annotation Tasks
Data Annotation Annotation Schemes
Data Annotation Annotation Schemes: Guidelines
Data Annotation Annotation Schemes: Disagreement
Data Annotation Annotation Schemes: Disagreement
Data Annotation Annotation Schemes: Disagreement
Data Annotation Annotation Schemes: Disagreement
Data Annotation Annotation Schemes: Disagreement
Data Annotation Annotator Agreement: Observed Agreement
Data Annotation Annotator Agreement: Observed Agreement
Data Annotation Annotator Agreement: Observed Agreement
Data Annotation Annotator Agreement: Cohen’s κ
Data Annotation Annotator Agreement: Cohen’s κ
Data Annotation Annotator Agreement: Fleiss’s κ
Remarks:
Data Annotation Non-technical Aspects
Chapter NLP:III III. Text Models
Text Preprocessing Overview
Text Preprocessing Overview
Text Preprocessing Overview
Text Preprocessing Overview
Text Preprocessing Preprocessing Pipeline
Text Preprocessing Preprocessing Pipeline
Text Preprocessing Preprocessing Pipeline
Text Preprocessing Preprocessing Pipeline
Remarks: Annotation is skipped when the annotations are not needed for further processing.
Text Preprocessing Token Normalization
Text Preprocessing Token Normalization: Regular Expressions
Text Preprocessing Token Normalization: Regular Expressions
Text Preprocessing Token Normalization: Regular Expressions
Text Preprocessing Token Normalization: Regular Expressions
Text Preprocessing Token Normalization: Regular Expressions
Text Preprocessing Token Normalization: Regular Expressions
Text Preprocessing Token Normalization: Regular Expressions
Text Preprocessing Token Normalization: Regular Expressions
Text Preprocessing Token Normalization: Regular Expressions
Text Preprocessing Token Normalization: Regular Expressions
Text Preprocessing Token Normalization: Regular Expressions
Text Preprocessing Token Normalization: Regular Expressions
Text Preprocessing Token Normalization: Regular Expressions Summary
Text Preprocessing Tokenization
Text Preprocessing Tokenization: Special Cases
Remarks:
Text Preprocessing Tokenization: Approaches
Text Preprocessing Tokenization: Rule-based [Jurafsky and Martin, 2007] [Grefenstette, 1999]
Text Preprocessing Tokenization: Rule-based [Jurafsky and Martin, 2007] [Grefenstette, 1999]
Remarks:
Text Preprocessing Problems of Rule-based Tokenization
Text Preprocessing Tokenization: Byte-Pair Encoding
Text Preprocessing Tokenization: Byte-Pair Encoding
Text Preprocessing Tokenization: Byte-Pair Encoding
Text Preprocessing Tokenization: Byte-Pair Encoding
Text Preprocessing Tokenization: Byte-Pair Encoding
Text Preprocessing Tokenization: Byte-Pair Encoding Rule Finding
Text Preprocessing Tokenization: Byte-Pair Encoding Rule Finding
Text Preprocessing Tokenization: Byte-Pair Encoding Rule Finding
Text Preprocessing Tokenization: Byte-Pair Encoding Rule Finding
Text Preprocessing Tokenization: Byte-Pair Encoding Rule Finding
Text Preprocessing Tokenization: Byte-Pair Encoding Rule Finding
Text Preprocessing Tokenization: Byte-Pair Encoding Rule Finding
Text Preprocessing Tokenization: Byte-Pair Encoding Rule Finding
Text Preprocessing Tokenization: Byte-Pair Encoding Rule Finding
Remarks:
Text Preprocessing Tokenization: Token Removal
Text Preprocessing Tokenization: Token Removal
Text Preprocessing Tokenization: Token Removal
Chapter NLP:III III. Text Models
Text Representation Models of Representation
Text Representation Token Representations
Text Representation Document Representation
Text Representation Document Representation: Bag of Words Metaphor
Text Representation Document Representation: Vector Space Model [Salton et. al. 1975]
Text Representation Document Representation: Vector Space Model [Salton et. al. 1975]
Text Representation Document Representation: Vector Space Model [Salton et. al. 1975]
Remarks: DTMs can become very large and very sparse (approx. 95% of elements are zero).
Text Representation Term Weighting: tf ·idf
Text Representation Term Weighting: tf ·idf
Text Representation Term Weighting: tf ·idf
Text Representation Term Weighting: tf ·idf
Text Representation Term Weighting: tf ·idf
Text Representation Term Weighting: tf ·idf Example
Text Representation Term Weighting: tf ·idf Example
Text Representation Term Weighting: tf ·idf Example
Text Representation Term Weighting: tf ·idf Example
Remarks:
Text Representation Vocabulary Pruning
Text Representation Distributional Representations of Words
Remarks: There are two other relevant hypotheses for distributional semantics.
Text Representation Distributional Representations of Words
Text Representation Co-occurrence Vectors
Text Representation Co-occurrence Vectors
Remarks: Principal components are linearly orthogonal vectors and differ by direction and the abount of
Text Representation Word2Vec
Text Representation Word2Vec
Remarks:
Text Representation Properties of Word Vectors
Text Representation Sentence Embeddings
Text Representation Sentence Embeddings
Chapter NLP:III III. Text Models
Text Similarity Text can be similar in different ways:
Text Similarity Text can be similar in different ways:
Text Similarity Similarity Measures
Text Similarity String-based Similarity: Hamming Distance
Text Similarity String-based Similarity: Levenshtein Distance
Text Similarity String-based Similarity
Remarks:
Text Similarity Resource-based Similarity: Thesaurus relations
Text Similarity Resource-based Similarity
Text Similarity Vector Distance
Text Similarity Vector Distance
Text Similarity Vector Similarity: Cosine Similarity
Text Similarity Vector Similarity: Cosine Similarity
Text Similarity Vector Similarity: Jaccard Similarity
Text Similarity Vector Similarity: Divergence
Text Similarity Vector Similarity
Remarks: Count vectors can be transformed into probability distributions (cf. Probability Mass
Similarity Measures Word Vector Similarity: Sentence Embeddings
Similarity Measures Word Vector Similarity: Sentence Embeddings
Similarity Measures Word Vector Similarity: Word Mover Distance
Similarity Measures Word Vector Similarity: Word Mover Distance
Similarity Measures Word Vector Similarity: Word Mover Distance
Text Similarity Word Vector Similarity
Chapter NLP:III III. Text Models
Text Classification Text Classification Problems
Text Classification Text Classification Problems
Text Classification Text Classification Problems
Text Classification Text Classification Problems
Text Classification Text Classification Problems
Text Classification Text Classification Problems
Text Classification Classification Tasks
Text Classification Classification Tasks
Text Classification Classification Tasks
Remarks: Classification and Regression in NLP
Text Classification Classification Tasks: Classes C
Text Classification Classification Tasks: Objects O
Remarks: Many (non-neural) classification algorithms work for |C| = 2 classes only. Multi-class and
Text Classification Feature space X
Text Classification Feature Engineering
Text Classification Content Features
Text Classification Linguistic Structure Features
Text Classification Task-specific features (a selection)
Text Classification Feature Engineering
Text Classification Representation Learning
Text Classification Representation Learning
Text Classification Feature Space Size
Text Classification Feature Space Size
Text Classification Common Classification Algorithms
Text Classification Evaluation
Text Classification Dataset Preparation
Text Classification Dataset Preparation: Negative Instances
Text Classification Dataset Preparation: Negative Instances
Text Classification Dataset Preparation: Mapping of Target Variable Values
Text Classification Dataset Preparation: Balancing Datasets
Text Classification Dataset Preparation: Balancing Datasets
Text Classification Dataset Preparation: Balancing Datasets
Text Classification Dataset Preparation: Balancing Datasets
Dataset Preparations Undersampling vs. Oversampling
Chapter NLP:IV IV. Text Models
Language Modeling Definition 1 (Language Model)
Language Modeling Definition 1 (Language Model)
Remarks:
Language Modeling Applications
Language Modeling Language Model Estimation
Language Modeling Language Model Estimation
Language Modeling Language Model Estimation
Language Modeling Language Model Estimation
Language Modeling Language Model Estimation
q
Language Modeling Language Model Estimation
Remarks:
Language Modeling Bi-gram Model
Language Modeling Bi-gram Model
Language Modeling Bi-gram Model
Language Modeling Bi-gram Model
Language Modeling Bi-gram Model
Language Modeling The N-gram Model
Language Modeling The N-gram Model
Language Modeling The N-gram Model
Remarks:
Language Modeling Improving the N-gram Model: Smoothing
Language Modeling Denominator Smoothing: Stupid Backoff
Language Modeling Denominator Smoothing: Linear Interpolation
Language Modeling Denominator Smoothing: Linear Interpolation
Language Modeling Numerator Smoothing: Add-one (Laplace) smoothing
Language Modeling Numerator Smoothing: Add-one (Laplace) smoothing
Language Modeling Numerator Smoothing: Add-one (Laplace) smoothing
Language Modeling Numerator Smoothing: Good-Turing smoothing
Language Modeling Numerator Smoothing: Kneser-Ney Smoothing
Language Modeling Numerator Smoothing: Kneser-Ney Smoothing
Language Modeling Numerator Smoothing: Kneser-Ney Smoothing
Remarks:
Language Modeling Conditional Language Modeling
Remarks: Conditional language models are the basis for large language models (LLMs).
Chapter NLP:IV IV. Text Models
Large Language Models Neural Language Models
Large Language Models Neural Language Models
Large Language Models Neural Language Models
Large Language Models Neural Language Models
Large Language Models Neural Language Models
Large Language Models Neural Language Models
Remarks: RNN Notation: t incidates the timestep. The weight matrices wh for encoding and wo are the
Large Language Models Transformer Architecture Overview
Large Language Models Transformer Architecture Overview: Transformer Block
Large Language Models Transformer Architecture Overview: Transformer Block
Large Language Models Transformer Architecture Overview: Transformer Block
Large Language Models Transformer Architecture Overview: Transformer Block
Large Language Models Transformer Architecture Overview: Transformer Block
Large Language Models Transformer Architecture Overview: Self-attention
Large Language Models Transformer Architecture Overview: Self-attention
Large Language Models Transformer Architecture Overview: Self-attention
Large Language Models Transformer Architecture Overview: Multi-headed Self-attention
Large Language Models Transformer Architecture Overview: Multi-headed Self-attention
Remarks:
Large Language Models Transformer Language Models: Types
Large Language Models Pre-training and Fine-tuning
Large Language Models Pre-training and Fine-tuning
Large Language Models Pre-training and Fine-tuning
Large Language Models Autoregressive Large Language Models
Large Language Models Autoregressive Large Language Models
Large Language Models LLM Fine-tuning: Instruction Tuning
Large Language Models Bidirectional Language Models
Large Language Models Bidirectional Language Models
Large Language Models Bidirectional Language Models
Large Language Models BERT Fine-tuning
Large Language Models Encoder-Decoder Language Models
Chapter NLP:IV IV. Text Models
Text Generation Autoregressive language models generate text by iteratively
Text Generation Decoding
Text Generation Decoding
Text Generation Decoding
Text Generation Scoring: Temperature
Text Generation Scoring: Temperature
Text Generation Scoring: Temperature
Text Generation Scoring: Temperature
Text Generation Scoring: Top-k Sampling
Text Generation Scoring: Top-k Sampling
Text Generation Scoring: Top-k Sampling
Text Generation Scoring: Nucleus (Top-p) Sampling
Text Generation Scoring: Nucleus (Top-p) Sampling
Text Generation Strategy: Contrastive Search
Text Generation Strategy: Contrastive Search
Text Generation Strategy: Contrastive Search
Remarks:
Text Generation Strategy: Beam Search
Remarks:
Chapter NLP:V V. Words
Morphology Overview [Hancox 1996]
Morphology Overview [Hancox 1996]
Morphology Overview [Hancox 1996]
Morphology Overview [Hancox 1996]
Morphology Stemming
Morphology Stemming: Principles [Frakes 1992]
Morphology Stemming: Affix Elimination
Morphology Stemming: Porter Stemmer
Morphology Stemming: Porter Stemmer
Morphology Remarks:
Morphology Stemming: Porter Stemmer
Morphology Stemming: Porter Stemmer
Morphology Stemming: Porter Stemmer
Morphology Stemming: Porter Stemmer
Morphology Stemming: Porter Stemmer
Morphology Stemming: Krovetz Stemmer
Morphology Stemming: Stemmer Comparison
Morphology Stemming: Stemmer Comparison
Morphology Stemming: Stemmer Comparison
Morphology Stemming: Character n-grams [McNamee et al. 2004] [McNamee et al. 2008]
Morphology Stemming: Character n-grams [McNamee et al. 2004] [McNamee et al. 2008]
Morphology Lemmatization
Chapter NLP:V V. Words
Word Classes Definition
Word Classes Traditional grammar
Word Classes Traditional grammar: Example
Remarks:
Remarks:
Word Classes Tagsets
Word Classes Penn Treebank tagset [upenn]
Word Classes Penn Treebank tagset [upenn]
Word Classes Penn Treebank tagset [upenn]
Word Classes Penn Treebank tagset [upenn]
Word Classes Penn Treebank tagset [upenn]
Word Classes Universal Dependencies tagset [UD]
Word Classes Universal Dependencies tagset [UD]
Word Classes Ambiguities
Remarks:
Word Classes Part-of-Speech Tagging
Word Classes Part-of-Speech Tagging
Word Classes Part-of-Speech Tagging: Maximum Likelihood Estimate
Word Classes Part-of-Speech Tagging: Brill Tagger [Brill 1992]
Word Classes Part-of-Speech Tagging: Brill Tagger [Brill 1992]
Word Classes Part-of-Speech Tagging: Brill Tagger [Brill 1994]
Word Classes Part-of-Speech Tagging: Brill Tagger [Brill 1994]
Word Classes Part-of-Speech Tagging: Token Classification
Word Classes Part-of-Speech Tagging: Token Classification
Word Classes Token Classification
Remarks:
Word Classes Part-of-Speech Tagging
Remarks:
Chapter NLP:V V. Words
Named Entities Entities
Named Entities Named Entities
Named Entities Named Entities
Remarks: Named entity tagsets vary by corpus and use case:
Named Entities Named Entity Recognition
Named Entities BIO Tagging
Named Entities BIO Tagging
Remarks: Two popular variations of BIO are IO and BIOES.
Chapter NLP:VI VI. Syntax
Grammar Formalisms Problem: Given a set of symbols, how do they incur meaning?
Grammar Formalisms Problem: Given a set of symbols, how do they incur meaning?
Grammar Formalisms Grammars
Grammar Formalisms Grammars
Remarks:
Grammar Formalisms Syntax Structures
Grammar Formalisms Syntax Parsing
Grammar Formalisms Ambiguity
Grammar Formalisms Ambiguity
Grammar Formalisms Ambiguity
Chapter NLP:VI VI. Syntax
Phrase Structure Grammars Formal Grammars
Phrase Structure Grammars Chomsky Hierarchy
Remarks: Context-sensitive grammars allow multiple symbols on the left side (but at least one
Phrase Structure Grammars Context-free grammars (CFG)
Phrase Structure Grammars Context-free grammars (CFG)
Phrase Structure Grammars Context-free grammars (CFG)
Phrase Structure Grammars Context-free grammars (CFG)
Phrase Structure Grammars Context-free grammars (CFG)
Phrase Structure Grammars CFG: Example Grammar
Phrase Structure Grammars CFG Construction: Treebanks
Phrase Structure Grammars Constituency Parsing
Phrase Structure Grammars CFG Modifications for Parsing
Phrase Structure Grammars Probabilistic CFG
Phrase Structure Grammars Probabilistic CFG
Phrase Structure Grammars Chomsky Normal Form
Phrase Structure Grammars Chomsky Normal Form
Phrase Structure Grammars Chomsky Normal Form
Phrase Structure Grammars CNF Transformation
Phrase Structure Grammars CNF Transformation: Replace Empty Rules
Phrase Structure Grammars CNF Transformation: Replace Unary Rules (1)
Phrase Structure Grammars CNF Transformation: Replace Unary Rules (2)
Phrase Structure Grammars CNF Transformation: Replace Unary Rules (3-7)
Phrase Structure Grammars CNF Transformation: Split n-ary rules with n ≥ 3
Phrase Structure Grammars Chomsky Normal Form Transformation: Pseudocode
Remarks: The original algorithm presented by Chomsky has 5 steps: START, TERM, BIN, DEL, and
Phrase Structure Grammars Cocke-Kasami-Younger (CKY) Parsing
Parsing based on a PCFG Cocke-Kasami-Younger (CKY) Parsing
Parsing based on a PCFG Cocke-Kasami-Younger (CKY) Parsing
Parsing based on a PCFG Cocke-Kasami-Younger (CKY) Parsing
Parsing based on a PCFG Cocke-Kasami-Younger (CKY) Parsing
Parsing based on a PCFG Cocke-Kasami-Younger (CKY) Parsing
Remarks: The binarization from the CNF is crucial for cubic time.
Phrase Structure Grammars CKY Parsing: Pseudo Code 1/2
Parsing based on a PCFG CKY Parsing: Pseudo Code 1/2
Phrase Structure Grammars CKY Parsing: Pseudo Code 2/2
Phrase Structure Grammars CKY Parsing: Pseudo Code 2/2
Phrase Structure Grammars CKY Parsing: Pseudo Code 2/2
Phrase Structure Grammars CKY Parsing: Example
Phrase Structure Grammars CKY Parsing: Example
Phrase Structure Grammars CKY Parsing: Example
Phrase Structure Grammars CKY Parsing: Example
Remarks:
Remarks:
Remarks:
Phrase Structure Grammars Lexicalization
Phrase Structure Grammars Lexicalized PCFG parsing[Collins, 1999]
Phrase Structure Grammars Unlexicalization[Klein and Manning, 2003]
Phrase Structure Grammars Linearized parsing[Vinyals, Kaiser, et al., 2015]
Remarks: Vinyals, Kaiser, et al. present linearaization as “Grammar as a Foreign Language”.
Phrase Structure Grammars Evaluation[Sekine and Collins, evalb]
Phrase Structure Grammars Evaluation[Sekine and Collins, evalb]
Phrase Structure Grammars Evaluation[Sekine and Collins, evalb]
Remarks: Those evaluation measures were developed at the PARSEVAL Workshop in 1998 and are
Phrase Structure Grammars Evaluation: Comparison of Methods
Chapter NLP:VI VI. Syntax
Dependency Grammars Definition
Dependency Grammars Properties of Dependencies
Remarks: Dependencies often approximate semantic relationships. Knowing the head-dependent
Dependency Grammars Dependency Treebanks: Universal Dependencies[UD, 2021]
Dependency Grammars Universal Dependency Relations[de Marneffe et al., 2014]
Dependency Grammars Universal Dependency Relations[de Marneffe et al., 2014]
Dependency Grammars Universal Dependency Relations[de Marneffe et al., 2014]
Dependency Grammars Transition-based parsing[Nivre, 2008]
Dependency Grammars Transition-based parsing[Nivre, 2008]
Dependency Grammars Transition-based parsing[Nivre, 2008]
Dependency Grammars Arc-Standard Parsing
Dependency Grammars Arc-Standard Parsing
Dependency Grammars Arc-Standard Parsing
Dependency Grammars Arc-Standard Parsing
Dependency Grammars Arc-Standard Parsing
Dependency Grammars Arc-Standard Parsing
Dependency Grammars Arc-Standard Parsing
Dependency Grammars Arc-Standard Parsing: Oracles
Dependency Grammars Arc-Standard Parsing: Oracles
Dependency Grammars Remarks:
Dependency Grammars Projectivity[McDonald et al., 2005]
Dependency Grammars Graph-based Parsing
Dependency Grammars Evaluation
Dependency Grammars Evaluation: Comparison of Methods
Chapter NLP:VII VII. Semantics
Semantic Structures Semantics
Semantic Structures Semantics
Remarks Semantics stems from the ancient Greek semantikos (relating to signs as in symptoms of a
Semantic Structures Lexical Semantics[OxfordRE Linguistics]
Semantic Structures Lexical Semantics: Word Senses
Semantic Structures Lexical Semantics: Lexical Relations (selection)
Semantic Structures Lexical Semantics: WordNet
Semantic Structures Lexical Semantics: WordNet
Remarks:
Semantic Structures Word Sense Disambiguation
Semantic Structures Word Sense Disambiguation: Lesk
Semantic Structures Word Sense Disambiguation: Lesk
Semantic Structures Word Sense Disambiguation: Classification
Remarks:
Semantic Structures Lexical Substitution
Semantic Structures Multi-Word Expressions
Semantic Structures Limitations of Lexical Semantics
Semantic Structures Compositional Semantics
Semantic Structures Compositional Semantics: Semantic Relations
Semantic Structures Compositional Semantics: Operators
Semantic Structures Compositional Semantics: Collocation
Remarks: A statistical approach to extract collocation from a corpus is cooccurrence significance on
Semantic Structures Compositional Semantics: Componential Analysis
Semantic Structures Frame Semantics: Semantic Roles
Semantic Structures Frame Semantics: Semantic Roles