Ch 10: Natural Language Processing Basics - Intermediate¶

Track: Practitioner | Try code in Playground | Back to chapter overview

Read online or run locally

To run the code interactively, clone the repo and open chapters/chapter-10-natural-language-processing-basics/notebooks/02_nlp_classification.ipynb in Jupyter.

Chapter 10: NLP Basics — Notebook 02 (Classification & NER)¶

This notebook covers deep learning for text (CNN/LSTM), multi-class text classification, named entity recognition with spaCy, text similarity and clustering, and common NLP pitfalls.

What you'll learn¶

Topic	Section
Deep learning for NLP (RNNs, LSTMs)	§1–2
Multi-class text classification	§3
Named Entity Recognition (NER) with spaCy	§4–5
Text similarity and clustering	§6
Full pipeline and common pitfalls	§7–8

Time estimate: 2.5–3 hours

Key concepts¶

LSTMs — Process sequences step-by-step and capture context for classification or tagging.
NER — Extract entities (person, organization, location, date) from text using spaCy or custom taggers.
Text similarity — TF-IDF or embedding-based cosine similarity; cluster documents with K-Means.
Pipeline — Combine sentiment, NER, and classification in one end-to-end flow.

Run the full notebook for code and outputs.

Generated by Berta AI