Ch 10: Natural Language Processing Basics - Intermediate¶
Track: Practitioner | Try code in Playground | Back to chapter overview
Read online or run locally
To run the code interactively, clone the repo and open chapters/chapter-10-natural-language-processing-basics/notebooks/02_nlp_classification.ipynb in Jupyter.
Chapter 10: NLP Basics — Notebook 02 (Classification & NER)¶
This notebook covers deep learning for text (CNN/LSTM), multi-class text classification, named entity recognition with spaCy, text similarity and clustering, and common NLP pitfalls.
What you'll learn¶
| Topic | Section |
|---|---|
| Deep learning for NLP (RNNs, LSTMs) | §1–2 |
| Multi-class text classification | §3 |
| Named Entity Recognition (NER) with spaCy | §4–5 |
| Text similarity and clustering | §6 |
| Full pipeline and common pitfalls | §7–8 |
Time estimate: 2.5–3 hours
Key concepts¶
- LSTMs — Process sequences step-by-step and capture context for classification or tagging.
- NER — Extract entities (person, organization, location, date) from text using spaCy or custom taggers.
- Text similarity — TF-IDF or embedding-based cosine similarity; cluster documents with K-Means.
- Pipeline — Combine sentiment, NER, and classification in one end-to-end flow.
Run the full notebook for code and outputs.
Generated by Berta AI