Skip to content

Ch 10: Natural Language Processing Basics - Intermediate

Track: Practitioner | Try code in Playground | Back to chapter overview

Read online or run locally

To run the code interactively, clone the repo and open chapters/chapter-10-natural-language-processing-basics/notebooks/02_nlp_classification.ipynb in Jupyter.


Chapter 10: NLP Basics — Notebook 02 (Classification & NER)

This notebook covers deep learning for text (CNN/LSTM), multi-class text classification, named entity recognition with spaCy, text similarity and clustering, and common NLP pitfalls.

What you'll learn

Topic Section
Deep learning for NLP (RNNs, LSTMs) §1–2
Multi-class text classification §3
Named Entity Recognition (NER) with spaCy §4–5
Text similarity and clustering §6
Full pipeline and common pitfalls §7–8

Time estimate: 2.5–3 hours


Key concepts

  • LSTMs — Process sequences step-by-step and capture context for classification or tagging.
  • NER — Extract entities (person, organization, location, date) from text using spaCy or custom taggers.
  • Text similarity — TF-IDF or embedding-based cosine similarity; cluster documents with K-Means.
  • Pipeline — Combine sentiment, NER, and classification in one end-to-end flow.

Run the full notebook for code and outputs.


Generated by Berta AI