Skip to content

Glossary

Applied Natural Language Processing (NLP)

Applied Natural Language Processing (NLP) uses computational techniques to analyze and work with human language. Common tasks include:

  • extracting information from text
  • classifying documents
  • identifying topics or sentiment
  • summarizing or transforming text

Web Mining

Web mining is the practice of collecting and analyzing data from the web.

Typical activities include:

  • retrieving web pages or APIs
  • extracting structured information from HTML
  • analyzing links, text, or metadata
  • building datasets for further analysis

Project

A structured set of files and folders that work together to run code, store data, and produce outputs.

Repository (repo)

A version-controlled project folder (often hosted on GitHub) that tracks changes over time.

Working Directory

The folder your terminal is currently operating in.

Path

An address to a file or folder (example: data/input.csv).

Dataset

A collection of data records used by a program.

Artifact

A file produced by the program (for example results or reports).

Logging

Messages written by the program that record what it is doing.

README.md

The front-page document that explains what a project is and how to run it.