AI Glossary: Every Term You Need to Know
AI has its own language. If you've ever felt lost during a conversation about "fine-tuning LLMs" or "multimodal transformers"—this guide is for you.
I'll explain each term in plain English with practical context.
Bookmark this. You'll come back to it.
Core AI Terms
Artificial Intelligence (AI)
Software that can perform tasks that typically require human intelligence. Not sentient, not conscious—just very good at specific tasks.
Machine Learning (ML)
A subset of AI where systems learn from data instead of being explicitly programmed. Feed it examples, it finds patterns.
Deep Learning
Machine learning using neural networks with many layers. "Deep" refers to the number of layers, not profundity.
Neural Network
A computing system inspired by the human brain. Nodes (like neurons) connected in layers, processing information by passing signals between them.
Algorithm
A set of rules or instructions that a computer follows to solve a problem. In AI, algorithms determine how models learn and make predictions.
Language Model Terms
Large Language Model (LLM)
An AI trained on massive amounts of text to understand and generate human language. ChatGPT, Claude, and Gemini are all LLMs.
GPT (Generative Pre-trained Transformer)
OpenAI's architecture for language models. GPT-4, GPT-5, etc. The "transformer" is the underlying technology.
Transformer
The neural network architecture behind modern LLMs. Introduced in 2017, it revolutionized language AI by handling long-range dependencies in text.
Token
A chunk of text that AI processes. Could be a word, part of a word, or punctuation. "ChatGPT" might be one or two tokens. Models have token limits.
Context Window
How much text an AI can "remember" in a single conversation. Measured in tokens. More context = better understanding of long documents.
Examples:
- GPT-4: ~128K tokens
- Claude: ~200K tokens
- Gemini: ~2M tokens
Prompt
The input you give to an AI. Your question, instruction, or context. Better prompts = better outputs.
Completion
The output an AI generates in response to your prompt. Also called a "response" or "generation."
Training Terms
Training Data
The information used to teach an AI model. For LLMs, this is typically billions of pages of text from the internet, books, articles, etc.
Parameters
The values an AI adjusts during training to improve its performance. More parameters generally means more capable (but also more expensive). GPT-4 reportedly has trillions.
Pre-training
The first phase of training where a model learns general language understanding from massive text datasets.
Fine-tuning
Taking a pre-trained model and training it further on specific data for a particular task. Like teaching a general assistant to specialize in legal documents.
RLHF (Reinforcement Learning from Human Feedback)
A training method where humans rate AI outputs, and the model learns to prefer higher-rated responses. Makes AI more helpful and less harmful.
Instruction Tuning
Training a model to follow instructions rather than just predict the next word. This is why ChatGPT can follow commands.
Common AI Behaviors
Hallucination
When an AI generates false information stated as fact. It's not lying—it's pattern-matching gone wrong. Always verify important claims.
Emergent Abilities
Capabilities that appear in larger models that weren't present in smaller ones. Like spontaneous problem-solving skills.
Zero-Shot Learning
AI performing a task it wasn't specifically trained for, just from the prompt description. "Translate this to French" works without French-specific training.
Few-Shot Learning
Giving an AI a few examples of what you want before asking it to do the task. Often improves results dramatically.
Chain of Thought (CoT)
Asking an AI to "think step by step" before answering. Often produces more accurate results, especially for complex problems.
Technical Architecture
Encoder
Part of a transformer that processes input and creates internal representations. Good for understanding text.
Decoder
Part of a transformer that generates output based on representations. Good for creating text.
Attention Mechanism
How transformers focus on relevant parts of input when generating output. "Attention is all you need" was the famous paper that introduced this.
Embedding
Converting text into numbers (vectors) that AI can process. Similar meanings = similar numbers.
Vector Database
A database optimized for storing and searching embeddings. Essential for many AI applications.
AI Application Terms
Chatbot
An AI that converses with users. Chat interface for AI. ChatGPT is a chatbot powered by an LLM.
AI Agent
An AI that can take actions autonomously, not just respond. Can use tools, make decisions, complete multi-step tasks.
RAG (Retrieval-Augmented Generation)
Combining an LLM with external information retrieval. The AI searches a database, then uses what it finds to generate responses. How AI assistants "know" about your documents.
Multimodal
AI that can process multiple types of input: text, images, audio, video. GPT-4V and Gemini are multimodal.
API (Application Programming Interface)
A way for programs to talk to each other. The "ChatGPT API" lets developers build apps that use ChatGPT's capabilities.
Output & Generation
Temperature
A setting that controls randomness in AI output. Low temperature = more predictable, focused. High temperature = more creative, varied.
Top-P (Nucleus Sampling)
Another randomness control. Limits the AI to choosing from the most likely next tokens. Often used with temperature.
Max Tokens
A limit on how long an AI response can be. Prevents runaway generation and controls costs.
Stop Sequences
Characters or phrases that tell the AI to stop generating. Useful for controlling output format.
Streaming
Receiving AI output word-by-word as it's generated, rather than waiting for the complete response. How ChatGPT shows words appearing in real-time.
Evaluation Terms
Benchmark
A standardized test for measuring AI performance. Common ones: MMLU (general knowledge), HumanEval (coding), MT-Bench (conversation).
Accuracy
How often an AI gives the correct answer. Simple but imperfect metric.
Perplexity
A measure of how confused a model is when predicting text. Lower perplexity = better language understanding.
BLEU/ROUGE
Metrics for evaluating AI-generated text quality, especially translations and summaries.
Safety & Ethics Terms
Alignment
Making AI behave according to human values and intentions. The challenge of getting AI to do what we actually want.
Jailbreaking
Tricking an AI into ignoring its safety guidelines. Usually involves clever prompting.
Constitutional AI
Anthropic's approach to training AI with a set of principles (a "constitution") it tries to follow.
Red Teaming
Testing AI by actively trying to make it fail or behave badly. Helps identify vulnerabilities before deployment.
Bias
When AI reflects or amplifies prejudices from its training data. A major concern for fairness.
Deployment Terms
Inference
Running a trained model to generate outputs. This is what happens when you chat with ChatGPT.
Latency
How long it takes for AI to respond. Lower latency = faster responses.
Edge AI
Running AI on local devices (phones, computers) rather than in the cloud. Faster, more private, but limited by device capabilities.
On-Premise
Running AI on your own servers rather than using a cloud service. More control, more responsibility.
SaaS (Software as a Service)
AI accessed through a subscription service. ChatGPT Plus is AI SaaS.
Business & Pricing
Per-Token Pricing
Paying based on how much text you process. $X per million tokens. Standard for API access.
Rate Limiting
Restrictions on how many requests you can make in a given time period.
Freemium
Free basic access with paid upgrades. How most consumer AI tools operate.
Acronyms Quick Reference
| Acronym | Meaning |
|---|---|
| AI | Artificial Intelligence |
| ML | Machine Learning |
| DL | Deep Learning |
| LLM | Large Language Model |
| NLP | Natural Language Processing |
| NLU | Natural Language Understanding |
| GPT | Generative Pre-trained Transformer |
| RAG | Retrieval-Augmented Generation |
| RLHF | Reinforcement Learning from Human Feedback |
| CoT | Chain of Thought |
| API | Application Programming Interface |
Still Confused?
If you encounter a term not on this list, drop it in any AI chatbot:
"Explain [term] in simple language. I have no technical background."
That's what they're for.
I'll keep updating this glossary as new terms emerge. Bookmark and come back.
