📖 Interview Prep

Quick Reference

Core AI concepts explained. How RAG works. What agents do. Why context matters. Read this in 5 minutes before an interview or meeting.

How to use this: Skim the concept names. If one comes up in conversation, you've got the explanation. Print it, bookmark it, or read it right before the call. This is what you actually need to know.

1. Fundamentals
How AI Works (The Pipeline)
Input → Processing → Output. The basic flow.
The Flow
You send text (a prompt) → the AI processes it (predicts what comes next based on patterns it learned) → returns text. It's not thinking. It's predicting the statistically most likely next word, then the next, then the next. But statistically accurate predictions look like thinking.
Example: You type "The capital of France is" → the model has seen this pattern millions of times in training → predicts "Paris" → that's the output. Not because it "knows" facts. Because "Paris" is statistically the most likely word to follow that phrase.
Training vs. Inference
Training: Learning the patterns (happens once, costs millions). Inference: Using those patterns to answer your question (happens every time you ask, costs money per query).
In an interview: "AI isn't magic. It's pattern matching on a massive scale. The model learned statistical relationships from billions of words and can predict plausible continuations. That's why it's good at writing but can hallucinate facts."
2. Constraints
Tokens & Context Windows
AI reads in chunks. There's a limit to how much it can see at once.
What's a Token?
AI doesn't read words. It reads tokens — roughly word-sized chunks. "ChatGPT" = 1 token. "I cannot" = 2 tokens. 100 tokens ≈ 75 words. This matters because:
Context Window
Every model has a limit to how much it can process at once. Claude reads up to 200,000 tokens. GPT-4o reads 128,000. That sounds huge — it's about 150,000 words. But paste a 50-page document + a long conversation + a request, and you hit the limit. When you do, the model can't see the earlier parts.
Example: You have a 100-page customer manual. You paste it + ask a question. The model can read all of it (within the window). Now add another 100-page document + more questions. Now you're over the limit. The model can't see everything anymore.
In an interview: "Context windows are the limit to how much the model can process at once. Larger is better for document analysis or long conversations. Once you hit the limit, earlier context gets dropped. This is why sometimes models seem to 'forget' earlier parts of conversations."
3. Architecture
RAG (Retrieval-Augmented Generation)
Give AI your documents. It finds relevant ones, reads them, and answers based on what it found.
The Problem It Solves
Vanilla LLMs only know what they were trained on. They can't read your company's internal documents, recent news, or proprietary data. RAG fixes this by letting you upload documents that the model reads at query time.
How It Works
Step 1 (Offline): Convert your documents into embeddings (numbers that represent meaning). Store them in a database.
Step 2 (At query time): Convert the user's question into embeddings. Find the most similar document chunks (semantically similar, not keyword match).
Step 3: Pass those chunks to the LLM along with the question. The model answers based on those chunks, not just training data.
Example: You ask ChatGPT "What was our Q3 revenue?" and it hallucinates. With RAG: the system finds your Q3 earnings report, passes it to the LLM, which reads it and answers accurately from that source.
In an interview: "RAG lets AI work with your data without being retrained. It's retrieval-augmented — it retrieves relevant documents, then generates answers based on what it found. This is how you build AI that knows about proprietary information, recent events, or domain-specific data."
4. Representation
Embeddings (Semantic Search)
Turn words into numbers so the model can compare meaning.
What Is It?
An embedding is a list of numbers that represents the meaning of a piece of text. "The cat sat on the mat" becomes something like [0.2, -0.5, 0.8, ...]. Texts with similar meaning get similar numbers. This lets you do semantic search: find things by meaning, not keywords.
Why It Matters
Keyword search: you search for "dog" and get pages with "dog" but not "puppy" or "canine". Semantic search: you search for "dog" and get pages about canines, puppies, and animal companions because they mean similar things.
Example: You have 10,000 support tickets. Convert them to embeddings. A customer asks "How do I refund an order?" You convert that to embeddings, find tickets with similar embeddings, and you've found the 20 most relevant past tickets — even if none say "refund" exactly.
In an interview: "Embeddings are how we make semantic search work. They convert meaning into geometry. Documents with similar meaning end up close together in embedding space. This is foundational to RAG, vector databases, and any system that needs to find semantically similar content at scale."
5. Autonomy
Agents (Agentic AI)
AI that takes actions, not just answers. It thinks, acts, observes results, and adjusts.
Basic AI vs. Agents
Chat: You ask a question → it answers. Done.
Agent: You ask something complex → it breaks it into steps → executes each step (might search, run code, call APIs, read files) → observes the results → adjusts → eventually returns an answer.
The Loop
Think → Act (use a tool) → Observe (see the result) → Reflect (does this help?) → Repeat until done. The agent can do things beyond text generation: search the web, execute code, send emails, fetch data from APIs.
Example: You ask an agent "What's the cheapest flight from London to New York next Tuesday?" The agent thinks: "I need to search for flights." It calls a flight search API → sees results → thinks "I need to filter these" → does so → thinks "Now I need to find the cheapest" → does so → returns an answer with links to book.
In an interview: "Agents shift from chat to action. They can plan multi-step tasks, use tools, and iterate based on results. This is why AI is moving beyond 'answer my question' to 'actually do something' — book a meeting, run an analysis, fetch and synthesise data. The agent decides what to do, what tools to use, and when it's done."
6. Technique
Prompt Engineering (Getting Better Answers)
The better your question, the better the answer. There are patterns that consistently work.
Key Patterns
Specificity: "Write a poem" → vague. "Write a poem in the style of Keats about autumn, 8 lines, structured as a sonnet" → specific.
Context: Explain your situation before asking. "I'm a product manager for a fitness app" helps the model give domain-appropriate advice.
Format: "Give me a bullet list of 5 ideas" → structured output.
Examples: Show 2-3 examples of what you want before asking for the real thing.
Example: Bad: "Analyse this customer feedback." Good: "I'm a SaaS founder with a product management problem. Here's feedback from 10 customers. Group it by theme, tell me what's most urgent, and suggest a fix for the top issue. Use a bullet list."
In an interview: "Prompt engineering isn't mystical. It's just being clear and specific. The model responds to the quality of the question. A vague prompt gets a vague answer. A well-structured, specific prompt — where you explain context, give examples, and specify format — gets dramatically better results. It's like briefing a contractor: the better the brief, the better the work."
7. Reality Check
What AI Can't Do (Yet)
The important limitations to remember.
Hallucinations
The model predicts plausible text. Sometimes it predicts facts that sound plausible but don't exist. Fake citations, wrong dates, stats that don't exist. It's not lying. It's pattern-matching producing implausible but grammatically correct text.
Knowledge Cutoff
Training data ends at a specific date (Claude's is April 2024, GPT-4 is April 2023). It doesn't know what happened after that unless you give it RAG context.
No Real Reasoning
It predicts plausible text. It doesn't understand causality, logic, or consequence the way humans do. Reasoning models (o1, o3) simulate reasoning but it's still statistical prediction at the base.
Context Limit
As mentioned: there's a maximum amount of text it can process. And the longer the context, the less reliable it becomes on specific details.
In an interview: "AI is a tool, not a brain. It's remarkably capable at text generation, pattern matching, and synthesis. But it hallucinates, has a knowledge cutoff, and doesn't reason the way humans do. The key is knowing where to use it (draft writing, summarisation, brainstorming) and where you need human judgment (decisions, facts that matter, anything with real consequences)."