跳转至

RAG Fundamentals: The Open-Book Exam | RAG 基础:开卷考试

The Problem with LLMs 1. Hallucination: They make things up when they don't know the answer. 2. Outdated Knowledge: ChatGPT doesn't know what happened today. 3. Private Data: It doesn't know your company's emails. LLM 的问题 1. 幻觉:当它们不知道答案时会瞎编。 2. 知识过时:ChatGPT 不知道今天发生了什么。 3. 私有数据:它不知道你公司的邮件。

1. What is RAG? | 什么是 RAG?

RAG (Retrieval-Augmented Generation) is like giving the AI a textbook during an exam. RAG(检索增强生成)就像在考试时给 AI 一本教科书。

  • Without RAG (Closed Book): "I think the capital of Mars is Elon City..." (Guessing).
  • With RAG (Open Book): Opens encyclopedia -> "According to this book, Mars has no capital." (Fact-based).

2. How It Works: The 3-Step Dance | 工作原理:三步舞

Step 1: Retrieval (The Librarian) | 检索(图书管理员)

User asks: "How do I fix Error 505?" 用户问:“我怎么修 Error 505?”

The system searches your Vector Database (your private library) and finds the relevant manual pages. 系统搜索你的向量数据库(你的私人图书馆)并找到相关的手册页。

Step 2: Augmentation (The Cheat Sheet) | 增强(小抄)

The system glues the user's question and the manual page together. 系统把用户的问题和手册页粘在一起。

Prompt to AI: "Context: [Manual Page Content...] Question: How do I fix Error 505? Answer using ONLY the context above."

Step 3: Generation (The Writer) | 生成(作家)

The AI reads the context and writes the answer. AI 阅读上下文并写出答案。

Why not just use Ctrl+F (Keyword Search)? 为什么不直接用 Ctrl+F(关键词搜索)?

Because users don't use the exact keywords. 因为用户不会使用精确的关键词。 - User asks: "My screen is frozen." - Manual says: "Display unresponsive."

Vector Search understands that "frozen" and "unresponsive" are similar concepts (High Dot Product). 向量搜索理解“frozen”和“unresponsive”是相似的概念(高点积)。

4. RAG vs. Fine-Tuning | RAG vs 微调

Feature RAG (Open Book) Fine-Tuning (Study Hard)
Analogy Looking up the answer in a book. Memorizing the book before the exam.
Knowledge Update Instant (Just add a new PDF). Slow (Re-train the model).
Accuracy High (Cites sources). Medium (Can still hallucinate).
Cost Low. High.

5. Scientist's Corner | 科学家角落

The "Lost in the Middle" Phenomenon If you give the AI too much text (e.g., 100 pages), it tends to forget the information in the middle. It pays most attention to the Beginning and the End. Smart RAG strategies (like Re-ranking) ensure the most important info is placed at the top or bottom of the prompt. “迷失在中间”现象 如果你给 AI 太多文本(例如 100 页),它倾向于忘记中间的信息。 它最关注开头结尾智能 RAG 策略(如重排序)确保最重要的信息被放在提示词的顶部或底部。

6. Practice Mission: The Manual RAG | 练习任务:手动 RAG

Objective (目标): Simulate the RAG process manually to understand how it works. 目标:手动模拟 RAG 流程以理解其工作原理。

Scenario (场景): You are the "Retrieval System". Your friend (The User) asks a question. You have a "Database" (a text file). 你是“检索系统”。你的朋友(用户)问了一个问题。你有一个“数据库”(一个文本文件)。

Data (Database):

File: secret_rules.txt 1. The password for the WiFi is "Pizza123". 2. The meeting is at 3 PM, not 2 PM. 3. The CEO likes coffee, not tea.

Task (任务): 1. User Question: "What is the wifi password?" 2. Retrieval (You): Look at the file. Which line is relevant? (Line 1). 3. Augmentation (You): Construct the prompt for the AI. * Template:

Context: [Insert the relevant line here]
Question: What is the wifi password?
Answer:
4. Generation (AI): What will the AI say?

Challenge (挑战): User asks: "Can I give the CEO a cup of Earl Grey?" 1. Retrieve: Find the line about the CEO. 2. Augment: Create the prompt. 3. Generate: What is the answer? (Yes/No?)

Answer Key (答案): - Challenge Answer: No. - Reasoning: The context says "The CEO likes coffee, not tea". Earl Grey is tea. Therefore, the AI should say "No, he prefers coffee."