跳转至

Machine Learning & Deep Learning Fundamentals | 机器学习与深度学习基础

The Big Picture Traditional programming is writing rules: if x then y. Machine Learning is learning rules from data: Input + Output -> Rules. 宏观图景 传统编程是写规则:if x then y。 机器学习是从数据中学习规则:输入 + 输出 -> 规则

1. Three Types of Learning | 三种学习类型

Imagine you are teaching a child to identify fruits: 想象你在教一个孩子识别水果:

1.1 Supervised Learning (监督学习)

"The Teacher Approach" | “老师教学法” - Method: You show an apple and say "This is an apple". You show a banana and say "This is a banana". - Data: Labeled (Input + Correct Answer). - Use Case: Spam Filter, Face Recognition.

1.2 Unsupervised Learning (无监督学习)

"The Self-Discovery Approach" | “自我探索法” - Method: You give the child a basket of mixed fruits. They sort them by shape and color without knowing the names. - Data: Unlabeled (Raw Data). - Use Case: Customer Segmentation (Grouping similar users).

1.3 Reinforcement Learning (强化学习)

"The Dog Training Approach" | “训狗法” - Method: The child eats a fruit. If it's sweet, they smile (Reward). If it's sour, they cry (Penalty). They learn to pick sweet fruits. - Data: Action + Reward/Penalty. - Use Case: Game AI (AlphaGo), Robot Control, ChatGPT (RLHF).

2. Neural Networks: Mimicking the Brain | 神经网络:模仿大脑

Deep Learning uses Neural Networks, which are layers of math equations loosely inspired by neurons. 深度学习使用神经网络,这是受神经元启发的层层数学公式。

2.1 Structure Visualization | 结构可视化

graph LR
    A[Input Layer<br>(Image Pixels)] --> B[Hidden Layer 1<br>(Edges)]
    B --> C[Hidden Layer 2<br>(Shapes)]
    C --> D[Output Layer<br>(Prediction: 'Cat')]

    style A fill:#f9f,stroke:#333,stroke-width:2px
    style D fill:#9f9,stroke:#333,stroke-width:2px
  • Input Layer: Receives raw data (e.g., pixels of an image).
  • Hidden Layers: The "Black Box" where magic happens. They extract features (edges -> shapes -> objects).
  • Output Layer: Gives the final answer (Probability).

2.2 The "Shooting Range" Analogy (Training Process)

“射击场”类比(训练过程)

How does the network learn? It's like learning to shoot a target. 网络是如何学习的?就像练习射击。

  1. Forward Pass (Shoot): The model takes a guess. "I think this is a Cat."
    前向传播(射击):模型进行猜测。“我觉得这是一只猫。”
  2. Loss Function (Check Target): Compare guess with reality. "Wrong. It was a Dog." (Error is high).
    损失函数(查看靶心):比较猜测与现实。“错了。那是只狗。”(误差很大)。
  3. Backpropagation (Blame Assignment): Who is responsible? "Layer 3, you focused too much on the ears. Layer 2, you missed the nose."
    反向传播(归责):谁该负责?“第3层,你太关注耳朵了。第2层,你漏掉了鼻子。”
  4. Optimizer (Adjust Stance): Nudge the weights to fix the error. "Aim a bit to the left."
    优化器(调整姿势):微调权重以修正错误。“向左瞄一点。”

3. Key Architectures | 关键架构

  • CNN (Convolutional Neural Network): The "Eye". Scans images with a sliding window. Best for Vision.
    CNN“眼睛”。用滑动窗口扫描图像。最适合视觉。
  • RNN (Recurrent Neural Network): The "Short-term Memory". Reads word by word. Good for text but forgets long context.
    RNN“短期记忆”。逐词阅读。适合文本但容易忘记长上下文。
  • Transformer: The "Parallel Brain". Reads the whole sentence at once and pays attention to everything. The engine of GPT.
    Transformer“并行大脑”。一次性阅读整句并关注所有内容。GPT 的引擎。