Skip to content

Instantly share code, notes, and snippets.

@KuRRe8
Last active June 6, 2025 17:35
Show Gist options
  • Save KuRRe8/36f63d23ef205a8e02b7b7ec009cc4e8 to your computer and use it in GitHub Desktop.
Save KuRRe8/36f63d23ef205a8e02b7b7ec009cc4e8 to your computer and use it in GitHub Desktop.
和Python使用有关的一些教程,按类别分为不同文件

Python教程

Python是一个新手友好的语言,并且现在机器学习社区深度依赖于Python,C++, Cuda C, R等语言,使得Python的热度稳居第一。本Gist提供Python相关的一些教程,可以直接在Jupyter Notebook中运行。

  1. 语言级教程,一般不涉及初级主题;
  2. 标准库教程,最常见的标准库基本用法;
  3. 第三方库教程,主要是常见的库如numpy,pytorch诸如此类,只涉及基本用法,不考虑新特性

其他内容就不往这个Gist里放了,注意Gist依旧由git进行版本控制,所以可以git clone 到本地,或者直接Google Colab\ Kaggle打开相应的ipynb文件

直接在网页浏览时,由于没有文件列表,可以按Ctrl + F来检索相应的目录,或者点击下面的超链接。

想要参与贡献的直接在评论区留言,有什么问题的也在评论区说 ^.^

目录-语言部分

目录-库部分

目录-具体业务库部分-本教程更多关注机器学习深度学习内容

目录-附录

  • sigh.md个人对于Python动态语言的看法
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# PyTorch - 灵活的深度学习框架教程\n",
"\n",
"欢迎来到 PyTorch 教程!PyTorch 是一个由 Facebook AI Research (FAIR) 开发和维护的开源机器学习库,广泛用于计算机视觉和自然语言处理等应用。\n",
"\n",
"**为什么 PyTorch 在 ML/DL/数据科学领域如此受欢迎?**\n",
"\n",
"1. **Pythonic 和易用性**:PyTorch 的 API 设计简洁直观,与 Python 和 NumPy 的风格非常接近,学习曲线相对平缓。\n",
"2. **动态计算图 (Define-by-Run)**:计算图在运行时动态构建,使得调试更加容易,并且非常适合处理动态结构的模型(如 RNN)。\n",
"3. **强大的 GPU 加速**:可以轻松地将计算移到 NVIDIA GPU 上执行,显著加速训练过程。\n",
"4. **丰富的生态系统**:拥有庞大的社区和丰富的扩展库(如 TorchVision, TorchText, TorchAudio)。\n",
"5. **研究友好**:由于其灵活性,PyTorch 在学术研究界非常受欢迎。\n",
"\n",
"**本教程将涵盖 PyTorch 的核心概念和基本用法:**\n",
"\n",
"1. **张量 (Tensors)**:PyTorch 的基本数据结构。\n",
"2. **张量操作**:类似于 NumPy 的操作。\n",
"3. **自动微分 (`torch.autograd`)**:计算梯度。\n",
"4. **神经网络模块 (`torch.nn`)**:构建网络的基础。\n",
"5. **构建一个简单的神经网络**。\n",
"6. **损失函数 (`torch.nn`)**。\n",
"7. **优化器 (`torch.optim`)**。\n",
"8. **模型训练流程**。\n",
"9. **数据加载 (`Dataset` & `DataLoader`)**。\n",
"10. **GPU 使用基础**。\n",
"11. **模型保存与加载**。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 准备工作:导入 PyTorch\n",
"\n",
"确保你已经安装了 PyTorch (可以访问 [pytorch.org](https://pytorch.org/) 获取适合你系统的安装命令)。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import torch\n",
"import torch.nn as nn\n",
"import torch.optim as optim\n",
"import torch.utils.data as data\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"print(f\"PyTorch version: {torch.__version__}\")\n",
"\n",
"# 检查是否有可用的 GPU\n",
"device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
"print(f\"Using device: {device}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. 张量 (Tensors)\n",
"\n",
"张量是 PyTorch 中的核心数据结构,类似于 NumPy 的 `ndarray`,但增加了在 GPU 上计算和自动微分的功能。\n",
"\n",
"可以从 Python 列表、NumPy 数组等创建张量,也可以使用 PyTorch 提供的函数创建。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"--- Creating Tensors ---\")\n",
"# 从列表创建\n",
"data_list = [[1, 2], [3, 4]]\n",
"tensor_from_list = torch.tensor(data_list)\n",
"print(f\"Tensor from list:\\n{tensor_from_list}\")\n",
"print(f\"Tensor dtype: {tensor_from_list.dtype}\") # 默认通常是 torch.int64\n",
"\n",
"# 从 NumPy 数组创建 (共享内存,除非显式复制)\n",
"np_array = np.array([[5, 6], [7, 8]], dtype=np.float32)\n",
"tensor_from_np = torch.from_numpy(np_array)\n",
"print(f\"\\nTensor from NumPy array:\\n{tensor_from_np}\")\n",
"print(f\"Tensor dtype: {tensor_from_np.dtype}\")\n",
"\n",
"# 修改 NumPy 数组会影响 Tensor (反之亦然)\n",
"# np_array[0, 0] = 99\n",
"# print(f\"Tensor after modifying NumPy array:\\n{tensor_from_np}\")\n",
"\n",
"# 创建特定形状和类型的张量\n",
"tensor_zeros = torch.zeros(2, 3) # 2x3 全零张量 (默认 float32)\n",
"print(f\"\\nZeros tensor (2x3):\\n{tensor_zeros}\")\n",
"\n",
"tensor_ones = torch.ones(3, 2, dtype=torch.int)\n",
"print(f\"\\nOnes tensor (3x2, int):\\n{tensor_ones}\")\n",
"\n",
"tensor_rand = torch.rand(2, 2) # [0, 1) 均匀分布随机数\n",
"print(f\"\\nRandom tensor (2x2, uniform [0,1)):\\n{tensor_rand}\")\n",
"\n",
"tensor_randn = torch.randn(3, 1) # 标准正态分布随机数\n",
"print(f\"\\nRandom tensor (3x1, standard normal):\\n{tensor_randn}\")\n",
"\n",
"# 获取张量属性\n",
"print(f\"\\nProperties of tensor_from_list:\")\n",
"print(f\" Shape: {tensor_from_list.shape}\")\n",
"print(f\" Size: {tensor_from_list.size()}\") # size() 和 shape 类似\n",
"print(f\" Data type: {tensor_from_list.dtype}\")\n",
"print(f\" Device: {tensor_from_list.device}\") # 存储设备 (默认是 'cpu')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. 张量操作\n",
"\n",
"PyTorch 张量支持丰富的操作,语法与 NumPy 非常相似。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"t1 = torch.tensor([[1., 2.], [3., 4.]])\n",
"t2 = torch.tensor([[5., 6.], [7., 8.]])\n",
"scalar = 10\n",
"\n",
"print(f\"Tensor t1:\\n{t1}\")\n",
"print(f\"Tensor t2:\\n{t2}\")\n",
"\n",
"# 算术运算 (Element-wise)\n",
"print(\"\\n--- Arithmetic Operations ---\")\n",
"print(f\"t1 + t2:\\n{t1 + t2}\")\n",
"print(f\"torch.add(t1, t2):\\n{torch.add(t1, t2)}\")\n",
"print(f\"t1 * t2:\\n{t1 * t2}\") # 逐元素乘法\n",
"print(f\"t1 + scalar:\\n{t1 + scalar}\")\n",
"\n",
"# In-place 操作 (通常方法名后带下划线 '_')\n",
"t1_copy = t1.clone() # 创建副本\n",
"t1_copy.add_(t2) # t1_copy = t1_copy + t2\n",
"print(f\"\\nt1_copy after add_:\\n{t1_copy}\")\n",
"\n",
"# 矩阵乘法\n",
"print(\"\\n--- Matrix Multiplication ---\")\n",
"# 使用 @ 运算符 或 torch.matmul()\n",
"mat_mul = t1 @ t2\n",
"print(f\"t1 @ t2:\\n{mat_mul}\")\n",
"mat_mul_func = torch.matmul(t1, t2)\n",
"print(f\"torch.matmul(t1, t2):\\n{mat_mul_func}\")\n",
"\n",
"# 索引和切片 (类似 NumPy)\n",
"print(\"\\n--- Indexing and Slicing ---\")\n",
"print(f\"t1[0, 1]: {t1[0, 1]}\") # 第一行,第二列\n",
"print(f\"t1[:, 0]: {t1[:, 0]}\") # 第一列\n",
"print(f\"t1[t1 > 2]: {t1[t1 > 2]}\") # 布尔索引\n",
"\n",
"# 形状操作\n",
"print(\"\\n--- Reshaping ---\")\n",
"t_flat = torch.arange(6) # [0, 1, 2, 3, 4, 5]\n",
"t_reshaped = t_flat.view(2, 3) # 改变视图,共享数据 (类似 reshape 但不保证)\n",
"print(f\"Original flat tensor: {t_flat}\")\n",
"print(f\"View as (2, 3):\\n{t_reshaped}\")\n",
"t_reshape = t_flat.reshape(3, 2) # 更通用的 reshape,可能返回副本\n",
"print(f\"Reshape as (3, 2):\\n{t_reshape}\")\n",
"\n",
"# NumPy <-> PyTorch 转换\n",
"print(\"\\n--- NumPy Bridge ---\")\n",
"np_arr = np.ones((2,2))\n",
"torch_tensor = torch.from_numpy(np_arr)\n",
"print(f\"Tensor from NumPy:\\n{torch_tensor}\")\n",
"np_back = torch_tensor.numpy() # 转换回 NumPy 数组 (共享内存)\n",
"print(f\"NumPy from Tensor:\\n{np_back}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. 自动微分 (`torch.autograd`)\n",
"\n",
"`autograd` 是 PyTorch 的核心功能之一,它为张量上的所有操作提供自动微分。这对于神经网络的反向传播至关重要。\n",
"\n",
"* 设置 `requires_grad=True` 的张量会跟踪其上的所有操作。\n",
"* 当计算完成后,可以调用 `.backward()`,PyTorch 会自动计算所有梯度。\n",
"* 梯度会累积到张量的 `.grad` 属性中。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"--- Autograd Example ---\")\n",
"# 创建需要计算梯度的张量\n",
"x = torch.tensor(2.0, requires_grad=True)\n",
"w = torch.tensor(3.0, requires_grad=True)\n",
"b = torch.tensor(1.0, requires_grad=True)\n",
"\n",
"# 定义一个简单的计算图 (e.g., y = w * x + b)\n",
"y = w * x + b\n",
"print(f\"y = w*x + b = {y}\")\n",
"\n",
"# 定义一个损失函数 (e.g., L = y^2)\n",
"L = y.pow(2)\n",
"print(f\"L = y^2 = {L}\")\n",
"\n",
"# 执行反向传播计算梯度\n",
"print(\"\\nCalling L.backward()...\")\n",
"L.backward()\n",
"\n",
"# 查看梯度 (dL/dx, dL/dw, dL/db)\n",
"# dL/dy = 2*y = 2*(w*x+b) = 2*(3*2+1) = 14\n",
"# dL/dw = dL/dy * dy/dw = (2*y) * x = 14 * 2 = 28\n",
"# dL/dx = dL/dy * dy/dx = (2*y) * w = 14 * 3 = 42\n",
"# dL/db = dL/dy * dy/db = (2*y) * 1 = 14 * 1 = 14\n",
"print(f\"Gradient dL/dx (x.grad): {x.grad}\")\n",
"print(f\"Gradient dL/dw (w.grad): {w.grad}\")\n",
"print(f\"Gradient dL/db (b.grad): {b.grad}\")\n",
"\n",
"# --- 停止梯度跟踪 ---\n",
"# 使用 torch.no_grad() 上下文管理器\n",
"print(\"\\nInside torch.no_grad():\")\n",
"with torch.no_grad():\n",
" y_no_grad = w * x + b\n",
" print(f\" y_no_grad = {y_no_grad}\")\n",
" print(f\" y_no_grad.requires_grad: {y_no_grad.requires_grad}\") # False\n",
"\n",
"# 或者使用 .detach() 创建一个不带梯度历史的新张量\n",
"x_detached = x.detach()\n",
"print(f\"\\nx_detached requires_grad: {x_detached.requires_grad}\") # False\n",
"\n",
"# 梯度会累积,在优化循环中需要清零\n",
"print(\"\\nGradient accumulation:\")\n",
"y2 = w * x + b\n",
"L2 = y2.pow(2)\n",
"L2.backward() # 再次计算梯度\n",
"print(f\"w.grad after second backward(): {w.grad}\") # 梯度变成了 28 + 28 = 56\n",
"\n",
"# 清零梯度 (在优化循环开始时)\n",
"w.grad.zero_()\n",
"x.grad.zero_()\n",
"b.grad.zero_()\n",
"print(f\"w.grad after zero_(): {w.grad}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. 神经网络模块 (`torch.nn`)\n",
"\n",
"`torch.nn` 模块是构建神经网络的核心。它提供了各种网络层、损失函数和容器。\n",
"\n",
"* **`nn.Module`**: 所有神经网络模块的基类。自定义网络层需要继承它,并实现 `forward()` 方法。\n",
"* **常用层**: `nn.Linear`, `nn.Conv2d`, `nn.RNN`, `nn.LSTM`, `nn.Embedding`, `nn.Dropout`, `nn.BatchNorm2d` 等。\n",
"* **激活函数**: `nn.ReLU`, `nn.Sigmoid`, `nn.Tanh`, `nn.Softmax` 等 (通常在 `nn.functional` 中也有函数形式)。\n",
"* **损失函数**: `nn.MSELoss`, `nn.CrossEntropyLoss`, `nn.BCELoss` 等。\n",
"* **容器**: `nn.Sequential` (按顺序连接模块), `nn.ModuleList`, `nn.ModuleDict`。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"--- torch.nn Examples ---\")\n",
"\n",
"# 全连接层 (Linear Layer)\n",
"input_features = 4\n",
"output_features = 2\n",
"linear_layer = nn.Linear(input_features, output_features)\n",
"print(f\"Linear layer: {linear_layer}\")\n",
"print(f\"Linear layer weight shape: {linear_layer.weight.shape}\") # [output_features, input_features]\n",
"print(f\"Linear layer bias shape: {linear_layer.bias.shape}\") # [output_features]\n",
"\n",
"# 示例输入 (需要 Batch 维度, e.g., [batch_size, num_features])\n",
"sample_input = torch.randn(3, input_features) # Batch of 3 samples\n",
"output = linear_layer(sample_input)\n",
"print(f\"\\nSample input shape: {sample_input.shape}\")\n",
"print(f\"Output shape from linear layer: {output.shape}\") # [batch_size, output_features]\n",
"\n",
"# ReLU 激活函数\n",
"relu = nn.ReLU()\n",
"output_relu = relu(output)\n",
"print(f\"\\nOutput after ReLU:\\n{output_relu}\")\n",
"\n",
"# 使用 Sequential 容器构建简单网络\n",
"model_sequential = nn.Sequential(\n",
" nn.Linear(10, 20), # 输入 10, 输出 20\n",
" nn.ReLU(),\n",
" nn.Linear(20, 5) # 输入 20, 输出 5\n",
")\n",
"print(f\"\\nSequential model:\\n{model_sequential}\")\n",
"\n",
"# 示例输入\n",
"seq_input = torch.randn(4, 10) # Batch of 4, 10 features\n",
"seq_output = model_sequential(seq_input)\n",
"print(f\"Output shape from sequential model: {seq_output.shape}\") # [4, 5]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. 构建一个简单的神经网络\n",
"\n",
"通常通过继承 `nn.Module` 并定义 `__init__` 和 `forward` 方法来构建自定义网络。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"--- Building a Simple Network (Custom Class) ---\")\n",
"\n",
"class SimpleNet(nn.Module):\n",
" def __init__(self, input_size, hidden_size, output_size):\n",
" super(SimpleNet, self).__init__() # 调用父类的 __init__\n",
" self.layer1 = nn.Linear(input_size, hidden_size)\n",
" self.relu = nn.ReLU()\n",
" self.layer2 = nn.Linear(hidden_size, output_size)\n",
" print(f\"SimpleNet initialized with input={input_size}, hidden={hidden_size}, output={output_size}\")\n",
"\n",
" def forward(self, x):\n",
" # 定义数据如何通过网络层流动\n",
" print(\" SimpleNet forward pass starting...\")\n",
" out = self.layer1(x)\n",
" print(f\" Shape after layer1: {out.shape}\")\n",
" out = self.relu(out)\n",
" print(f\" Shape after relu: {out.shape}\")\n",
" out = self.layer2(out)\n",
" print(f\" Shape after layer2 (final output): {out.shape}\")\n",
" return out\n",
"\n",
"# 实例化网络\n",
"input_dim = 5\n",
"hidden_dim = 8\n",
"output_dim = 2\n",
"net = SimpleNet(input_dim, hidden_dim, output_dim)\n",
"print(f\"\\nNetwork structure:\\n{net}\")\n",
"\n",
"# 查看网络参数\n",
"print(\"\\nNetwork parameters:\")\n",
"for name, param in net.named_parameters():\n",
" if param.requires_grad:\n",
" print(f\" {name}: shape={param.shape}, requires_grad={param.requires_grad}\")\n",
"\n",
"# 示例前向传播\n",
"print(\"\\nPerforming a forward pass:\")\n",
"dummy_input = torch.randn(4, input_dim) # Batch of 4\n",
"output = net(dummy_input)\n",
"print(f\"Output tensor:\\n{output}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. 损失函数 (`torch.nn`)\n",
"\n",
"损失函数(或成本函数)衡量模型输出与真实目标之间的差距。训练的目标是最小化损失。\n",
"\n",
"* 回归常用:`nn.MSELoss` (均方误差), `nn.L1Loss` (平均绝对误差)。\n",
"* 分类常用:`nn.CrossEntropyLoss` (通常用于多分类,内部结合了 `LogSoftmax` 和 `NLLLoss`),`nn.BCELoss` (二元交叉熵,需要输入概率和 Sigmoid 激活)。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"--- Loss Function Examples ---\")\n",
"\n",
"# --- MSELoss for Regression --- \n",
"loss_fn_mse = nn.MSELoss()\n",
"predictions_reg = torch.tensor([2.5, 4.8, 1.9])\n",
"targets_reg = torch.tensor([3.0, 5.0, 1.5])\n",
"mse_loss = loss_fn_mse(predictions_reg, targets_reg)\n",
"print(f\"MSE Loss: {mse_loss.item()}\") # .item() 获取标量张量的值\n",
"\n",
"# --- CrossEntropyLoss for Multi-class Classification --- \n",
"# 输入: 模型输出的原始分数 (logits),形状 [BatchSize, NumClasses]\n",
"# 目标: 类别索引 (整数),形状 [BatchSize]\n",
"loss_fn_ce = nn.CrossEntropyLoss()\n",
"predictions_clf = torch.tensor([[2.0, -1.0, 0.5], # Sample 1 scores for 3 classes\n",
" [0.1, 1.5, -0.5]]) # Sample 2 scores\n",
"targets_clf = torch.tensor([0, 1]) # Sample 1 true class is 0, Sample 2 true class is 1\n",
"ce_loss = loss_fn_ce(predictions_clf, targets_clf)\n",
"print(f\"\\nCross Entropy Loss: {ce_loss.item()}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 7. 优化器 (`torch.optim`)\n",
"\n",
"优化器实现了更新模型参数(权重和偏置)以最小化损失函数的算法。\n",
"\n",
"* 常用优化器:`optim.SGD` (随机梯度下降), `optim.Adam`, `optim.RMSprop`, `optim.Adagrad`。\n",
"* 需要将模型的参数 (`model.parameters()`) 和学习率 (`lr`) 传递给优化器构造函数。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"--- Optimizer Example (using SimpleNet) ---\")\n",
"\n",
"learning_rate = 0.01\n",
"\n",
"# 使用 SGD\n",
"optimizer_sgd = optim.SGD(net.parameters(), lr=learning_rate, momentum=0.9)\n",
"print(f\"SGD Optimizer: {optimizer_sgd}\")\n",
"\n",
"# 使用 Adam (常用的自适应学习率优化器)\n",
"optimizer_adam = optim.Adam(net.parameters(), lr=learning_rate)\n",
"print(f\"Adam Optimizer: {optimizer_adam}\")\n",
"\n",
"# 优化器的关键方法:\n",
"# - optimizer.zero_grad(): 清除所有参数的梯度 (在计算新梯度前必须调用)\n",
"# - optimizer.step(): 根据计算出的梯度更新参数"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 8. 模型训练流程\n",
"\n",
"一个典型的 PyTorch 训练循环包含以下步骤:\n",
"\n",
"1. **前向传播 (Forward Pass)**:将输入数据传递给模型,得到输出。\n",
"2. **计算损失 (Compute Loss)**:使用损失函数计算模型输出和真实目标之间的差距。\n",
"3. **反向传播 (Backward Pass)**:调用 `loss.backward()` 计算损失相对于模型参数的梯度。\n",
"4. **更新参数 (Update Parameters)**:调用 `optimizer.step()` 更新模型参数。\n",
"5. **清零梯度 (Zero Gradients)**:调用 `optimizer.zero_grad()` 清除梯度,为下一次迭代做准备。\n",
"\n",
"这个过程通常在一个循环中重复多个周期 (epochs)。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"--- Simple Training Loop Example (Linear Regression) ---\")\n",
"\n",
"# 1. 准备数据 (简单线性关系 y = 2*x + 1 + noise)\n",
"X_numpy = np.linspace(0, 10, 100).reshape(-1, 1) # 100个样本, 1个特征\n",
"y_numpy = 2 * X_numpy + 1 + np.random.randn(100, 1) * 2 # 添加噪声\n",
"\n",
"X_train = torch.from_numpy(X_numpy.astype(np.float32))\n",
"y_train = torch.from_numpy(y_numpy.astype(np.float32))\n",
"\n",
"# 2. 定义模型、损失和优化器\n",
"input_size = 1\n",
"output_size = 1\n",
"model = nn.Linear(input_size, output_size)\n",
"criterion = nn.MSELoss()\n",
"optimizer = optim.SGD(model.parameters(), lr=0.01)\n",
"\n",
"print(f\"Initial model weights: {model.weight.item():.3f}, bias: {model.bias.item():.3f}\")\n",
"\n",
"# 3. 训练循环\n",
"num_epochs = 100\n",
"losses = []\n",
"\n",
"for epoch in range(num_epochs):\n",
" # 前向传播\n",
" outputs = model(X_train)\n",
" loss = criterion(outputs, y_train)\n",
" \n",
" # 反向传播和优化\n",
" optimizer.zero_grad() # 清零梯度\n",
" loss.backward() # 计算梯度\n",
" optimizer.step() # 更新权重\n",
" \n",
" losses.append(loss.item())\n",
" \n",
" if (epoch + 1) % 20 == 0:\n",
" print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')\n",
"\n",
"print(f\"\\nFinal model weights: {model.weight.item():.3f}, bias: {model.bias.item():.3f}\") # 应该接近 2 和 1\n",
"\n",
"# 绘制损失曲线\n",
"plt.figure(figsize=(6, 4))\n",
"plt.plot(losses)\n",
"plt.xlabel(\"Epoch\")\n",
"plt.ylabel(\"MSE Loss\")\n",
"plt.title(\"Training Loss Over Epochs\")\n",
"plt.show()\n",
"\n",
"# 绘制拟合结果\n",
"predicted = model(X_train).detach().numpy() # detach() 停止梯度跟踪,.numpy() 转回 NumPy\n",
"plt.figure(figsize=(8, 5))\n",
"plt.scatter(X_numpy, y_numpy, label='Original data', s=10, alpha=0.7)\n",
"plt.plot(X_numpy, predicted, color='red', label='Fitted line')\n",
"plt.xlabel(\"X\")\n",
"plt.ylabel(\"y\")\n",
"plt.title(\"Linear Regression Fit\")\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 9. 数据加载 (`Dataset` & `DataLoader`)\n",
"\n",
"`torch.utils.data` 提供了两个核心类来帮助加载和处理数据:\n",
"* **`Dataset`**: 一个抽象类,代表数据集。你需要继承它并实现 `__len__` (返回数据集大小) 和 `__getitem__` (根据索引获取一个样本)。\n",
"* **`DataLoader`**: 将 `Dataset` 包装起来,提供对数据的迭代访问,支持批处理 (batching)、打乱 (shuffling) 和并行加载数据。\n",
"\n",
"对于常见的图像 (TorchVision)、文本 (TorchText)、音频 (TorchAudio) 数据集,通常有预置的 `Dataset` 类。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"--- Dataset and DataLoader Example ---\")\n",
"\n",
"# 1. 创建一个简单的自定义 Dataset\n",
"class SimpleDataset(data.Dataset):\n",
" def __init__(self, features, labels):\n",
" self.features = features\n",
" self.labels = labels\n",
" print(f\"SimpleDataset created with {len(features)} samples.\")\n",
"\n",
" def __len__(self):\n",
" return len(self.features)\n",
"\n",
" def __getitem__(self, idx):\n",
" # 根据索引获取一个样本 (特征和标签)\n",
" sample_feature = self.features[idx]\n",
" sample_label = self.labels[idx]\n",
" # print(f\" Dataset __getitem__ called for index {idx}\")\n",
" return sample_feature, sample_label\n",
"\n",
"# 使用之前的线性回归数据\n",
"dataset = SimpleDataset(X_train, y_train)\n",
"\n",
"# 获取第一个样本\n",
"first_sample_feature, first_sample_label = dataset[0]\n",
"print(f\"\\nFirst sample feature: {first_sample_feature}, label: {first_sample_label}\")\n",
"\n",
"# 2. 创建 DataLoader\n",
"batch_size = 16\n",
"shuffle_data = True\n",
"num_workers_loader = 0 # Windows often requires 0, Linux/Mac can use > 0 for parallelism\n",
"\n",
"dataloader = data.DataLoader(dataset=dataset, \n",
" batch_size=batch_size, \n",
" shuffle=shuffle_data,\n",
" num_workers=num_workers_loader)\n",
"\n",
"print(f\"\\nDataLoader created with batch_size={batch_size}, shuffle={shuffle_data}\")\n",
"\n",
"# 迭代 DataLoader 获取数据批次\n",
"print(\"Iterating through the first batch from DataLoader:\")\n",
"num_batches_to_show = 1\n",
"for i, (batch_features, batch_labels) in enumerate(dataloader):\n",
" print(f\"\\nBatch {i+1}:\")\n",
" print(f\" Features shape: {batch_features.shape}\") # [batch_size, num_features]\n",
" print(f\" Labels shape: {batch_labels.shape}\") # [batch_size, num_labels]\n",
" # 在训练循环中,你会在这里进行模型的前向/反向传播\n",
" if i+1 >= num_batches_to_show:\n",
" break"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 10. GPU 使用基础\n",
"\n",
"如果你的机器上有兼容的 NVIDIA GPU 和 CUDA 环境,可以将张量和模型移动到 GPU 上进行计算。\n",
"\n",
"1. **确定设备**:`device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")`\n",
"2. **移动模型到设备**:`model.to(device)`\n",
"3. **移动数据到设备**:在训练循环中,将每个数据批次移动到设备:`inputs, labels = inputs.to(device), labels.to(device)`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"--- GPU Usage Example ---\")\n",
"\n",
"# 1. 确定设备 (在教程开始时已完成)\n",
"print(f\"Using device: {device}\")\n",
"\n",
"# 2. 创建模型并移动到设备\n",
"model_gpu = nn.Linear(10, 5) # 示例模型\n",
"model_gpu.to(device)\n",
"print(f\"Model moved to device: {next(model_gpu.parameters()).device}\")\n",
"\n",
"# 3. 创建数据并移动到设备\n",
"input_cpu = torch.randn(4, 10)\n",
"input_gpu = input_cpu.to(device)\n",
"print(f\"Input tensor on device: {input_gpu.device}\")\n",
"\n",
"# 在 GPU 上执行前向传播\n",
"# (确保模型和输入都在同一设备上)\n",
"try:\n",
" with torch.no_grad(): # 推理时通常不需要梯度\n",
" output_gpu = model_gpu(input_gpu)\n",
" print(f\"Output tensor on device: {output_gpu.device}\")\n",
"\n",
" # 如果需要将结果移回 CPU (例如,用于绘图或 NumPy 操作)\n",
" output_cpu = output_gpu.cpu()\n",
" print(f\"Output tensor moved back to CPU: {output_cpu.device}\")\n",
"except Exception as e:\n",
" print(f\"An error occurred during GPU operation (maybe no GPU?): {e}\")\n",
"\n",
"# 在训练循环中移动数据 (伪代码)\n",
"print(\"\\nTraining loop data movement (pseudo-code):\")\n",
"print(\"for inputs, labels in dataloader:\")\n",
"print(\" inputs = inputs.to(device)\")\n",
"print(\" labels = labels.to(device)\")\n",
"print(\" # ... rest of the training step ...\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 11. 模型保存与加载\n",
"\n",
"训练好的模型需要保存以便后续使用或继续训练。\n",
"\n",
"* **推荐方式**:只保存模型的**状态字典 (`state_dict`)**,它包含了模型的所有可学习参数(权重和偏置)。\n",
"* 加载时,需要先创建同一模型类的实例,然后加载状态字典。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"print(\"--- Saving and Loading Models ---\")\n",
"\n",
"# 假设我们已经训练好了之前的线性回归模型 'model'\n",
"model_path = \"linear_regression_model.pth\"\n",
"\n",
"# 1. 保存模型的状态字典\n",
"torch.save(model.state_dict(), model_path)\n",
"print(f\"Model state_dict saved to {model_path}\")\n",
"\n",
"# 2. 加载模型\n",
"# a) 首先,创建同一模型结构的新实例\n",
"loaded_model = nn.Linear(input_size, output_size)\n",
"print(\"New model instance created.\")\n",
"print(f\"Weights before loading: {loaded_model.weight.item():.3f}\")\n",
"\n",
"# b) 加载状态字典\n",
"loaded_model.load_state_dict(torch.load(model_path))\n",
"print(\"State_dict loaded into the new model instance.\")\n",
"print(f\"Weights after loading: {loaded_model.weight.item():.3f}\") # 应该和之前保存的一样\n",
"\n",
"# c) 设置为评估模式 (重要!影响 Dropout, BatchNorm 等层的行为)\n",
"loaded_model.eval()\n",
"print(\"Model set to evaluation mode (.eval())\")\n",
"\n",
"# 使用加载的模型进行预测\n",
"# with torch.no_grad():\n",
"# predictions = loaded_model(some_new_data)\n",
"\n",
"# 清理保存的文件\n",
"if os.path.exists(model_path):\n",
" os.remove(model_path)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 总结\n",
"\n",
"PyTorch 是一个强大而灵活的深度学习框架,特别适合研究和需要精细控制的场景。\n",
"\n",
"**关键要点:**\n",
"* **张量 (Tensor)** 是核心数据结构,支持 GPU 和自动微分。\n",
"* **`autograd`** 自动计算梯度,是反向传播的基础。\n",
"* **`torch.nn`** 提供了构建神经网络的模块、层和损失函数。\n",
"* **`torch.optim`** 提供了优化算法来更新模型参数。\n",
"* 训练循环通常包括:**前向传播 -> 计算损失 -> 反向传播 -> 更新参数 -> 清零梯度**。\n",
"* **`Dataset` 和 `DataLoader`** 用于高效地加载和批处理数据。\n",
"* 使用 **`.to(device)`** 将模型和数据移动到 GPU (如果可用)。\n",
"* 推荐保存和加载模型的 **`state_dict`**。\n",
"\n",
"PyTorch 的功能非常丰富,本教程只介绍了基础。要深入学习,需要探索更复杂的网络结构 (CNN, RNN), TorchVision/TorchText 等库,以及更高级的训练技巧。官方文档和教程是极佳的学习资源。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 5
}
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

对动态语言Python的一些感慨

众所周知Python是完全动态的语言,体现在

  1. 类型动态绑定
  2. 运行时检查
  3. 对象结构内容可动态修改(而不仅仅是值)
  4. 反射
  5. 一切皆对象(instance, class, method)
  6. 可动态执行代码(eval, exec)
  7. 鸭子类型支持

动态语言的约束更少,对使用者来说更易于入门,但相应的也会有代价就是运行时开销很大,和底层汇编执行逻辑完全解耦不知道代码到底是怎么执行的。

而且还有几点是我认为较为严重的缺陷。下面进行梳理。

破坏了OOP的语义

较为流行的编程语言大多支持OOP编程范式。即继承和多态。同样,Python在执行简单任务时候可以纯命令式(Imperative Programming),也可以使用复杂的面向对象OOP。

但是,其动态特性破环了OOP的结构:

  1. 类型模糊:任何类型实例,都可以在运行时添加或者删除属性或者方法(相比之下静态语言只能在运行时修改它们的值)。经此修改的实例,按理说不再属于原来的类型,毕竟和原类型已经有了明显的区别。但是该实例的内建__class__属性依旧会指向原类型,这会给类型的认知造成困惑。符合一个class不应该只是名义上符合,而是内容上也应该符合。
  2. 破坏继承:体现在以下两个方面
    1. 大部分实践没有虚接口继承。abc模块提供了虚接口的基类ABC,经典的做法是让自己的抽象类继承自ABC,然后具体类继承自自己的抽象类,然后去实现抽象方法。但PEP提案认为Pythonic的做法是用typing.Protocol来取代ABC,具体类完全不继承任何虚类,只要实现相应的方法,那么就可以被静态检查器认为是符合Protocol的。
    2. 不需要继承自具体父类。和上一条一样,即使一个类没有任何父类(除了object类),它依旧可以生成同名的方法,以实现和父类方法相同的调用接口。这样在语义逻辑上,类的定义完全看不出和其他类有何种关系。完全可以是一种松散的组织结构,任何两个类之间都没继承关系。
  3. 破坏多态:任何一个入参出参,天然不限制类型。这使得要求父类型的参数处,传入子类型显得没有意义,依旧是因为任何类型都能动态修改满足要求。

破坏了设计模式

经典的模式诸如工厂模式,抽象工厂,访问者模式,都严重依赖于继承和多态的性质。但是在python的设计中,其动态能力使得设计模式形同虚设。 大家常见的库中使用设计模式的有transformers库,其中的from_pretrained系列则是工厂模式,通过字符串名称确定了具体的构造器得到具体的子类。而工厂构造器的输出类型是一个所有模型的基类。

安全性问题

Python在代码层面一般不直接管理指针,所以指针越界,野指针,悬空指针等问题一般不存在。而gc机制也能自动处理垃圾回收使得编码过程不必关注这类安全性问题。但与之相对的,Python也有自己的安全性问题。以往非托管形式的代码的攻击难度较大,注入代码想要稳定执行需要避免破坏原来的结构导致程序直接崩溃(段错误)。 Python却可以直接注入任何代码修改原本的逻辑,并且由于不是在code段固定的内容,攻击时候也无需有额外考虑。运行时可以手动修改globals() locals()内容,亦有一定风险。 另一个危险则是类型不匹配导致的代码执行问题,因为只有在运行时才确定类型,无法提前做出保证,可能会产生类型错误的异常,造成程序崩溃。

总结

我出身于C++。但是近年来一直在用python编程。而且python的市场占有率已经多年第一,且遥遥领先。这和其灵活性分不开关系。对于一个面向大众的编程语言,这样的特性是必要的。即使以上说了诸多python的不严谨之处,但是对于程序员依旧可以选择严谨的面向对象写法。所以,程序的优劣不在于语言怎么样,而在于程序员本身。程序员有责任写出易于维护,清晰,规范的代码~

Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@KuRRe8
Copy link
Author

KuRRe8 commented May 8, 2025

返回顶部

有见解,有问题,或者单纯想盖楼灌水,都可以在这里发表!

因为文档比较多,有时候渲染不出来ipynb是浏览器性能的问题,刷新即可

或者git clone到本地来阅读

ChatGPT Image May 9, 2025, 04_45_04 AM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment