Skip to content

Instantly share code, notes, and snippets.

@KuRRe8
Last active June 6, 2025 17:35
Show Gist options
  • Save KuRRe8/36f63d23ef205a8e02b7b7ec009cc4e8 to your computer and use it in GitHub Desktop.
Save KuRRe8/36f63d23ef205a8e02b7b7ec009cc4e8 to your computer and use it in GitHub Desktop.
和Python使用有关的一些教程,按类别分为不同文件

Python教程

Python是一个新手友好的语言,并且现在机器学习社区深度依赖于Python,C++, Cuda C, R等语言,使得Python的热度稳居第一。本Gist提供Python相关的一些教程,可以直接在Jupyter Notebook中运行。

  1. 语言级教程,一般不涉及初级主题;
  2. 标准库教程,最常见的标准库基本用法;
  3. 第三方库教程,主要是常见的库如numpy,pytorch诸如此类,只涉及基本用法,不考虑新特性

其他内容就不往这个Gist里放了,注意Gist依旧由git进行版本控制,所以可以git clone 到本地,或者直接Google Colab\ Kaggle打开相应的ipynb文件

直接在网页浏览时,由于没有文件列表,可以按Ctrl + F来检索相应的目录,或者点击下面的超链接。

想要参与贡献的直接在评论区留言,有什么问题的也在评论区说 ^.^

目录-语言部分

目录-库部分

目录-具体业务库部分-本教程更多关注机器学习深度学习内容

目录-附录

  • sigh.md个人对于Python动态语言的看法
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# NumPy - Python 数值计算基石教程\n",
"\n",
"欢迎来到 NumPy 教程!NumPy (Numerical Python) 是 Python 科学计算生态系统的核心库。它提供了一个强大的 N 维数组对象 (`ndarray`),以及用于高效处理这些数组的各种函数。\n",
"\n",
"**为什么 NumPy 对 ML/DL/数据科学如此重要?**\n",
"\n",
"1. **高效的数值运算**:NumPy 的核心是用 C 语言编写的,其数组操作(向量化操作)比纯 Python 的列表循环快得多。\n",
"2. **`ndarray` 对象**:提供了一个同构(所有元素类型相同)、多维、固定大小的数组,非常适合表示数值数据、向量、矩阵、图像等。\n",
"3. **数学函数库**:包含大量的数学、线性代数、傅里叶变换和随机数生成函数。\n",
"4. **生态系统基础**:Pandas, Scikit-learn, SciPy, Matplotlib, PyTorch, TensorFlow 等几乎所有科学计算和 ML/DL 库都建立在 NumPy 之上或与之紧密集成。\n",
"\n",
"**本教程将涵盖 NumPy 的核心概念和常用操作:**\n",
"\n",
"1. 创建 NumPy 数组 (`ndarray`)\n",
"2. 数组的基本属性\n",
"3. 数组索引和切片\n",
"4. 向量化操作和通用函数 (ufuncs)\n",
"5. 广播 (Broadcasting)\n",
"6. 基本线性代数\n",
"7. 随机数生成\n",
"8. 数组形状操作"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 准备工作:导入 NumPy\n",
"\n",
"按照惯例,我们将 NumPy 导入并简写为 `np`。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"print(f\"NumPy version: {np.__version__}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. 创建 NumPy 数组 (`ndarray`)\n",
"\n",
"有多种方法可以创建 NumPy 数组:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 从 Python 列表或元组创建\n",
"list_data = [1, 2, 3, 4, 5]\n",
"arr_from_list = np.array(list_data)\n",
"print(f\"Array from list: {arr_from_list}\")\n",
"print(f\"Type: {type(arr_from_list)}\")\n",
"\n",
"tuple_data = (6, 7, 8)\n",
"arr_from_tuple = np.array(tuple_data)\n",
"print(f\"Array from tuple: {arr_from_tuple}\")\n",
"\n",
"# 创建多维数组 (例如,从嵌套列表)\n",
"nested_list = [[1, 2, 3], [4, 5, 6]]\n",
"arr_2d = np.array(nested_list)\n",
"print(f\"\\n2D Array:\\n{arr_2d}\")\n",
"\n",
"# 使用内置函数创建特定数组\n",
"arr_zeros = np.zeros((2, 3)) # 创建一个 2x3 的全零数组 (默认 float64)\n",
"print(f\"\\nZeros array (2x3):\\n{arr_zeros}\")\n",
"\n",
"arr_ones = np.ones((3, 2), dtype=int) # 创建一个 3x2 的全一数组,指定类型为 int\n",
"print(f\"\\nOnes array (3x2, int):\\n{arr_ones}\")\n",
"\n",
"arr_full = np.full((2, 2), 7.5) # 创建一个 2x2 的数组,所有元素填充为 7.5\n",
"print(f\"\\nFull array (2x2, fill 7.5):\\n{arr_full}\")\n",
"\n",
"arr_eye = np.eye(3) # 创建一个 3x3 的单位矩阵\n",
"print(f\"\\nIdentity matrix (3x3):\\n{arr_eye}\")\n",
"\n",
"# 使用序列生成函数\n",
"arr_arange = np.arange(0, 10, 2) # 类似 Python 的 range,但不包含结束值,可以有步长\n",
"print(f\"\\nArray from arange(0, 10, 2): {arr_arange}\")\n",
"\n",
"arr_linspace = np.linspace(0, 1, 5) # 在 [0, 1] 之间生成 5 个等间隔的数 (包含结束值)\n",
"print(f\"\\nArray from linspace(0, 1, 5): {arr_linspace}\")\n",
"\n",
"# 使用随机数函数\n",
"arr_rand = np.random.rand(2, 2) # 生成 [0, 1) 之间的均匀分布随机数 (2x2)\n",
"print(f\"\\nRandom array (2x2, uniform [0,1)):\\n{arr_rand}\")\n",
"\n",
"arr_randn = np.random.randn(3, 1) # 生成符合标准正态分布 (均值0, 方差1) 的随机数 (3x1)\n",
"print(f\"\\nRandom array (3x1, standard normal):\\n{arr_randn}\")\n",
"\n",
"arr_randint = np.random.randint(0, 10, size=(2, 4)) # 生成 [0, 10) 之间的随机整数 (2x4)\n",
"print(f\"\\nRandom integer array (2x4, [0,10)):\\n{arr_randint}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. 数组的基本属性\n",
"\n",
"每个 `ndarray` 对象都有一些描述其自身的属性:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"arr = np.array([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])\n",
"print(f\"Array:\\n{arr}\")\n",
"\n",
"# shape: 数组的维度 (一个元组)\n",
"print(f\"Shape: {arr.shape}\") # (2, 3) -> 2 行 3 列\n",
"\n",
"# ndim: 数组的轴(维度)的数量\n",
"print(f\"Number of dimensions (ndim): {arr.ndim}\") # 2\n",
"\n",
"# dtype: 数组中元素的数据类型\n",
"print(f\"Data type (dtype): {arr.dtype}\") # float64 (默认)\n",
"\n",
"# size: 数组中元素的总数\n",
"print(f\"Total number of elements (size): {arr.size}\") # 6\n",
"\n",
"# itemsize: 数组中每个元素的字节大小\n",
"print(f\"Size of each element in bytes (itemsize): {arr.itemsize}\") # 8 (float64 是 8 字节)\n",
"\n",
"# nbytes: 整个数组占用的总字节数 (size * itemsize)\n",
"print(f\"Total bytes consumed by the elements (nbytes): {arr.nbytes}\") # 48"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. 数组索引和切片\n",
"\n",
"NumPy 数组的索引和切片非常灵活,对于数据访问和操作至关重要。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 一维数组索引和切片 (类似 Python 列表)\n",
"arr1d = np.arange(10)\n",
"print(f\"1D Array: {arr1d}\")\n",
"print(f\"Element at index 3: {arr1d[3]}\")\n",
"print(f\"Elements from index 2 to 5 (exclusive): {arr1d[2:5]}\")\n",
"print(f\"Elements from index 5 onwards: {arr1d[5:]}\")\n",
"print(f\"Elements up to index 4 (exclusive): {arr1d[:4]}\")\n",
"print(f\"Every other element: {arr1d[::2]}\")\n",
"print(f\"Reverse array: {arr1d[::-1]}\")\n",
"\n",
"# 多维数组索引和切片\n",
"arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])\n",
"print(f\"\\n2D Array:\\n{arr2d}\")\n",
"\n",
"# 访问单个元素: arr[row, column]\n",
"print(f\"Element at row 1, col 2: {arr2d[1, 2]}\") # 6\n",
"# 或者使用 arr[row][column],但这通常效率较低,因为它创建了中间数组\n",
"print(f\"Element using arr[1][2]: {arr2d[1][2]}\") # 6\n",
"\n",
"# 切片:使用逗号分隔不同维度的切片\n",
"print(f\"\\nFirst 2 rows, columns 1 to 3 (exclusive):\\n{arr2d[:2, 1:3]}\")\n",
"# [[2 3]\n",
"# [5 6]]\n",
"\n",
"print(f\"\\nRow at index 1: {arr2d[1, :]}\") # 或 arr2d[1]\n",
"# [4 5 6]\n",
"\n",
"print(f\"\\nColumn at index 1:\\n{arr2d[:, 1]}\")\n",
"# [2 5 8] (注意返回的是一维数组)\n",
"\n",
"# *** NumPy 切片是视图 (Views) ***\n",
"# 对切片的修改会影响原始数组!\n",
"arr2d_slice = arr2d[:2, :2]\n",
"print(f\"\\nOriginal slice:\\n{arr2d_slice}\")\n",
"arr2d_slice[0, 0] = 99\n",
"print(f\"Slice after modification:\\n{arr2d_slice}\")\n",
"print(f\"Original arr2d after slice modification:\\n{arr2d}\") # 原始数组也被改变了!\n",
"\n",
"# 如果需要副本,使用 .copy()\n",
"arr2d_copy = arr2d[:2, :2].copy()\n",
"arr2d_copy[0, 0] = 111\n",
"print(f\"\\nCopy after modification:\\n{arr2d_copy}\")\n",
"print(f\"Original arr2d (should be unchanged by copy modification):\\n{arr2d}\")\n",
"\n",
"# *** 高级索引 ***\n",
"print(\"\\n--- Advanced Indexing ---\")\n",
"arr_adv = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])\n",
"\n",
"# 1. 整数数组索引: 使用整数数组指定索引\n",
"# 获取第0行和第2行的所有列\n",
"print(f\"Rows 0 and 2:\\n{arr_adv[[0, 2]]}\") \n",
"# 获取元素 (0, 1), (1, 2), (2, 0)\n",
"row_indices = np.array([0, 1, 2])\n",
"col_indices = np.array([1, 2, 0])\n",
"print(f\"Elements at (0,1), (1,2), (2,0): {arr_adv[row_indices, col_indices]}\") # [2 6 7]\n",
"\n",
"# 2. 布尔索引: 使用布尔数组选择元素\n",
"bool_mask = arr_adv > 5\n",
"print(f\"\\nBoolean mask (arr_adv > 5):\\n{bool_mask}\")\n",
"print(f\"Elements greater than 5: {arr_adv[bool_mask]}\") # 返回一个包含满足条件元素的一维数组\n",
"\n",
"# 也可以直接使用条件\n",
"print(f\"Elements where arr_adv % 2 == 0: {arr_adv[arr_adv % 2 == 0]}\")\n",
"\n",
"# *** 注意:高级索引总是返回数组的副本,而不是视图!***"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. 向量化操作和通用函数 (ufuncs)\n",
"\n",
"NumPy 的核心优势在于其**向量化 (vectorization)** 能力。这意味着许多操作可以应用于整个数组,而无需编写显式的 Python 循环。这些操作通常是通过**通用函数 (universal functions, ufuncs)** 实现的,它们底层是用 C 编写的,非常高效。\n",
"\n",
"**好处:**\n",
"* 代码更简洁、更易读。\n",
"* 执行速度快得多。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"arr_a = np.array([1, 2, 3])\n",
"arr_b = np.array([4, 5, 6])\n",
"\n",
"# 基本算术运算 (element-wise)\n",
"print(f\"arr_a + arr_b = {arr_a + arr_b}\")\n",
"print(f\"arr_a - arr_b = {arr_a - arr_b}\")\n",
"print(f\"arr_a * arr_b = {arr_a * arr_b}\") # 逐元素乘法\n",
"print(f\"arr_a / arr_b = {arr_a / arr_b}\")\n",
"print(f\"arr_a ** 2 = {arr_a ** 2}\")\n",
"\n",
"# 也可以和标量运算 (利用了广播,见下一节)\n",
"print(f\"arr_a + 5 = {arr_a + 5}\")\n",
"print(f\"arr_a * 2 = {arr_a * 2}\")\n",
"\n",
"# 比较运算 (element-wise)\n",
"print(f\"\\narr_a > 1 = {arr_a > 1}\") # 返回布尔数组\n",
"print(f\"arr_a == arr_b = {arr_a == arr_b}\")\n",
"\n",
"# 通用函数 (ufuncs)\n",
"print(f\"\\nnp.sqrt(arr_a) = {np.sqrt(arr_a)}\")\n",
"print(f\"np.exp(arr_a) = {np.exp(arr_a)}\") # e^x\n",
"print(f\"np.sin(arr_a) = {np.sin(arr_a)}\")\n",
"print(f\"np.log(arr_a) = {np.log(arr_a)}\") # 自然对数\n",
"print(f\"np.add(arr_a, arr_b) = {np.add(arr_a, arr_b)}\") # 等同于 arr_a + arr_b\n",
"print(f\"np.maximum(arr_a, np.array([0, 5, 2])) = {np.maximum(arr_a, np.array([0, 5, 2]))}\") # 逐元素取最大值\n",
"\n",
"# 聚合函数\n",
"arr_agg = np.array([[1, 2, 3], [4, 5, 6]])\n",
"print(f\"\\nArray for aggregation:\\n{arr_agg}\")\n",
"print(f\"Sum of all elements: {np.sum(arr_agg)} or {arr_agg.sum()}\")\n",
"print(f\"Sum along columns (axis=0): {np.sum(arr_agg, axis=0)} or {arr_agg.sum(axis=0)}\") # [5 7 9]\n",
"print(f\"Sum along rows (axis=1): {np.sum(arr_agg, axis=1)} or {arr_agg.sum(axis=1)}\") # [ 6 15]\n",
"print(f\"Minimum value: {np.min(arr_agg)} or {arr_agg.min()}\")\n",
"print(f\"Maximum value in each column: {np.max(arr_agg, axis=0)}\") # [4 5 6]\n",
"print(f\"Mean of all elements: {np.mean(arr_agg)} or {arr_agg.mean()}\")\n",
"print(f\"Standard deviation: {np.std(arr_agg)} or {arr_agg.std()}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. 广播 (Broadcasting)\n",
"\n",
"广播是 NumPy 强大的机制,它允许 NumPy 在执行算术运算时处理不同形状的数组,前提是它们的形状满足一定的兼容性规则。\n",
"\n",
"**广播规则:**\n",
"当对两个数组进行操作时,NumPy 逐个比较它们的维度(从末尾维度开始向前比较):\n",
"1. 如果两个数组的维度数不同,将维度较少的数组的形状在前面补 1,直到它们的维度数相同。\n",
"2. 在任何一个维度上,如果两个数组的该维度大小相同,或者其中一个数组的大小为 1,则认为它们在该维度上是兼容的。\n",
"3. 如果两个数组在所有维度上都兼容,它们就可以一起广播。\n",
"4. 广播后,每个数组的行为就像其形状沿大小为 1 的维度扩展(复制)以匹配另一个数组的形状一样。\n",
"5. 如果存在任何一个维度,两个数组的大小都大于 1 且不相同,则无法广播,会引发 `ValueError`。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 示例 1: 标量和数组\n",
"arr = np.array([1, 2, 3])\n",
"scalar = 5\n",
"# arr shape: (3,)\n",
"# scalar conceptually has shape ()\n",
"# Broadcasting makes scalar act like [5, 5, 5]\n",
"result = arr + scalar\n",
"print(f\"Array + Scalar:\\n {arr} + {scalar} = {result}\")\n",
"\n",
"# 示例 2: 一维数组和二维数组\n",
"arr2d = np.array([[10, 20, 30], [40, 50, 60]]) # shape (2, 3)\n",
"arr1d = np.array([1, 2, 3]) # shape (3,)\n",
"# Broadcasting rules:\n",
"# arr2d shape: (2, 3)\n",
"# arr1d shape: ( 3,) -> promoted to (1, 3)\n",
"# Dimension 2: 3 == 3 (compatible)\n",
"# Dimension 1: 2 vs 1 (compatible, 1 will be stretched)\n",
"# arr1d acts like [[1, 2, 3], [1, 2, 3]]\n",
"result = arr2d + arr1d\n",
"print(f\"\\n2D Array + 1D Array:\\n{arr2d}\\n + \\n{arr1d}\\n = \\n{result}\")\n",
"\n",
"# 示例 3: 列向量和行向量\n",
"col_vector = np.array([[10], [20], [30]]) # shape (3, 1)\n",
"row_vector = np.array([1, 2, 3]) # shape (3,) -> treated as (1, 3)\n",
"# Broadcasting rules:\n",
"# col_vector shape: (3, 1)\n",
"# row_vector shape: (1, 3)\n",
"# Dimension 2: 1 vs 3 (compatible, 1 stretched to 3)\n",
"# Dimension 1: 3 vs 1 (compatible, 1 stretched to 3)\n",
"# col acts like [[10, 10, 10], [20, 20, 20], [30, 30, 30]]\n",
"# row acts like [[ 1, 2, 3], [ 1, 2, 3], [ 1, 2, 3]]\n",
"result = col_vector + row_vector\n",
"print(f\"\\nColumn Vector + Row Vector:\\n{col_vector}\\n + \\n{row_vector}\\n = \\n{result}\")\n",
"\n",
"# 示例 4: 不兼容的形状\n",
"arr_a = np.array([[1, 2], [3, 4]]) # shape (2, 2)\n",
"arr_b = np.array([10, 20, 30]) # shape (3,)\n",
"try:\n",
" result = arr_a + arr_b\n",
"except ValueError as e:\n",
" print(f\"\\nError broadcasting incompatible shapes (2,2) and (3,): {e}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. 基本线性代数\n",
"\n",
"NumPy 提供了进行线性代数运算的功能,这在 ML/DL 中非常常用。\n",
"* **矩阵乘法**: 可以使用 `@` 运算符 (Python 3.5+) 或 `np.dot()` 函数。\n",
"* **其他运算**: 位于 `np.linalg` 子模块中,如求逆、行列式、特征值、奇异值分解 (SVD) 等。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"mat_a = np.array([[1, 2], [3, 4]]) # 2x2 matrix\n",
"mat_b = np.array([[5, 6], [7, 8]]) # 2x2 matrix\n",
"vec_v = np.array([10, 20]) # 1D vector (shape (2,))\n",
"\n",
"print(f\"Matrix A:\\n{mat_a}\")\n",
"print(f\"Matrix B:\\n{mat_b}\")\n",
"print(f\"Vector V: {vec_v}\")\n",
"\n",
"# 逐元素乘法 (回顾)\n",
"print(f\"\\nElement-wise product A * B:\\n{mat_a * mat_b}\")\n",
"\n",
"# 矩阵乘法 (Dot Product / Matrix Multiplication)\n",
"# 1. 使用 @ 运算符 (推荐)\n",
"mat_product = mat_a @ mat_b \n",
"print(f\"\\nMatrix product A @ B:\\n{mat_product}\")\n",
"\n",
"# 2. 使用 np.dot()\n",
"mat_product_dot = np.dot(mat_a, mat_b)\n",
"print(f\"Matrix product np.dot(A, B):\\n{mat_product_dot}\")\n",
"\n",
"# 矩阵与向量乘法\n",
"mat_vec_product = mat_a @ vec_v # or np.dot(mat_a, vec_v)\n",
"print(f\"\\nMatrix-vector product A @ V: {mat_vec_product}\") # Result is 1D array [50, 110]\n",
"\n",
"# 转置 (Transpose)\n",
"print(f\"\\nTranspose of A (A.T):\\n{mat_a.T}\")\n",
"print(f\"Transpose using np.transpose(A):\\n{np.transpose(mat_a)}\")\n",
"\n",
"# 矩阵求逆 (Inverse) - 仅对方阵且可逆矩阵有效\n",
"try:\n",
" mat_a_inv = np.linalg.inv(mat_a)\n",
" print(f\"\\nInverse of A:\\n{mat_a_inv}\")\n",
" # 验证 A @ A_inv 约等于单位矩阵\n",
" print(f\"A @ A_inv (should be close to identity):\\n{mat_a @ mat_a_inv}\")\n",
"except np.linalg.LinAlgError as e:\n",
" print(f\"\\nCould not compute inverse of A: {e}\")\n",
"\n",
"# 行列式 (Determinant)\n",
"det_a = np.linalg.det(mat_a)\n",
"print(f\"\\nDeterminant of A: {det_a:.2f}\")\n",
"\n",
"# 奇异值分解 (Singular Value Decomposition - SVD)\n",
"U, s, Vh = np.linalg.svd(mat_a)\n",
"print(\"\\nSVD of A:\")\n",
"print(f\" U:\\n{U}\")\n",
"print(f\" Singular values (s): {s}\")\n",
"print(f\" Vh (V transpose):\\n{Vh}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 7. 随机数生成\n",
"\n",
"`np.random` 模块提供了更丰富、更高效的随机数生成功能,常用于权重初始化、数据增强、模拟等。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 设置随机种子以保证结果可复现\n",
"np.random.seed(42)\n",
"\n",
"# 生成均匀分布 [0.0, 1.0)\n",
"rand_uniform = np.random.rand(3, 2) # 3x2 array\n",
"print(f\"Uniform random [0,1) (3x2):\\n{rand_uniform}\")\n",
"\n",
"# 生成标准正态分布 (mean=0, std=1)\n",
"rand_normal = np.random.randn(4) # 1D array of size 4\n",
"print(f\"\\nStandard normal random (size 4): {rand_normal}\")\n",
"\n",
"# 生成随机整数 [low, high)\n",
"rand_int = np.random.randint(1, 100, size=5) # 5 integers from [1, 100)\n",
"print(f\"\\nRandom integers [1, 100): {rand_int}\")\n",
"\n",
"# 从数组中随机选择 (有放回或无放回)\n",
"population = np.arange(10)\n",
"choice_replace = np.random.choice(population, size=5, replace=True) # 有放回\n",
"print(f\"\\nRandom choice (with replacement) from {population}: {choice_replace}\")\n",
"choice_no_replace = np.random.choice(population, size=5, replace=False) # 无放回\n",
"print(f\"Random choice (without replacement) from {population}: {choice_no_replace}\")\n",
"\n",
"# 打乱数组 (原地操作)\n",
"arr_to_shuffle = np.arange(9)\n",
"np.random.shuffle(arr_to_shuffle)\n",
"print(f\"\\nShuffled array: {arr_to_shuffle}\")\n",
"\n",
"# 生成特定分布的随机数 (例如,正态分布)\n",
"mu, sigma = 10, 2 # 均值和标准差\n",
"normal_dist = np.random.normal(mu, sigma, size=5)\n",
"print(f\"\\nNormal distribution (mean=10, std=2): {normal_dist}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 8. 数组形状操作\n",
"\n",
"改变数组的形状而不改变其数据,是常见的操作。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"arr = np.arange(12) # [ 0 1 2 3 4 5 6 7 8 9 10 11]\n",
"print(f\"Original array: {arr}, shape: {arr.shape}\")\n",
"\n",
"# reshape(): 返回一个具有新形状的数组 (视图或副本,取决于内存布局)\n",
"reshaped_arr = arr.reshape((3, 4))\n",
"print(f\"\\nReshaped array (3x4):\\n{reshaped_arr}\")\n",
"print(f\"Shape after reshape: {reshaped_arr.shape}\")\n",
"\n",
"# -1 可以表示推断维度大小\n",
"reshaped_auto = arr.reshape((2, -1)) # -1 会自动计算为 6\n",
"print(f\"\\nReshaped array (2x-1):\\n{reshaped_auto}\")\n",
"print(f\"Shape with -1: {reshaped_auto.shape}\")\n",
"\n",
"# ravel() or flatten(): 将多维数组展平成一维数组\n",
"# ravel() 通常返回视图 (如果可能)\n",
"raveled_arr = reshaped_arr.ravel()\n",
"print(f\"\\nRaveled array: {raveled_arr}\")\n",
"# flatten() 总是返回副本\n",
"flattened_arr = reshaped_arr.flatten()\n",
"print(f\"Flattened array: {flattened_arr}\")\n",
"\n",
"# 修改 ravel 返回的视图会影响原始数组 (如果它是视图)\n",
"raveled_arr[0] = 99\n",
"print(f\"Original reshaped_arr after modifying raveled view:\\n{reshaped_arr}\")\n",
"\n",
"# T attribute or transpose(): 转置数组\n",
"print(f\"\\nOriginal reshaped array (3x4):\\n{reshaped_arr}\")\n",
"transposed_arr = reshaped_arr.T\n",
"print(f\"Transposed array (4x3):\\n{transposed_arr}\")\n",
"print(f\"Shape after transpose: {transposed_arr.shape}\")\n",
"\n",
"# 使用 np.newaxis 增加维度\n",
"arr1d = np.array([1, 2, 3])\n",
"print(f\"\\nOriginal 1D array: {arr1d}, shape: {arr1d.shape}\")\n",
"row_vec = arr1d[np.newaxis, :] # 变成行向量\n",
"print(f\"Row vector: {row_vec}, shape: {row_vec.shape}\")\n",
"col_vec = arr1d[:, np.newaxis] # 变成列向量\n",
"print(f\"Column vector:\\n{col_vec}, shape: {col_vec.shape}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 总结\n",
"\n",
"NumPy 是 Python 科学计算、机器学习和深度学习的基础。其核心 `ndarray` 对象和相关的函数提供了高效处理数值数据的强大能力。\n",
"\n",
"**关键要点:**\n",
"* `ndarray` 用于高效存储和操作同构数据。\n",
"* 利用索引和切片进行灵活的数据访问。\n",
"* 向量化操作和 ufuncs 显著提高性能,避免 Python 循环。\n",
"* 广播机制允许不同形状的数组进行运算。\n",
"* `np.linalg` 提供了基本的线性代数功能。\n",
"* `np.random` 用于生成各种随机数。\n",
"* `reshape`, `ravel`, `transpose` 等用于操作数组形状。\n",
"\n",
"熟练掌握 NumPy 是进行数据科学和机器学习工作的必备技能。建议多加练习,并查阅 NumPy 官方文档以了解更多高级功能。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 5
}
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

对动态语言Python的一些感慨

众所周知Python是完全动态的语言,体现在

  1. 类型动态绑定
  2. 运行时检查
  3. 对象结构内容可动态修改(而不仅仅是值)
  4. 反射
  5. 一切皆对象(instance, class, method)
  6. 可动态执行代码(eval, exec)
  7. 鸭子类型支持

动态语言的约束更少,对使用者来说更易于入门,但相应的也会有代价就是运行时开销很大,和底层汇编执行逻辑完全解耦不知道代码到底是怎么执行的。

而且还有几点是我认为较为严重的缺陷。下面进行梳理。

破坏了OOP的语义

较为流行的编程语言大多支持OOP编程范式。即继承和多态。同样,Python在执行简单任务时候可以纯命令式(Imperative Programming),也可以使用复杂的面向对象OOP。

但是,其动态特性破环了OOP的结构:

  1. 类型模糊:任何类型实例,都可以在运行时添加或者删除属性或者方法(相比之下静态语言只能在运行时修改它们的值)。经此修改的实例,按理说不再属于原来的类型,毕竟和原类型已经有了明显的区别。但是该实例的内建__class__属性依旧会指向原类型,这会给类型的认知造成困惑。符合一个class不应该只是名义上符合,而是内容上也应该符合。
  2. 破坏继承:体现在以下两个方面
    1. 大部分实践没有虚接口继承。abc模块提供了虚接口的基类ABC,经典的做法是让自己的抽象类继承自ABC,然后具体类继承自自己的抽象类,然后去实现抽象方法。但PEP提案认为Pythonic的做法是用typing.Protocol来取代ABC,具体类完全不继承任何虚类,只要实现相应的方法,那么就可以被静态检查器认为是符合Protocol的。
    2. 不需要继承自具体父类。和上一条一样,即使一个类没有任何父类(除了object类),它依旧可以生成同名的方法,以实现和父类方法相同的调用接口。这样在语义逻辑上,类的定义完全看不出和其他类有何种关系。完全可以是一种松散的组织结构,任何两个类之间都没继承关系。
  3. 破坏多态:任何一个入参出参,天然不限制类型。这使得要求父类型的参数处,传入子类型显得没有意义,依旧是因为任何类型都能动态修改满足要求。

破坏了设计模式

经典的模式诸如工厂模式,抽象工厂,访问者模式,都严重依赖于继承和多态的性质。但是在python的设计中,其动态能力使得设计模式形同虚设。 大家常见的库中使用设计模式的有transformers库,其中的from_pretrained系列则是工厂模式,通过字符串名称确定了具体的构造器得到具体的子类。而工厂构造器的输出类型是一个所有模型的基类。

安全性问题

Python在代码层面一般不直接管理指针,所以指针越界,野指针,悬空指针等问题一般不存在。而gc机制也能自动处理垃圾回收使得编码过程不必关注这类安全性问题。但与之相对的,Python也有自己的安全性问题。以往非托管形式的代码的攻击难度较大,注入代码想要稳定执行需要避免破坏原来的结构导致程序直接崩溃(段错误)。 Python却可以直接注入任何代码修改原本的逻辑,并且由于不是在code段固定的内容,攻击时候也无需有额外考虑。运行时可以手动修改globals() locals()内容,亦有一定风险。 另一个危险则是类型不匹配导致的代码执行问题,因为只有在运行时才确定类型,无法提前做出保证,可能会产生类型错误的异常,造成程序崩溃。

总结

我出身于C++。但是近年来一直在用python编程。而且python的市场占有率已经多年第一,且遥遥领先。这和其灵活性分不开关系。对于一个面向大众的编程语言,这样的特性是必要的。即使以上说了诸多python的不严谨之处,但是对于程序员依旧可以选择严谨的面向对象写法。所以,程序的优劣不在于语言怎么样,而在于程序员本身。程序员有责任写出易于维护,清晰,规范的代码~

Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@KuRRe8
Copy link
Author

KuRRe8 commented May 8, 2025

返回顶部

有见解,有问题,或者单纯想盖楼灌水,都可以在这里发表!

因为文档比较多,有时候渲染不出来ipynb是浏览器性能的问题,刷新即可

或者git clone到本地来阅读

ChatGPT Image May 9, 2025, 04_45_04 AM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment