Skip to content

Instantly share code, notes, and snippets.

@KuRRe8
Last active June 6, 2025 17:35
Show Gist options
  • Save KuRRe8/36f63d23ef205a8e02b7b7ec009cc4e8 to your computer and use it in GitHub Desktop.
Save KuRRe8/36f63d23ef205a8e02b7b7ec009cc4e8 to your computer and use it in GitHub Desktop.
和Python使用有关的一些教程,按类别分为不同文件

Python教程

Python是一个新手友好的语言,并且现在机器学习社区深度依赖于Python,C++, Cuda C, R等语言,使得Python的热度稳居第一。本Gist提供Python相关的一些教程,可以直接在Jupyter Notebook中运行。

  1. 语言级教程,一般不涉及初级主题;
  2. 标准库教程,最常见的标准库基本用法;
  3. 第三方库教程,主要是常见的库如numpy,pytorch诸如此类,只涉及基本用法,不考虑新特性

其他内容就不往这个Gist里放了,注意Gist依旧由git进行版本控制,所以可以git clone 到本地,或者直接Google Colab\ Kaggle打开相应的ipynb文件

直接在网页浏览时,由于没有文件列表,可以按Ctrl + F来检索相应的目录,或者点击下面的超链接。

想要参与贡献的直接在评论区留言,有什么问题的也在评论区说 ^.^

目录-语言部分

目录-库部分

目录-具体业务库部分-本教程更多关注机器学习深度学习内容

目录-附录

  • sigh.md个人对于Python动态语言的看法
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Python C 扩展与 `ctypes`/`cffi` 教程\n",
"\n",
"欢迎来到 Python C 扩展与 `ctypes`/`cffi` 教程!本教程将介绍如何在 Python 中调用 C 语言编写的函数和库,主要通过 `ctypes` 和 `cffi` 这两个标准库/第三方库来实现。\n",
"\n",
"**为什么需要与 C 代码交互?**\n",
"\n",
"1. **性能优化**:对于计算密集型的代码段,使用 C 语言实现通常比纯 Python 快得多。\n",
"2. **利用现有 C 库**:许多成熟、高效的库是用 C/C++ 编写的,通过 Python 调用它们可以复用这些功能。\n",
"3. **访问底层系统功能**:某些操作系统级别的功能可能只有 C API。\n",
"\n",
"**主要方法:**\n",
"\n",
"* **Python C API (Writing C Extensions)**:直接使用 C 语言编写 Python 扩展模块。这是最强大但也最复杂的方法,需要深入理解 Python 内部机制。本教程不直接涉及编写原生 C 扩展,而是关注如何调用已有的 C 代码。\n",
"* **`ctypes`**:Python 的一个外部函数接口库 (Foreign Function Interface - FFI),包含在标准库中。它允许直接在 Python 中加载动态链接库/共享库 (`.dll`, `.so`, `.dylib`) 并调用其中的函数,无需编写额外的 C 代码。\n",
"* **`cffi`**:另一个流行的 FFI 库 (需要 `pip install cffi`)。它提供了比 `ctypes` 更高级的抽象和更好的性能,特别是对于复杂的 C API。它支持 API 模式(基于 C 头文件声明)和 ABI 模式(类似于 `ctypes`,直接与二进制接口交互)。\n",
"* **Cython**:一种静态类型的 Python 超集,可以将类似 Python 的代码编译成高效的 C 扩展。它介于纯 Python 和纯 C 之间,是另一个强大的选择,但本教程主要关注 `ctypes` 和 `cffi`。\n",
"\n",
"**前提知识:**\n",
"\n",
"* 基本的 Python 知识。\n",
"* 对 C 语言有一定了解(数据类型、指针、函数等)会非常有帮助。\n",
"* 需要一个 C 编译器(如 GCC, Clang, MSVC)来编译示例 C 代码。\n",
"\n",
"**本教程将涵盖:**\n",
"\n",
"1. 准备一个简单的 C 库示例。\n",
"2. 使用 `ctypes` 调用 C 函数。\n",
"3. 使用 `cffi` (ABI 模式) 调用 C 函数。\n",
"4. 使用 `cffi` (API 模式) 调用 C 函数。\n",
"5. 传递不同类型的数据 (基本类型、指针、结构体、回调函数)。\n",
"6. `ctypes` vs `cffi` 比较与选择。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. 准备一个简单的 C 库示例\n",
"\n",
"我们将创建一个简单的 C 库,包含一些基本函数,用于后续的 Python 调用演示。\n",
"\n",
"**`my_c_library.c` 文件内容:**\n",
"\n",
"```c\n",
"// my_c_library.c\n",
"#include <stdio.h>\n",
"#include <stdlib.h> // For malloc, free\n",
"#include <string.h> // For strlen, strcpy\n",
"\n",
"// 确保在 C++ 编译器下符号名不会被 mangled\n",
"#ifdef __cplusplus\n",
"extern \"C\" {\n",
"#endif\n",
"\n",
"// 简单的加法函数\n",
"int add_integers(int a, int b) {\n",
" printf(\"C: add_integers(%d, %d) called\\n\", a, b);\n",
" return a + b;\n",
"}\n",
"\n",
"// 字符串处理函数:计算字符串长度 (返回指针,需要调用方释放)\n",
"// 注意:这个函数设计得不太好,因为内存分配在库中,释放由调用者负责,容易出错\n",
"// 更好的设计是让调用者传入缓冲区,或者库提供释放函数。\n",
"// 这里仅为演示指针和字符串。\n",
"char* greet_person(const char* name) {\n",
" printf(\"C: greet_person(\\\"%s\\\") called\\n\", name);\n",
" const char* prefix = \"Hello, \";\n",
" // 加1是为了末尾的 '\\0'\n",
" char* greeting = (char*)malloc(strlen(prefix) + strlen(name) + 1);\n",
" if (greeting == NULL) {\n",
" return NULL; // 内存分配失败\n",
" }\n",
" strcpy(greeting, prefix);\n",
" strcat(greeting, name);\n",
" return greeting;\n",
"}\n",
"\n",
"// 释放由 greet_person 分配的内存\n",
"void free_greeting_string(char* greeting_str) {\n",
" printf(\"C: free_greeting_string called for string at %p\\n\", (void*)greeting_str);\n",
" if (greeting_str != NULL) {\n",
" free(greeting_str);\n",
" }\n",
"}\n",
"\n",
"// 结构体示例\n",
"typedef struct {\n",
" int id;\n",
" double value;\n",
"} PointData;\n",
"\n",
"// 传递结构体指针并修改其值\n",
"void process_point_data(PointData* p_data) {\n",
" if (p_data != NULL) {\n",
" printf(\"C: process_point_data called. Original id=%d, value=%.2f\\n\", p_data->id, p_data->value);\n",
" p_data->id += 100;\n",
" p_data->value *= 2.0;\n",
" printf(\"C: Modified id=%d, value=%.2f\\n\", p_data->id, p_data->value);\n",
" }\n",
"}\n",
"\n",
"// 回调函数示例\n",
"// 定义回调函数的类型\n",
"typedef int (*callback_func_type)(int, int);\n",
"\n",
"// 使用回调函数的 C 函数\n",
"int apply_callback(int x, int y, callback_func_type cb) {\n",
" printf(\"C: apply_callback called with x=%d, y=%d\\n\", x, y);\n",
" if (cb != NULL) {\n",
" return cb(x, y);\n",
" }\n",
" return -1; // 表示错误或回调未提供\n",
"}\n",
"\n",
"#ifdef __cplusplus\n",
"}\n",
"#endif\n",
"```\n",
"\n",
"**编译 C 库:**\n",
"你需要将上面的 C 代码编译成一个动态链接库/共享库。\n",
"\n",
"* **Linux/macOS (使用 GCC/Clang):**\n",
" ```bash\n",
" # 将 my_c_library.c 保存到当前目录\n",
" gcc -shared -o libmyclibrary.so -fPIC my_c_library.c # Linux\n",
" # 或者\n",
" clang -shared -o libmyclibrary.dylib my_c_library.c # macOS\n",
" ```\n",
" (`-fPIC` 用于生成位置无关代码,这对于共享库是必需的。)\n",
"\n",
"* **Windows (使用 MinGW GCC):**\n",
" ```bash\n",
" gcc -shared -o myclibrary.dll my_c_library.c\n",
" ```\n",
"\n",
"* **Windows (使用 MSVC - Visual Studio 命令行工具):**\n",
" ```bash\n",
" cl /LD my_c_library.c /Fe:myclibrary.dll\n",
" ```\n",
"\n",
"**重要**:确保生成的库文件(`libmyclibrary.so`, `libmyclibrary.dylib`, 或 `myclibrary.dll`)位于 Python 脚本可以找到它的地方。通常,这意味着它应该在:\n",
"1. 与 Python 脚本相同的目录。\n",
"2. 系统的标准库路径中 (例如 `/usr/lib`, `/usr/local/lib`)。\n",
"3. 包含在 `LD_LIBRARY_PATH` (Linux/macOS) 或 `PATH` (Windows) 环境变量指定的目录中。\n",
"\n",
"为了本教程的方便,我们假设你已经成功编译了 C 代码,并将生成的库文件放在了与此 Jupyter Notebook **相同的目录**下,并根据你的操作系统命名(例如,Linux 为 `libmyclibrary.so`)。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. 使用 `ctypes` 调用 C 函数\n",
"\n",
"`ctypes` 是 Python 标准库的一部分,可以直接加载和使用 C 编译的共享库。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import ctypes\n",
"import os\n",
"import platform\n",
"\n",
"# 根据操作系统确定库文件名\n",
"lib_name = None\n",
"if platform.system() == \"Windows\":\n",
" lib_name = \"myclibrary.dll\"\n",
"elif platform.system() == \"Darwin\": # macOS\n",
" lib_name = \"libmyclibrary.dylib\"\n",
"else: # Linux and other Unix-like\n",
" lib_name = \"libmyclibrary.so\"\n",
"\n",
"# 构造库的完整路径 (假设与notebook在同一目录)\n",
"lib_path = os.path.join(os.getcwd(), lib_name)\n",
"\n",
"print(f\"Attempting to load C library: {lib_path}\")\n",
"\n",
"try:\n",
" # 加载共享库\n",
" # 使用 ctypes.CDLL 或 ctypes.WinDLL (Windows) 或 ctypes.PyDLL\n",
" # CDLL 适用于使用标准 cdecl 调用约定的库\n",
" c_lib = ctypes.CDLL(lib_path)\n",
" print(f\"Successfully loaded {lib_name}\")\n",
"except OSError as e:\n",
" print(f\"Error loading library {lib_name}: {e}\")\n",
" print(\"Please ensure you have compiled 'my_c_library.c' into a shared library\")\n",
" print(\"and placed it in the same directory as this notebook.\")\n",
" c_lib = None # 标记库未加载成功\n",
"\n",
"if c_lib:\n",
" # --- 1. 调用简单的整数加法函数 ---\n",
" print(\"\\n--- Calling add_integers --- \")\n",
" # 获取函数指针\n",
" add_func_ctypes = c_lib.add_integers\n",
" \n",
" # **重要**: 定义参数类型 (argtypes) 和返回类型 (restype)\n",
" # 这对于类型安全和正确的数据转换至关重要\n",
" add_func_ctypes.argtypes = [ctypes.c_int, ctypes.c_int] # 两个 int 参数\n",
" add_func_ctypes.restype = ctypes.c_int # 返回 int\n",
" \n",
" result_add = add_func_ctypes(10, 25)\n",
" print(f\"Python: add_integers(10, 25) result = {result_add}\")\n",
"\n",
" # --- 2. 调用处理字符串的函数 (返回 char*) ---\n",
" print(\"\\n--- Calling greet_person --- \")\n",
" greet_func_ctypes = c_lib.greet_person\n",
" greet_func_ctypes.argtypes = [ctypes.c_char_p] # 参数是 char*\n",
" greet_func_ctypes.restype = ctypes.c_char_p # 返回 char*\n",
"\n",
" # Python 字符串需要编码为字节串 (UTF-8 是常用编码)\n",
" py_name = \"Alice (ctypes)\"\n",
" c_name = py_name.encode('utf-8') \n",
" \n",
" # 调用 C 函数\n",
" c_greeting_ptr = greet_func_ctypes(c_name)\n",
" \n",
" if c_greeting_ptr:\n",
" # 将返回的 char* 转换回 Python 字符串\n",
" py_greeting = c_greeting_ptr.decode('utf-8')\n",
" print(f\"Python: Greeting from C: {py_greeting}\")\n",
" \n",
" # **重要**: 释放由 C 库分配的内存\n",
" free_greeting_func_ctypes = c_lib.free_greeting_string\n",
" free_greeting_func_ctypes.argtypes = [ctypes.c_char_p]\n",
" free_greeting_func_ctypes.restype = None # void 返回类型\n",
" free_greeting_func_ctypes(c_greeting_ptr)\n",
" print(f\"Python: Called C to free greeting string for '{py_name}'.\")\n",
" else:\n",
" print(f\"Python: greet_person returned NULL for '{py_name}'.\")\n",
"\n",
" # --- 3. 处理结构体 ---\n",
" print(\"\\n--- Processing PointData struct --- \")\n",
" # 在 Python 中定义与 C 结构体匹配的 ctypes.Structure\n",
" class PointDataStruct(ctypes.Structure):\n",
" _fields_ = [(\"id\", ctypes.c_int),\n",
" (\"value\", ctypes.c_double)]\n",
"\n",
" # 创建结构体实例\n",
" point_instance_ctypes = PointDataStruct(id=10, value=3.14)\n",
" print(f\"Python: Initial PointData: id={point_instance_ctypes.id}, value={point_instance_ctypes.value:.2f}\")\n",
"\n",
" process_point_func_ctypes = c_lib.process_point_data\n",
" process_point_func_ctypes.argtypes = [ctypes.POINTER(PointDataStruct)] # 参数是结构体指针\n",
" process_point_func_ctypes.restype = None\n",
"\n",
" # 传递结构体实例的指针 (ctypes.byref() 或 ctypes.pointer())\n",
" process_point_func_ctypes(ctypes.byref(point_instance_ctypes))\n",
" # process_point_func_ctypes(ctypes.pointer(point_instance_ctypes)) # 也可以用这个\n",
"\n",
" print(f\"Python: Modified PointData: id={point_instance_ctypes.id}, value={point_instance_ctypes.value:.2f}\")\n",
"\n",
" # --- 4. 处理回调函数 ---\n",
" print(\"\\n--- Using callback function --- \")\n",
" # 定义回调函数的 Python 实现\n",
" def python_callback(a, b):\n",
" print(f\"Python callback: called with {a}, {b}\")\n",
" return a * b # 示例:做乘法\n",
"\n",
" # 将 Python 函数转换为 ctypes 可用的回调函数类型\n",
" # 第一个参数是返回类型,后面是参数类型\n",
" CALLBACK_FUNC_TYPE_CTYPES = ctypes.CFUNCTYPE(ctypes.c_int, ctypes.c_int, ctypes.c_int)\n",
" c_callback_ctypes = CALLBACK_FUNC_TYPE_CTYPES(python_callback)\n",
"\n",
" apply_callback_func_ctypes = c_lib.apply_callback\n",
" apply_callback_func_ctypes.argtypes = [ctypes.c_int, ctypes.c_int, CALLBACK_FUNC_TYPE_CTYPES]\n",
" apply_callback_func_ctypes.restype = ctypes.c_int\n",
"\n",
" callback_result = apply_callback_func_ctypes(7, 6, c_callback_ctypes)\n",
" print(f\"Python: Result from C apply_callback (using Python callback): {callback_result}\")\n",
"else:\n",
" print(\"Skipping ctypes examples as C library was not loaded.\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**`ctypes` 要点:**\n",
"* 必须显式定义 C 函数的参数类型 (`argtypes`) 和返回类型 (`restype`)。\n",
"* Python 字符串传递给 C 时需要编码成字节串 (`ctypes.c_char_p` 期望字节串)。\n",
"* C 返回的 `char*` 需要解码成 Python 字符串。\n",
"* **内存管理**:如果 C 库分配了内存并返回指针,Python 代码通常需要调用 C 库提供的相应释放函数来避免内存泄漏。\n",
"* 结构体通过创建 `ctypes.Structure` 的子类来映射。\n",
"* 回调函数通过 `ctypes.CFUNCTYPE` (或 `WINFUNCTYPE` for `stdcall` on Windows) 来创建。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. 使用 `cffi` (ABI 模式) 调用 C 函数\n",
"\n",
"`cffi` (C Foreign Function Interface for Python) 是一个强大的第三方库,通常比 `ctypes` 更易用且性能更好,尤其对于复杂的 C API。\n",
"你需要先安装它:`pip install cffi`\n",
"\n",
"**ABI (Application Binary Interface) 模式**:\n",
"这种模式类似于 `ctypes`,直接与编译好的共享库的二进制接口交互。你不需要 C 头文件,但需要手动声明函数签名。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" from cffi import FFI\n",
" cffi_available = True\n",
"except ImportError:\n",
" print(\"cffi library not found. Please install it: pip install cffi\")\n",
" cffi_available = False\n",
"\n",
"if cffi_available and c_lib: # c_lib 仍然是 ctypes 加载的库,cffi 可以使用它\n",
" print(f\"\\n=== CFFI ABI Mode Examples (using previously loaded library: {lib_path}) ===\")\n",
" ffi_abi = FFI()\n",
"\n",
" # 加载库 (cffi 可以直接使用 ctypes 加载的库对象,或者自己加载)\n",
" # 如果 c_lib (ctypes) 没加载,可以用下面这行,但要确保路径正确\n",
" # c_lib_cffi_abi = ffi_abi.dlopen(lib_path)\n",
" # 由于我们已经用 ctypes 加载了 c_lib, cffi 可以通过它访问函数\n",
" # 但更典型的 cffi ABI 用法是 ffi_abi.dlopen()\n",
" # 为保持与 ctypes 示例的库加载一致性,我们这里假设 c_lib 已通过 ctypes 加载\n",
" # 但在实际 ABI 模式中,你会用 ffi.dlopen()\n",
"\n",
" # 为了更纯粹地演示 ABI 模式,我们用 ffi.dlopen 重新加载\n",
" try:\n",
" c_lib_cffi_abi = ffi_abi.dlopen(lib_path)\n",
" print(f\"Successfully loaded {lib_name} using cffi.dlopen()\")\n",
" except Exception as e:\n",
" print(f\"Error loading library with cffi.dlopen(): {e}\")\n",
" c_lib_cffi_abi = None\n",
"\n",
" if c_lib_cffi_abi:\n",
" # --- 1. 调用简单的整数加法函数 ---\n",
" print(\"\\n--- CFFI ABI: Calling add_integers --- \")\n",
" # 在 ABI 模式下,你需要为每个要调用的函数声明其签名\n",
" # 这里的 add_integers 是从 c_lib_cffi_abi 对象获取的,它是一个 <CompiledLib object>\n",
" # 你需要告诉 ffi 这个函数的签名\n",
" # result_add_cffi_abi = c_lib_cffi_abi.add_integers(15, 30) # 直接调用会失败,因为cffi不知道类型\n",
" \n",
" # 正确做法:通过ffi对象声明函数,然后通过库对象调用\n",
" # 这种方式更接近API模式,但可以用于ABI。更纯粹的ABI是直接用库对象,但需要类型转换\n",
" # 或者,如果函数签名简单,cffi有时能推断,但显式声明更好\n",
" add_func_cffi_abi_ptr = c_lib_cffi_abi.add_integers # 获取函数指针\n",
" # 我们需要告诉 cffi 如何调用它\n",
" # 对于简单函数,有时可以不显式声明,但复杂类型必须声明\n",
" # 最好总是声明:\n",
" # int add_integers(int, int);\n",
" # c_lib_cffi_abi.add_integers.argtypes = [...] (这是 ctypes 的方式)\n",
" # cffi 的方式是使用 ffi.cast 或者在 ffi.cdef 中定义,然后在 API 模式中使用\n",
" # 在纯 ABI 模式下,如果类型不匹配,需要手动转换:\n",
" arg1 = ffi_abi.cast(\"int\", 15)\n",
" arg2 = ffi_abi.cast(\"int\", 30)\n",
" result_add_cffi_abi = add_func_cffi_abi_ptr(arg1, arg2)\n",
" print(f\"Python (CFFI ABI): add_integers(15, 30) result = {result_add_cffi_abi}\")\n",
"\n",
" # --- 2. 调用处理字符串的函数 ---\n",
" print(\"\\n--- CFFI ABI: Calling greet_person --- \")\n",
" py_name_cffi = \"Bob (cffi-abi)\"\n",
" # cffi 会自动处理 Python str 到 char* 的转换 (通常是 UTF-8)\n",
" # 但返回的 char* 需要特殊处理\n",
" c_greeting_ptr_cffi_abi = c_lib_cffi_abi.greet_person(py_name_cffi.encode('utf-8')) # 仍然建议显式编码\n",
" \n",
" if c_greeting_ptr_cffi_abi != ffi_abi.NULL:\n",
" # ffi.string() 将 char* (ffi.CData <char*>) 转换为 Python bytes\n",
" py_greeting_cffi_abi_bytes = ffi_abi.string(c_greeting_ptr_cffi_abi)\n",
" py_greeting_cffi_abi = py_greeting_cffi_abi_bytes.decode('utf-8')\n",
" print(f\"Python (CFFI ABI): Greeting from C: {py_greeting_cffi_abi}\")\n",
" \n",
" # 释放内存\n",
" c_lib_cffi_abi.free_greeting_string(c_greeting_ptr_cffi_abi)\n",
" print(f\"Python (CFFI ABI): Called C to free greeting string for '{py_name_cffi}'.\")\n",
" else:\n",
" print(f\"Python (CFFI ABI): greet_person returned NULL for '{py_name_cffi}'.\")\n",
"\n",
" else:\n",
" print(\"Skipping CFFI ABI examples as C library (c_lib_cffi_abi) was not loaded.\")\n",
"elif c_lib: # cffi 不可用,但 ctypes 加载了库\n",
" print(\"cffi not available, skipping CFFI ABI examples.\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**ABI 模式要点:**\n",
"* 通过 `ffi.dlopen()` 加载库。\n",
"* 可以直接调用库中的函数,但 `cffi` 对类型的处理不如 API 模式明确,可能需要手动使用 `ffi.cast()` 进行类型转换。\n",
"* `ffi.string()` 用于将 C 的 `char*` 转换为 Python 字节串。\n",
"* Python 字符串传递给 C 时,最好显式编码。\n",
"* 内存管理责任与 `ctypes` 类似。\n",
"* ABI 模式通常用于简单场景或无法获取头文件的情况。对于复杂 API,API 模式更佳。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. 使用 `cffi` (API 模式) 调用 C 函数\n",
"\n",
"**API (Application Programming Interface) 模式**:\n",
"这是 `cffi` 更推荐和强大的模式。它通过解析 C 头文件(或直接在 Python 中提供 C 声明)来理解 C API 的结构和类型。\n",
"\n",
"步骤:\n",
"1. 创建 `FFI` 对象:`ffi = FFI()`\n",
"2. 使用 `ffi.cdef(\"C declarations...\")` 提供 C 函数、类型、结构体等的声明。\n",
"3. 使用 `ffi.dlopen(\"library_path\")` 或 `ffi.verify(\"#include <header.h>\", libraries=['libname'])` (后者会编译一个小型的 C 包装器) 来加载库并使其声明的函数可用。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"if cffi_available:\n",
" print(f\"\\n=== CFFI API Mode Examples ===\")\n",
" ffi_api = FFI()\n",
"\n",
" # 1. 提供 C 声明 (通常可以从头文件复制或简化)\n",
" ffi_api.cdef(\"\"\"\n",
" // Function declarations\n",
" int add_integers(int a, int b);\n",
" char* greet_person(const char* name);\n",
" void free_greeting_string(char* greeting_str);\n",
"\n",
" // Struct declaration\n",
" typedef struct {\n",
" int id;\n",
" double value;\n",
" } PointData; // cffi 会自动处理 typedef\n",
" // 或者:struct PointData_tag { int id; double value; };\n",
" // typedef struct PointData_tag PointData;\n",
"\n",
" void process_point_data(PointData* p_data);\n",
"\n",
" // Callback function type declaration\n",
" typedef int (*callback_func_type)(int, int);\n",
" int apply_callback(int x, int y, callback_func_type cb);\n",
" \n",
" // extern \"Python+C\" int python_callback_for_cffi(int, int); // For direct C callback definition\n",
" \"\"\")\n",
"\n",
" # 2. 加载库\n",
" # 通常你会使用 ffi.dlopen() 如果库已经编译好\n",
" # 或者 ffi.verify() 如果你想让 cffi 编译一个小的 C 包装器 (更复杂,但有时更健壮)\n",
" # 这里我们继续使用 dlopen(),因为库已存在。\n",
" try:\n",
" c_lib_cffi_api = ffi_api.dlopen(lib_path)\n",
" print(f\"Successfully loaded {lib_name} for CFFI API mode.\")\n",
" except Exception as e:\n",
" print(f\"Error loading library for CFFI API mode: {e}\")\n",
" c_lib_cffi_api = None\n",
"\n",
" if c_lib_cffi_api:\n",
" # --- 1. 调用简单的整数加法函数 ---\n",
" print(\"\\n--- CFFI API: Calling add_integers --- \")\n",
" result_add_cffi_api = c_lib_cffi_api.add_integers(20, 35)\n",
" print(f\"Python (CFFI API): add_integers(20, 35) result = {result_add_cffi_api}\")\n",
"\n",
" # --- 2. 调用处理字符串的函数 ---\n",
" print(\"\\n--- CFFI API: Calling greet_person --- \")\n",
" py_name_cffi_api = \"Charlie (cffi-api)\"\n",
" # cffi 自动处理 Python str 到 const char* (通常 UTF-8 编码)\n",
" c_greeting_ptr_cffi_api = c_lib_cffi_api.greet_person(py_name_cffi_api.encode('utf-8'))\n",
"\n",
" if c_greeting_ptr_cffi_api != ffi_api.NULL:\n",
" # ffi.string() 转换 char* 为 Python bytes\n",
" py_greeting_cffi_api_bytes = ffi_api.string(c_greeting_ptr_cffi_api)\n",
" py_greeting_cffi_api = py_greeting_cffi_api_bytes.decode('utf-8')\n",
" print(f\"Python (CFFI API): Greeting from C: {py_greeting_cffi_api}\")\n",
"\n",
" # 释放内存\n",
" c_lib_cffi_api.free_greeting_string(c_greeting_ptr_cffi_api)\n",
" print(f\"Python (CFFI API): Called C to free greeting string for '{py_name_cffi_api}'.\")\n",
" else:\n",
" print(f\"Python (CFFI API): greet_person returned NULL for '{py_name_cffi_api}'.\")\n",
"\n",
" # --- 3. 处理结构体 ---\n",
" print(\"\\n--- CFFI API: Processing PointData struct --- \")\n",
" # 创建结构体实例 (CData 对象)\n",
" # 方法1: 使用 ffi.new() 分配内存并初始化\n",
" # point_instance_cffi_api = ffi_api.new(\"PointData*\") # 创建一个指针\n",
" # point_instance_cffi_api.id = 20\n",
" # point_instance_cffi_api.value = 6.28\n",
" # 方法2: 直接用字典初始化 (如果结构体是简单类型)\n",
" point_instance_cffi_api = ffi_api.new(\"PointData*\", {\"id\": 20, \"value\": 6.28})\n",
" # 如果是值类型而不是指针: point_val = ffi_api.new(\"PointData\", ...) then pass &point_val\n",
" \n",
" print(f\"Python (CFFI API): Initial PointData: id={point_instance_cffi_api.id}, value={point_instance_cffi_api.value:.2f}\")\n",
" \n",
" # C 函数期望 PointData*\n",
" c_lib_cffi_api.process_point_data(point_instance_cffi_api)\n",
" print(f\"Python (CFFI API): Modified PointData: id={point_instance_cffi_api.id}, value={point_instance_cffi_api.value:.2f}\")\n",
"\n",
" # --- 4. 处理回调函数 ---\n",
" print(\"\\n--- CFFI API: Using callback function --- \")\n",
" # 定义 Python 回调函数\n",
" @ffi_api.callback(\"int(int, int)\") # 使用装饰器声明C签名\n",
" def python_callback_cffi_api(a, b):\n",
" print(f\"Python callback (CFFI API): called with {a}, {b}\")\n",
" return a - b # 示例:做减法\n",
"\n",
" # cffi_api.callback 已经返回了一个C兼容的函数指针\n",
" callback_result_cffi_api = c_lib_cffi_api.apply_callback(100, 30, python_callback_cffi_api)\n",
" print(f\"Python (CFFI API): Result from C apply_callback: {callback_result_cffi_api}\")\n",
" else:\n",
" print(\"Skipping CFFI API examples as C library (c_lib_cffi_api) was not loaded.\")\n",
"elif not cffi_available:\n",
" print(\"cffi not available, skipping CFFI API examples.\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**API 模式要点:**\n",
"* **C 声明**:通过 `ffi.cdef()` 提供 C API 的声明,`cffi` 会解析它们。\n",
"* **类型安全**:由于 `cffi` 知道类型信息,它能进行更严格的类型检查和自动转换。\n",
"* **易用性**:调用 C 函数、创建结构体等通常更自然。\n",
"* `ffi.new(\"C_type*\", initializer)` 用于创建 C 数据结构并获取其指针。\n",
"* `ffi.string()` 和 `ffi.buffer()` 用于处理 C 字符串和内存块。\n",
"* 回调函数可以通过 `@ffi.callback(\"return_type(arg_types)\")` 装饰器方便地创建。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. 传递不同类型的数据 (回顾与补充)\n",
"\n",
"| C Type | `ctypes` Type | `cffi` Declaration | Python Type (to C) | Python Type (from C) |\n",
"|-----------------|--------------------------|--------------------|--------------------|----------------------|\n",
"| `int` | `ctypes.c_int` | `int` | `int` | `int` |\n",
"| `long` | `ctypes.c_long` | `long` | `int` | `int` |\n",
"| `float` | `ctypes.c_float` | `float` | `float` | `float` |\n",
"| `double` | `ctypes.c_double` | `double` | `float` | `float` |\n",
"| `char` | `ctypes.c_char` | `char` | `bytes` (len 1) | `bytes` (len 1) |\n",
"| `char*` (string)| `ctypes.c_char_p` | `char*` | `bytes` or `str` (auto-enc) | `bytes` (use `ffi.string` or decode) |\n",
"| `void*` | `ctypes.c_void_p` | `void*` | `int` (address) or `None` | `int` (address) or CData |\n",
"| `struct {…}` | `ctypes.Structure` subcls| `struct T {…};` | `ctypes` struct inst | `ctypes` struct inst / CData struct |\n",
"| `TYPE*` (pointer)| `ctypes.POINTER(TYPE)` | `TYPE*` | `ctypes` pointer obj | `ctypes` pointer obj / CData pointer |\n",
"| `function ptr` | `ctypes.CFUNCTYPE` | `ret (*)(args)` | `ctypes` func ptr | CData func ptr |\n",
"| `NULL` (pointer)| N/A (use `None`) | `NULL` (from `ffi`) | `None` | `ffi.NULL` |\n",
"\n",
"**数组:**\n",
"* `ctypes`: `(ctypes.c_int * 5)()` 创建一个包含5个整数的数组。\n",
"* `cffi`: `ffi.new(\"int[5]\")` 或在 `cdef` 中声明 `int my_array[5];`。\n",
"\n",
"**错误处理:**\n",
"C 函数通常通过返回值(如 -1, NULL)或设置全局变量 (`errno`,`ctypes.get_errno()`, `ffi.errno`) 来指示错误。Python 代码需要检查这些。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. `ctypes` vs `cffi` 比较与选择\n",
"\n",
"| 特性 | `ctypes` | `cffi` |\n",
"|------------------|----------------------------------------------|-----------------------------------------------------------|\n",
"| **标准库** | 是 (无需安装) | 否 (需 `pip install cffi`) |\n",
"| **易用性** | 相对较低,需要手动管理类型和内存 | API 模式非常易用,接近直接写 C;ABI 模式类似 ctypes |\n",
"| **性能** | 调用开销相对较大 | 通常比 `ctypes` 快,尤其对于复杂类型和回调 |\n",
"| **类型安全** | 依赖用户正确设置 `argtypes`/`restype` | API 模式通过 C 声明提供强类型检查 |\n",
"| **头文件依赖** | 不需要 (直接与 ABI 交互) | API 模式通常需要 C 声明 (可来自头文件) |\n",
"| **复杂 API** | 处理复杂结构、回调可能繁琐 | 更擅长处理复杂的 C API |\n",
"| **编译步骤** | 不需要 | API 模式 (verify/set_source) 可能涉及编译 C 包装代码 |\n",
"| **Python 实现** | PyPy 对 `ctypes` 支持可能不如 `cffi` | PyPy 推荐使用 `cffi`,有很好的 JIT 集成 |\n",
"\n",
"**选择建议:**\n",
"\n",
"* **简单、少量函数调用,不想添加依赖**:`ctypes` 可能足够了。\n",
"* **处理复杂的 C API,关注性能和类型安全**:`cffi` (尤其是 API 模式) 通常是更好的选择。\n",
"* **需要与 PyPy 良好集成**:`cffi` 是首选。\n",
"* **没有 C 头文件,只能与二进制交互**:`ctypes` 或 `cffi` ABI 模式。\n",
"\n",
"## 总结\n",
"\n",
"Python 通过 `ctypes` 和 `cffi` 等工具提供了与 C 代码交互的强大能力。这使得 Python 可以利用现有的 C 库并对性能关键部分进行优化。\n",
"\n",
"`ctypes` 作为标准库的一部分,易于上手用于简单场景。`cffi` 提供了更现代、更强大且通常性能更好的 FFI 解决方案,特别是其 API 模式。\n",
"\n",
"在与 C 代码交互时,务必注意数据类型的正确映射、内存管理(谁分配、谁释放)以及错误处理。\n",
"\n",
"对于更深层次的集成或从头编写高性能扩展,可以进一步研究直接编写 Python C API 扩展或使用 Cython。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.0"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 5
}
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

对动态语言Python的一些感慨

众所周知Python是完全动态的语言,体现在

  1. 类型动态绑定
  2. 运行时检查
  3. 对象结构内容可动态修改(而不仅仅是值)
  4. 反射
  5. 一切皆对象(instance, class, method)
  6. 可动态执行代码(eval, exec)
  7. 鸭子类型支持

动态语言的约束更少,对使用者来说更易于入门,但相应的也会有代价就是运行时开销很大,和底层汇编执行逻辑完全解耦不知道代码到底是怎么执行的。

而且还有几点是我认为较为严重的缺陷。下面进行梳理。

破坏了OOP的语义

较为流行的编程语言大多支持OOP编程范式。即继承和多态。同样,Python在执行简单任务时候可以纯命令式(Imperative Programming),也可以使用复杂的面向对象OOP。

但是,其动态特性破环了OOP的结构:

  1. 类型模糊:任何类型实例,都可以在运行时添加或者删除属性或者方法(相比之下静态语言只能在运行时修改它们的值)。经此修改的实例,按理说不再属于原来的类型,毕竟和原类型已经有了明显的区别。但是该实例的内建__class__属性依旧会指向原类型,这会给类型的认知造成困惑。符合一个class不应该只是名义上符合,而是内容上也应该符合。
  2. 破坏继承:体现在以下两个方面
    1. 大部分实践没有虚接口继承。abc模块提供了虚接口的基类ABC,经典的做法是让自己的抽象类继承自ABC,然后具体类继承自自己的抽象类,然后去实现抽象方法。但PEP提案认为Pythonic的做法是用typing.Protocol来取代ABC,具体类完全不继承任何虚类,只要实现相应的方法,那么就可以被静态检查器认为是符合Protocol的。
    2. 不需要继承自具体父类。和上一条一样,即使一个类没有任何父类(除了object类),它依旧可以生成同名的方法,以实现和父类方法相同的调用接口。这样在语义逻辑上,类的定义完全看不出和其他类有何种关系。完全可以是一种松散的组织结构,任何两个类之间都没继承关系。
  3. 破坏多态:任何一个入参出参,天然不限制类型。这使得要求父类型的参数处,传入子类型显得没有意义,依旧是因为任何类型都能动态修改满足要求。

破坏了设计模式

经典的模式诸如工厂模式,抽象工厂,访问者模式,都严重依赖于继承和多态的性质。但是在python的设计中,其动态能力使得设计模式形同虚设。 大家常见的库中使用设计模式的有transformers库,其中的from_pretrained系列则是工厂模式,通过字符串名称确定了具体的构造器得到具体的子类。而工厂构造器的输出类型是一个所有模型的基类。

安全性问题

Python在代码层面一般不直接管理指针,所以指针越界,野指针,悬空指针等问题一般不存在。而gc机制也能自动处理垃圾回收使得编码过程不必关注这类安全性问题。但与之相对的,Python也有自己的安全性问题。以往非托管形式的代码的攻击难度较大,注入代码想要稳定执行需要避免破坏原来的结构导致程序直接崩溃(段错误)。 Python却可以直接注入任何代码修改原本的逻辑,并且由于不是在code段固定的内容,攻击时候也无需有额外考虑。运行时可以手动修改globals() locals()内容,亦有一定风险。 另一个危险则是类型不匹配导致的代码执行问题,因为只有在运行时才确定类型,无法提前做出保证,可能会产生类型错误的异常,造成程序崩溃。

总结

我出身于C++。但是近年来一直在用python编程。而且python的市场占有率已经多年第一,且遥遥领先。这和其灵活性分不开关系。对于一个面向大众的编程语言,这样的特性是必要的。即使以上说了诸多python的不严谨之处,但是对于程序员依旧可以选择严谨的面向对象写法。所以,程序的优劣不在于语言怎么样,而在于程序员本身。程序员有责任写出易于维护,清晰,规范的代码~

Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@KuRRe8
Copy link
Author

KuRRe8 commented May 8, 2025

返回顶部

有见解,有问题,或者单纯想盖楼灌水,都可以在这里发表!

因为文档比较多,有时候渲染不出来ipynb是浏览器性能的问题,刷新即可

或者git clone到本地来阅读

ChatGPT Image May 9, 2025, 04_45_04 AM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment