Python 性能优化

性能分析工具

# 1. cProfile：函数级性能分析
import cProfile
cProfile.run("your_function()")
 
# 2. timeit：精确测量小代码片段
import timeit
timeit.timeit("[x**2 for x in range(1000)]", number=1000)
 
# 命令行
# python -m cProfile -s cumtime script.py
 
# 3. line_profiler（第三方）：行级分析
# pip install line_profiler
# @profile 装饰函数，然后 kernprof -l script.py
 
# 4. memory_profiler：内存分析
# pip install memory_profiler
# @memory_profiler.profile

常见优化技巧

1. 使用局部变量

import math
 
# 慢：每次都查全局作用域
def slow():
    result = []
    for i in range(10000):
        result.append(math.sqrt(i))
 
# 快：局部变量查找更快（LOAD_FAST vs LOAD_GLOBAL）
def fast():
    sqrt = math.sqrt  # 缓存到局部变量
    result = []
    append = result.append  # 方法也可以缓存
    for i in range(10000):
        append(sqrt(i))

2. 列表推导式 vs 循环

# 列表推导式通常比等价的 for 循环快 30-50%
# 推导式在 C 层面运行，没有 Python 层循环开销
 
# 慢
result = []
for x in range(1000):
    if x % 2 == 0:
        result.append(x * x)
 
# 快
result = [x * x for x in range(1000) if x % 2 == 0]

3. 字符串拼接

# 慢：每次 + 都创建新字符串，O(n²)
s = ""
for word in words:
    s += word
 
# 快：join 一次性分配，O(n)
s = "".join(words)

4. 使用 `in` 检测成员

# list 成员检测：O(n)
if x in [1, 2, 3, 4, 5]:  # 慢
    ...
 
# set/dict 成员检测：O(1)
lookup = {1, 2, 3, 4, 5}
if x in lookup:  # 快
    ...

5. 避免重复计算

# 慢：每次循环都重新计算 len
for i in range(len(my_list)):
    ...
 
# 快：缓存长度
n = len(my_list)
for i in range(n):
    ...

NumPy 向量化

import numpy as np
 
# 慢：Python 循环
def python_sum(arr):
    total = 0
    for x in arr:
        total += x * x
    return total
 
# 快：NumPy 向量化（底层 C，且释放 GIL）
def numpy_sum(arr):
    return np.sum(arr ** 2)
 
arr = np.arange(10**6)
# numpy 版本快 100x+

上下文管理器（Context Manager）

# 实现 __enter__ 和 __exit__
class Timer:
    def __enter__(self):
        import time
        self.start = time.perf_counter()
        return self
 
    def __exit__(self, exc_type, exc_val, exc_tb):
        import time
        self.elapsed = time.perf_counter() - self.start
        print(f"Elapsed: {self.elapsed:.3f}s")
        return False  # 不抑制异常
 
with Timer() as t:
    sum(range(10**7))
print(t.elapsed)
 
# 用 contextlib 更简洁
from contextlib import contextmanager
 
@contextmanager
def timer():
    import time
    start = time.perf_counter()
    yield
    print(f"Elapsed: {time.perf_counter() - start:.3f}s")
 
with timer():
    sum(range(10**7))

常见陷阱总结

# 1. 全局变量慢于局部变量
# 2. 属性访问有开销（obj.attr 比局部变量慢）
# 3. 函数调用有开销（避免在热路径上调用小函数）
# 4. 异常处理有开销（不要用 try/except 控制正常流程）
# 5. isinstance 比 type() == 慢（但更 Pythonic，通常无需优化）

Python 版本特性速查

版本	主要特性
3.7	dataclass, dict 有序保证
3.8	walrus operator `:=`, f-string `=`
3.9	`dict \| dict`, `list[int]` 类型注解
3.10	match/case（结构模式匹配）
3.11	性能提升 10-60%, `tomllib`
3.12	更好的错误信息, f-string 增强

# walrus operator（Python 3.8+）：在表达式中赋值
import re
if m := re.search(r'\d+', text):
    print(m.group())  # 避免重复调用 re.search
 
# match/case（Python 3.10+）：结构模式匹配
def process(command):
    match command.split():
        case ["quit"]:
            return "Quitting"
        case ["go", direction]:
            return f"Going {direction}"
        case ["pick", "up", item]:
            return f"Picking up {item}"
        case _:
            return "Unknown command"

Notes

Explorer

06-Python性能优化

Python 性能优化

性能分析工具

常见优化技巧

1. 使用局部变量

2. 列表推导式 vs 循环

3. 字符串拼接

4. 使用 `in` 检测成员

5. 避免重复计算

NumPy 向量化

上下文管理器（Context Manager）

常见陷阱总结

Python 版本特性速查

Table of Contents

Graph View

Table of Contents

Backlinks

Notes

Explorer

06-Python性能优化

Python 性能优化

性能分析工具

常见优化技巧

1. 使用局部变量

2. 列表推导式 vs 循环

3. 字符串拼接

4. 使用 in 检测成员

5. 避免重复计算

NumPy 向量化

上下文管理器（Context Manager）

常见陷阱总结

Python 版本特性速查

Graph View

Table of Contents

Backlinks

4. 使用 `in` 检测成员