Python 多线程与多进程并发编程完全指南：GIL 原理、threading、multiprocessing 到 concurrent.futures

Python 的并发编程是进阶开发者必须掌握的技能。无论是并发爬虫、Web 服务、数据处理还是 GUI 应用，多线程和多进程都是核心武器。

本文将系统讲解 Python 并发编程的完整知识体系：

GIL（全局解释器锁）原理——理解 Python 多线程的底层限制
threading 模块——多线程编程与同步机制
multiprocessing 模块——真正的并行计算
concurrent.futures——现代高级并发 API
三种并发模型的选择决策
完整实战案例

前置阅读：本文假设你已了解 Python 基础语法。如果想先了解异步编程，请阅读：Python asyncio 异步编程完全指南

1. GIL 原理深度解析#

1.1 什么是 GIL？#

GIL（Global Interpreter Lock，全局解释器锁） 是 CPython 解释器中的一个互斥锁（mutex）。它的核心规则非常简单：

同一时刻，只有一个线程可以执行 Python 字节码。

这意味着，即使你的 CPU 有 16 个核心，CPython 的多线程在任意时刻也只能用一个核心执行 Python 代码。

1.2 为什么 Python 有 GIL？#

GIL 的存在有其历史原因和实际意义：

历史原因： CPython 的内存管理（引用计数）不是线程安全的。如果没有 GIL，多个线程同时修改对象的引用计数会导致内存泄漏或崩溃。GIL 是最简单、最高效的保护方案。

实际意义：

让 CPython 的单线程性能很高（无需复杂的锁机制）
简化了 C 扩展的编写（C 扩展假设 GIL 存在）
对于 I/O 密集型任务影响很小（I/O 等待时 GIL 会释放）

1.3 GIL 的工作机制#

1
import sys
2
# 查看 GIL 切换间隔（Python 3.2+ 默认 5 毫秒）
3
print(sys.getswitchinterval())  # 0.005 秒

GIL 的释放和获取遵循以下规则：

时间片到期： 每个线程执行 Python 字节码到一定时间（默认 5ms），会释放 GIL 让其他线程运行
I/O 操作： 线程执行 I/O 操作（网络请求、文件读写、sleep）时会主动释放 GIL
主动释放： 可以显式调用 time.sleep(0) 来主动让出 GIL

1.4 GIL 的实际影响验证#

让我们用一个实验来验证 GIL 对 CPU 密集型和 I/O 密集型任务的影响：

1
import time
2
import threading
3

4
# === CPU 密集型任务：计算斐波那契 ===
5
def cpu_bound(n=35):
6
    def fib(x):
7
        if x <= 1:
8
            return x
9
        return fib(x - 1) + fib(x - 2)
10
    return fib(n)
11

12
# === I/O 密集型任务：模拟网络请求 ===
13
def io_bound():
14
    time.sleep(0.1)  # 模拟 100ms 的 I/O 等待
15
    return "done"
16

17
# === 单线程基准测试 ===
18
def benchmark_serial(task, count):
19
    start = time.time()
20
    for _ in range(count):
21
        task()
22
    return time.time() - start
23

24
# === 多线程测试 ===
25
def benchmark_threaded(task, count):
26
    start = time.time()
27
    threads = [threading.Thread(target=task) for _ in range(count)]
28
    for t in threads:
29
        t.start()
30
    for t in threads:
31
        t.join()
32
    return time.time() - start
33

34
print("=== CPU 密集型（8 任务）===")
35
print(f"单线程: {benchmark_serial(cpu_bound, 8):.2f}s")
36
print(f"多线程: {benchmark_threaded(cpu_bound, 8):.2f}s")
37
# 预期：多线程 ≈ 单线程（甚至可能更慢！）
38

39
print("\n=== I/O 密集型（8 任务）===")
40
print(f"单线程: {benchmark_serial(io_bound, 8):.2f}s")
41
print(f"多线程: {benchmark_threaded(io_bound, 8):.2f}s")
42
# 预期：多线程 ≈ 0.1s（远快于单线程的 0.8s）

输出示例：

1
=== CPU 密集型（8 任务）===
2
单线程: 12.34s
3
多线程: 13.15s        ← 多线程反而更慢！
4

5
=== I/O 密集型（8 任务）===
6
单线程: 0.80s
7
多线程: 0.11s          ← 多线程快了 7 倍！

结论：

CPU 密集型：多线程受 GIL 限制，无法加速，甚至因线程切换开销略微变慢
I/O 密集型：多线程在 I/O 等待时释放 GIL，大幅提升并发效率

1.5 GIL 的未来#

Python 3.13 引入了实验性的 free-threading 模式（PEP 703），允许在编译时禁用 GIL：

1
# Python 3.13+ 实验性 free-threading 编译
2
./configure --disable-gil
3
make

但目前（2026 年）CPython 默认仍带 GIL，且许多 C 扩展尚未适配 free-threading。在可预见的未来，GIL 仍将是 CPython 的默认特性。

2. threading 多线程模块详解#

2.1 Thread 线程创建#

Python 创建线程有两种方式：

1
import threading
2
import time
3

4
# 方式一：传递目标函数
5
def worker(name, delay):
6
    """工作线程函数"""
7
    print(f"[{name}] 开始工作")
8
    time.sleep(delay)
9
    print(f"[{name}] 工作完成，耗时 {delay}s")
10

11
# 创建线程（daemon=True 表示守护线程，主线程退出时自动结束）
12
t1 = threading.Thread(target=worker, args=("worker-1", 2))
13
t2 = threading.Thread(target=worker, kwargs={"name": "worker-2", "delay": 1})
14

15
t1.start()
16
t2.start()
17

18
# 等待线程结束
19
t1.join()
20
t2.join()
21

22
print("所有线程完成")

1
# 方式二：继承 Thread 类
2
class MyThread(threading.Thread):
3
    def __init__(self, name, delay):
4
        super().__init__(name=name)  # 设置线程名
5
        self.delay = delay
6

7
    def run(self):
8
        print(f"[{self.name}] 开始工作")
9
        time.sleep(self.delay)
10
        print(f"[{self.name}] 完成")
11

12
t = MyThread("custom-thread", 1.5)
13
t.start()
14
t.join()

2.2 线程属性与方法#

属性/方法	说明
`threading.active_count()`	当前活跃线程数
`threading.current_thread()`	获取当前线程对象
`threading.main_thread()`	获取主线程对象
`threading.enumerate()`	列出所有活跃线程
`t.name`	线程名
`t.daemon`	是否为守护线程
`t.ident`	线程 ID（系统级）
`t.is_alive()`	线程是否存活
`t.join(timeout)`	等待线程结束

1
import threading
2

3
def task():
4
    current = threading.current_thread()
5
    print(f"名称: {current.name}")
6
    print(f"ID: {current.ident}")
7
    print(f"守护: {current.daemon}")
8
    print(f"总线程数: {threading.active_count()}")
9

10
t = threading.Thread(target=task, name="demo-thread", daemon=True)
11
t.start()
12
t.join()
13

14
# 输出:
15
# 名称: demo-thread
16
# ID: 123145307795456
17
# 守护: True
18
# 总线程数: 2

2.3 daemon 守护线程#

守护线程的特殊行为：

主线程退出时，守护线程会立即被强制结束（不等待运行完成）
非守护线程（默认）会阻止主进程退出，直到所有非守护线程结束

1
import threading
2
import time
3

4
def daemon_task():
5
    for i in range(5):
6
        print(f"守护线程运行中... {i}")
7
        time.sleep(0.5)
8

9
def normal_task():
10
    for i in range(3):
11
        print(f"普通线程运行中... {i}")
12
        time.sleep(0.5)
13

14
# 守护线程
15
d = threading.Thread(target=daemon_task, daemon=True)
16
d.start()
17

18
# 普通线程
19
n = threading.Thread(target=normal_task)
20
n.start()
21

22
print("主线程退出")
23
# 主线程退出后，守护线程 d 会立即终止
24
# 但主线程会等普通线程 n 结束后才真正退出

最佳实践： 后台辅助任务（日志、心跳、监控）用守护线程；需要完整执行的任务用普通线程 + join()。

3. 线程同步机制#

多线程编程最大的陷阱是数据竞争（Race Condition）——多个线程同时修改共享数据导致不可预期的结果。

3.1 Lock 互斥锁#

1
import threading
2

3
# 没有锁的经典问题：计数不准确
4
counter = 0
5

6
def increment_without_lock():
7
    global counter
8
    for _ in range(1_000_000):
9
        counter += 1  # 这不是原子操作！实际是：读→加→写 三步
10

11
threads = [threading.Thread(target=increment_without_lock) for _ in range(10)]
12
for t in threads:
13
    t.start()
14
for t in threads:
15
    t.join()
16

17
print(f"不加锁: counter = {counter}")
18
# 预期 10,000,000，实际可能是 4,328,917（每次不同）

1
# 使用 Lock 修复
2
counter = 0
3
lock = threading.Lock()
4

5
def increment_with_lock():
6
    global counter
7
    for _ in range(1_000_000):
8
        with lock:  # 自动 acquire/release
9
            counter += 1
10

11
threads = [threading.Thread(target=increment_with_lock) for _ in range(10)]
12
for t in threads:
13
    t.start()
14
for t in threads:
15
    t.join()
16

17
print(f"加锁后: counter = {counter}")  # 正确: 10,000,000

Lock 核心方法：

1
lock = threading.Lock()
2

3
# 阻塞获取（等待直到可用）
4
lock.acquire()
5

6
# 非阻塞获取（立即返回是否成功）
7
if lock.acquire(blocking=False):  # 或 lock.acquire(False)
8
    # 成功获取
9
    lock.release()
10

11
# 带超时获取
12
if lock.acquire(timeout=2.0):
13
    # 2 秒内获取到
14
    lock.release()
15
else:
16
    print("超时！")
17

18
# 推荐用法：with 语句自动管理
19
with lock:
20
    # 临界区代码
21
    pass

3.2 RLock 可重入锁#

RLock 允许同一线程多次获取锁，适用于递归或嵌套调用场景：

1
import threading
2

3
rlock = threading.RLock()
4

5
class RecursiveCounter:
6
    def __init__(self):
7
        self.value = 0
8

9
    def increment(self):
10
        with rlock:
11
            self.value += 1
12
            return self.value
13

14
    def reset_and_increment(self):
15
        with rlock:
16
            self.value = 0
17
            # 同一线程再次调用 increment()，里面也会获取 rlock
18
            # Lock 会死锁，RLock 正常工作
19
            return self.increment()
20

21
counter = RecursiveCounter()
22
print(counter.reset_and_increment())  # 1

选择建议： 一般场景用 Lock（性能略好），递归/嵌套场景用 RLock。

3.3 Semaphore 信号量#

Semaphore 限制同时访问某个资源的线程数量（类似停车场容量）：

1
import threading
2
import time
3
import random
4

5
# 最多允许 3 个线程同时访问数据库
6
db_semaphore = threading.Semaphore(3)
7

8
def database_query(query_id):
9
    with db_semaphore:
10
        print(f"查询 {query_id} 开始执行...")
11
        time.sleep(random.uniform(0.5, 1.5))  # 模拟查询
12
        print(f"查询 {query_id} 完成")
13

14
# 启动 10 个并发查询
15
for i in range(10):
16
    threading.Thread(target=database_query, args=(i,)).start()
17

18
# 最多 3 个同时执行，其余排队等待

3.4 Event 事件#

Event 用于线程间的事件通知——一个线程等待信号，另一个线程发送信号：

1
import threading
2
import time
3

4
def downloader(stop_event):
5
    """下载线程：持续下载直到收到停止信号"""
6
    chunk = 0
7
    while not stop_event.is_set():
8
        chunk += 1
9
        print(f"下载 chunk {chunk}...")
10
        time.sleep(0.5)
11
    print("收到停止信号，下载终止")
12

13
def controller(stop_event):
14
    """控制线程：5 秒后发送停止信号"""
15
    time.sleep(5)
16
    print("→ 发送停止信号")
17
    stop_event.set()
18

19
stop_event = threading.Event()
20

21
d = threading.Thread(target=downloader, args=(stop_event,))
22
c = threading.Thread(target=controller, args=(stop_event,))
23

24
d.start()
25
c.start()
26
d.join()
27
c.join()

Event 方法：

event.set() — 设置事件（点亮信号灯）
event.clear() — 清除事件（熄灭信号灯）
event.wait(timeout) — 等待事件设置（可设超时）
event.is_set() — 检查事件是否已设置

3.5 Condition 条件变量#

Condition 比 Event 更强大，支持复杂的等待-通知模式（生产者-消费者模型）：

1
import threading
2
import time
3
import random
4

5
class BoundedQueue:
6
    """有界队列：使用 Condition 实现生产者-消费者模式"""
7

8
    def __init__(self, capacity):
9
        self.queue = []
10
        self.capacity = capacity
11
        self.condition = threading.Condition()
12

13
    def put(self, item):
14
        with self.condition:
15
            # 队列满时等待消费者腾出空间
16
            while len(self.queue) >= self.capacity:
17
                print(f"  队列满，生产者等待...")
18
                self.condition.wait()
19
            self.queue.append(item)
20
            print(f"  生产: {item} (队列: {len(self.queue)}/{self.capacity})")
21
            self.condition.notify()  # 通知一个等待的消费者
22

23
    def get(self):
24
        with self.condition:
25
            # 队列空时等待生产者放入数据
26
            while len(self.queue) == 0:
27
                print(f"  队列空，消费者等待...")
28
                self.condition.wait()
29
            item = self.queue.pop(0)
30
            print(f"     消费: {item} (队列: {len(self.queue)}/{self.capacity})")
31
            self.condition.notify()  # 通知一个等待的生产者
32
            return item
33

34
# 测试生产者-消费者
35
q = BoundedQueue(3)
36

37
def producer(q, n):
38
    for i in range(n):
39
        time.sleep(random.uniform(0.1, 0.5))
40
        q.put(f"P-{i}")
41

42
def consumer(q, n):
43
    for _ in range(n):
44
        time.sleep(random.uniform(0.2, 1.0))
45
        q.get()
46

47
threading.Thread(target=producer, args=(q, 5)).start()
48
threading.Thread(target=consumer, args=(q, 5)).start()

3.6 Barrier 屏障#

Barrier 让多个线程在某个点互相等待，全部到达后一起继续：

1
import threading
2
import time
3
import random
4

5
def racer(barrier, name):
6
    prep_time = random.uniform(0.5, 2.0)
7
    print(f"{name} 准备中...({prep_time:.1f}s)")
8
    time.sleep(prep_time)
9
    print(f"{name} 到达起跑线，等待其他选手...")
10
    barrier.wait()  # 等待所有选手
11
    print(f"{name} 起跑！")
12

13
# 5 个选手需要全部就位才能起跑
14
barrier = threading.Barrier(5, timeout=10)
15
for i in range(5):
16
    threading.Thread(target=racer, args=(barrier, f"选手-{i+1}"), daemon=True).start()

3.7 Timer 定时器#

1
import threading
2

3
def remind(msg):
4
    print(f"⏰ 提醒: {msg}")
5

6
# 5 秒后执行
7
timer = threading.Timer(5.0, remind, args=["该休息了！"])
8
timer.start()
9

10
# 如果不需要了可以取消
11
# timer.cancel()

3.8 local() 线程局部存储#

1
import threading
2

3
# 每个线程有自己的独立存储空间，互不干扰
4
thread_data = threading.local()
5

6
def process():
7
    thread_data.value = threading.current_thread().name
8
    # 其他线程修改 thread_data.value 不会影响当前线程
9
    print(f"{threading.current_thread().name}: value = {thread_data.value}")
10

11
for i in range(3):
12
    threading.Thread(target=process, name=f"thread-{i}").start()

3.9 同步机制选择速查表#

场景	推荐工具
保护共享变量不被并发修改	`Lock`
递归/嵌套调用中加锁	`RLock`
限制并发访问数量	`Semaphore`
等待某个事件发生	`Event`
生产者-消费者模式	`Condition` / `queue.Queue`
多线程同步等待	`Barrier`
定时执行	`Timer`
线程隔离存储	`local()`

4. queue.Queue 线程安全队列#

queue.Queue 是 Python 内置的线程安全队列，比 Condition 更适合生产者-消费者场景：

1
import threading
2
import queue
3
import time
4

5
# 创建有界队列
6
q = queue.Queue(maxsize=5)
7

8
def producer(q, n):
9
    for i in range(n):
10
        time.sleep(0.3)
11
        item = f"item-{i}"
12
        q.put(item)  # 阻塞直到有空位
13
        print(f"生产: {item} (队列: {q.qsize()})")
14

15
def consumer(q, n):
16
    for _ in range(n):
17
        item = q.get()  # 阻塞直到有数据
18
        print(f"  消费: {item} (队列: {q.qsize()})")
19
        q.task_done()  # 标记任务完成
20
        time.sleep(0.8)
21

22
# 启动线程
23
threading.Thread(target=producer, args=(q, 10), daemon=True).start()
24
threading.Thread(target=consumer, args=(q, 10), daemon=True).start()
25

26
# 等待所有任务被处理
27
q.join()  # 阻塞直到队列中所有项都被 task_done()
28
print("所有任务完成")

Queue 类型：

queue.Queue(maxsize) — FIFO 队列（先进先出）
queue.LifoQueue(maxsize) — LIFO 队列（后进先出，栈）
queue.PriorityQueue(maxsize) — 优先级队列

1
import queue
2

3
# 优先级队列：数字越小优先级越高
4
pq = queue.PriorityQueue()
5
pq.put((2, "普通任务"))
6
pq.put((1, "紧急任务"))
7
pq.put((3, "低优先级任务"))
8

9
while not pq.empty():
10
    print(pq.get())  # (1, '紧急任务') → (2, '普通任务') → (3, '低优先级任务')

5. multiprocessing 多进程模块#

5.1 为什么需要多进程？#

多进程可以绕过 GIL 限制，实现真正的并行计算。每个进程有自己独立的 Python 解释器和内存空间。

5.2 Process 进程创建#

1
import multiprocessing
2
import os
3
import time
4

5
def cpu_heavy(n):
6
    """CPU 密集型：计算素数和"""
7
    print(f"进程 {os.getpid()} 开始，参数 n={n}")
8
    total = 0
9
    for i in range(2, n):
10
        is_prime = True
11
        for j in range(2, int(i ** 0.5) + 1):
12
            if i % j == 0:
13
                is_prime = False
14
                break
15
        if is_prime:
16
            total += i
17
    return total
18

19
# 创建进程（与 Thread API 几乎一样！）
20
p1 = multiprocessing.Process(target=cpu_heavy, args=(200000,))
21
p2 = multiprocessing.Process(target=cpu_heavy, args=(200000,))
22

23
start = time.time()
24
p1.start()
25
p2.start()
26
p1.join()
27
p2.join()
28
print(f"2 进程耗时: {time.time() - start:.2f}s")
29
# 真实并行，耗时接近单进程一半

注意： Windows 下 multiprocessing 必须在 if __name__ == "__main__": 块中使用，否则会递归创建进程。macOS/Linux 也建议添加此保护。

5.3 Pool 进程池#

Pool 管理固定数量的工作进程，自动分配任务：

1
import multiprocessing
2
import time
3
import random
4

5
def task(n):
6
    """模拟耗时计算"""
7
    time.sleep(random.uniform(0.5, 1.5))
8
    result = n * n
9
    print(f"任务 {n}: {n}² = {result}")
10
    return result
11

12
if __name__ == "__main__":
13
    # 创建 4 个工作进程
14
    with multiprocessing.Pool(processes=4) as pool:
15
        # map: 阻塞，按顺序返回
16
        results = pool.map(task, range(10))
17
        print(f"map 结果: {results}")
18

19
        # imap: 惰性迭代器，按提交顺序返回
20
        for result in pool.imap(task, range(5)):
21
            print(f"imap 得到: {result}")
22

23
        # imap_unordered: 按完成顺序返回（非提交顺序）
24
        for result in pool.imap_unordered(task, range(5)):
25
            print(f"imap_unordered 得到: {result}")
26

27
        # apply_async: 异步提交单个任务，返回 AsyncResult
28
        async_result = pool.apply_async(task, args=(42,))
29
        # 可以做其他事情...
30
        result = async_result.get(timeout=10)  # 阻塞获取结果
31
        print(f"42² = {result}")
32

33
        # map_async: 异步 map，返回 AsyncResult
34
        async_results = pool.map_async(task, range(8))
35
        results = async_results.get()
36
        print(f"map_async 结果: {results}")

Pool 方法选择：

方法	阻塞/异步	返回值	适用场景
`pool.map()`	阻塞	列表	批量任务，等全部完成
`pool.imap()`	惰性	迭代器	大量任务，边完成边处理
`pool.apply()`	阻塞	单个值	单个任务
`pool.apply_async()`	异步	AsyncResult	单个任务，不阻塞主进程
`pool.map_async()`	异步	AsyncResult	批量任务，不阻塞主进程
`pool.starmap()`	阻塞	列表	多参数任务（类似 itertools.starmap）

5.4 Queue 与 Pipe 进程间通讯#

进程间不能直接共享变量（独立内存空间），需要通过 IPC 机制：

1
import multiprocessing
2

3
# === 方法一：Queue（队列）===
4
def producer(q):
5
    for i in range(5):
6
        q.put(f"msg-{i}")
7
    q.put("EOF")  # 结束标志
8

9
def consumer(q):
10
    while True:
11
        msg = q.get()
12
        if msg == "EOF":
13
            break
14
        print(f"收到: {msg}")
15

16
if __name__ == "__main__":
17
    q = multiprocessing.Queue()
18
    p1 = multiprocessing.Process(target=producer, args=(q,))
19
    p2 = multiprocessing.Process(target=consumer, args=(q,))
20
    p1.start()
21
    p2.start()
22
    p1.join()
23
    p2.join()

1
# === 方法二：Pipe（管道）===
2
if __name__ == "__main__":
3
    # Pipe 返回两个连接对象，全双工
4
    parent_conn, child_conn = multiprocessing.Pipe()
5

6
    def sender(conn):
7
        conn.send("Hello from child")
8
        conn.send([1, 2, 3])
9
        conn.send({"key": "value"})
10
        conn.close()
11

12
    def receiver(conn):
13
        print(conn.recv())   # "Hello from child"
14
        print(conn.recv())   # [1, 2, 3]
15
        print(conn.recv())   # {"key": "value"}
16
        conn.close()
17

18
    p = multiprocessing.Process(target=sender, args=(child_conn,))
19
    p.start()
20
    receiver(parent_conn)
21
    p.join()

5.5 Manager 共享数据管理器#

Manager 提供了在进程间共享 Python 对象的机制：

1
import multiprocessing
2

3
def worker(mgr_dict, mgr_list, name, value):
4
    mgr_dict[name] = value
5
    mgr_list.append(f"{name}:{value}")
6

7
if __name__ == "__main__":
8
    with multiprocessing.Manager() as manager:
9
        # 创建进程间共享的数据结构
10
        shared_dict = manager.dict()
11
        shared_list = manager.list()
12

13
        processes = []
14
        for i in range(5):
15
            p = multiprocessing.Process(
16
                target=worker,
17
                args=(shared_dict, shared_list, f"key-{i}", i * 100)
18
            )
19
            processes.append(p)
20
            p.start()
21

22
        for p in processes:
23
            p.join()
24

25
        print(f"共享字典: {dict(shared_dict)}")
26
        print(f"共享列表: {list(shared_list)}")

Manager 支持的类型： dict, list, Namespace, Lock, RLock, Semaphore, BoundedSemaphore, Condition, Event, Barrier, Queue, Value, Array

5.6 Value 和 Array 共享内存#

比 Manager 更高效的共享内存方式（直接使用 C 类型）：

1
import multiprocessing
2

3
# Value: 单个值的共享内存
4
counter = multiprocessing.Value("i", 0)  # "i" = 有符号整数
5

6
# Array: 数组的共享内存
7
shared_array = multiprocessing.Array("d", [1.0, 2.0, 3.0])  # "d" = double
8

9
def increment(val, arr):
10
    for _ in range(1000):
11
        with val.get_lock():  # 需要显式加锁
12
            val.value += 1
13

14
if __name__ == "__main__":
15
    processes = [
16
        multiprocessing.Process(target=increment, args=(counter, shared_array))
17
        for _ in range(4)
18
    ]
19
    for p in processes:
20
        p.start()
21
    for p in processes:
22
        p.join()
23

24
    print(f"计数器: {counter.value}")  # 4000

类型码：

"i" — 有符号 int
"f" — float
"d" — double
"c" — char
"b" — signed char（注意：不是布尔值）

6. concurrent.futures 高级并发 API#

concurrent.futures 是 Python 3.2 引入的高级并发 API，提供了统一的线程池和进程池接口。

6.1 ThreadPoolExecutor 线程池#

1
from concurrent.futures import ThreadPoolExecutor, as_completed
2
import time
3
import random
4
import urllib.request
5

6
def fetch_url(url):
7
    """模拟抓取 URL"""
8
    time.sleep(random.uniform(0.3, 1.0))  # 模拟网络延迟
9
    return f"{url} → 200 OK"
10

11
urls = [f"https://api.example.com/v1/item/{i}" for i in range(10)]
12

13
# 方式一：submit + as_completed（按完成顺序处理）
14
with ThreadPoolExecutor(max_workers=5) as executor:
15
    # 提交所有任务
16
    futures = {executor.submit(fetch_url, url): url for url in urls}
17

18
    # 按完成顺序处理结果
19
    for future in as_completed(futures):
20
        url = futures[future]
21
        try:
22
            result = future.result(timeout=5)
23
            print(f"✓ {result}")
24
        except Exception as e:
25
            print(f"✗ {url} 失败: {e}")

1
# 方式二：map（保持提交顺序）
2
with ThreadPoolExecutor(max_workers=5) as executor:
3
    results = executor.map(fetch_url, urls, timeout=10)
4
    for url, result in zip(urls, results):
5
        print(f"{url}: {result}")

6.2 ProcessPoolExecutor 进程池#

与线程池 API 一致，底层使用进程：

1
from concurrent.futures import ProcessPoolExecutor
2
import os
3

4
def cpu_task(n):
5
    return f"PID {os.getpid()}: {n}² = {n * n}"
6

7
if __name__ == "__main__":
8
    with ProcessPoolExecutor(max_workers=4) as executor:
9
        # submit 方式
10
        futures = [executor.submit(cpu_task, i) for i in range(20)]
11
        for future in as_completed(futures):
12
            print(future.result())
13

14
        # map 方式
15
        results = executor.map(cpu_task, range(10, 15))
16
        for r in results:
17
            print(r)

6.3 Future 对象#

Future 代表一个异步操作的结果，提供统一的查询和控制接口：

方法	说明
`future.result(timeout)`	获取结果（阻塞）
`future.exception()`	获取异常（不阻塞）
`future.done()`	任务是否完成
`future.running()`	任务是否正在执行
`future.cancel()`	尝试取消任务（只能取消尚未开始的任务）
`future.add_done_callback(fn)`	添加完成回调

1
from concurrent.futures import ThreadPoolExecutor
2
import time
3

4
def check_done(future):
5
    """完成时的回调函数"""
6
    try:
7
        result = future.result()
8
        print(f"回调通知: 结果 = {result}")
9
    except Exception as e:
10
        print(f"回调通知: 异常 = {e}")
11

12
with ThreadPoolExecutor() as executor:
13
    future = executor.submit(time.sleep, 1)  # 返回 None
14
    future.add_done_callback(check_done)
15

16
    print("主线程继续执行其他工作...")
17
    time.sleep(1.5)
18
    print("主线程检查: done =", future.done())  # True

6.4 wait 等待多任务#

1
from concurrent.futures import ThreadPoolExecutor, wait, FIRST_COMPLETED, ALL_COMPLETED
2
import time
3
import random
4

5
def task(n):
6
    time.sleep(random.uniform(0.5, 2.0))
7
    return n * 2
8

9
with ThreadPoolExecutor(max_workers=5) as executor:
10
    futures = [executor.submit(task, i) for i in range(10)]
11

12
    # 等待第一个完成
13
    done, not_done = wait(futures, return_when=FIRST_COMPLETED)
14
    print(f"第一个完成的: {[f.result() for f in done]}")
15

16
    # 等待所有完成
17
    done, not_done = wait(not_done, return_when=ALL_COMPLETED)
18
    print(f"剩下的: {[f.result() for f in done]}")

7. 三种并发模型对比与选择#

7.1 对比总结表#

维度	threading（多线程）	multiprocessing（多进程）	asyncio（异步）
适用场景	I/O 密集型	CPU 密集型	高并发 I/O 密集型
GIL 影响	受限制（只能并发 I/O）	无影响（独立解释器）	不受限（单线程）
真正并行	❌（CPU 任务）	✅	❌（单线程）
内存开销	低（共享内存）	高（独立内存）	极低（协程栈）
创建开销	低	高（fork 新进程）	极低（协程切换）
通讯方式	共享变量 + Lock	Queue / Pipe / Manager	无需通讯（单线程）
数据共享	简单（共享内存）	复杂（需要 IPC）	N/A
并发数量	数十到数百	数个到数十	数万到数十万
代码复杂度	中等（需要处理锁）	中等（需要处理 IPC）	较高（async/await 传染性）
典型应用	爬虫、API 调用、文件处理	图像处理、科学计算、视频编码	高并发 Web 服务、WebSocket

7.2 选择决策树#

1
需要并发？
2
├── CPU 密集型（计算为主）
3
│   └── → multiprocessing / ProcessPoolExecutor
4
│
5
└── I/O 密集型（等待为主）
6
    ├── 简单任务 / 少量并发（< 100）
7
    │   └── → threading / ThreadPoolExecutor
8
    │
9
    └── 高并发（> 1000）/ 需要精细控制
10
        └── → asyncio

7.3 混合使用：线程 + 进程#

对于复杂场景，可以组合使用：

1
from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor
2
import time
3

4
def cpu_heavy(n):
5
    """CPU 密集型工作"""
6
    return sum(i * i for i in range(n * 10000))
7

8
def io_heavy(n):
9
    """I/O 密集型工作"""
10
    time.sleep(0.1)
11
    return f"io-task-{n}"
12

13
# 进程池处理 CPU 任务，线程池处理 I/O 任务
14
with ProcessPoolExecutor(max_workers=4) as cpu_pool, \
15
     ThreadPoolExecutor(max_workers=10) as io_pool:
16

17
    # 并行提交两种任务
18
    cpu_futures = [cpu_pool.submit(cpu_heavy, i) for i in range(8)]
19
    io_futures = [io_pool.submit(io_heavy, i) for i in range(20)]
20

21
    # 收集结果
22
    for f in cpu_futures:
23
        print(f"CPU结果: {f.result()}")
24
    for f in io_futures:
25
        print(f"IO结果: {f.result()}")

8. 实战案例#

8.1 多线程爬虫加速#

1
from concurrent.futures import ThreadPoolExecutor, as_completed
2
import threading
3
import time
4
import random
5

6
# 模拟爬取页面
7
def crawl_page(url):
8
    """模拟爬取一个页面"""
9
    thread = threading.current_thread()
10
    delay = random.uniform(0.2, 0.8)
11
    time.sleep(delay)  # 模拟网络延迟
12
    return {
13
        "url": url,
14
        "status": 200,
15
        "size": random.randint(1000, 50000),
16
        "thread": thread.name,
17
        "time": delay,
18
    }
19

20
# URL 列表
21
urls = [f"https://example.com/page/{i}" for i in range(50)]
22

23
print(f"开始爬取 {len(urls)} 个页面...\n")
24
start = time.time()
25

26
with ThreadPoolExecutor(max_workers=10) as executor:
27
    futures = {executor.submit(crawl_page, url): url for url in urls}
28

29
    completed = 0
30
    for future in as_completed(futures):
31
        completed += 1
32
        result = future.result()
33
        print(
34
            f"[{completed:2d}/{len(urls)}] {result['url']:35s} | "
35
            f"状态={result['status']} | "
36
            f"{result['size']:5d}字节 | "
37
            f"{result['time']:.2f}s | "
38
            f"{result['thread']}"
39
        )
40

41
print(f"\n总耗时: {time.time() - start:.2f}s")
42
# 50 个页面，10 线程并发，约 4-5 秒完成
43
# 单线程需要约 0.5 * 50 = 25 秒

8.2 多进程批量图片处理#

1
from concurrent.futures import ProcessPoolExecutor
2
import os
3
import time
4

5
def process_image(args):
6
    """模拟图片处理：缩放、压缩、加水印"""
7
    filename, size = args
8
    time.sleep(0.3)  # 模拟处理时间
9
    new_size = (size[0] // 2, size[1] // 2)
10
    new_name = filename.replace(".jpg", "_thumb.jpg")
11

12
    # 实际中这里会用 Pillow 处理图片
13
    # from PIL import Image
14
    # img = Image.open(filename)
15
    # img.thumbnail(new_size)
16
    # img.save(new_name, quality=80)
17

18
    return {
19
        "file": new_name,
20
        "old_size": size,
21
        "new_size": new_size,
22
        "pid": os.getpid(),
23
    }
24

25
if __name__ == "__main__":
26
    # 模拟 30 张图片
27
    images = [(f"photo_{i}.jpg", (4000, 3000)) for i in range(30)]
28

29
    print(f"开始处理 {len(images)} 张图片...\n")
30
    start = time.time()
31

32
    with ProcessPoolExecutor(max_workers=6) as executor:
33
        futures = [executor.submit(process_image, img) for img in images]
34

35
        for i, future in enumerate(futures, 1):
36
            result = future.result()
37
            print(
38
                f"[{i:2d}/{len(images)}] {result['file']:25s} | "
39
                f"{result['old_size']} → {result['new_size']} | "
40
                f"PID={result['pid']}"
41
            )
42

43
    print(f"\n总耗时: {time.time() - start:.2f}s")
44
    # 30 张图，6 进程并行，约 1.5 秒
45
    # 单进程需要 0.3 * 30 = 9 秒

8.3 线程安全的连接池#

1
import threading
2
import time
3
import random
4
from contextlib import contextmanager
5

6
class ConnectionPool:
7
    """
8
    线程安全的数据库连接池
9
    使用 Semaphore 限制并发、Lock 保护数据结构
10
    """
11

12
    def __init__(self, pool_size=5):
13
        self.pool_size = pool_size
14
        self.semaphore = threading.Semaphore(pool_size)
15
        self.lock = threading.Lock()
16
        self.connections = []  # 可用连接
17
        self.in_use = 0        # 使用中的连接数
18

19
        # 初始化连接池
20
        for i in range(pool_size):
21
            self.connections.append(self._create_connection(i))
22

23
    def _create_connection(self, conn_id):
24
        return {"id": conn_id, "connected_at": time.time()}
25

26
    @contextmanager
27
    def get_connection(self):
28
        """获取连接（上下文管理器，自动归还）"""
29
        self.semaphore.acquire()
30
        with self.lock:
31
            conn = self.connections.pop()
32
            self.in_use += 1
33
            print(f"  → 获取连接 #{conn['id']} (使用中: {self.in_use}/{self.pool_size})")
34

35
        try:
36
            yield conn
37
        finally:
38
            with self.lock:
39
                self.connections.append(conn)
40
                self.in_use -= 1
41
                print(f"  ← 归还连接 #{conn['id']} (使用中: {self.in_use}/{self.pool_size})")
42
            self.semaphore.release()
43

44
# 使用连接池
45
pool = ConnectionPool(pool_size=3)
46

47
def execute_query(query_id):
48
    with pool.get_connection() as conn:
49
        duration = random.uniform(0.5, 1.5)
50
        print(f"查询 {query_id} 执行中 (连接 #{conn['id']})...")
51
        time.sleep(duration)
52
        print(f"查询 {query_id} 完成 ({duration:.1f}s)")
53

54
# 8 个并发查询，但最多 3 个同时执行
55
with ThreadPoolExecutor(max_workers=8) as executor:
56
    executor.map(execute_query, range(8))

9. 常见陷阱与最佳实践#

9.1 陷阱一：忘记 join()#

1
# ❌ 错误：主线程退出，子线程被强制终止
2
def long_task():
3
    for i in range(100):
4
        print(f"进度: {i}%")
5
        time.sleep(0.1)
6

7
t = threading.Thread(target=long_task)
8
t.start()
9
# 程序直接结束了！子线程 work 没做完
10

11
# ✅ 正确
12
t.start()
13
t.join()  # 等待子线程完成

9.2 陷阱二：共享变量不加锁#

1
# ❌ Python 的 += 不是原子操作
2
counter = 0
3
def increment():
4
    global counter
5
    counter += 1  # 读 → 加 → 写，三步可能被打断
6

7
# ✅ 加锁或使用线程安全结构
8
import threading
9
counter = 0
10
lock = threading.Lock()
11

12
def increment():
13
    global counter
14
    with lock:
15
        counter += 1

9.3 陷阱三：死锁#

1
# ❌ 经典死锁场景
2
lock_a = threading.Lock()
3
lock_b = threading.Lock()
4

5
def thread1():
6
    with lock_a:
7
        time.sleep(0.1)
8
        with lock_b:  # 等待 lock_b
9
            pass
10

11
def thread2():
12
    with lock_b:
13
        time.sleep(0.1)
14
        with lock_a:  # 等待 lock_a
15
            pass
16

17
# thread1 持有 lock_a 等 lock_b
18
# thread2 持有 lock_b 等 lock_a
19
# → 死锁！
20

21
# ✅ 修复：统一加锁顺序
22
def thread1():
23
    with lock_a:
24
        with lock_b:
25
            pass
26

27
def thread2():
28
    with lock_a:  # 先获取 lock_a（与 thread1 顺序一致）
29
        with lock_b:
30
            pass

9.4 陷阱四：multiprocessing 在 Windows 上的坑#

1
# ❌ Windows 下会无限递归创建进程
2
import multiprocessing
3

4
def worker():
5
    pass
6

7
p = multiprocessing.Process(target=worker)
8
p.start()  # Windows 会重新导入整个模块！
9

10
# ✅ 正确：使用 __name__ 保护
11
if __name__ == "__main__":
12
    p = multiprocessing.Process(target=worker)
13
    p.start()
14
    p.join()

9.5 陷阱五：线程池中传递不可序列化对象#

1
# ❌ ProcessPoolExecutor 参数和返回值必须可 pickle
2
from concurrent.futures import ProcessPoolExecutor
3

4
lambda_fn = lambda x: x * 2
5
with ProcessPoolExecutor() as executor:
6
    executor.submit(lambda_fn, 5)  # TypeError: can't pickle lambda
7

8
# ✅ 使用普通函数
9
def double(x):
10
    return x * 2
11

12
with ProcessPoolExecutor() as executor:
13
    f = executor.submit(double, 5)
14
    print(f.result())  # 10

9.6 最佳实践清单#

线程安全数据结构优先： 能用 queue.Queue 就不用 list + Lock
锁的范围最小化： 只锁关键代码，不要锁整个函数
使用 with 语句： 避免忘记释放锁/信号量
统一加锁顺序： 防止死锁
I/O 密集用 ThreadPoolExecutor，CPU 密集用 ProcessPoolExecutor
ProcessPoolExecutor 统一加 if __name__ == "__main__": 保护
设置合理的超时： future.result(timeout=...) 避免永久阻塞
异常处理： 用 try/except 包裹任务函数，避免一个异常导致整个任务组崩溃

10. 总结#

场景	推荐方案	关键理由
网络爬虫 / API 调用	`ThreadPoolExecutor`	I/O 密集 + 简单 API
图片/视频处理	`ProcessPoolExecutor`	CPU 密集 + 真正并行
高并发 Web 服务	`asyncio` + `aiohttp`	超低开销 + 数万并发
简单并行任务	`ThreadPoolExecutor`	轻量 + 无 IPC 开销
数据库操作	`ThreadPoolExecutor`	I/O 密集 + 线程安全
科学计算 / 数据处理	`ProcessPoolExecutor`	绕过 GIL
混合任务（I/O + CPU）	`ProcessPoolExecutor` + `ThreadPoolExecutor`	各取所长

记住三个核心规律：

I/O 等得起 → 用线程（网络、文件、数据库）
CPU 算不动 → 用进程（计算、编码、模型推理）
高并发等得多 → 用异步（协程、事件循环）

推荐阅读：

Python asyncio 异步编程完全指南

Python 元类（Metaclass）完全指南

Python 类型注解完全指南

Python 上下文管理器完全指南

Python 多线程与多进程并发编程完全指南：GIL 原理、threading、multiprocessing 到 concurrent.futures | Python 进阶核心知识