Is there a Python threadpool like multiprocessing.Pool?

Is there a Python threadpool similar to the multiprocessing.Pool for worker threads?

I like the simplicity of using multiprocessing.Pool to parallelize tasks, for example:

def long_running_func(p):
    c_func_no_gil(p)

p = multiprocessing.Pool(4)
xs = p.map(long_running_func, range(100))

However, I want to achieve the same parallelization without the overhead of creating new processes.

I understand the Global Interpreter Lock (GIL), but in my use case, the function being called is an IO-bound C function, and the Python wrapper releases the GIL before invoking the actual function.

Do I need to implement my own threading pool for this, or is there an existing Python threadpool solution that can handle this scenario?

Hey @MiroslavRalevic

May your code always be bug-free and your threads never deadlock!

Using concurrent.futures.ThreadPoolExecutor: The concurrent.futures module provides a high-level interface for asynchronously executing callables. You can use ThreadPoolExecutor to manage a pool of threads easily. This is especially handy for IO-bound functions as it avoids the overhead of creating new processes. Here’s an example:

from concurrent.futures import ThreadPoolExecutor

def long_running_func(p):
    c_func_no_gil(p)

with ThreadPoolExecutor(max_workers=4) as executor:
    results = list(executor.map(long_running_func, range(100)))

It’s a simple yet powerful way to implement a Python thread pool!

Hey Everyone!

Building on what @ian-partridge mentioned, another approach is using queue.Queue with worker threads. This method lets you manually manage the thread pool, providing more control over task distribution among threads. Here’s an example:

import threading
import queue

def worker(task_queue):
    while True:
        p = task_queue.get()
        if p is None:
            break
        c_func_no_gil(p)

task_queue = queue.Queue()
threads = []

# Create worker threads
for _ in range(4):
    t = threading.Thread(target=worker, args=(task_queue,))
    t.start()
    threads.append(t)

# Add tasks to the queue
for p in range(100):
    task_queue.put(p)

# Stop workers
for _ in range(4):
    task_queue.put(None)
for t in threads:
    t.join()

This gives you the flexibility to tailor thread behavior exactly as needed!

Adding to Toby’s detailed explanation, another efficient method is using multiprocessing.dummy.Pool. This version of Pool uses threads rather than processes, which reduces overhead while still allowing thread-based parallelism. Here’s how it works:

from multiprocessing.dummy import Pool as ThreadPool

def long_running_func(p):
    c_func_no_gil(p)

pool = ThreadPool(4)
results = pool.map(long_running_func, range(100))
pool.close()
pool.join()

It’s perfect if you want the simplicity of multiprocessing. Pool but with the lighter weight of threads!