How to know when to use Python multiprocessing pool or process?

I’m working on an algorithm, and the function multiprocess is supposed to call another function, CreateMatrixMp(), in parallel as many times as there are CPUs available. Since I have never worked with multiprocessing before, I’m unsure whether to use Pool or Process.

The word “efficient” here refers to the function CreateMatrixMp() being called potentially thousands of times. I’ve read through the Python documentation on the multiprocessing module and have considered two possibilities:

Using the Pool class:

def MatrixHelper(self, args):
    return self.CreateMatrix(*args)

def Multiprocess(self, sigmaI, sigmaX):
    cpus = mp.cpu_count()
    print(f'Number of CPUs to process WM: {cpus}')
    poolCount = cpus * 2
    args = [(sigmaI, sigmaX, i) for i in range(self.numPixels)]

    pool = mp.Pool(processes=poolCount, maxtasksperchild=2)
    tempData = pool.map(self.MatrixHelper, args)
    pool.close()
    pool.join()

Using the Process class:

def Multiprocess(self, sigmaI, sigmaX):
    cpus = mp.cpu_count()
    print(f'Number of CPUs to process WM: {cpus}')
    
    processes = [mp.Process(target=self.CreateMatrixMp, args=(sigmaI, sigmaX, i,)) for i in range(self.numPixels)]
    for p in processes:
        p.start()
    for p in processes:
        p.join()

From my research, it seems like Pool would be the better choice due to its lower overhead, and the fact that it automatically handles the number of CPUs on the machine. The only issue with using Pool is that I keep running into errors, and every time I fix one, another pops up. On the other hand, using Process seems simpler to implement, and for all I know, it might be the better choice.

Should I use Pool instead of Process, and is map() the right function to use if I want to maintain the order of results? I need to collect the results of each process in a list. How does Process handle returned data, and is it a good fit for this case?

Please advise when to use Python multiprocessing pool or process and the best approach for handling the data returned by each parallel task.

If you want to efficiently handle parallel tasks while maintaining the order of results, the Pool with the map() function is the better approach. The map() function ensures that the results are returned in the same order as the input, and Pool will handle the distribution of tasks to available CPU cores automatically.

Here’s an updated approach using Pool:

import multiprocessing as mp

def MatrixHelper(self, args):
    return self.CreateMatrix(*args)

def Multiprocess(self, sigmaI, sigmaX):
    cpus = mp.cpu_count()
    print(f'Number of CPUs to process WM: {cpus}')
    poolCount = cpus * 2  # This can be adjusted based on the workload

    # Prepare the arguments for each process
    args = [(sigmaI, sigmaX, i) for i in range(self.numPixels)]

    # Use Pool to distribute tasks and collect results
    with mp.Pool(processes=poolCount, maxtasksperchild=2) as pool:
        tempData = pool.map(self.MatrixHelper, args)

    return tempData

In this solution, the Pool takes care of distributing the tasks, and map() ensures the results are collected in the correct order. This is typically more efficient than using Process for each task, as Pool handles the task distribution internally.

If you choose to use process for fine-grained control over task execution, you can use a Queue to collect the results from each process. This approach allows you to manage task execution explicitly and handle the return values of each process.

Here’s how you can implement it:

import multiprocessing as mp

def worker(result_queue, sigmaI, sigmaX, index):
    result = CreateMatrixMp(sigmaI, sigmaX, index)
    result_queue.put(result)

def Multiprocess(self, sigmaI, sigmaX):
    cpus = mp.cpu_count()
    print(f'Number of CPUs to process WM: {cpus}')
    
    # Create a Queue for collecting results
    result_queue = mp.Queue()

    # Create processes for each task
    processes = [mp.Process(target=worker, args=(result_queue, sigmaI, sigmaX, i)) for i in range(self.numPixels)]

    # Start all processes
    for p in processes:
        p.start()

    # Wait for all processes to complete
    for p in processes:
        p.join()

    # Collect the results from the Queue
    tempData = []
    while not result_queue.empty():
        tempData.append(result_queue.get())

    return tempData

This approach allows you to use Process for each task, while collecting the results via a Queue. This gives you flexibility if you want to implement custom logic for task handling, but it introduces more complexity than Pool.

If each task requires multiple arguments (as in your case), the starmap() function in Pool might be more suitable than map(). It allows you to pass multiple arguments to each task function.

Here’s an updated solution using starmap():

import multiprocessing as mp

def MatrixHelper(self, sigmaI, sigmaX, i):
    return self.CreateMatrix(sigmaI, sigmaX, i)

def Multiprocess(self, sigmaI, sigmaX):
    cpus = mp.cpu_count()
    print(f'Number of CPUs to process WM: {cpus}')
    poolCount = cpus * 2

    # Prepare the arguments for each process
    args = [(sigmaI, sigmaX, i) for i in range(self.numPixels)]

    # Use starmap for parallel execution with multiple arguments
    with mp.Pool(processes=poolCount, maxtasksperchild=2) as pool:
        tempData = pool.starmap(self.MatrixHelper, args)

    return tempData

Using starmap() allows you to handle functions that require multiple arguments in a simple and efficient manner while maintaining the order of results.