ProcessPoolExecutor in Python: The Complete Guide

Python Processes and the Need for Process Pools

So, what are processes and why do we care about process pools?

What Are Python Processes

process refers to a computer program.

Every Python program is a process and has one thread called the main thread used to execute your program instructions. Each process is, in fact, one instance of the Python interpreter that executes Python instructions (Python byte-code), which is a slightly lower level than the code you type into your Python program.

Sometimes we may need to create new processes to run additional tasks concurrently.

Python provides real system-level processes via the Process class in the multiprocessing module.

You can learn more about multiprocessing in the tutorial:

The underlying operating system controls how new processes are created. On some systems, that may require spawning a new process, and on others, it may require that the process is forked. The operating-specific method used for creating new processes in Python is not something we need to worry about as it is managed by your installed Python interpreter.

A task can be run in a new process by creating an instance of the Process class and specifying the function to run in the new process via the “target” argument.

1

2

3

...

# define a task to run in a new process

p = Process(target=task)

Once the process is created, it must be started by calling the start() function.

1

2

3

...

# start the task in a new process

p.start()

We can then wait around for the task to complete by joining the process; for example:

1

2

3

...

# wait for the task to complete

p.join()

Whenever we create new processes, we must protect the entry point of the program.

1

2

3

# entry point for the program

if __name__ == '__main__':

# do things...

Tying this together, the complete example of creating a Process to run an ad hoc task function is listed below.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

# SuperFastPython.com

# example of running a function in a new process

from multiprocessing import Process

 

# a task to execute in another process

def task():

    print('This is another process', flush=True)

 

# entry point for the program

if __name__ == '__main__':

    # define a task to run in a new process

    p = Process(target=task)

    # start the task in a new process

    p.start()

    # wait for the task to complete

    p.join()

This is useful for running one-off ad hoc tasks in a separate process, although it becomes cumbersome when you have many tasks to run.

Each process that is created requires the application of resources (e.g. an instance of the Python interpreter and a memory for the process’s main thread’s stack space). The computational costs for setting up processes can become expensive if we are creating and destroying many processes over and over for ad hoc tasks.

Instead, we would prefer to keep worker processes around for reuse if we expect to run many ad hoc tasks throughout our program.

This can be achieved using a process pool.

Back to Top