Multiprocessing in Python
1. Introduction
Multiprocessing in Python allows the execution of multiple processes simultaneously. This is particularly useful for CPU-bound tasks that can benefit from parallelism.
2. Key Concepts
- Process: An instance of a program in execution.
- Thread: A lightweight process that shares the same memory space.
- Concurrency: Running multiple tasks at the same time (not necessarily simultaneously).
- Parallelism: Performing multiple operations simultaneously.
3. Getting Started
To use the multiprocessing module, you first need to import it. Below is a basic setup:
import multiprocessing
def worker():
print("Worker Function")
if __name__ == "__main__":
p = multiprocessing.Process(target=worker)
p.start()
p.join()
4. Code Example
Here is a more detailed example that demonstrates the use of multiple processes.
import multiprocessing
import time
def square(n):
time.sleep(1)
print(f'Square of {n} is {n * n}')
if __name__ == "__main__":
processes = []
numbers = [1, 2, 3, 4, 5]
for number in numbers:
p = multiprocessing.Process(target=square, args=(number,))
processes.append(p)
p.start()
for p in processes:
p.join()
5. Best Practices
- Always protect the entry point of the program by using
if __name__ == "__main__"
. - Use a
Pool
for managing multiple processes efficiently. - Be cautious of shared data; use
Manager
orLock
when necessary. - Keep track of processes and ensure they are properly joined.
6. FAQ
What is the difference between threading and multiprocessing in Python?
Threading is used for I/O-bound tasks, while multiprocessing is better for CPU-bound tasks.
How does the Global Interpreter Lock (GIL) affect multiprocessing?
The GIL allows only one thread to execute at a time, but does not affect multiprocessing since each process has its own Python interpreter.
Can I share data between processes?
Yes, you can share data using shared memory or a manager. However, be mindful of concurrency issues.