What Is the Python Global Interpreter Lock (GIL)?
This means that only one thread can be in a state of execution at any point in time. The impact of the GIL isn’t visible to developers who execute single-threaded programs, but it can be a performance bottleneck in CPU-bound and multi-threaded code.
Since the GIL allows only one thread to execute at a time even in a multi-threaded architecture with more than one CPU core, the GIL has gained a reputation as an “infamous” feature of Python.
In this article you’ll learn how the GIL affects the performance of your Python programs, and how you can mitigate the impact it might have on your code.
What Problem Did the GIL Solve for Python?
Python uses reference counting for memory management. It means that objects created in Python have a reference count variable that keeps track of the number of references that point to the object. When this count reaches zero, the memory occupied by the object is released.
Let’s take a look at a brief code example to demonstrate how reference counting works:
>>> import sys >>> a =  >>> b = a >>> sys.getrefcount(a) 3
In the above example, the reference count for the empty list object
 was 3. The list object was referenced by
b and the argument passed to
Back to the GIL:
The problem was that this reference count variable needed protection from race conditions where two threads increase or decrease its value simultaneously. If this happens, it can cause either leaked memory that is never released or, even worse, incorrectly release the memory while a reference to that object still exists. This can cause crashes or other “weird” bugs in your Python programs.
This reference count variable can be kept safe by adding locks to all data structures that are shared across threads so that they are not modified inconsistently.
But adding a lock to each object or groups of objects means multiple locks will exist which can cause another problem—Deadlocks (deadlocks can only happen if there is more than one lock). Another side effect would be decreased performance caused by the repeated acquisition and release of locks.
The GIL is a single lock on the interpreter itself which adds a rule that execution of any Python bytecode requires acquiring the interpreter lock. This prevents deadlocks (as there is only one lock) and doesn’t introduce much performance overhead. But it effectively makes any CPU-bound Python program single-threaded.
The GIL, although used by interpreters for other languages like Ruby, is not the only solution to this problem. Some languages avoid the requirement of a GIL for thread-safe memory management by using approaches other than reference counting, such as garbage collection.
On the other hand, this means that those languages often have to compensate for the loss of single threaded performance benefits of a GIL by adding other performance boosting features like JIT compilers.
Why Was the GIL Chosen as the Solution?
So, why was an approach that is seemingly so obstructing used in Python? Was it a bad decision by the developers of Python?
Well, in the words of Larry Hastings, the design decision of the GIL is one of the things that made Python as popular as it is today.
Python has been around since the days when operating systems did not have a concept of threads. Python was designed to be easy-to-use in order to make development quicker and more and more developers started using it.
A lot of extensions were being written for the existing C libraries whose features were needed in Python. To prevent inconsistent changes, these C extensions required a thread-safe memory management which the GIL provided.
The GIL is simple to implement and was easily added to Python. It provides a performance increase to single-threaded programs as only one lock needs to be managed.
C libraries that were not thread-safe became easier to integrate. And these C extensions became one of the reasons why Python was readily adopted by different communities.
As you can see, the GIL was a pragmatic solution to a difficult problem that the CPython developers faced early on in Python’s life.
The Impact on Multi-Threaded Python Programs
When you look at a typical Python program—or any computer program for that matter—there’s a difference between those that are CPU-bound in their performance and those that are I/O-bound.
CPU-bound programs are those that are pushing the CPU to its limit. This includes programs that do mathematical computations like matrix multiplications, searching, image processing, etc.
I/O-bound programs are the ones that spend time waiting for Input/Output which can come from a user, file, database, network, etc. I/O-bound programs sometimes have to wait for a significant amount of time till they get what they need from the source due to the fact that the source may need to do its own processing before the input/output is ready, for example, a user thinking about what to enter into an input prompt or a database query running in its own process.
Let’s have a look at a simple CPU-bound program that performs a countdown:
# single_threaded.py import time from threading import Thread COUNT = 50000000 def countdown(n): while n>0: n -= 1 start = time.time() countdown(COUNT) end = time.time() print('Time taken in seconds -', end - start)
Running this code on my system with 4 cores gave the following output:
$ python single_threaded.py Time taken in seconds - 6.20024037361145
Now I modified the code a bit to do to the same countdown using two threads in parallel:Back to Top