Optimization using Celery: Asynchronous Processing in Django
Consider a situation where you have a task running exceptionally slow even after doing hours of optimization, whether it’s database query optimization or changing how the application processes things. We often face these kinds of situations. If you are facing a similar problem with your code, then most probably your code is doing fine what it should do, but consuming time due to the way it does that.
You must understand that some tasks are inherently time/resources consuming because they are either CPU intensive jobs or I/O intensive jobs. For example, if your code is calling an API endpoint for thousands of data, it will definitely take a lot of time, if processed sequentially as each URL endpoint can take a decent amount of time between request and response. So sequential processing of data is not a right way to tackle this issue.
We will discuss these key points in this blog:
Our Experience with Celery
Asynchronous Processing as Solution
Way in Python
Why you should use Celery
If you don’t want to read about these points, you can skip this blog and read the Part-2 of this blog.
Our Experience with celery
Here at Technoarch, we experienced a situation where we have to parse a list of source code text for a DSL(Domain Specific Language), a programming language specifically designed for a feature in the project .Although each source code parsing and execution takes around 5-7 seconds, and we have to run 10-12 of them at once, if we execute them sequentially, it will take around 50-84 seconds which is a lot of time for a user facing functionality. After just switching to parallel processing of these source code parsing, the throughput of the waiting time drastically reduces to 6-7 seconds, which is a great deal of efficiency without changing the way we parse the source code.
Asynchronous Processing as Solution
What I mean by Asynchronous operation is the process which operates independently of other processes, for example, sending a welcome email on a new signup. So if we can process these processes asynchronously too, that will increase the efficiency a lot. There are 2 different terms related to asynchronous processing - Parallelism & Concurrency .
Parallelism: It simply means processing 2 or more independent tasks simultaneously.
Concurrency: It is a broader term which means running multiple tasks at once, they may be executed simultaneously or executed sequentially where the order of execution is not important, for example, if 2 tasks say A & B are running concurrently, then B can be executed first, or maybe A can be executed first , or maybe A can be run partially for a while and then paused to let B executed completely and then resumed execution for A.
Ways in Python
Python provides various ways to achieve asynchronous processing. So let’s discuss them briefly.
Multiple Processes -
This is the most obvious way is spawning multiple processes. These processes do not share any memory and will be run in different cores of the CPU. This suits best for CPU intensive jobs and not for I/O intensive jobs.
Python provides a standard library to create and manage processes: multiprocessing.
Multiple Thread -
In this, asynchronous tasks run in the same process but within different threads. Although threads can be run in different cores depending upon OS but due to Python’s GIL (Global Interpreter Lock), different threads can only be run in a single core despite having more than one. Due to this, threads in python are not parallel in true sense but concurrently, that is, threads are not executed simultaneously but out of order. Technically OS allocates equal amounts of time slices among these threads and runs each thread within these time slices. It can increase the efficiency drastically as the OS can allocate CPU resources to a thread which actually needs it from the thread which is waiting for user’s input or response from REST API endpoint. This is best suited for I/O intensive jobs as there is no need to allocate CPU resources for tasks which are not doing anything but waiting for input. But It has one downpoint, developers have no control over the execution, for example, OS may allocate CPU resources to thread which is still waiting for input.
Python provides a standard library to create and manage threads: threading.
Coroutines are computer program components that generalize subroutines for non-preemptive multitasking, by allowing execution to be suspended and resumed. Think of them as threads only but with greater flexibility and control for developers. Previously coroutines can be used in python using yield. But a new package asyncio was introduced in Python 3.4 and gained a lot of hype because of its usefulness with I/O intensive jobs. It provides keywords async using which creates a coroutine and await using which coroutine gives away control to other tasks which need resources for execution.
Python provides a standard library to create and manage coroutines: asyncio.
Why should you use Celery
The most sensible response after getting to know about all these ways, you must be thinking why should we use anything else for asynchronous processing when we could simply use the previously discussed methods introduced by python itself which best suited our requirements. So You should use Celery because it abstracts the way it achieves concurrency and the only thing we should know is how to use celery instead of knowing how to work with all of the above discussed packages. Now let’s discuss what Celery is, following is the quote from Celery Project website.
Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operations but supports scheduling as well. The execution units, called tasks, are executed concurrently on one or more worker servers using multiprocessing, Eventlet, or gevent. Tasks can execute asynchronously (in the background) or synchronously (wait until ready).
In simple words, Celery executes a queue of tasks on getting messages from a message broker agent (redis/rabbitMQ) by using a pool of concurrent worker agents. Celery gives us control of choosing between different kinds of these pools which decides what kind of concurrency it will achieve.
There are mainly 2 kinds of pool worker:
Multiprocessing-based pool worker -
This will spawn a process for each worker in the pool. This will achieve parallel processing of tasks and hence suited for CPU Intensive jobs. The pool worker in celery for multiprocessing is prefork. It is a default choice of pool worker in Celery.
The following is the command to start a pool of prefork workers and you can also use the --concurrency or -c argument to increase the number of concurrent workers in the pool. By default this number is set to the number of cores of the CPU in the system.
Thread-based pool worker -
These types of worker doesn’t spawn a new process but a thread for a worker. This helps celery to avoid unnecessary consumption of CPU resources by these workers and these best suited for I/O bound tasks. The commonly used thread-based pool workers are Gevent and Greenlet.
The following is the command for starting a pool of workers with these, as threads are inexpensive than processes, you can have a very large number of these concurrent workers.
Now you know about the different ways of processing other than the synchronous, the tools available in Python to execute asynchronous code concurrently and how you can abstract all of these complexities by using celery by running concurrent tasks with different kinds of worker pool instead of programming using multiprocessing, threading or asyncio packages. You can now read the next part of this blog to know about how to actually use Celery for asynchronous processing.