-
Notifications
You must be signed in to change notification settings - Fork 47
Add the ability to run parallel tasks #200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…sks and processing them in batches
…al of multiple locked jobs at once.
…se and DatabaseTaskResultTestCase tests
…atabaseBackendWorkerTestCase
…ple task results.
…ción concurrente de tareas. Se agrega un argumento --max-workers para definir el número máximo de hilos de trabajo.
…o reflect changes in the worker's execution logic.
|
There are some things in this PR that need to be taken into account, such as that with my proposed changes, signal.SIGINT cannot terminate a running task, signal.SIGINT cannot terminate threads other than the main thread; as I have left it, the task would finish and then the worker would close. |
| for thread in threads: | ||
| thread.join() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue: I don't think this approach is ideal. If a worker process is set to run 5 threads, and receives 4 fast tasks and 1 long task, the worker will sit processing the long task and never pick up the extra 4 tasks is has capacity for.
|
I'm not convinced this is necessarily a good idea. There are other systems which can be used above the worker process to handle running multiple, rather than adding that complexity to the worker process itself. Tools like supervisord, Kubernetes etc have done the work on how to manage multiple processes properly - that complexity probably shouldn't live in the worker. |
…cutor and update related configurations
…ra utilizar la función de validación valid_max_tasks.
|
Hello @RealOrangeOne According to the tests I've run, each command consumes approximately 190MB of RAM. If I have to launch 5 workers to execute simple tasks (database queries, some HTTP requests, etc.), it could consume close to 1GB. However, if threads are integrated into the command, the consumption of the 5 workers would remain around the original 190MB. I've modified the code so that no thread blocks another, and if there's a large task, it doesn't block any smaller tasks. However, we've encountered two problems:
|
…ncy in the Worker thread configuration.
…ecutor for task execution
|
After using the changes I proposed in a work project, I realized that over time threads can become disabled, preventing code execution. Changing the threads to |
This pull request adds support for concurrent task processing in the database-backed worker by introducing multi-threading. The worker can now claim and process multiple tasks in parallel, controlled by a new
--max-workersoption. The changes also update the core query and locking logic to support batch task retrieval, and adjust related tests to reflect the new behavior.Concurrency and worker configuration:
--max-workerscommand-line option to the worker, allowing configuration of the maximum number of concurrent worker threads (default is 1, set byMAX_WORKERS) (django_tasks/backends/database/management/commands/db_worker.py,django_tasks/base.py). [1] [2] [3]max_workersparameter (django_tasks/backends/database/management/commands/db_worker.py). [1] [2] [3] [4]Task claiming and processing logic:
max_workerstasks concurrently using threads, instead of a single task at a time (django_tasks/backends/database/management/commands/db_worker.py). [1] [2]get_lockedmethod in the queryset to return a batch of locked tasks (as a queryset slice) instead of a single result, supporting batch locking (django_tasks/backends/database/models.py).Testing updates:
tests/tests/test_database_backend.py). [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]These changes collectively enable the worker to process multiple tasks in parallel, improving throughput and efficiency for database-backed task queues.