I’m curious about how the load balancing of workers work in drone and if I can configure it in some way. How does drone decide which worker to start a new job on?
The runners connect to a FIFO queue. There is no explicit load balancing. Runners are assigned work based on the order in which they connect. This ends up being somewhat randomized based on the fact that runners perform long polling, and therefore connect / disconnect / reconnect every 30 seconds.
Does this mean that there’s only one work item per runner? Maybe I’m missing something obvious here. Let me give an example to see if I understand you correctly.
Let’s say we’ve two hosts A and B and then jobs 1 and 2 running on A and B respectively.
Now enter job 3, (both job 1 and 2 is still running), A has a load of 0.3 and B of 0.9. Does this mean that job 3 can still end up on B?
the number of work items per runner is configurable:
correct, the queue is fifo which means runners pull jobs from the queue based on order in which they connect. Because runners connect and disconnect every 30 seconds (to avoid network timeouts commonly enforced by reverse proxies and load balancers) it is possible that a runner could pull multiple jobs from the queue, while other runners pull no jobs from the queue. Distribution of jobs is completely random, based on available slots, not based on optimum allocation and balancing.
tldr; let’s pretend you have 2 runners, each with capacity to execute 4 jobs. It is entirely possible to observe one runner executing 4 jobs, while the other runner sits completely idle.
thank you for a quick and very good answer!