Runners must be gracefully terminated; they must not be force-terminated while pipelines are running, otherwise they are stuck in a running state.
The server does not keep track of runner connectivity for a number of reasons (for example, connections are not persistent and the runners use long polling and frequently connect and disconnect to avoid tcp timeouts, which are common in many corporate networks). If you stop or restart the server while builds are running, or the runner loses connectivity with the server, it is able to keep running pipelines and upload the results using a backoff once it is able to re-establish a connection. This decentralized design makes the system more resilient to outages and flaky networks, but the tradeoff is that you must not shut down a runner while it is running a pipeline.
The servers does scan for stuck jobs every 24 hours and terminates them. If you want to reduce the interval and scan more frequently, you can adjust the cleanup intervals and deadlines by passing the following environment variables to your Drone server: