I have received reports of builds getting stuck and I wanted to create a dedicated thread to track the issue. This is impacting a subset of Drone users, typically running older versions of Docker. This is not something I’ve been able to personally reproduce.
Please note that this is for builds stuck in running status. If your builds are stuck in pending status (with the clock) please do not use this thread.
The issue seems to be related to logs freezing either while streaming or uploading. We know this because we can follow the agent logs and pinpoint exactly where the agent appears to be blocking. We see the agent blocking after the first log line below:
logger.Debug(). Msg("uploading logs") uploads.Wait() logger.Debug(). Msg("uploading logs complete")
There is a known issue with older Docker versions (17.03 and below) that can cause the
docker logs command to lock, and therefore causes Drone to lock. So if you are experiencing this issue, upgrading Docker should be your first step.
See the following docker issue https://github.com/moby/moby/issues/30135
It has been suggested that very large log outputs could also cause the system to lock. This is a theory I plan to test prior to an 0.8-final release. This should be tested using Docker 17.06 or higher to avoid confusion with the Docker issue described above.
How to Help
This has not yet been confirmed to be a Drone issue due to my inability to reproduce. If you upgrade Docker to 17.06 and continue to experience issues we need your help. We need someone that is able to reproduce to step through the code, debug and pinpoint a root cause.
PLEASE NOTE to keep this thread concise and focused we will delete all +1 or me too comments, including comments that describe in detail how you can reproduce this issue but do not provide new or useful information to advance this discussion.