Drone-runner keeps receiving HTML when doing RPC calls and fails to run jobs

I’m doing a fresh install of Drone-1.6.2 on a docker host which is also running Gitea-1.10.0.

The Drone server is started like this:

docker run --restart=always --detach=true \
    --volume=/srv/drone:/data \
    --env=DRONE_GITEA_SERVER=http://my.internal.tld:3080/ \
    --env=DRONE_GIT_ALWAYS_AUTH=false \
    --env=DRONE_AGENTS_ENABLED=true \
    --env=DRONE_RPC_HOST=my.internal.tld:4080 \
    --env=DRONE_RPC_PROTO=http \
    --env=DRONE_SERVER_HOST=my.internal.tld:4080 \
    --env=DRONE_SERVER_PROTO=http \
    --env=DRONE_SERVER_PORT=:4080 \
    --env=DRONE_RPC_DUMP_HTTP=true \
    --env=DRONE_RPC_DUMP_HTTP_BODY=true \
    --env=DRONE_LOGS_TEXT=true \
    --env=DRONE_LOGS_PRETTY=true \
    --env=DRONE_LOGS_TRACE=true \
    --publish=4080:4080 \
    --name=drone drone/drone:1.6.2

…and I started a runner like this:

docker run --restart=always --detach=true \
    --volume=/var/run/docker.sock:/var/run/docker.sock \
    --env=DRONE_RPC_HOST=my.internal.tld:4080 \
    --env=DRONE_RPC_PROTO=http \
    --env=DRONE_UI_USERNAME=root \
    --env=DRONE_UI_PASSWORD=drone \
    --publish 3000:3000 \
    --name=drone-runner drone/drone-runner-docker:1.0.1

Drone is picking up the repos just fine and when I activate them the web hook is set up correctly. Drone-runner says successfully pinged the remote server.

When I push a commit Drone picks it up but the runner does not build it, instead outputting this on the logs:
level=error msg="cannot get stage details" error="invalid character '<' looking for beginning of value" stage.id=3 stage.name=default stage.number=1 thread=1

If I force a wrong RPC secret and restart the runner, it will correctly say that it is not authorised. If I docker exec into the drone-runner container and apk add curl I can do curl http://my.internal.tld:4080 and I see drone responding with its html home page.

With DRONE_RPC_DUMP_HTTP and DRONE_RPC_DUMP_HTTP_BODY I can see the following calls from the runner logs:

POST /rpc/v2/stage/4?machine=docker HTTP/1.1
HTTP/1.1 200 OK
GET /rpc/v2/stage/4 HTTP/1.1
HTTP/1.1 500 Internal Server Error

Enabling traces and debug on the Drone server I get these in the logs:

DEBU[0031] webhook parsed                                commit=059baca4df90d22678f81a3877366ff1eac96ffc event=push name=prova namespace=luca
DEBU[0031] trigger: received                             commit=059baca4df90d22678f81a3877366ff1eac96ffc event=push ref=refs/heads/master repo=luca/prova
DEBU[0031]                                               fields.time="2019-11-25T16:09:25Z" latency=163.644565ms method=POST remote="" request="/hook?secret=<REDACTED>" request-id=<REDACTED>
DEBU[0031] manager: accept stage                         machine=docker stage-id=5
DEBU[0031] manager: stage accepted                       machine=docker stage-id=5
DEBU[0031] manager: fetching stage details               step-id=5
WARN[0032] manager: cannot generate netrc                error="invalid character '<' looking for beginning of value" repo.id=5 repo.name=luca/prova
DEBU[0032] manager: request queue item                   arch=amd64 kernel= kind=pipeline os=linux type=docker variant=
DEBU[0037] manager: context canceled                     arch=amd64 kernel= kind=pipeline os=linux type=docker variant=

I do understand something somewhere is getting an HTML output instead of some more structured RPC stuff and then chocking on the first unexpected < character, but I’m failing to understand how to debug and fix this problem. As far as I can tell Drone is talking correctly to Gitea (it sees the repos, sets up the hooks, receives hooks) and also when the runner starts it is talking correctly to the Drone server.

Any help or any idea about debugging this would be really appreciated, thanks.

have you taken a look at this thread which discusses common root causes for runners not picking up builds?

Yes I did, sorry for not being explicit about it.

I do have the successfully pinged the remote server (and a corresponding trace with POST /rpc/v2/ping with HTTP 200 OK reply), according to the official docs I take it that I do have a successful connection between the runner and the server.

For the same reasons I ruled out networking problems and invalid endpoints (I’m getting HTTP 200 OKs responses). I’m not using any proxy in between (I plan to do, but I’m not using anything else until I get it running in the first place).

I’m also confident I’ve not got a wrong RPC secret because if I put in a wrong one it does complain about that (contradicting the FAQ thread).

Not using protected mode, arm architectures, and so on.

post scriptum.

I also tried running drone-runner under Docker for Mac, still connecting to the same linux docker host running Drone server. I’m using the same command line as above and I’m getting exactly the same results as above.

do you have a reverse proxy or load balancer routing traffic? one common issue we see is the reverse proxy or load balancer mutating the request. one example is redirecting from http to https which in turn changes the request from a post to a get.

no, it’s all direct connections in the same LAN/subnet and there is no proxy or anything else.

After a lot of poking around I solved the problem.

I was using DRONE_GITEA_SERVER=http://my.internal.tld:3080/

It must be changed to avoid the trailing slash: DRONE_GITEA_SERVER=http://my.internal.tld:3080

I know I didn’t follow the example down to the letter, but I would suggest to make it super clear in the documentation to omit the leading / and/or modify Drone code to avoid putting // in the urls it is requesting to Gitea.