Steps failing in Kubernetes Native


#21

1.10.7-gke.11 did not work


#22

Tobias mentions this is fixed in more recent versions of Docker. He tested with 18.09 and I tested locally with 18.06.1. What version of Docker is 1.10.7-gke.1 running? I think at this point it is helpful to narrow down the minimum version of Docker that is supported.


#23

Yeah correct. I just upgraded again to v1.11.5-gke.5 which is docker://17.3.2. No result. Im going to attempt to change the node image type from Container-Optimised to Ubuntu … long-shot


#24

Ok my repo now clones! using GKE v1.11.5-gke.5 that uses containerd://1.1.5 rather then docker. Other pipeline issues but i think that solves the issue as its using containerd over docker?

Thats node runtime Container-Optimised OS with Containerd (cos_containerd) (beta) its not really a long term solution as containerd introduces a host of other issues for most builds


#25

Can confirm this issue with Amazon EKS as well:

[root@ip-192-168-101-190 ~]# docker -v
Docker version 17.06.2-ce, build 3dfb8343b139d6342acfd9975d7f1068b5b1c3d3

I was able to work around it by disabling clone and adding a custom step:

workspace:
  base: /go
  path: /

clone:
  disable: true

steps:
  - name: docker-workspace-fix
    image: golang:1.9.1
    commands:
      - mkdir -p /go/src/github.com/terraform-providers
      - cd /go/src/github.com/terraform-providers
      - git clone $DRONE_REPO_LINK
      - cd terraform-provider-spotinst
      - git checkout $DRONE_COMMIT

I then do a cd to the actual working directory in subsequent steps.

The downside is local exec is now failing:

$ drone exec
2019/01/01 22:29:26 Error response from daemon: Duplicate mount point: /go

#26

Good catch. The reason is that we create two mounts for drone exec. We create a data volume at the base path (e.g. /drone) and a host mount at the base + path (e.g. /drone/src). If the workspace.base and workspace.base + workspace.path have the same absolute path, we should avoid the data volume mount. I created an issue to track here.

I should have a fix for rc.4 once I have some time to do more testing.


#27

As mentioned above I have a fix planned for rc.4 but in the meantime there has been some luck upgrading the docker version. This from @brandom in our gitter channel -

Regarding this error: container_linux.go:247: starting container process caused "chdir to cwd (\"/drone/src\") set in config.json failed: no such file or directory" - I was already on kubernetes v1.12.3 and noticed that newer versions of docker were finally officially supported. I upgraded from 17.03 to 18.06 and this issue was resolved. I’m not sure why, but thought I would let you know.


#28

Really nice, unfortantly manuell docker upgrades for a managed kubernetes require too much manuel interference for us currently.

Really looking forward to rc.4 :+1: , do you already have a vague eta?


#29

I’m hoping to push something end of next week. It will take me a few days to ramp up development after such a long holiday break :slight_smile:

In the meantime the best option is to use the workaround presented by Kevin. It is a pretty good option to unblock while you are waiting for a fix, as long as you do not need drone exec for local testing.


#30

Very much looking forward to rc-4 :slight_smile:


#31

Just ran into this on GKE and was glad to see so much info on the problem already. Great detective work everyone!

I’ll be awaiting rc-4 as well :slight_smile:. In the meantime, Tobias’s workaround above works good enough for now :+1:.


#32

FWIW, I didn’t have this issue with CRI-O. It seems pretty clear that attributing this to certain versions of Docker was/is correct. Good find!

Is RC4’s release still planned for this week? Is there a ticket in Drone’s Github that we can use to track this?


#33

Happy Friday everyone!

I just wanted to thank everyone for their help troubleshooting these issues. I just published 1.0.0-rc.4 which includes a fix for this issue. I performed a decent amount of testing to ensure it did not cause any unexpected regressions, and can also confirm it works with drone exec (thanks @kmcgrath for pointing this out).

The only caveat to this fix is that you should not define a custom workspace path with Kubernetes. You can define a custom base, but if you define a custom subpath you will experience this same error.

The is good:

workspace:
  base: /go/src/github.com/octocat/hello-world

This is bad:

workspace:
  base: /go
  path: /src/github.com/octocat/hello-world

This is good:

workspace:
  base: /go/src/github.com/octocat/hello-world

I do not foresee this (above) being an issue in the long term. If you are interested in reading more expand below, but unless you are a Go developer and you have no plans to adopt Go modules, this won’t impact you.

Planned changes to the workspace.

The concept of base and path is a legacy concept. It was created for Go projects so the GOPATH could be preserved in between build steps. The base was mounted as a volume, and the path was the working directory. This allowed mounting /go and preserving /go/pkg and /go/bin across pipeline steps. Anyway … with the ability to mount empty directory volumes and the advent of Go modules the workspace will probably be deprecated later in 2019.

Version 1.0.0-rc.4 also includes some additional Kubernetes improvements (thanks to paulreimer):

  1. Fixed an issue with cloning private gcr images
  2. Fixed an issue when cloning private images, missing imagePullSecrets in spec
  3. Enabled nodeAffinity so that you can send Pipelines to specific nodes.

The Kubernetes runtime is coming along nicely, however, it still lacks feature parity with the Docker runtime. Here are some things that I plan to address over the coming weeks:

  1. Global secrets not enabled
  2. Global registry credentials not enabled
  3. Global volumes not enabled
  4. Global resource limits not enabled
  5. Non-x86_64 Pipelines are not supported
  6. Other features I have not considered? Probably.

I will close this topic once I have broader confirmation the fix is working. If you experience any further issues unrelated to this thread, please start a new thread. Thanks everyone, and have a good weekend!


#34

Works for me on GKE. Thanks for the update and happy weekend Brad!


#35

Builds are running 100% fine now on GKE node version - 1.11.5-gke.5. Thank you for your hard work on this!!!