Troubleshoot a DNS error on a GitLab runner utilizing a Kubernetes executor.

Posted by Jie Gao on November 29, 2023 · 7 mins read

Background and issues

I’m configuring GitLab Runners on a Kubernetes cluster where my self-hosted GitLab instance is located at https://gitlab.valid.fqdn.com. I’ve used Helm to install the runners on Kubernetes [1]. The installation command looks like this:

helm install --namespace gitlab-k8s-ci-test --generate-name --name gitlab-runner gitlab/gitlab-runner

The Kubernetes environment now hosts the GitLab Runner successfully, and its registration status is confirmed on the GitLab admin page. Subsequently, an attempt was made to execute a basic pipeline on a test repository using the following GitLab CI YAML configuration.

variables:
  KUBE_CONFIG_PATH: ~/.kube/config

MergeSimilarCompile:
  image: 
    name: bitnami/kubectl:latest
    entrypoint: [""]
  script:
    - echo "hello"
  stage: deploy
  tags:
  - test

stages:
- deploy

Although the setup appeared promising, I encountered the following error:

Running with gitlab-runner 16.6.1 (f5da3c5a)
  on gitlab-runner-1701218999-598996988-8t7k6 7eGb55yu, system ID: r_Hy3cLQpcxMJJ
Preparing the "kubernetes" executor
00:00
Using Kubernetes namespace: gitlab-k8s-ci-test
Using Kubernetes executor with image bitnami/kubectl:latest ...
Using attach strategy to execute scripts...
Preparing environment
00:25
Using FF_USE_POD_ACTIVE_DEADLINE_SECONDS, the Pod activeDeadlineSeconds will be set to the job timeout: 1h0m0s...
Waiting for pod gitlab-k8s-ci-test/runner-7egb55yu-project-978-concurrent-0-i0w1q0nu to be running, status is Pending
Waiting for pod gitlab-k8s-ci-test/runner-7egb55yu-project-978-concurrent-0-i0w1q0nu to be running, status is Pending
	ContainersNotInitialized: "containers with incomplete status: [init-permissions]"
	ContainersNotReady: "containers with unready status: [build helper]"
	ContainersNotReady: "containers with unready status: [build helper]"
Waiting for pod gitlab-k8s-ci-test/runner-7egb55yu-project-978-concurrent-0-i0w1q0nu to be running, status is Pending
	ContainersNotInitialized: "containers with incomplete status: [init-permissions]"
	ContainersNotReady: "containers with unready status: [build helper]"
	ContainersNotReady: "containers with unready status: [build helper]"
Waiting for pod gitlab-k8s-ci-test/runner-7egb55yu-project-978-concurrent-0-i0w1q0nu to be running, status is Pending
	ContainersNotInitialized: "containers with incomplete status: [init-permissions]"
	ContainersNotReady: "containers with unready status: [build helper]"
	ContainersNotReady: "containers with unready status: [build helper]"
Waiting for pod gitlab-k8s-ci-test/runner-7egb55yu-project-978-concurrent-0-i0w1q0nu to be running, status is Pending
	ContainersNotReady: "containers with unready status: [build helper]"
	ContainersNotReady: "containers with unready status: [build helper]"
Waiting for pod gitlab-k8s-ci-test/runner-7egb55yu-project-978-concurrent-0-i0w1q0nu to be running, status is Pending
	ContainersNotReady: "containers with unready status: [build helper]"
	ContainersNotReady: "containers with unready status: [build helper]"
Waiting for pod gitlab-k8s-ci-test/runner-7egb55yu-project-978-concurrent-0-i0w1q0nu to be running, status is Pending
	ContainersNotReady: "containers with unready status: [build helper]"
	ContainersNotReady: "containers with unready status: [build helper]"
Waiting for pod gitlab-k8s-ci-test/runner-7egb55yu-project-978-concurrent-0-i0w1q0nu to be running, status is Pending
	ContainersNotReady: "containers with unready status: [build helper]"
	ContainersNotReady: "containers with unready status: [build helper]"
Running on runner-7egb55yu-project-978-concurrent-0-i0w1q0nu via gitlab-runner-1701218999-598996988-8t7k6...
Getting source from Git repository
00:05
Fetching changes with git depth set to 20...
Initialized empty Git repository in /builds/root/k8s-data/.git/
Created fresh repository.
fatal: unable to access 'https://gitlab.valid.fqdn.com/root/k8s-data.git/': Could not resolve host: gitlab.valid.fqdn.com
Cleaning up project directory and file based variables
00:01
ERROR: Job failed: command terminated with exit code 1

Troubleshooting DNS issue

The GitLab Runner encounters difficulty resolving the GitLab server.

To troubleshoot this issue initially, I examined the DNS name on the Kubernetes master node.

$ host gitlab.valid.fqdn.com

gitlab.valid.fqdn.com has address 111.222.333.444

where it successfully returns a valid address, I proceeded to troubleshoot using Kubernetes documentation [2]. Following the steps outlined:

  1. Created a DNS utils pod.

kubectl apply -f https://k8s.io/examples/admin/dns/dnsutils.yaml

  1. Validated its running status.

kubectl get pods dnsutils

  1. Performed a DNS lookup.

kubectl exec -i -t dnsutils -- nslookup kubernetes.default

Successfully found kubernetes.default.

  1. Attempted GitLab host lookup.

kubectl exec -i -t dnsutils -- nslookup gitlab.valid.fqdn.com

This returned a failure to find gitlab.valid.fqdn.com.home.apra, indicating a potential issue.

  1. Checked the DNS configuration.

kubectl exec -ti dnsutils -- cat /etc/resolv.conf

This returns:

search default.svc.cluster.local svc.cluster.local cluster.local google.internal home.apra
nameserver 10.0.0.10
options ndots:5

Notably, the search domain seems to include an unexpected suffix, indicating a possible source of the problem.

Some findings

The last line options ndots:5 seems interesting. What does it mean?

I searched online and find some discussions about ndots [3] [4]. @thockin has mentioned that set ndots to 5 is a default method and it is unlikely to change in the further. Now that it is clear that GitLab runner has inherit this setting from k8s.

Solutions

The workaround would be easy as long as we have located the root cause. Here are some discussions on how to add extra hosts in kubenetes. Another solution is to reset ndots to be 2 in GitLab runner.

  1. create a file called values.yaml.
gitlabUrl: https://gitlab.valid.fqdn.com
runnerRegistrationToken: "token"
runners:
  config: |
    [[runners]]
      [runners.kubernetes]
        [runners.kubernetes.dns_config]
              [[runners.kubernetes.dns_config.options]]
            name = "ndots"
            value = "2"

Now use the following to install a gitlab runner.

helm install --namespace gitlab-k8s-ci-test --generate-name --name gitlab-runner -f values.yaml gitlab/gitlab-runner

Let’s see how pipeline goes this time.

Running with gitlab-runner 16.6.1 (f5da3c5a)
  on gitlab-runner-1701235149-84bccd9fc8-89lhj UbuxdMob, system ID: r_BCADAzlRTjJe
Preparing the "kubernetes" executor
00:00
Using Kubernetes namespace: gitlab-k8s-ci-test
Using Kubernetes executor with image bitnami/kubectl:latest ...
Using attach strategy to execute scripts...
Preparing environment
00:10
Using FF_USE_POD_ACTIVE_DEADLINE_SECONDS, the Pod activeDeadlineSeconds will be set to the job timeout: 1h0m0s...
Waiting for pod gitlab-k8s-ci-test/runner-ubuxdmob-project-978-concurrent-0-jwi6kvkw to be running, status is Pending
Waiting for pod gitlab-k8s-ci-test/runner-ubuxdmob-project-978-concurrent-0-jwi6kvkw to be running, status is Pending
	ContainersNotReady: "containers with unready status: [build helper]"
	ContainersNotReady: "containers with unready status: [build helper]"
Waiting for pod gitlab-k8s-ci-test/runner-ubuxdmob-project-978-concurrent-0-jwi6kvkw to be running, status is Pending
	ContainersNotReady: "containers with unready status: [build helper]"
	ContainersNotReady: "containers with unready status: [build helper]"
Running on runner-ubuxdmob-project-978-concurrent-0-jwi6kvkw via gitlab-runner-1701235149-84bccd9fc8-89lhj...
Getting source from Git repository
00:00
Fetching changes with git depth set to 20...
Initialized empty Git repository in /builds/root/k8s-data/.git/
Created fresh repository.
Checking out b60f5bf7 as detached HEAD (ref is main)...
Skipping Git submodules setup
Executing "step_script" stage of the job script
00:01
$ echo "hello"
hello
Cleaning up project directory and file based variables
00:00
Job succeeded

Oh, it’s working!!

Reference

[1]. https://docs.gitlab.com/runner/install/kubernetes.html

[2]. https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/

[3]. https://pracucci.com/kubernetes-dns-resolution-ndots-options-and-why-it-may-affect-application-performances.html

[4]. https://github.com/kubernetes/kubernetes/issues/33554#issuecomment-266251056