gitlab.yml sample config Gitlab Helm install needs to work with certmanager, if you are not using it at all, you can manually bind your certs with nginx, just create secret with tls.crt and tls.key, Where tls.crt would need to include the entire bundle from root CA all the way to your Cert, otherwise you may face weird errors, such as: x509: certificate signed by unknown authority and error authorizing context: authorization token required.

To manually create cert secert, use:

kubectl create secret generic gitlab-registry-tls --from-file=tls.key=certs/tdlabwild.key --from-file=tls.crt=certs/tdlabwild.crt -n gitlab
kubectl create secret generic gitlab-gitlab-tls --from-file=tls.key=certs/tdlabwild.key --from-file=tls.crt=certs/tdlabwild.crt -n gitlab
kubectl create secret generic gitlab-minio-tls --from-file=tls.key=certs/tdlabwild.key --from-file=tls.crt=certs/tdlabwild.crt -n gitlab

They will automatically be called by nginx and mounted.

Note: minio, registry and unicorn also need to mount them accordingly because they need to use them to validate each other through domain names which are managed and hosted through nginx.

Enable Ldap

production:
  ldap:
    enabled: true
    servers:
      main:
        label: 'LAB AD'
        host: '10.10.10.x'
        port: 389
        uid: 'sAMAccountName'
        bind_dn: 'CN=xxx,CN=Users,DC=xxx,DC=ca'
        password: 'xxxx'
        encryption: 'plain'
        active_directory: true
        base: 'OU=yyy,DC=tdlab,DC=ca'
        group_base: 'OU=CloudOps,DC=tdlab,DC=ca'
        admin_group: 'yyy'

Jenkins Integration

Install Gitlab-plugin Gitlab-plugin on Jenkins, and enable webhook on Gitlab with or without secret. push events will make it trigger on any push commits, merge request events will trigger on merge related actions. If you choose both(gitlab and jenkins all choose push and merge), then it will trigger twice, as a successful merge will become a push action. The setup part on gitlab is only to make gitlab sent out JSON request to Jenkins API, while the final determination is actually done by Jenkins. So let’s say if you choose push and merge on Gitlab side, and following options on Jenkins will have different results:

  1. push events and Opened Merge Request Events: Any new merge request created will trigger, and push created will trigger.
  2. push events and Accepted Merge Request Events: Any accepted merge will trigger, and push created will trigger. So if you only want to focus on 1 trigger, you would consider using only Accepted Merge Request Events.
  3. Filter branch can force it only trigger when changes done on that branch. Say if you filter on master, and a merge from dev => master will trigger, while a merge from master => dev will not trigger. Jenkins Pipeline Setup Example: Gitlab-Jenkins

Example of integrating Jenkins with Gitlab, Jenkins will provide stage progress to gitlab and can be further redirected on gitlab for analysis.

podTemplate(label: 'docker', containers: [
      containerTemplate(name: 'docker', image: "${env.EXECUTOR_IMAGE}", ttyEnabled: true, command: 'cat')],
      volumes: [hostPathVolume(hostPath: '/var/run/docker.sock', mountPath: '/var/run/docker.sock')])
    {
      node('docker') {
          def app

          stage('Clone repository') {
              /* Let's make sure we have the repository cloned to our workspace */
            container('docker'){
              checkout scm
            }
          }

          gitlabBuilds(builds: ["Build image", "Push image"]) {
            stage('Build image') {
              /* This builds the actual image; synonymous to
               * docker build on the command line */
              gitlabCommitStatus("Build image") {
                container('docker'){
                  app = sh 'docker build --no-cache -t $REGISTRY/$IMAGE_NAME:v$BUILD_NUMBER .'
                }
              }
            }

            stage('Push image') {
              /* Finally, we'll push the image with two tags:
               * First, the incremental build number from Jenkins
               * Second, the 'latest' tag.
               * Pushing multiple tags is cheap, as all the layers are reused. */
              gitlabCommitStatus("Push image") {
                container('docker'){
                  docker.withRegistry("${env.REGISTRY_URL}", 'gitlab-registry') {
                    echo "${env.BUILD_NUMBER}"
                    sh 'docker push $REGISTRY/$IMAGE_NAME:v$BUILD_NUMBER'
                  }
                }
              }
            }
          }
       }
    }

Internal Variables

Few usefull built-in variables are listed here: CI_COMMIT_REF_NAME: The branch or tag name for which project is built. CI_COMMIT_TAG: The commit tag name. Present only when building tags. CI_JOB_ID: The unique id of the current job that GitLab CI uses internally. CI_JOB_TOKEN: Token used for authenticating with the GitLab Container Registry. Also used to authenticate with multi-project pipelines when triggers are involved. CI_REPOSITORY_URL: The URL to clone the Git repository. CI_REGISTRY: If the Container Registry is enabled it returns the address of GitLab’s Container Registry. CI_REGISTRY_IMAGE: If the Container Registry is enabled for the project it returns the address of the registry tied to the specific project. CI_REGISTRY_PASSWORD: The password to use to push containers to the GitLab Container Registry. CI_REGISTRY_USER: The username to use to push containers to the GitLab Container Registry. GITLAB_USER_LOGIN: The login username of the user who started the job. GITLAB_USER_NAME: The real name of the user who started the job.

Sample CI Script

Sample Code for baking docker image and distinguished by version tag and job id.

variables:
  CI_VERSION: "1.0.${CI_JOB_ID}"

build-master:
  stage: build
  script:
    - docker build --pull -t "$CI_REGISTRY_IMAGE" -t "$CI_REGISTRY_IMAGE:$CI_VERSION"   ./postfix
    - docker push "$CI_REGISTRY_IMAGE"
  only:
    - master

Sample Code for baking docker image and distinguished by manual version tag.

build-master:
  stage: build
  script:
    - docker build --pull -t $CI_REGISTRY/releases/application:$CI_COMMIT_TAG .
    - docker push $CI_REGISTRY/releases/application:$CI_COMMIT_TAG
  only:
    - /^my-release-.*$/

Sample Code for baking docker image and distinguished by branch name Note: following code will make $CI_COMMIT_REF_NAME=dev first time when building TEST_IMAGE, but when it’s pulling image when changes happen(push or merge) in master it will go for $CI_COMMIT_REF_NAME=master which will lead to failure. So $CI_COMMIT_REF_NAME only equals to current running branch.

image: docker:stable
services:
- docker:dind

stages:
- build
- test
- release

variables:
  DOCKER_HOST: tcp://localhost:2375
  DOCKER_DRIVER: overlay2
  CONTAINER_TEST_IMAGE: registry.tdlab.ca/tdlab/pysite/my-image:$CI_COMMIT_REF_NAME
  CONTAINER_RELEASE_IMAGE: registry.tdlab.ca/tdlab/pysite/my-image:latest

before_script:
  - docker login -u gitlab-ci-token -p $CI_JOB_TOKEN registry.tdlab.ca
#registry.tdlab.ca can be replaced by $CI_REGISTRY for built-in gitlab registry

build:
  stage: build
  script:
    - docker build --pull -t $CONTAINER_TEST_IMAGE .
    - docker push $CONTAINER_TEST_IMAGE
  only:
    - dev

test1:
  stage: test
  script:
    - docker pull $CONTAINER_TEST_IMAGE
    - docker run $CONTAINER_TEST_IMAGE python /app/network_test.py
  only:
    - dev

release-image:
  stage: release
  script:
    - docker pull $CONTAINER_TEST_IMAGE
    - docker tag $CONTAINER_TEST_IMAGE $CONTAINER_RELEASE_IMAGE
    - docker push $CONTAINER_RELEASE_IMAGE
  only:
    - master

Sample Coding for Baking image on different registry with tags:

image: docker:stable
services:
- docker:dind

stages:
- build
- test
- release

variables:
  DOCKER_HOST: tcp://localhost:2375
  DOCKER_DRIVER: overlay2
  CONTAINER_TEST_IMAGE: registry.tdlab.ca/tdlab/pysite/my-image:dev
  CONTAINER_RELEASE_IMAGE: dockerlab.tdlab.ca:5001/pysite:$CI_COMMIT_TAG

before_script:
  - docker login -u $CI_REGISTRY_USER -p $CI_JOB_TOKEN $CI_REGISTRY

build:
  stage: build
  script:
    - docker build --pull -t $CONTAINER_TEST_IMAGE .
    - docker push $CONTAINER_TEST_IMAGE
  only:
    - dev

test1:
  stage: test
  script:
    - docker pull $CONTAINER_TEST_IMAGE
    - docker run $CONTAINER_TEST_IMAGE python /app/network_test.py
  only:
    - dev

release-image:
  stage: release
  script:
    - docker pull $CONTAINER_TEST_IMAGE
    - docker login -u admin -p $DOCKERLAB_TOKEN dockerlab.tdlab.ca:5001
    - docker tag $CONTAINER_TEST_IMAGE $CONTAINER_RELEASE_IMAGE
    - docker push $CONTAINER_RELEASE_IMAGE
  only:
    - /^v.*$/

Sample ENV output in build evnrionment:

CI_SERVER_REVISION=3007b0e3
GITLAB_USER_LOGIN=xxx
CI_BUILD_ID=70
KUBERNETES_SERVICE_PORT=443
KUBERNETES_PORT=tcp://10.233.0.1:443
CI_BUILD_REF=5c908c0220d8d34e1df8e05303252fdd57728138
CI=true
CI_PROJECT_NAME=pysite
HOSTNAME=runner-49d47ada-project-4-concurrent-0gxpj4
CI_JOB_STAGE=build
CI_COMMIT_DESCRIPTION=
TILLER_DEPLOY_SERVICE_HOST=10.233.54.63
CI_SERVER_VERSION=11.2.0-pre
SHLVL=2
OLDPWD=/
HOME=/root
CI_COMMIT_REF_NAME=dev
CI_JOB_ID=70
CI_PIPELINE_SOURCE=push
CI_REGISTRY_PASSWORD=xxxxxxxxxxxxxxxxxxxx
CI_BUILD_TOKEN=xxxxxxxxxxxxxxxxxxxx
GITLAB_FEATURES=
CI_REGISTRY_IMAGE=registry.tdlab.ca/tdlab/pysite
CI_PROJECT_ID=4
TILLER_DEPLOY_PORT=tcp://10.233.54.63:44134
TILLER_DEPLOY_SERVICE_PORT=44134
GITLAB_CI=true
CI_COMMIT_SHA=5c908c0220d8d34e1df8e05303252fdd57728138
TILLER_DEPLOY_PORT_44134_TCP_ADDR=10.233.54.63
CI_REGISTRY_USER=gitlab-ci-token
CI_PROJECT_PATH=tdlab/pysite
CI_PROJECT_DIR=/tdlab/pysite
DOCKER_DRIVER=overlay2
CI_PROJECT_NAMESPACE=tdlab
TILLER_DEPLOY_PORT_44134_TCP_PORT=44134
CI_JOB_TOKEN=xxxxxxxxxxxxxxxxxxxx
CI_SERVER_NAME=GitLab
TILLER_DEPLOY_PORT_44134_TCP_PROTO=tcp
KUBERNETES_PORT_443_TCP_ADDR=10.233.0.1
CI_PIPELINE_URL=https://gitlab.tdlab.ca/tdlab/pysite/pipelines/22
TILLER_DEPLOY_SERVICE_PORT_TILLER=44134
CI_RUNNER_DESCRIPTION=
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
CI_PROJECT_VISIBILITY=public
CI_BUILD_REF_SLUG=dev
CI_COMMIT_TITLE=check env
[email protected]
KUBERNETES_PORT_443_TCP_PORT=443
DOCKER_CHANNEL=stable
CI_SERVER=yes
KUBERNETES_PORT_443_TCP_PROTO=tcp
CI_BUILD_BEFORE_SHA=4a0e6b94f84cfac68dea76985df90a8563701ec1
TILLER_DEPLOY_PORT_44134_TCP=tcp://10.233.54.63:44134
CI_REPOSITORY_URL=https://gitlab-ci-token:[email protected]/tdlab/pysite.git
CI_RUNNER_ID=2
DOCKER_VERSION=18.03.1-ce
CI_REGISTRY=registry.tdlab.ca
GITLAB_USER_NAME=xxx yyy
CI_PIPELINE_IID=19
DOCKER_HOST=tcp://localhost:2375
CI_JOB_URL=https://gitlab.tdlab.ca/tdlab/pysite/-/jobs/70
CI_BUILD_NAME=build
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT_443_TCP=tcp://10.233.0.1:443
CI_COMMIT_REF_SLUG=dev
CI_DISPOSABLE_ENVIRONMENT=true
KUBERNETES_SERVICE_HOST=10.233.0.1
PWD=/tdlab/pysite
CI_RUNNER_TAGS=cluster, kubernetes
CI_BUILD_REF_NAME=dev
CI_PROJECT_URL=https://gitlab.tdlab.ca/tdlab/pysite
CI_CONFIG_PATH=.gitlab-ci.yml
CI_COMMIT_BEFORE_SHA=4a0e6b94f84cfac68dea76985df90a8563701ec1
CI_SERVER_TLS_CA_FILE=/tdlab/pysite.tmp/CI_SERVER_TLS_CA_FILE
CI_PROJECT_PATH_SLUG=tdlab-pysite
CI_PIPELINE_ID=22
CONTAINER_RELEASE_IMAGE=registry.tdlab.ca/tdlab/pysite/my-image:latest
CI_BUILD_STAGE=build
CI_COMMIT_MESSAGE=check env

GITLAB_USER_ID=2
CI_JOB_NAME=build
CONTAINER_TEST_IMAGE=registry.tdlab.ca/tdlab/pysite/my-image:dev

Gitlab CI Runner

Gitlab CI runner uses registrition token to talk with k8s and creates project agent for each stage to use. A runner can be global available to all groups and individual repos in a gitlab environment, this is configured under Admin->Overview->Runner. It can also be shared in a group scope, which is configured under Groups->Settings->CI/CD. It can also be dedicated to a single repo, which is configured under Project->Settings->CI/CD. Onece it’s configured and added, it can only be removed from Admin->Overview->Runner.

If you see error Warning: failed to get default registry endpoint from daemon (Cannot connect to the Docker daemon at tcp://localhost:2375. Is the docker daemon running?). Using system default: https://index.docker.io/v1/ when runner run dind for baking docker images, it means it’s not running under previleged mode. Simple enable it on runner helm chart, or add KUBERNETES_PRIVILEGED = true in runner’s deployment.

How to get raw file from gitlab

Simply find your project ID on top of the project page and use it against your gitlab endpoint like following in a browser:

https://gitlab.test.ca/api/v4/projects/14/repository/files/deployment.yml/raw?ref=master

if calling via curl you need to percent encode:

curl --request GET  --header 'PRIVATE-TOKEN: xxxxxxxxx' https://gitlab.test.ca/api/v4/projects/14/repository/files/deployment%2Eyml/raw?ref=master

header would be needed if it’s a private repo.

Make git update at the end of pipeline

A very usefull practice in IaC is to upload/update your computing results onto same git repo, as we are using gitlab as infra’s script database. To do this, we need to define user and ssh token.

Update Gitlab Repo:
  stage: update
  tags:
    - maas
    - creekbank
    - labrat
  script:
    - echo "StrictHostKeyChecking no" > /etc/ssh/ssh_config
    - mkdir ~/.ssh
    - echo "$SSH_PRIVATE_KEY" | tr -d '\r' > ~/.ssh/id_rsa
    - chmod 600 ~/.ssh/id_rsa
    - chmod -v 700 $(pwd)
    - git config --global user.email "$GITLAB_USER_EMAIL"
    - git config --global user.name "$GITLAB_USER_NAME"
    - git add -A
    - git commit -m "$GITLAB_USER_NAME $ACTION $(date)"
    - git push [email protected]:$CI_PROJECT_PATH.git HEAD:$CI_COMMIT_REF_NAME
  only:
    - schedules

Here echo "$SSH_PRIVATE_KEY" | tr -d '\r' > ~/.ssh/id_rsa will make sure there’s no \n typo on each line of your copy of your ssh key on gitlab variables.

Remap container endpoint

A lot of time a container may have already had an endpoint setup, and any further command interact with it will start from that point. e.g, a terraform container may have endpoint terraform, and it accept cmd like apply but gives error if you use terrafomr apply. To force container ignore its original endpoint and reuse its bash path:

  image:
    name: alpine/git
    entrypoint:
      - '/usr/bin/env'
      - 'PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'

Always return exit 0

Sometimes user may perfer to ingore errors inside a job, e.g, when using terraform apply to create resources on Openstack, the apply action may fail but the actual state and resources may be partially created on Openstack already, and original terraform apply's exit 1 will result in job failure and further action in this stage such as cache terraform state files will not run, and this creates out-of-sync between actual Openstack and Terraform database.

A work around is to simply add logic in CI file to force linux output to be changed from 1 to 0.

  script:
    - terraform apply -auto-approve || if [ $? -eq 1 ]; then echo "success"; fi
    - terraform output > terraform-output.txt || if [ $? -eq 1 ]; then echo "success"; fi

And this is based on a ticket on gitlab [Accept specific exit code]“https://gitlab.com/gitlab-org/gitlab-runner/issues/3013".

# Normal run
/ # ls /asd
ls: /asd: No such file or directory
/ # echo $?
1

# Catch exit code 1
/ # ls /asd || if [ $? -eq 1 ]; then echo "success"; else /bin/false; fi
ls: /asd: No such file or directory
success
/ # echo $?
0

# Catching exit code 2 yields the following
/ # ls /asd || if [ $? -eq 2 ]; then echo "success"; else /bin/false; fi
ls: /asd: No such file or directory
/ # echo $?
1

Always run git push at the end of a pipeline

If you want to git update every time when there’s any changes in your code inside a pipeline, you can do it by injecting a dedicated stage, gitlab itself is not supporting push from runner using runner user though.

Update Gitlab Repo:
  stage: update
  image: 
    name: alpine/git
    entrypoint:
      - '/usr/bin/env'
      - 'PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'
  script:
    - chmod -v 700 $(pwd)
    - git config --global user.email "$GITLAB_USER_EMAIL"
    - git config --global user.name "$GITLAB_USER_NAME"
    - git add -A
    - git commit -m "$GITLAB_USER_NAME $ACTION $(date)"
    - git push [email protected]:$CI_PROJECT_PATH.git HEAD:$CI_COMMIT_REF_NAME
  when: always
  only:
    refs:
      - master

Difference between Cache and Artifacs

Cache must be defined on top of CI yaml, before any stages, and will be shared with all stages in a pipeline even entire repo pipelines:

image: 
  name: alpine/git
  entrypoint:
    - '/usr/bin/env'
    - 'PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'
cache:
  key: ${CI_COMMIT_REF_SLUG}    #enable per-branch cache
  paths:
    - ./*.yaml
    - ./root-ssh.sh

stages:
- create_inventory
- update
- deploy

Cache will be stored in a place as gitlab config defined, and every stage will load them during boot up. No one can access it from outside of gitlab stages.

1 Running with gitlab-runner 12.5.0 (577f813d)
2 on maas deployment runner wNo_tdSQ
3 Using Docker executor with image python:3.7.5-slim ...
4 Pulling docker image python:3.7.5-slim ...
5 Using docker image sha256:9f4008bf3f119728447a7112ff04e016d8eb756158525ffec07c7f2e4e80cf90 for python:3.7.5-slim ...
7 Running on runner-wNo_tdSQ-project-77-concurrent-0 via cbjump...
9 Fetching changes with git depth set to 50...
10 Initialized empty Git repository in /builds/tdlab/maas/.git/
11 Created fresh repository.
12 From https://gitlab.tdlab.ca/tdlab/maas
13  * [new ref]         refs/pipelines/1166 -> refs/pipelines/1166
14  * [new branch]      master              -> origin/master
15 Checking out 9d352881 as master...
16 Skipping Git submodules setup
18 Checking cache for master-26...
19 No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted. 
20 Successfully extracted cache

Artifacts is different, which has to be defined under each stage:

Initialize Terraform:
  stage: initialize
  tags:
    - OpenIAASMAAS
    - Caniff
  script:
    - sh init.sh
    - terraform init
  artifacts:
    name: "$CI_JOB_NAME"
    paths:
      - .terraform/
      - variables.tf
      - provider.tf
    expire_in: 1 sec

It will upload concerned files to gitlab and be shared with other stages and can be later downloaded from gitlab from anywhere.

Uploading artifacts...
49 .terraform/: found 5 matching files                
50 variables.tf: found 1 matching files               
51 provider.tf: found 1 matching files                
52 Uploading artifacts to coordinator... ok            id=776 responseStatus=201 Created token=YRaxP11R
54 Job succeeded

How to make stage only continue if a manual stage agrees to continue

A lot of time we need to use manual stages to decide if we should continue furhter actions, e.g in a scenario like only when this passed then we push to produnction, and this pass means manually human check for a result of a stage.

when: manual will define a stage to be manual, while all stage default value has allow_failure: true, which means this manual stage won’t allow disapproval, if anyone says no then the entire pipeline stops. To allow other stages continue after this manual stage, we need to make allow_failure: false. And then in the next stage we use when: on_success to make further condition of this stage to follow the previous stage’s decision.

Destroy Project Resources:
  stage: destroy 
  allow_failure: false
  tags:
    - Openstack
    - Caniff
  script:
    - terraform destroy -auto-approve || if [ $? -eq 1 ]; then echo "success"; fi
  when: manual
  only:
    variables:
      - $ACTION == "destroy"

Update Gitlab Repo:
  stage: update
  tags:
    - Openstack
    - Caniff
  image: 
    name: alpine/git
    entrypoint:
      - '/usr/bin/env'
      - 'PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'
  script:
    - chmod -v 700 $(pwd)
  when: on_success
  only:
    variables:
      - $ACTION == "deploy"
      - $ACTION == "destroy"

Database Restore Causes 500 Error on all CI pages

Built-in backup/restore script will backup all repo related values but won’t cover sensetive data such as password and variables which are encrypted by runner, and this is the root cause of 500 error on all CI related pages.
One word around is to reset all related data in postgres.

  1. Docker login to postgres as root.
  2. psql postgres login with postgres admin account.
  3. set all token related value to null, this fix runner page error:
    -- Clear project tokens
    UPDATE projects SET runners_token = null, runners_token_encrypted = null;
    -- Clear group tokens
    UPDATE namespaces SET runners_token = null, runners_token_encrypted = null;
    -- Clear instance tokens
    UPDATE application_settings SET runners_registration_token_encrypted = null;
    -- Clear runner tokens
    UPDATE ci_runners SET token = null, token_encrypted = null;

  1. Reset all group/project variables as they are encrypted by old runner:
    UPDATE ci_variables SET encrypted_value = null, encrypted_value_salt = null, encrypted_value_iv = null;
    UPDATE ci_group_variables SET encrypted_value = null, encrypted_value_salt = null, encrypted_value_iv = null;

  1. Reset all k8s vars:
    UPDATE cluster_platforms_kubernetes SET ca_cert=null, encrypted_token=null, encrypted_token_iv=null;
  1. Re-register all runners.