Enroll Worker Nodes

Register Worker Nodes

Enrolling a worker node to our Openstack requires the following steps:

  1. Installing k3s on the worker node

  2. Importing the worker node

Installing k3s on the worker node

On the controller, get mynodetoken by

$ sudo cat /var/lib/rancher/k3s/server/node-token

On the controller, get server by reading the EXTERNAL-IP of the node

$ kubectl get nodes -o wide
NAME       STATUS   ROLES                  AGE    VERSION        INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
edge-vm1   Ready    control-plane,master   148m   v1.22.5+k3s1   10.10.2.31    10.0.87.20    Ubuntu 20.04.5 LTS   5.4.0-128-generic   containerd://1.5.8-k3s1

Here you can choose between 2 different installations:


1. Simple K3S installation

On the worker node run

$ curl -sfL https://get.k3s.io | K3S_URL=https://server:6443 K3S_TOKEN=mynodetoken sh -
[INFO]  Finding release for channel stable
[INFO]  Using v1.24.6+k3s1 as release
[INFO]  Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.24.6+k3s1/sha256sum-amd64.txt
[INFO]  Downloading binary https://github.com/k3s-io/k3s/releases/download/v1.24.6+k3s1/k3s
[INFO]  Verifying binary download
[INFO]  Installing k3s to /usr/local/bin/k3s
[INFO]  Skipping installation of SELinux RPM
[INFO]  Creating /usr/local/bin/kubectl symlink to k3s
[INFO]  Creating /usr/local/bin/crictl symlink to k3s
[INFO]  Creating /usr/local/bin/ctr symlink to k3s
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-agent-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s-agent.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s-agent.service
[INFO]  systemd: Enabling k3s-agent unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s-agent.service → /etc/systemd/system/k3s-agent.service.
[INFO]  systemd: Starting k3s-agent

2. K3S installation with a static cpu manager policy

If you like to enroll a worker node with a static cpu manager policy, run the following on the node

$ curl -sfL https://get.k3s.io -o install.sh
$ chmod +x install.sh
$ K3S_URL=https://server:6443 K3S_TOKEN=token ./install.sh agent --kubelet-arg 'cpu-manager-policy=static' --kubelet-arg 'kube-reserved=cpu=1,memory=2Gi,ephemeral-storage=1Gi' --kubelet-arg 'system-reserved=cpu=1,memory=2Gi,ephemeral-storage=1Gi'

Check if cpu manager state is set, run on the worker node

$ sudo cat /var/lib/kubelet/cpu_manager_state
{"policyName":"static","defaultCpuSet":"0-63","checksum":1058907510}

On the controller, check if the worker node is added

$ kubectl get nodes -o wide
NAME       STATUS   ROLES                  AGE    VERSION        INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
edge-vm2   Ready    <none>                 71s    v1.24.6+k3s1   10.10.2.32    <none>        Ubuntu 20.04.5 LTS   5.4.0-125-generic   containerd://1.6.8-k3s1
edge-vm1   Ready    control-plane,master   152m   v1.22.5+k3s1   10.10.2.31    10.0.87.20    Ubuntu 20.04.5 LTS   5.4.0-128-generic   containerd://1.5.8-k3s1

check if all kube pods are running healthy

(venv) expeca@controller-01:/opt/chi-in-a-box$ kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=worker-02
NAMESPACE       NAME                            READY   STATUS    RESTARTS   AGE   IP               NODE        NOMINATED NODE   READINESS GATES
calico-system   calico-typha-697fd494b5-b4cf2   1/1     Running   0          84s   10.10.2.5        worker-02   <none>           <none>
kube-system     kube-multus-ds-9544c            1/1     Running   0          89s   10.10.2.5        worker-02   <none>           <none>
calico-system   csi-node-driver-jhqf8           2/2     Running   0          88s   192.168.73.112   worker-02   <none>           <none>
calico-system   calico-node-2jfd5               1/1     Running   0          89s   10.10.2.5        worker-02   <none>           <none>

run on the controller, check the node

$ kubectl describe node <node-name>
...
Capacity:
  cpu:                32
  ephemeral-storage:  425991584Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             131518696Ki
  pods:               110
Allocatable:
  cpu:                30
  ephemeral-storage:  412257128943
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             127324392Ki
  pods:               110
...

On the worker node, install the cni plugins (needs GO)

git clone https://github.com/containernetworking/plugins.git
cd plugins
./build_linux.sh
sudo cp bin/host-device /opt/cni/bin/
sudo cp bin/dhcp /opt/cni/bin/
sudo cp bin/macvlan /opt/cni/bin/
sudo cp bin/host-local /opt/cni/bin/
sudo cp bin/bridge /opt/cni/bin/
sudo cp bin/static /opt/cni/bin/
ll /opt/cni/bin

On the worker node, check all the interfaces are up, setup with the proper MTU, and connected according to the testbeds inventory.

Importing the worker node

Here we add the node to Openstack. First there must be a json file containing the worker node information. The format must be as below.

$ cat radiohost.json
[
  {
    "name": "worker-02",
    "hardware_type": "workernode",
    "properties": {
      "blazar_device_driver": "k8s",
      "machine_name": "worker-02",
      "device_name": "Dell PowerEdge R750xs",
      "vendor": "Dell",
      "model": "PowerEdge",
      "cpu_arch": "x86_64",
      "bm_interfaces": [
        {
          "name": "ens1f0",
          "mtu": 9000,
          "local_link_information":[
            {
              "switch_id":"a0:f8:49:f7:89:d1",
              "port_id":"te1/0/20",
              "switch_info":"tenant-switch-01"
            }
          ]
        },
        { 
          "name": "enp4s0np0",
          "mtu": 9000,
          "local_link_information":[
            {
              "switch_id":"a0:f8:49:f7:89:d1",
              "port_id":"te2/0/11",
              "switch_info":"tenant-switch-01"
            }
          ]
        },
        { 
          "name": "enp5s0np1",
          "mtu": 9000,
          "local_link_information":[
            {
              "switch_id":"a0:f8:49:f7:89:d1",
              "port_id":"te2/0/12",
              "switch_info":"tenant-switch-01"
            }
          ]
        }
      ]
    }
  }
]

Then we import the json file using openstack command:

openstack hardware import -f radiohost.json

Note that the name and mtu of the interfaces must match with the worker node.

Check if it is added to Doni's database and check its status. Workers' state must be STEADY. Otherwise, check the services' logs.

$ openstack hardware list
--------------------------------------------------------------------------
| UUID                                 | Name            | Properties  
--------------------------------------------------------------------------
| 6e29e738-c770-45e8-8747-cc6d8f6fd48f | radio-host3-vm1 | {'blazar_device_driver': 'k8s', 'bm_interfaces': [...], 'cpu_arch': 'x86_64', 'machine_name': 'k8s-worker'} |
--------------------------------------------------------------------------
$ openstack hardware show 6e29e738-c770-45e8-8747-cc6d8f6fd48f
--------------------------------------------------------------------------
| Field         | Value
--------------------------------------------------------------------------
| created_at    | 2022-11-07T23:16:21+00:00
| hardware_type | workernode 
| name          | radio-host3-vm1
| project_id    | e27227248c2b425ba4e2b2f548dbcd85
| workers       | [{'state': 'STEADY', 'state_details': {'blazar_resource_id': '628168f7-0217-483a-b4ad-884344762a6f', 'resource_created_at': '2022-11-07 23:17:07'}, 'worker_type': 'blazar.device'}, {'state': 'STEADY', 'state_details': {'num_labels': 0}, 'worker_type': 'k8s'}]
--------------------------------------------------------------------------

Doni adds the node to Blazar for user reservation and creates network attachment definition on k8s for the baremetal ports. Check their existence.

$ openstack reservation device list
+--------------------------------------+-----------------+---------------+-------------+
| id                                   | name            | device_driver | device_type |
+--------------------------------------+-----------------+---------------+-------------+
| 628168f7-0217-483a-b4ad-884344762a6f | radio-host3-vm1 | k8s           | container   |
+--------------------------------------+-----------------+---------------+-------------+
$ openstack reservation device show 628168f7-0217-483a-b4ad-884344762a6f
+------------------+--------------------------------------+
| Field            | Value                                |
+------------------+--------------------------------------+
| created_at       | 2022-11-07 23:17:07                  |
| device_driver    | k8s                                  |
| device_name      | K8S Worker Node                      |
| device_type      | container                            |
| id               | 628168f7-0217-483a-b4ad-884344762a6f |
| machine_name     | k8s-worker                           |
| model            | Ubuntu 20.04.5 Virtual Machine       |
| name             | radio-host3-vm1                      |
| platform_version | 2                                    |
| reservable       | True                                 |
| uid              | 6e29e738-c770-45e8-8747-cc6d8f6fd48f |
| updated_at       |                                      |
| vendor           | KTH Royal Institue of Technology     |
+------------------+--------------------------------------+
$ kubectl get network-attachment-definitions
NAME                        AGE
radio-host3-vm1.enp4s0np0   34h
radio-host3-vm1.enp5s0np1   34h
$ kubectl describe network-attachment-definition radio-host3-vm1.enp4s0np0
Name:         radio-host3-vm1.enp4s0np0
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  k8s.cni.cncf.io/v1
Kind:         NetworkAttachmentDefinition
Metadata:
  Creation Timestamp:  2022-11-07T23:17:08Z
  Generation:          1
  Managed Fields:
    API Version:  k8s.cni.cncf.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        .:
        f:config:
    Manager:         OpenAPI-Generator
    Operation:       Update
    Time:            2022-11-07T23:17:08Z
  Resource Version:  599993
  UID:               4f77252e-0b7d-4ced-9630-d00d273b19c5
Spec:
  Config:  { "cniVersion": "0.3.1", "local_link_information":[{"switch_id": "a0:f8:49:f7:89:d1", "port_id": "te2/0/11", "switch_info": "tenant-switch"}], "plugins": [{ "type": "macvlan", "master": "enp4s0np0", "mode": "bridge", "ipam": {} },{ "capabilities":{ "mac": true}, "type": "tuning" }] }
Events:    <none>

Remove and clean a worker node

  • Make sure the node has no reservations. Delete if there is any.

  • Delete the hardware in Doni

    openstack hardware delete <hardware-uuid>
  • On the worker node, run the following to stop and remove k3s services.

    /usr/local/bin/k3s-killall.sh
    /usr/local/bin/k3s-agent-uninstall.sh
  • Check the nodes registered on the controller, delete the desired node

    kubectl get nodes
    kubectl delete node <node-name>

Attach networks to the containers

This feature is only enable for physical VLAN networks registered in Neutron.

When creating a container, to attach any network to the network interfaces, make sure to provide the proper labels in the following format:

networks.<network-number>.interface=<interface-name>,
networks.<network-number>.ip=<ip>/<subnet>,
networks.<network-number>.routes=<source-network>/<subnet>-<via-ip>,

If you don't specify any ip for an interface, the container will get ip from the dhcp agent. Therefore, the network must be dhcp enabled. For example, the following settings corresponds to attaching 2 networks where for the first one we ask for a static ip and the second one will get an ip from the dhcp agent. Routes option can be used multiple times, resulting in multiple route commands.

networks.1.interface=enp4s0np0,networks.1.ip=192.168.100.200/24,networks.2.interface=enp5s0np1

Note that the first container that is taking a baremetal interface will decide about the network that it is going to get attached to. The next containers can use the same baremetal interface but they must be in the same network and subnet as the first container. For example:

Conainer 1 on worker-1:

networks.1.interface=enp4s0np0,networks.1.ip=192.168.0.1/24

Conainer 2 on worker-1 (correct):

networks.1.interface=enp4s0np0,networks.1.ip=192.168.0.2/24

Conainer 2 on worker-1 (wrong):

networks.1.interface=enp4s0np0,networks.1.ip=10.10.0.2/24

Routing example

networks.1.interface=enp4s0np0,networks.1.ip=10.99.99.3/24,networks.1.routes=172.16.0.0/16-10.99.99.1

Create containers with privileges

When creating a container, to enable any capabalities, make sure to provide the proper labels in the following format:

capabilities.privileged=true
capabilities.drop=ALL
capabilities.add.1=NET_ADMIN
capabilities.add.2=SYS_ADMIN

For more information about kubernetes security context check here.

Control CPU and memory of the container

resources.limits.memory=200Mi
resources.limits.cpu=1.5
resources.requests.memory=200Mi
resources.requests.cpu=1.5

Check from within the container

cat /sys/fs/cgroup/cpuset/cpuset.cpus
4,6

For more information about kubernetes controlling node's resources check here.

More troubleshooting

User can request access to the local bridge. Check https://devopstales.github.io/kubernetes/multus/ the bridge CNI. If the option is checked, Multus will add an interface that is connected to a bridge on the container worker node. It means all containers running on this node could be connected to each other using this bridge. In the json, we specify the ip address management (ipam) of this interface. It is better to have different subnets for different container workers.

For the dhcp IPAM to work do the following https://www.cni.dev/plugins/current/ipam/dhcp/ https://superuser.com/questions/1727321/podman-macvlan-network-error

kubectl get pods -A -o wide
openstack baremetal node list
kubectl get deployments -A -o wide
kubectl delete deployment -n <namespace> <deployment-name>

Last updated