# Enroll Worker Nodes

## Register Worker Nodes

Enrolling a worker node to our Openstack requires the following steps:

1. Installing k3s on the worker node
2. Importing the worker node

### Installing k3s on the worker node

On the controller, get `mynodetoken` by

```
$ sudo cat /var/lib/rancher/k3s/server/node-token
```

On the controller, get `server` by reading the `EXTERNAL-IP` of the node

```
$ kubectl get nodes -o wide
NAME       STATUS   ROLES                  AGE    VERSION        INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
edge-vm1   Ready    control-plane,master   148m   v1.22.5+k3s1   10.10.2.31    10.0.87.20    Ubuntu 20.04.5 LTS   5.4.0-128-generic   containerd://1.5.8-k3s1
```

Here you can choose between 2 different installations:

***

#### 1. Simple K3S installation

On the worker node run

```
$ curl -sfL https://get.k3s.io | K3S_URL=https://server:6443 K3S_TOKEN=mynodetoken sh -
[INFO]  Finding release for channel stable
[INFO]  Using v1.24.6+k3s1 as release
[INFO]  Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.24.6+k3s1/sha256sum-amd64.txt
[INFO]  Downloading binary https://github.com/k3s-io/k3s/releases/download/v1.24.6+k3s1/k3s
[INFO]  Verifying binary download
[INFO]  Installing k3s to /usr/local/bin/k3s
[INFO]  Skipping installation of SELinux RPM
[INFO]  Creating /usr/local/bin/kubectl symlink to k3s
[INFO]  Creating /usr/local/bin/crictl symlink to k3s
[INFO]  Creating /usr/local/bin/ctr symlink to k3s
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-agent-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s-agent.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s-agent.service
[INFO]  systemd: Enabling k3s-agent unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s-agent.service → /etc/systemd/system/k3s-agent.service.
[INFO]  systemd: Starting k3s-agent
```

***

#### 2. K3S installation with a static cpu manager policy

If you like to enroll a worker node with a static cpu manager policy, run the following on the node

```bash
$ curl -sfL https://get.k3s.io -o install.sh
$ chmod +x install.sh
$ K3S_URL=https://server:6443 K3S_TOKEN=token ./install.sh agent --kubelet-arg 'cpu-manager-policy=static' --kubelet-arg 'kube-reserved=cpu=1,memory=2Gi,ephemeral-storage=1Gi' --kubelet-arg 'system-reserved=cpu=1,memory=2Gi,ephemeral-storage=1Gi'
```

Check if cpu manager state is set, run on the worker node

```
$ sudo cat /var/lib/kubelet/cpu_manager_state
{"policyName":"static","defaultCpuSet":"0-63","checksum":1058907510}
```

***

On the controller, check if the worker node is added

```
$ kubectl get nodes -o wide
NAME       STATUS   ROLES                  AGE    VERSION        INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
edge-vm2   Ready    <none>                 71s    v1.24.6+k3s1   10.10.2.32    <none>        Ubuntu 20.04.5 LTS   5.4.0-125-generic   containerd://1.6.8-k3s1
edge-vm1   Ready    control-plane,master   152m   v1.22.5+k3s1   10.10.2.31    10.0.87.20    Ubuntu 20.04.5 LTS   5.4.0-128-generic   containerd://1.5.8-k3s1
```

check if all kube pods are running healthy

```
(venv) expeca@controller-01:/opt/chi-in-a-box$ kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=worker-02
NAMESPACE       NAME                            READY   STATUS    RESTARTS   AGE   IP               NODE        NOMINATED NODE   READINESS GATES
calico-system   calico-typha-697fd494b5-b4cf2   1/1     Running   0          84s   10.10.2.5        worker-02   <none>           <none>
kube-system     kube-multus-ds-9544c            1/1     Running   0          89s   10.10.2.5        worker-02   <none>           <none>
calico-system   csi-node-driver-jhqf8           2/2     Running   0          88s   192.168.73.112   worker-02   <none>           <none>
calico-system   calico-node-2jfd5               1/1     Running   0          89s   10.10.2.5        worker-02   <none>           <none>
```

run on the controller, check the node

```
$ kubectl describe node <node-name>
...
Capacity:
  cpu:                32
  ephemeral-storage:  425991584Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             131518696Ki
  pods:               110
Allocatable:
  cpu:                30
  ephemeral-storage:  412257128943
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             127324392Ki
  pods:               110
...
```

On the worker node, install the cni plugins (needs GO)

```
git clone https://github.com/containernetworking/plugins.git
cd plugins
./build_linux.sh
sudo cp bin/host-device /opt/cni/bin/
sudo cp bin/dhcp /opt/cni/bin/
sudo cp bin/macvlan /opt/cni/bin/
sudo cp bin/host-local /opt/cni/bin/
sudo cp bin/bridge /opt/cni/bin/
sudo cp bin/static /opt/cni/bin/
ll /opt/cni/bin
```

On the worker node, check all the interfaces are up, setup with the proper MTU, and connected according to the testbeds inventory.

### Importing the worker node

Here we add the node to Openstack. First there must be a json file containing the worker node information. The format must be as below.

```
$ cat radiohost.json
[
  {
    "name": "worker-02",
    "hardware_type": "workernode",
    "properties": {
      "blazar_device_driver": "k8s",
      "machine_name": "worker-02",
      "device_name": "Dell PowerEdge R750xs",
      "vendor": "Dell",
      "model": "PowerEdge",
      "cpu_arch": "x86_64",
      "bm_interfaces": [
        {
          "name": "ens1f0",
          "mtu": 9000,
          "local_link_information":[
            {
              "switch_id":"a0:f8:49:f7:89:d1",
              "port_id":"te1/0/20",
              "switch_info":"tenant-switch"
            }
          ]
        },
        { 
          "name": "enp4s0np0",
          "mtu": 9000,
          "local_link_information":[
            {
              "switch_id":"a0:f8:49:f7:89:d1",
              "port_id":"te2/0/11",
              "switch_info":"tenant-switch"
            }
          ]
        },
        { 
          "name": "enp5s0np1",
          "mtu": 9000,
          "local_link_information":[
            {
              "switch_id":"a0:f8:49:f7:89:d1",
              "port_id":"te2/0/12",
              "switch_info":"tenant-switch"
            }
          ]
        }
      ]
    }
  }
]
```

Then we import the json file using openstack command:

```
openstack hardware import -f radiohost.json
```

Note that the `name` and `mtu` of the interfaces must match with the worker node.

Check if it is added to Doni's database and check its status. Workers' state must be `STEADY`. Otherwise, check the services' logs.

```
$ openstack hardware list
--------------------------------------------------------------------------
| UUID                                 | Name            | Properties  
--------------------------------------------------------------------------
| 6e29e738-c770-45e8-8747-cc6d8f6fd48f | radio-host3-vm1 | {'blazar_device_driver': 'k8s', 'bm_interfaces': [...], 'cpu_arch': 'x86_64', 'machine_name': 'k8s-worker'} |
--------------------------------------------------------------------------
$ openstack hardware show 6e29e738-c770-45e8-8747-cc6d8f6fd48f
--------------------------------------------------------------------------
| Field         | Value
--------------------------------------------------------------------------
| created_at    | 2022-11-07T23:16:21+00:00
| hardware_type | workernode 
| name          | radio-host3-vm1
| project_id    | e27227248c2b425ba4e2b2f548dbcd85
| workers       | [{'state': 'STEADY', 'state_details': {'blazar_resource_id': '628168f7-0217-483a-b4ad-884344762a6f', 'resource_created_at': '2022-11-07 23:17:07'}, 'worker_type': 'blazar.device'}, {'state': 'STEADY', 'state_details': {'num_labels': 0}, 'worker_type': 'k8s'}]
--------------------------------------------------------------------------
```

Doni adds the node to Blazar for user reservation and creates network attachment definition on k8s for the baremetal ports. Check their existence.

```
$ openstack reservation device list
+--------------------------------------+-----------------+---------------+-------------+
| id                                   | name            | device_driver | device_type |
+--------------------------------------+-----------------+---------------+-------------+
| 628168f7-0217-483a-b4ad-884344762a6f | radio-host3-vm1 | k8s           | container   |
+--------------------------------------+-----------------+---------------+-------------+
```

```
$ openstack reservation device show 628168f7-0217-483a-b4ad-884344762a6f
+------------------+--------------------------------------+
| Field            | Value                                |
+------------------+--------------------------------------+
| created_at       | 2022-11-07 23:17:07                  |
| device_driver    | k8s                                  |
| device_name      | K8S Worker Node                      |
| device_type      | container                            |
| id               | 628168f7-0217-483a-b4ad-884344762a6f |
| machine_name     | k8s-worker                           |
| model            | Ubuntu 20.04.5 Virtual Machine       |
| name             | radio-host3-vm1                      |
| platform_version | 2                                    |
| reservable       | True                                 |
| uid              | 6e29e738-c770-45e8-8747-cc6d8f6fd48f |
| updated_at       |                                      |
| vendor           | KTH Royal Institue of Technology     |
+------------------+--------------------------------------+
```

```
$ kubectl get network-attachment-definitions
NAME                        AGE
radio-host3-vm1.enp4s0np0   34h
radio-host3-vm1.enp5s0np1   34h
```

```
$ kubectl describe network-attachment-definition radio-host3-vm1.enp4s0np0
Name:         radio-host3-vm1.enp4s0np0
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  k8s.cni.cncf.io/v1
Kind:         NetworkAttachmentDefinition
Metadata:
  Creation Timestamp:  2022-11-07T23:17:08Z
  Generation:          1
  Managed Fields:
    API Version:  k8s.cni.cncf.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        .:
        f:config:
    Manager:         OpenAPI-Generator
    Operation:       Update
    Time:            2022-11-07T23:17:08Z
  Resource Version:  599993
  UID:               4f77252e-0b7d-4ced-9630-d00d273b19c5
Spec:
  Config:  { "cniVersion": "0.3.1", "local_link_information":[{"switch_id": "a0:f8:49:f7:89:d1", "port_id": "te2/0/11", "switch_info": "tenant-switch"}], "plugins": [{ "type": "macvlan", "master": "enp4s0np0", "mode": "bridge", "ipam": {} },{ "capabilities":{ "mac": true}, "type": "tuning" }] }
Events:    <none>
```

## Remove and clean a worker node

* Make sure the node has no reservations. Delete if there is any.
* Delete the hardware in `Doni`

  ```
  openstack hardware delete <hardware-uuid>
  ```
* On the worker node, run the following to stop and remove k3s services.

  ```
  /usr/local/bin/k3s-killall.sh
  /usr/local/bin/k3s-agent-uninstall.sh
  ```
* Check the nodes registered on the controller, delete the desired node

  ```
  kubectl get nodes
  kubectl delete node <node-name>
  ```

## Attach networks to the containers

This feature is only enable for physical VLAN networks registered in Neutron.

When creating a container, to attach any network to the network interfaces, make sure to provide the proper labels in the following format:

```
networks.<network-number>.interface=<interface-name>,
networks.<network-number>.ip=<ip>/<subnet>,
networks.<network-number>.routes=<source-network>/<subnet>-<via-ip>,
```

If you don't specify any `ip` for an interface, the container will get ip from the dhcp agent. Therefore, the network must be dhcp enabled. For example, the following settings corresponds to attaching 2 networks where for the first one we ask for a static ip and the second one will get an ip from the dhcp agent. Routes option can be used multiple times, resulting in multiple route commands.

```
networks.1.interface=enp4s0np0,networks.1.ip=192.168.100.200/24,networks.2.interface=enp5s0np1
```

Note that the first container that is taking a baremetal interface will decide about the network that it is going to get attached to. The next containers can use the same baremetal interface but they must be in the same network and subnet as the first container. For example:

Conainer 1 on `worker-1`:

```
networks.1.interface=enp4s0np0,networks.1.ip=192.168.0.1/24
```

Conainer 2 on `worker-1` (correct):

```
networks.1.interface=enp4s0np0,networks.1.ip=192.168.0.2/24
```

Conainer 2 on `worker-1` (wrong):

```
networks.1.interface=enp4s0np0,networks.1.ip=10.10.0.2/24
```

Routing example

```
networks.1.interface=enp4s0np0,networks.1.ip=10.99.99.3/24,networks.1.routes=172.16.0.0/16-10.99.99.1
```

## Create containers with privileges

When creating a container, to enable any capabalities, make sure to provide the proper labels in the following format:

```
capabilities.privileged=true
capabilities.drop=ALL
capabilities.add.1=NET_ADMIN
capabilities.add.2=SYS_ADMIN
```

For more information about kubernetes security context check [here](https://github.com/hub-kubernetes/securitycontext/blob/master/README.md).

## Control CPU and memory of the container

```
resources.limits.memory=200Mi
resources.limits.cpu=1.5
resources.requests.memory=200Mi
resources.requests.cpu=1.5
```

Check from within the container

```
cat /sys/fs/cgroup/cpuset/cpuset.cpus
4,6
```

For more information about kubernetes controlling node's resources check [here](https://github.com/hub-kubernetes/securitycontext/blob/master/README.md).

## More troubleshooting

User can request access to the local bridge. Check <https://devopstales.github.io/kubernetes/multus/> the bridge CNI. If the option is checked, Multus will add an interface that is connected to a bridge on the container worker node. It means all containers running on this node could be connected to each other using this bridge. In the json, we specify the ip address management (ipam) of this interface. It is better to have different subnets for different container workers.

For the `dhcp` IPAM to work do the following\
<https://www.cni.dev/plugins/current/ipam/dhcp/\\>
<https://superuser.com/questions/1727321/podman-macvlan-network-error>

```
kubectl get pods -A -o wide
openstack baremetal node list
kubectl get deployments -A -o wide
kubectl delete deployment -n <namespace> <deployment-name>
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://kth-expeca.gitbook.io/testbedconfig/enroll/worker.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
