During Deployment

cc-ansible deploy issues (first round)

Prevent Zun's Known Error

During the deployment of Zun, Ansible complains at Copying over kubeconfig for k8s agent task.

TASK [zun : Copying over kubeconfig for k8s agent] 
fatal: [edge]: FAILED! => {"msg": "No file was found when using first_found. Use errors='ignore' to allow this task to be skipped if no files are found"}

In order to avoid that, you need to edit the Ansible task and add ignore_errors: yes to it.

cd /opt/chi-in-a-box
vim venv/src/kolla-ansible/ansible/roles/zun/tasks/config.yml

Then change

...
- name: Copying over kubeconfig for k8s agent
  vars:
    service_name: zun-compute-k8s
...

to

...
- name: Copying over kubeconfig for k8s agent
  ignore_errors: yes
  vars:
    service_name: zun-compute-k8s
...

MariaDB Access denied for root

TASK [mariadb : Creating shard root mysql user] ************************************************************************************************************************************************************
fatal: [edge]: FAILED! => {"action": "mysql_user", "changed": false, "msg": "unable to connect to database, check login_user and login_password are correct or /var/lib/ansible/.my.cnf has the credentials. Exception message: (1045, \"Access denied for user 'root'@'edge' (using password: YES)\")"}

Solution:

The main solution is to run from clean slate. Stop all docker containers and delete all docker volumes. However if it did'nt work there is an alternative solution but risky.

We use docker to enable root user access, instead of using Ansible mysql_user module. It could be done by replacing Creating shard root mysql user task in the file venv/src/kolla-ansible/ansible/roles/mariadb/tasks/register.yml with the following:

- name: Creating shard root mysql user
  become: true
  command: >
    docker exec {{ mariadb_service.container_name }}
    mysql -u{{ database_user }} -p{{ database_password }} -e "GRANT ALL PRIVILEGES ON *.* TO '{{ database_user }}'@'%' IDENTIFIED BY '{{ database_password }}';"
  #ansible.builtin.shell: mysql -e "GRANT ALL PRIVILEGES ON *.* TO '{{ database_user }}'@'%' IDENTIFIED BY '{{ database_password }}';"
  when: mariadb_shard_id == mariadb_default_database_shard_id
  notify:
    - restart mariadb

Zun Kubeconfig

During the zun deployment, Ansible complains when running Copying over kubeconfig for k8s agent.

TASK [zun : Copying over kubeconfig for k8s agent] 
fatal: [edge]: FAILED! => {"msg": "No file was found when using first_found. Use errors='ignore' to allow this task to be skipped if no files are found"}

In that case, edit the file venv/src/kolla-ansible/ansible/roles/zun/tasks/config.yml and add ignore_errors: yes to the task Copying over kubeconfig for k8s agent.

Keystone certificate

During keystone deployment:

"AnsibleUndefinedVariable: 'keystone_federation_openid_certificate_key_ids' is undefined.

The solution is to commenting out the following lines in the /opt/site-config/defaults.yml

# enable_keystone_federation: no
# enable_keystone_federation_openid: no
# keystone_idp_client_id: null

Keystone register blazar

TASK [service-ks-register : blazar | Granting user roles]

FAILED - RETRYING: blazar | Granting user roles (5 retries left).
FAILED - RETRYING: blazar | Granting user roles (4 retries left).
FAILED - RETRYING: blazar | Granting user roles (3 retries left).
FAILED - RETRYING: blazar | Granting user roles (2 retries left).
FAILED - RETRYING: blazar | Granting user roles (1 retries left).
failed: [edge-vm1] (item={'user': 'blazar', 'role': 'admin', 'project': 'service'}) => {"action": "os_user_role", "ansible_loop_var": "item", "attempts": 5, "changed": false, "item": {"password": "up6URL14XXIHUgR5ofF1tRIWpwQJ6YAZ6PDS33UX", "project": "service", "role": "admin", "user": "blazar"}, "module_stderr": "Failed to discover available identity versions when contacting http://10.20.111.254:35357. Attempting to parse version from URL.\nTraceback (most recent call last):\n  File \"/opt/ansible/lib/python3.8/site-packages/urllib3/connection.py\", line 169, in _new_conn\n    conn = connection.create_connection(\n  File \"/opt/ansible/lib/python3.8/site-packages/urllib3/util/connection.py\", line 96, in create_connection\n    raise err\n  File \"/opt/ansible/lib/python3.8/site-packages/urllib3/util/connection.py\", line 86, in create_connection\n    sock.connect(sa)\nOSError: [Errno 113] No route to host
...
anothservice_catalog = self.get_access(session).service_catalog\n  File \"/opt/ansible/lib/python3.8/site-packages/keystoneauth1/identity/base.py\", line 134, in get_access\n    self.auth_ref = self.get_auth_ref(session)\n  File \"/opt/ansible/lib/python3.8/site-packages/keystoneauth1/identity/generic/base.py\", line 206, in get_auth_ref\n    self._plugin = self._do_create_plugin(session)\n  File \"/opt/ansible/lib/python3.8/site-packages/keystoneauth1/identity/generic/base.py\", line 158, in _do_create_plugin\n    raise exceptions.DiscoveryFailure(\nkeystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find versioned identity endpoints when attempting to authenticate. Please check that your auth_url is correct. Unable to establish connection to http://10.20.111.254:35357: HTTPConnectionPool(host='10.20.111.254', port=35357): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f08a208fb20>: Failed to establish a new connection: [Errno 113] No route to host'))\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}

cc-ansible deploy errors (second round)

Before running the second round, make sure to run a ./cc-ansible --site /opt/site-config/ bootstrap-servers.

Then we get the following error again:

TASK [zun : Copying over kubeconfig for k8s agent] *********************************************************************************************************************************************************
fatal: [edge-mv]: FAILED! => {"msg": "No file was found when using first_found. Use errors='ignore' to allow this task to be skipped if no files are found"}

In order to fix this, we need to copy a kubeconfig file that is written to /etc/rancher/k3s/k3s.yaml by k3s. Add the following task to the file venv/src/kolla-ansible/ansible/roles/zun/tasks/config.yml right before 'Copying over kubeconfig for k8s agent'.

- name: Set kubeconfig copy permission from etc
  become: true
  vars:
    service_name: zun-compute-k8s
  file:
    path: "/etc/rancher/k3s/k3s.yaml"
    mode: o+r
  when:
    - zun_services[service_name].enabled | bool
    - inventory_hostname in groups[zun_services[service_name].group]

Then add the line below to 'Copying over kubeconfig for k8s agent'.with_first_found:

"/etc/rancher/k3s/k3s.yaml"

Check if zun_compute_k8s container is healthy.

Last updated