- Sep 16, 2019
-
-
Mark Goddard authored
The kolla_toolbox Ansible module executes as-hoc ansible commands in the kolla_toolbox container, and parses the output to make it look as if ansible-playbook executed the command. Currently however, this module sometimes fails to catch failures of the underlying command, and also sometimes shows tasks as 'ok' when the underlying command was changed. This has been tested both before and after the upgrade to ansible 2.8. This change fixes this issue by configuring ansible to emit output in JSON format, to make parsing simpler. We can now pick up errors and changes, and signal them to the caller. This change also adds an ansible playbook, tests/test-kolla-toolbox.yml, that can be executed to test the module. It's not currently integrated with any CI jobs. Note that this change cannot be backported as the JSON output callback plugin was added in Ansible 2.5. Change-Id: I8236dd4165f760c819ca972b75cbebc62015fada Closes-Bug: #1844114
-
- Sep 10, 2019
-
-
Hongbin Lu authored
After the integration with placement [1], we need to configure how zun-compute is going to work with nova-compute. * If zun-compute and nova-compute run on the same compute node, we need to set 'host_shared_with_nova' as true so that Zun will use the resource provider (compute node) created by nova. In this mode, containers and VMs could claim allocations against the same resource provider. * If zun-compute runs on a node without nova-compute, no extra configuration is needed. By default, each zun-compute will create a resource provider in placement to represent the compute node it manages. [1] https://blueprints.launchpad.net/zun/+spec/use-placement-resource-management Change-Id: I2d85911c4504e541d2994ce3d48e2fbb1090b813
-
- Sep 05, 2019
-
-
Marcin Juszkiewicz authored
Instead of changing Docker daemon command line let's change config for Docker instead. In /etc/docker/daemon.json file as it should be. Custom Docker options can be set with 'docker_custom_config' variable. Old 'docker_custom_option' is still present but should be avoided. Co-Authored-By:
Radosław Piliszek <radoslaw.piliszek@gmail.com> Change-Id: I1215e04ec15b01c0b43bac8c0e81293f6724f278
-
- Aug 22, 2019
-
-
Michal Nasiadka authored
In order to orchestrate smooth transition to fluentd 0.14.x aka 1.0 stable branch aka td-agent 3 from td-agent repository - use image labels (fluentd_version and fluentd_binary). Depends-On: https://review.opendev.org/676411 Change-Id: Iab8518c34ef876056c6abcdb5f2e9fc9f1f7dbdd
-
- Aug 16, 2019
-
-
Mark Goddard authored
At the end of a CI run, check all log files. Change-Id: I99afc1c5207757e35beabf7daebd86c56151c96d
-
Radosław Piliszek authored
- Test Zun on CentOS too - Make etcd change also trigger Zun jobs (like kuryr and zun) - Test multinode Zun deployments instead of AIO (more likely to break) - In Zun scenario, stop configuring docker for legacy swarm mode (Zun is no swarm) - Separate test-zun.sh testing script - Show appcontainer to see which node it has been started on Change-Id: I289b1009fe00aedb9b78cbd83298b14da5fd9670 Depends-On: https://review.opendev.org/676736 Signed-off-by:
Radosław Piliszek <radoslaw.piliszek@gmail.com>
-
Michal Nasiadka authored
Change-Id: I081f2f4762651bca935f08a67b20f21946aaf051
-
- Aug 14, 2019
-
-
Kien Nguyen authored
Add Masakari testing into the Gate. Change-Id: I52df33f963e7d2ae4059887df3d24d9e6642134e Depends-On: https://review.opendev.org/#/c/615469/ Depends-On: https://review.opendev.org/#/c/615715 Implements: blueprint ansible-masakari Co-Authored-By:
Gaëtan Trellu <gaetan.trellu@incloudus.com>
-
- Aug 06, 2019
-
-
Mark Goddard authored
During the MariaDB testing we saw a number of cases where this IP address was not assigned to one or more hosts, which caused various issues later on. Change-Id: I61b54483e4553b926e9ddc0a8848b2daa6bc49f1
-
- Aug 05, 2019
-
-
Radosław Piliszek authored
1) ceph-nfs (ganesha-ceph) - use NFSv4 only This is recommended upstream. v3 and UDP require portmapper (aka rpcbind) which we do not want, except where Ubuntu ganesha version (2.6) forces it by requiring enabled UDP, see [1]. The issue has been fixed in 2.8, included in CentOS. Additionally disable v3 helper protocols and kerberos to avoid meaningless warnings. 2) ceph-nfs (ganesha-ceph) - do not export host dbus It is not in use. This avoids the temptation to try handling it on host. 3) Properly handle ceph services deploy and upgrade Upgrade runs deploy. The order has been corrected - nfs goes after mds. Additionally upgrade takes care of rgw for keystone (for swift emulation). 4) Enhance ceph keyring module with error detection Now it does not blindly try to create a keyring after any failure. This used to hide real issue. 5) Retry ceph admin keyring update until cluster works Reordering deployment caused issue with ceph cluster not being fully operational before taking actions on it. 6) CI: Remove osd df from collected logs as it may hang CI Hangs are caused by healthy MON and no healthy MGR. A descriptive note is left in its place. 7) CI: Add 5s timeout to ceph informational commands This decreases the timeout from the default 300s. [1] https://review.opendev.org/669315 Change-Id: I1cf0ad10b80552f503898e723f0c4bd00a38f143 Signed-off-by:
Radosław Piliszek <radoslaw.piliszek@gmail.com>
-
- Jul 26, 2019
-
-
Radosław Piliszek authored
This actually replaces two ad-hoc fixes with a more unified solution (with comment for posterity). Change-Id: I62f57cb489c900f68a0c7aeb3e20e4715c0e2661 Signed-off-by:
Radosław Piliszek <radoslaw.piliszek@gmail.com>
-
Radosław Piliszek authored
Multinode jobs did not run sanity checks for all the hosts, only primary. Now they check all. Additionally upgrades are now checked using the proper (pre-upgrade) scripts (not that it matters too much as they are the same atm) and both checks are done, not only failures, but also config. Change-Id: I10552e256edbddd5b1f8a8a7f8805262e72ce8d8 Signed-off-by:
Radosław Piliszek <radoslaw.piliszek@gmail.com>
-
- Jul 18, 2019
-
-
Radosław Piliszek authored
Docker has no restart policy named 'never'. It has 'no'. This has bitten us already (see [1]) and might bite us again whenever we want to change the restart policy to 'no'. This patch makes our docker integration honor all valid restart policies and only valid restart policies. All relevant docker restart policy usages are patched as well. I added some FIXMEs around which are relevant to kolla-ansible docker integration. They are not fixed in here to not alter behavior. [1] https://review.opendev.org/667363 Change-Id: I1c9764fb9bbda08a71186091aced67433ad4e3d6 Signed-off-by:
Radosław Piliszek <radoslaw.piliszek@gmail.com>
-
- Jul 16, 2019
-
-
Radosław Piliszek authored
We install kolla-ansible requirements in Zuul's Ansible playbooks. This patch cleans up the installation in scripts so that they are only concerned with auxiliary requirements: - ansible (since we do not track it in requirements) - ara (for log summaries) - openstack clients (for first init and tests after deployment) Additionally this patch installs openstack clients in a separate virtualenv. Note that all kolla-ansible requirements, ansible and ara are still installed system-wide. Change-Id: Iac04082ad39a9d823c515ba11c5db9af50ed225f Signed-off-by:
Radosław Piliszek <radoslaw.piliszek@gmail.com>
-
Michal Nasiadka authored
Depends-On: https://review.opendev.org/669315 Change-Id: I6946290cd890f74c59ed5394e8382a8b75c0c4cd
-
- Jul 09, 2019
-
-
Radosław Piliszek authored
Missed by me in a recent merge. TrivialFix Signed-off-by:
Radosław Piliszek <radoslaw.piliszek@gmail.com> Change-Id: I83b1e84a43f014ce20be8677868be3f66017e3c2
-
- Jul 04, 2019
-
-
Mark Goddard authored
This is the documented procedure. Change-Id: I09ca99e92b112621d66b564a88b13658632242f5
-
- Jul 03, 2019
-
-
Radosław Piliszek authored
Change-Id: I59a05e8a0a2656596d2cced61bd98f2aa790d60b Signed-off-by:
Radosław Piliszek <radoslaw.piliszek@gmail.com>
-
- Jul 02, 2019
-
-
Radosław Piliszek authored
Otherwise ara had only the stderr part and logs only the stdout part which made ordered analysis harder. Additionally add -vvv for the bootstrap-servers run. Change-Id: Ia42ac9b90a17245e9df277c40bda24308ebcd11d Signed-off-by:
Radosław Piliszek <radoslaw.piliszek@gmail.com>
-
- Jul 01, 2019
-
-
Radosław Piliszek authored
Some kolla-ansible jobs failed due to using external mirrors instead of local ones. This was due to not using the template override provided by kolla. This patch fixes that. Depends-On: https://review.opendev.org/668226 Change-Id: I27f714fdf05e521aa8ce25c5683a452ceb35eeb8 Signed-off-by:
Radosław Piliszek <radoslaw.piliszek@gmail.com>
-
Radosław Piliszek authored
Change-Id: Ifc898015b9b523ef4c50fc969e464f05762f2151 Signed-off-by:
Radosław Piliszek <radoslaw.piliszek@gmail.com>
-
Mark Goddard authored
This reverts commit 8ce5ffd0. Change-Id: I81ce7c007ff267ebbbb721bcdb7eebc0dd575bf8
-
- Jun 28, 2019
-
-
Mark Goddard authored
Previously we sourced this script in tests/deploy.sh, but this was recently changed. Following that change we lost the errexit setting, meaning we ignore errors in init-runonce. Adding errexit in the script itself means that all callers get error handling. Also log init-runonce output. TrivialFix Change-Id: I9b35bd5f0f76eec26ddd968d093a3a5fd55a7ce2
-
- Jun 27, 2019
-
-
Mark Goddard authored
These were not templated, so always evaluated to true. This shouldn't be causing any issues. Change-Id: I7b8e407e688ba201c4f7d1a94bbd41af0918e7df
-
- Jun 21, 2019
-
-
Radosław Piliszek authored
Docker registry being insecure is handled by docker_registry_insecure which is set to true by default when docker_registry is set. The removed code had no effect because docker_registry is not changed anyway for base (pre-upgrade) install. This change makes config more readable and also prevents a potential conflict with the zun profile if ever used in upgrade mode. Change-Id: I9b5ae8c5b534fa6cce9dbaca8af191e2ca79d19f Signed-off-by:
Radosław Piliszek <radoslaw.piliszek@gmail.com>
-
- Jun 16, 2019
-
-
Jeffrey Zhang authored
The nova-consoleauth service was deprecated during the Rocky release [1] and has not been necessary since unless you're using cells v1. As Kolla has never supported cells v1, which is finally being removed during Train [2], we can get ahead of the curve and stop deploying nova-consoleauth immediately. [1] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/convert-consoles-to-objects.html [2] https://blueprints.launchpad.net/nova/+spec/remove-cells-v1/ Change-Id: I099080979f5497537e390f531005a517ab12aa7a
-
- Jun 11, 2019
-
-
Mark Goddard authored
Adds four new CI jobs for testing centos/ubuntu binary/source deploys with ironic enabled. These are run only when there are changes to the ironic role. Performs some simple testing by creating a node using the fake-hardware hardware type and creating a server. Change-Id: Ie669e57ce2af53257b4ca05f45193cb73f48827a Depends-On: https://review.opendev.org/664011
-
- Jun 07, 2019
-
-
Carlos Goncalves authored
The project has been retired and there will be no Train release [1]. This patch removes Neutron LBaaS support in Kolla. [1] https://review.opendev.org/#/c/658494/ Change-Id: Ic0d3da02b9556a34d8c27ca21a1ebb3af1f5d34c
-
Mark Goddard authored
This is useful when removing a container that is no longer supported. Change-Id: I08d79ce7dd2f3d11e466930de85412017cd5f747
-
- Jun 03, 2019
-
-
Mark Goddard authored
Add CI jobs for testing an upgrade of a multinode system with Ceph enabled. As for the existing upgrade job, we upgrade from the previous release to the current release. Change-Id: I931772ca4c63757769467a57c80dc0726a11167a Depends-On: https://review.opendev.org/658163
-
- May 31, 2019
-
-
Gaetan Trellu authored
Qinling is an OpenStack project to provide "Function as a Service". This project aims to provide a platform to support serverless functions. Change-Id: I239a0130f8c8b061b531dab530d65172b0914d7c Implements: blueprint ansible-qinling-support Story: 2005760 Task: 33468
-
- May 17, 2019
-
-
Mark Goddard authored
Right now every controller rotates fernet keys. This is nice because should any controller die, we know the remaining ones will rotate the keys. However, we are currently over-rotating the keys. When we over rotate keys, we get logs like this: This is not a recognized Fernet token <token> TokenNotFound Most clients can recover and get a new token, but some clients (like Nova passing tokens to other services) can't do that because it doesn't have the password to regenerate a new token. With three controllers, in crontab in keystone-fernet we see the once a day correctly staggered across the three controllers: ssh ctrl1 sudo cat /etc/kolla/keystone-fernet/crontab 0 0 * * * /usr/bin/fernet-rotate.sh ssh ctrl2 sudo cat /etc/kolla/keystone-fernet/crontab 0 8 * * * /usr/bin/fernet-rotate.sh ssh ctrl3 sudo cat /etc/kolla/keystone-fernet/crontab 0 16 * * * /usr/bin/fernet-rotate.sh Currently with three controllers we have this keystone config: [token] expiration = 86400 (although, keystone default is one hour) allow_expired_window = 172800 (this is the keystone default) [fernet_tokens] max_active_keys = 4 Currently, kolla-ansible configures key rotation according to the following: rotation_interval = token_expiration / num_hosts This means we rotate keys more quickly the more hosts we have, which doesn't make much sense. Keystone docs state: max_active_keys = ((token_expiration + allow_expired_window) / rotation_interval) + 2 For details see: https://docs.openstack.org/keystone/stein/admin/fernet-token-faq.html Rotation is based on pushing out a staging key, so should any server start using that key, other servers will consider that valid. Then each server in turn starts using the staging key, each in term demoting the existing primary key to a secondary key. Eventually you prune the secondary keys when there is no token in the wild that would need to be decrypted using that key. So this all makes sense. This change adds new variables for fernet_token_allow_expired_window and fernet_key_rotation_interval, so that we can correctly calculate the correct number of active keys. We now set the default rotation interval so as to minimise the number of active keys to 3 - one primary, one secondary, one buffer. This change also fixes the fernet cron job generator, which was broken in the following cases: * requesting an interval of more than 1 day resulted in no jobs * requesting an interval of more than 60 minutes, unless an exact multiple of 60 minutes, resulted in no jobs It should now be possible to request any interval up to a week divided by the number of hosts. Change-Id: I10c82dc5f83653beb60ddb86d558c5602153341a Closes-Bug: #1809469
-
Mark Goddard authored
Before making changes to this script, document its behaviour with a unit test. There are two major issues: * requesting an interval of more than 1 day results in no jobs * requesting an interval of more than 60 minutes, unless an exact multiple of 60 minutes, results in no jobs Change-Id: I655da1102dfb4ca12437b7db0b79c9a61568f79e Related-Bug: #1809469
-
- Apr 19, 2019
-
-
OpenDev Sysadmins authored
This commit was bulk generated and pushed by the OpenDev sysadmins as a part of the Git hosting and code review systems migration detailed in these mailing list posts: http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003603.html http://lists.openstack.org/pipermail/openstack-discuss/2019-April/004920.html Attempts have been made to correct repository namespaces and hostnames based on simple pattern matching, but it's possible some were updated incorrectly or missed entirely. Please reach out to us via the contact information listed at https://opendev.org/ with any questions you may have.
-
- Apr 14, 2019
-
-
Mark Goddard authored
Periodic jobs don't have zuul.change defined, since there is no change being tested. This causes an early failure when referencing zuul.change to set the image tag for built images. In periodic jobs we'll never need to build images because there is no dependent kolla change under test. Change-Id: I6d9d81cf17b7d0d7aaf87cd96418c904c46681f2
-
- Apr 10, 2019
-
-
Mark Goddard authored
During the Train cycle, Bifrost switched to using JSON-RPC by default for Ironic's internal communication [1], avoiding the need to install RabbitMQ. This simplifies things, so we may as well remove our custom configuration of RabbitMQ. [1] https://review.openstack.org/645093 Change-Id: I3107349530aa753d68fd59baaf13eb7dd5485ae6
-
- Apr 08, 2019
-
-
Mark Goddard authored
Make an early start on the TODOs for the Train cycle. 1. Remove the task that removes the vitrage_collector container, which was added in the Stein cycle to clean up this container which is no longer deployed. 2. Remove globals.yml configuration in CI to disable Heat for upgrade jobs. Heat is now enabled in the previous release (Stein). 3. Remove the deprecated variable cinder_iscsi_helper, which was renamed to cinder_target_helper in Stein. Change-Id: I774bf395e0bdd4db9c20c6289a22cf059fa42e1a
-
- Apr 03, 2019
-
-
Mark Goddard authored
Typically, non-executable files should have 660 or 600 and executable files and directories should have 770. All should be owned by the 'config_owner_user' and 'config_owner_group' variables. This change adds a script to check the owner and permissions of config files under /etc/kolla, and runs it at the end of CI jobs. Change-Id: Icdbabf36e284b9030017a0dc07b9dc81a37758ab Related-Bug: #1821579
-
- Mar 27, 2019
-
-
Mark Goddard authored
This patch adds two new jobs: * kolla-ansible-centos-source-upgrade * kolla-ansible-ubuntu-source-upgrade These jobs first deploy a control plane using the previous release of Kolla Ansible, then upgrade to the current release. Because we can't change the branch of the git repository on the Zuul executor, we change the branch of the kolla-ansible repository on the primary node to the branch of the previous release, in this case stable/rocky. A new remote-template role has been added that supports generating templates using a remote template source, to generate config files using the previous kolla-ansible branch. If the change being tested depends on a kolla change for the current branch, then we build images. Rather than using the current kolla-ansible version to tag the images, we now tag them with change_<gerrit change ID>. This is because the version of kolla-ansible will change from the previous release to the current one as we upgrade the system. Finally, it should be noted that the 'previous_release' variable in the Zuul config needs to be updated with each release, since this sets the release of kolla-ansible that is installed initially. Depends-On: https://review.openstack.org/645089/ Depends-On: https://review.openstack.org/644250/ Depends-On: https://review.openstack.org/645816/ Depends-On: https://review.openstack.org/645840/ Change-Id: If301e0affcd55360fefe3b105f023ae5c47b0853
-
- Mar 21, 2019
-
-
Mark Goddard authored
Fixes a race condition where sometimes a volume would still be in the 'creating' state when trying to attach it to a server. Invalid volume: Volume <id> status must be available or downloading to reserve, but the current status is creating. Change-Id: I0687ddfd78c384650cb361ff07aa64c5c3806a93
-