- May 16, 2024
-
-
Christian Berendt authored
Also rename task to "Copying over custom pipeline.yaml file" for clarity. Change-Id: I04e3eb9620830a15781f9bab2549b557a9d1d9cb
-
- Jul 21, 2023
-
-
Doug Szumski authored
The OpenSearch Dashboards container does not have a health check defined when created. This causes the container to always restart when reconfigured, even if no change has been made. Change-Id: I0b437a77aeb61bc5ae9238f900a1fa00cbc34e18 Partial-Bug: #2028362
-
- Jun 28, 2023
-
-
Michal Nasiadka authored
Use case: exposing single external https frontend and load balancing services using FQDNs. Support different ports for internal and external endpoints. Introduced kolla_url filter to normalize urls like: - https://magnum.external:443/v1 - http://magnum.external:80/v1 Change-Id: I9fb03fe1cebce5c7198d523e015280c69f139cd0 Co-Authored-By:
Jakub Darmach <jakub@stackhpc.com>
-
Michal Nasiadka authored
We've seen issues in CI when keepalived haproxy check script returns an error and keepalived is switching to backup and then again to primary on a single node environment. Closes-Bug: #2025219 Change-Id: Iba62e76b3cf83f3ade6df81288d2d77129ffc725
-
- Jun 21, 2023
-
-
Michal Arbet authored
This patch fixing issue with octavia security group rules creation when using IPv6 configuration for octavia management network. Closes-Bug: #2023502 Change-Id: I3f8fbb0632ec6ecdc9f3820ebbcf01480de59e1f
-
- Jun 20, 2023
-
-
Dawud authored
Replaces the instance label on prometheus metrics with the inventory hostname as opposed to the ip address. The ip address is still used as the target address which means that there is no issue of the hostname being unresolvable. Can be optionally enabled or set to FQDNs by changing the prometheus_instance_label variable as mentioned in the release notes. Co-Authored-By:
Will Szumski <will@stackhpc.com> Change-Id: I387c9d8f5c01baf6054381834ecf4e554d0fff35
-
- Jun 17, 2023
-
-
Mark Goddard authored
Ansible 2.14.3 introduced a change that broke the method used for restarting MariaDB and RabbitMQ serially [1][2]. In I57425680a4cdbf0daeb9b2cc35920f1b933aa4a8 we limited to 2.14.2 to work around this. Ansible upstream claim this behaviour was unintentional, and will not fix it. This change moves to a different approach where we use separate plays with a 'serial' keyword to execute the restart. This change also removes the restriction on the maximum supported version of 2.14.2 on ansible-core - any 2.14 release is now supported. [1] https://github.com/ansible/ansible/commit/65366f663de7d044f42ae6dd53368fd4c1f88b35 [2] https://github.com/ansible/ansible/issues/80848 Depends-On: https://review.opendev.org/c/openstack/kolla/+/884208 Change-Id: I5a12670d07077d24047aaff57ce8d33ccf7156ff
-
- Jun 14, 2023
-
-
Michal Arbet authored
This patch is adding a feature for an option to copy different ceph configuration files and corresponding keyrings for cinder, glance, manila, gnocchi and nova services. This is especially useful when the deployment uses availability zones as below example. - Individual compute can read/write to individual ceph cluster in same AZ. - Cinder can write to several ceph clusters in several AZs. - Glance can use multistore and upload images to several ceph clusters in several AZs at once. Change-Id: Ie4d8ab5a3df748137835cae1c943b9180cd10eb1
-
- Jun 12, 2023
-
-
Mathias Fechner authored
Fix permissions for opensearch-dashboard data directory. Closes-bug: #2020152 Change-Id: Ie4cec7649d89df5b8bb306563da2c62ea0cdd2c0 Signed-off-by:
Mathias Fechner <fechner@osism.tech>
-
- Jun 07, 2023
-
-
Maksim Malchuk authored
According to the documentation [1] type of the Cyborg service should be 'accelerator' and description 'Acceleration Service'. Also, this change fixes incorrect endpoint URLs, and not configures an admin endpoint [2] because the documentation [1] not updated yet. 1. https://docs.openstack.org/cyborg/latest/install/common.html 2. Icf3bf08deab2c445361f0a0124d87ad8b0e4e9d9 Closes-Bug: #2020080 Change-Id: I002db50cbad5a90e479498e605bdeab343e129c7 Signed-off-by:
Maksim Malchuk <maksim.malchuk@gmail.com>
-
- May 31, 2023
-
-
Maksim Malchuk authored
The kolla-genpwd, kolla-mergepwd, kolla-readpwd and kolla-writepwd commands now creates or updates passwords.yml with correct permissions. Also they display warning message about incorrect permissions. Closes-Bug: #2018338 Change-Id: I4b50053ced9150499d1d09fd4a0ec2e243cf938b Signed-off-by:
Maksim Malchuk <maksim.malchuk@gmail.com>
-
- May 26, 2023
-
-
OpenStack Release Bot authored
Add file to the reno documentation build to show release notes for stable/2023.1. Use pbr instruction to increment the minor version number automatically so that master versions are higher than the versions on stable/2023.1. Sem-Ver: feature Change-Id: I870c0569a1e175ac5df59fc495812ba81c5147e6
-
- May 19, 2023
-
-
Michal Nasiadka authored
Depends-On: https://review.opendev.org/c/openstack/neutron/+/878535 Change-Id: I05d8b29b59a7de76da488f68775547a8f0f11d0f
-
- May 18, 2023
-
-
Michal Nasiadka authored
We limit to 2.14.2 due to a regression in ansible-core [1] that breaks conditional include_task loops in handlers. This is used for controlled restarts of MariaDB and RabbitMQ. [1]: https://github.com/ansible/ansible/commit/65366f663de7d044f42ae6dd53368fd4c1f88b35 Change-Id: I57425680a4cdbf0daeb9b2cc35920f1b933aa4a8 Co-Authored-By:
Michal Nasiadka <michal@stackhpc.com>
-
- May 16, 2023
-
-
Sean Mooney authored
As of I3629b84d3255a8fe9d8a7cea8c6131d7c40899e8 nova now requires the service_user section to be configured to address CVE-2023-2088. This change adds the service user section to the nova.conf template in the nova and nova-cell roles. Related-Bug: #2004555 Signed-off-by:
Sven Kieske <kieske@osism.tech> Change-Id: I2189dafca070accfd8efcd4b8cc4221c6decdc9f (cherry picked from commit a77ea13ef1991543df29b7eea14b1f91ef26f858) (cherry picked from commit 03c12abbcc107bfec451f4558bc97d14facae01c) (cherry picked from commit cb105dc293ff1cdb11ab63fa3e3bf39fd17e0ee0) (cherry picked from commit efe6650d09441b02cf93738a94a59723d84c5b19)
-
- May 04, 2023
-
-
Matt Crees authored
The flags ``--db-nb-pid`` and ``--db-sb-pid`` are corected to be ``--db-nb-pidfile`` and ``--db-sb-pidfile`` respectively. See here for reference: https://github.com/ovn-org/ovn/blob/6c6a7ad1c64a21923dc9b5bea7069fd88bcdd6a8/utilities/ovn-ctl#L1045 Closes-Bug: #2018436 Change-Id: Ic1e8768374566eb2198302807ecc644a19cd3062
-
- Apr 26, 2023
-
-
Sven Kieske authored
as agreed in the Kolla meeting: https://meetings.opendev.org/meetings/kolla/2023/kolla.2023-04-19-13.00.html Signed-off-by:
Sven Kieske <kieske@osism.tech> Change-Id: I099a5328e0837e1f5dcf7f21b7fd7bea1748456d
-
- Apr 20, 2023
-
-
Magnus Lööf authored
When using externally managed certificates, according to [1], one should set `kolla_externally_managed_cert: yes` and ensure that the certificates are in the correct place. However, RabbitMQ precheck still expects the certificates to be available on the controller node. This is incorrect. Fix by not running the tasks in question when `kolla_externally_managed_cert: yes` [1] https://docs.openstack.org/kolla-ansible/latest/admin/tls.html Closes-Bug: 1999081 Related-Bug: 1940286 Signed-off-by:
Magnus Lööf <magnus.loof@basalt.se> Change-Id: I9f845a7bdf5055165e199ab1887ed3ccbfb9d808
-
Dr. Jens Harbott authored
This reverts commit 9867060b. Reason for revert: seems this broke some jobs Change-Id: I1ca81214ece403351c0a522ea05bf07802e4c4c0
-
- Apr 17, 2023
-
-
Michal Arbet authored
This patch introduces distributed lock for masakari-api service when handle the concurrent notifications for the same host failure from multiple masakari-hostmonitor services. Change-Id: I46985202dc8da22601357eefe2727599e7a413e5
-
- Apr 13, 2023
-
-
Michal Nasiadka authored
Change-Id: Ibc9cc91f64b0450de3cae6e2830b4ff2c52c0395
-
Matt Crees authored
With the addition of the variable `om_enable_rabbitmq_high_availability`, this feature in the upgrade task should be brought back. It is also now used in the deploy task. The `ha-all` policy is cleared only when `om_enable_rabbitmq_high_availability` is set to `false`. Change-Id: Ia056aa40e996b1f0fed43c0f672466c7e4a2f547
-
- Apr 12, 2023
-
-
Matt Crees authored
Puts the RabbitMQ node into maintenance mode before restarting the container. This will make the node shutdown less disruptive. For details on what maintenance mode does, see: https://www.rabbitmq.com/upgrade.html#maintenance-mode Change-Id: Ia61573f3fb95fe8fcde6b789ca77ef5b45fe0a65
-
Michal Nasiadka authored
Since RMQ 3.8 we can use rolling upgrade [1]. Depends-On: https://review.opendev.org/c/openstack/kolla/+/872393 [1]: https://www.rabbitmq.com/upgrade.html#rolling-upgrades Change-Id: If6a7c6c12d9226a2406728108b3c87b3485ac55f
-
- Apr 08, 2023
-
-
gamerslouis authored
Add checking for container readiness before create sasl user Closes-Bug: #2015589 Change-Id: Ic650ba6be1f192e3cbeaa94de3d00507636c1c92
-
- Mar 29, 2023
-
-
Maksim Malchuk authored
Since CVE-2022-29404 is fixed [1,2] the default value for the LimitRequestBody directive in the Apache HTTP Server has been changed from 0 (unlimited) to 1 GiB. This limits the size of images (for example) uploaded in Horizon. This change add the ability to configure the limit. 1. https://access.redhat.com/articles/6975397 2. https://ubuntu.com/security/CVE-2022-29404 Closes-Bug: #2012588 Change-Id: I4cd9dd088cbcf38ff6f8d188ebcc56be7d9ea1c9 Signed-off-by:
Maksim Malchuk <maksim.malchuk@gmail.com>
-
- Mar 28, 2023
-
-
Matt Crees authored
When upgrading Nova, we sometimes hit an error where an old hypervisor that hasn’t been upgraded recently (for example due to broken hardware) is preventing Nova API from starting properly. This can be detected using the tool ``nova-status upgrade check`` to make sure that there are no ``nova-compute`` that are older than N-1 releases. This is already used in the Kolla Ansible upgrade task for Nova. However, this task uses the current ``nova-api`` container, so computes which will be too old after the upgrade are not caught. This patch changes Kolla Ansible so that the upgraded ``nova-api`` image is used to run the upgrade checks, allowing computes that will be too old to be detected before the upgrades are performed. Depends-On: https://review.opendev.org/c/openstack/kolla/+/878744 Closes-Bug: #1957080 Co-Authored-By:
Pierre Riteau <pierre@stackhpc.com> Change-Id: I3a899411001834a0c88e37f45a756247ee11563d
-
- Mar 21, 2023
-
-
John Garbutt authored
Following ideas here: https://wiki.openstack.org/wiki/Large_Scale_Configuration_Rabbit Make sure old messages with no consumer are dropped after the message TTL of 10 mins, longer than the 1 min RPC timeout. Also ensure queues expire after an hour of inactivity, so queues from removed nodes or renamed nodes don't grow over time. Change-Id: Ifb28ac68b6328adb604a7474d01e5f7a47b2e788
-
Matt Crees authored
Adds two new flags to alter behaviour in RabbitMQ: * `rabbitmq_message_ttl_ms`, which lets you set a TTL on messages. * `rabbitmq_queue_expiry_ms`, which lets you set an expiry time on queues. See https://www.rabbitmq.com/ttl.html for more information on both. Change-Id: I51ca37ffbb1bb5c07f2d39873f0f33ca20263f2a
-
Matt Crees authored
Changes the default value of `rabbitmq-ha-promote-on-shutdown` to `"always"`. We are seeing issues with RabbitMQ automatically recovering when nodes are restarted. https://www.rabbitmq.com/ha.html#cluster-shutdown Rather than waiting for operator interventions, it is better we allow recovery to happen, even if that means we may loose some messages. A few failed and timed out operations is better than a totaly broken cloud. This is achieved using ha-promote-on-shutdown=always. Note, when a node failure is detected, this is already the default behaviour from 3.7.5 onwards: https://www.rabbitmq.com/ha.html#promoting-unsynchronised-mirrors Related-Bug: #1954925 Change-Id: I484a81163f703fa27112df22473d657e2a9ab964
-
- Mar 06, 2023
-
-
Christian Berendt authored
With the parameter ``mariadb_datadir_volume`` it is possible to use a directory as volume for the mariadb service. By default, a volume named mariadb is used (the previous default). Change-Id: Ic61fe981825c5fa6f50e53c9555b6a102f42f522
-
Christian Berendt authored
With the new ``neutron_ovn_availability_zones`` parameter it is possible to define network availability zones for OVN. Further details can be found in the Neutron OVN documentation: https://docs.openstack.org/neutron/latest/admin/ovn/availability_zones.html#how-to-configure-it Change-Id: I203e0d400a3218d0b4a41f2a948207032c4febec
-
- Mar 02, 2023
-
-
Matthew N Heler authored
deployments This allows services to work with etcd when coordination is enabled for TLS internal deployments. Without this fix, we fail to connect to etcd with the coordination backend and the service itself crashes. Change-Id: I0c1d6b87e663e48c15a846a2774b0a4531a3ca68
-
- Feb 14, 2023
-
-
Mark Goddard authored
Previously, when running one of the following commands: kolla-ansible deploy --check kolla-ansible genconfig --check deployment or configuration generation fails for various reasons. MariaDB fails to lookup the existing cluster. Keystone fails to generate cron config. Nova-cell fails to get the cell settings. Closes-Bug: #2002661 Change-Id: I5e765f498ae86d213d0a4379ca5d473db1499962
-
John Garbutt authored
Currently we do not follow the RabbitMQ advice on replicas here: https://www.rabbitmq.com/ha.html#replication-factor Here we reduce the number of replicas to n // 2 + 1 as advised above. The hope it this helps speed up recovery from rabbit issues. Related-Bug: #1954925 Change-Id: Ib6bcb26c499c9884faa4a0cd51abaec00cacb096
-
Matt Crees authored
Adds the flag `rabbitmq_ha_replica_count` to change how many different nodes a queue should be mirrored across. If the value is not set, then it defaults to "ha-mode":"all". This value is unset by default to avoid any unexpected changes to the RabbitMQ definitions.json file, as that would trigger an unexpected restart of RabbitMQ during the next deploy. Change-Id: Iee98cd937197a73a3b04aa8501fa325e8ecfff24
-
Will Szumski authored
Hardcoding the first etcd host creates a single point of failure. Change-Id: I0f83030fcd84ddcdc4bf2226e76605c7cab84cbb
-
- Feb 13, 2023
-
-
Will Szumski authored
etcd-compatible tooz drivers do not support multiple endpoints via backend_url. We can put a loadbalancer in front of etcd and configure backend_url to use the VIP instead. The issue with hard coding the first host is that we break coordination if we take this host offline. In the case of cinder, we would not be able to perform any volume related operations. Co-Authored-By:
Mark Goddard <mark@stackhpc.com> Change-Id: Ib684501ba03c386dc5ac71e5cbea05c99f191665
-
- Feb 09, 2023
-
-
John Garbutt authored
By default ha-promote-on-shutdown=when-synced. However we are seeing issues with RabbitMQ automatically recovering when nodes are restarted. https://www.rabbitmq.com/ha.html#cluster-shutdown Rather than waiting for operator interventions, it is better we allow recovery to happen, even if that means we may loose some messages. A few failed and timed out operations is better than a totaly broken cloud. This is achieved using ha-promote-on-shutdown=always. Note, when a node failure is detected, this is already the default behaviour from 3.7.5 onwards: https://www.rabbitmq.com/ha.html#promoting-unsynchronised-mirrors This patch adds the option to change the ha-promote-on-shutdown definition, using the flag `rabbitmq_ha_promote_on_shutdown`. This value is unset by default to avoid any unexpected changes to the RabbitMQ definitions.json file, as that would trigger an unexpected restart of RabbitMQ during the next deploy. Related-Bug: #1954925 Change-Id: I2146bda2c72ddac2c9923c6941b0596395fd9ab5
-
- Feb 04, 2023
-
-
Michal Arbet authored
This patch fixes kolla_docker module as it did not take into account common_options parameter. From patchset it's visible that module's default values are used always - even if user overrided some param in common_options dict. Closes-Bug: #2003079 Change-Id: I677fde708dd004decaff4bd39f2173d8d81052fb
-