Commits · 121aa3d25852660827d8263fd8650f72f2e37d5f · Very Demiurge Very Mindful / Kolla Ansible

Feb 15, 2024

Ironic: enable elevated access for project scoped service role · 121aa3d2

Bartosz Bezak authored 1 year ago

Ironic recently started to enforce new policies and scope [1].
And Ironic is one of the sole openstack project which need
system scope for some admin related api calls [2].
However Ironic also started to allow project-scope behaviour
for service role with setting
``rbac_service_role_elevated_access``[3] [4]. This change enables
this setting to get similar behaviour of service role as other
openstack projects.

[1] https://review.opendev.org/c/openstack/ironic/+/902009
[2] https://opendev.org/openstack/governance/src/commit/e2a47de10a689a78c31765fd1b020f17c0d3109c/goals/selected/consistent-and-secure-rbac.rst?display=source#L261
[3] https://review.opendev.org/c/openstack/ironic/+/907148
[4] https://opendev.org/openstack/ironic/src/commit/8ec56066223301230ac0ed0f0c471a10d366b474/releasenotes/notes/service-project-service-role-fix-e4d1a8c23856926a.yaml

Related-Bug: #2051837

Change-Id: If8d7cf1663145d0398a2e936486e2b316d4df5e0

121aa3d2

Feb 07, 2024

Fix horizon deployment · 4108aea8

Michal Arbet authored 1 year ago

New horizon release use [1] for cache backend
instead of [2] as it was in previous versions.

This patch:

1. Removes override from config and
   configure only memcached endpoints, not backend
   specification itself. This will avoid bugs
   in future in case BACKEND will be switched again.

2. Remove 'memcached' context from kolla_address filter
   and use 'url' as [1] don't support inet6:[{address}]
   for ipv6 but supports [{address}] which 'url' provides.

[1] django.core.cache.backends.memcached.PyMemcacheCache
[2] django.core.cache.backends.memcached.MemcachedCache

Change-Id: Ie3a8f47e7b776b6aa2bb9b1522fdd4514ea1484b

4108aea8

Rework horizon role to support local_settings.d · b5aa63de

Michal Arbet authored 1 year ago

This patch implements horizon's preferred way how
to configure itself described in docs [1],

[1] https://docs.openstack.org/horizon/latest/configuration/settings.html

Depends-On: https://review.opendev.org/c/openstack/kolla/+/906339
Change-Id: I60ab4634bf4333c47d00b12fc4ec00570062bd18

b5aa63de

openvswitch: Set fail_mode to standalone for external bridges · 5016b3ef

Michal Nasiadka authored 1 year ago

That is the ovs-vsctl default but Ansible module is failing in
reconfigure step - and secure breaks external connectivity in
OVN.

From OVS docs:
fail_mode: optional string, either secure or standalone

When  a controller is configured, it is, ordinarily, responsible
for setting up all flows on the switch. Thus, if the  connection
to  the  controller fails, no new network connections can be set
up. If the connection to the controller stays down long  enough,
no  packets can pass through the switch at all. This setting de‐
termines the switch’s response to such a situation.  It  may  be
set to one of the following:

standalone
    If  no  message is received from the controller for three
    times  the  inactivity  probe  interval   (see   inactiv‐
    ity_probe), then Open vSwitch will take over responsibil‐
    ity for setting up flows.  In  this  mode,  Open  vSwitch
    causes  the  bridge  to act like an ordinary MAC-learning
    switch. Open vSwitch will continue to retry connecting to
    the controller in the background and, when the connection
    succeeds, it will discontinue its standalone behavior.

secure 
    Open vSwitch will not set up flows on its  own  when  the
    controller  connection  fails  or when no controllers are
    defined. The bridge will continue to retry connecting  to
    any defined controllers forever.

The default is standalone if the value is unset, but future ver‐
sions of Open vSwitch may change the default.

Change-Id: Ica4dda2914113e8f8349e7227161cb81a02b33ee

5016b3ef

Feb 05, 2024

openvswitch: use Ansible modules to set up bridge · 90e9dc9e
Michal Nasiadka authored 1 year ago
```
Change-Id: Iaf337c4a44bf065e96d6f30598e519ffc78de554
```
90e9dc9e

Rename horizon settings filenames · 43272acf

Michal Arbet authored 1 year ago

The purpose of this patch is to make it easier to
review changes, because renaming and changing the
file in one patch will generate diff when the entire
file will be deleted on the one hand and new file
(actually just renamed) will be new on the other hand,
which is hard to review.

Change-Id: I17a16ce746faa8898a457cadbb6f996f964a5b6f

43272acf

openvswitch: add external-ids:hostname · 2830e426

Michal Nasiadka authored 1 year ago

It's been introduced in [1] and seems to be used by ovn-controller.

[1]: https://patchwork.ozlabs.org/project/openvswitch/patch/1458866450-1967-1-git-send-email-russell@ovn.org/

Change-Id: I90e91f2923d58eb3c70e8d6efdc4e1212fbdc14f

2830e426

Jan 31, 2024

Disable new defaults and scope for Ironic (RBAC) · d77372e8

Bartosz Bezak authored 1 year ago

Ironic started enforcing new RBAC policies [1]. Kolla/Kayobe
CI jobs are failing, as K-A doesn't have service role support.
Moreover Ironic RBAC is not yet stable enough [2].
Disable enforcing new policies until fix merges and Kolla
Ansible service role support is added.

[1] https://review.opendev.org/c/openstack/ironic/+/902009
[2] https://review.opendev.org/c/openstack/ironic/+/907148

Related-Bug: #2051837

Change-Id: I424cff6ac96dfe0dd5dc58afca2b785f494c9f02

d77372e8

Jan 30, 2024

Configure missing nova services to expose vendordata over configdrive · 0376f9dd
Grzegorz Koper authored 1 year ago
```
Closes-Bug: #2049607

Change-Id: I14ae2be2e19ad06e3190e2e948bac7ce77e80d4b
```
0376f9dd

Fix neutron DNS integration · 6f847610

Michal Arbet authored 1 year ago

This patch basically does a simple thing, on the basis
of a variable neutron_dns_integration it enables/disables
DNS integration.

There is also precheck added which checks whether dns_domain
in neutron.conf has a non-default value if DNS integration is
enabled as this is requirement.

[1] https://docs.openstack.org/neutron/latest/admin/config-dns-int.html
[2] https://docs.openstack.org/neutron/latest/admin/config-dns-int-ext-serv.html#config-dns-int-ext-serv

Closes-Bug: #2049503

Change-Id: I90f0f8dcec6fa0112179f050d96e9d9db5956cf8

6f847610

Enable instance usage audit only when ceilometer is enabled · 66c4f72c

Michal Arbet authored 1 year ago

This patch disables periodic compute.instance.exists
notifications when designate is enabled.

Related-Bug: #2049503
Change-Id: I39fe2db9182de23c1df814d911eec15e86317702

66c4f72c

Jan 29, 2024

Update keystone service user passwords · ffd6e3bf

Alex-Welsh authored 1 year ago

Service user passwords will now be updated in keystone if services are
reconfigured with new passwords set in config. This behaviour can be
overridden.

Closes-Bug: #2045990
Change-Id: I91671dda2242255e789b521d19348b0cccec266f

ffd6e3bf

Jan 24, 2024

update openstack_previous_release_name var for bobcat · 364cbaa5
Bartosz Bezak authored 1 year ago
```
Change-Id: Ib0325c12cf965e7df7c1ac6b17ca87187a4cb91d
```
364cbaa5

Update horizon local settings for Django 4 · 6ec1dc97

Dr. Jens Harbott authored 1 year ago

As horizon is now using Django 4 after a recent requirements update, we
need to clean our config from settings that were long deprecated and now
no longer work.

[0] https://review.opendev.org/c/openstack/horizon/+/891828
[1] https://review.opendev.org/c/openstack/horizon/+/827092

Change-Id: I47533a2ad436578c98503284c25db4fd51896506

6ec1dc97

Jan 17, 2024

Fix OpenSearch upgrade tasks idempotency · e502b65b

Matt Crees authored 1 year ago

Shard allocation is disabled at the start of the OpenSearch upgrade
task. This is set as a transient setting, meaning it will be removed
once the containers are restarted. However, if there is not change in
the OpenSearch container it will not be restarted so the cluster is left
in a broken state: unable to allocate shards.

This patch moves the pre-upgrade tasks to within the handlers, so shard
allocation and the flush are only performed when the OpenSearch
container is going to be restarted.

Closes-Bug: #2049512
Change-Id: Ia03ba23bfbde7d50a88dc16e4f117dec3c98a448

e502b65b

Jan 11, 2024

Fix trove failed to discover swift endpoint · 9eff4380

wu.chunyang authored 1 year ago

This change fixes the trove failed to discover swift endpoint
by adding service_credentials in guest-agent.conf

Closes-Bug: #2048829

Change-Id: I185484d2a0d0a2d4016df6acf8a6b0a7f934c237

9eff4380

Fix trove failed to connect rabbitmq - quorum queues support · 57b24f01

wu.chunyang authored 1 year ago

This change fixes the trove guest instance failed to connect to
RabbitMQ by adding quorum queues support to oslo_messaging_rabbit
section in guest-agent.conf.

Closes-Bug: #2048822
Change-Id: I94908f8e20981f20fbe4dc18e2091d3798f8b801

57b24f01

Fix trove failed to connect rabbitmq - durable queues support · 6b96d098

wu.chunyang authored 1 year ago

This change fixes the trove guest instance failed to connect to
RabbitMQ by adding durable queues support to oslo_messaging_rabbit
section in guest-agent.conf.

Partial-Bug: #2048822

Change-Id: I8efc3c92e861816385e6cda3b231a950a06bf57d

6b96d098

Jan 08, 2024

Fix Nova scp failures on Debian Bookworm · bfa9dd97

Pierre Riteau authored 1 year ago

The addition of an instance resize operation [1] to CI testing is
triggering a failure in kolla-ansible-debian-ovn jobs, which are using a
nodeset with multiple nodes:

    oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.
    Command: scp -r /var/lib/nova/instances/8ca2c7e8-acae-404c-af7d-6cac38e354b8_resize/disk 192.0.2.2:/var/lib/nova/instances/8ca2c7e8-acae-404c-af7d-6cac38e354b8/disk
    Exit code: 255
    Stdout: ''
    Stderr: "Warning: Permanently added '[192.0.2.2]:8022' (ED25519) to the list of known hosts.\r\nsubsystem request failed on channel 0\r\nscp: Connection closed\r\n"

This is not seen on Ubuntu Jammy, which uses OpenSSH 8.9, while Debian
Bookworm uses OpenSSH 9.2. This is likely related to this change in
OpenSSH 9.0 [2]:

    This release switches scp(1) from using the legacy scp/rcp protocol
    to using the SFTP protocol by default.

Configure sftp subsystem like on RHEL9 derivatives. Even though it is
not yet required for Ubuntu, we also configure it so we are ready for
the Noble release.

[1] https://review.opendev.org/c/openstack/kolla-ansible/+/904249
[2] https://www.openssh.com/txt/release-9.0

Closes-Bug: #2048700
Change-Id: I9f1129136d7664d5cc3b57ae5f7e8d05c499a2a5

bfa9dd97

Enable glance proxying behaviour · 9ecfcf5a

Michal Arbet authored 1 year ago

This patch sets URL to glance worker.
If this is set, other glance workers will know how to contact this one
directly if needed. For image import, a single worker stages the image
and other workers need to be able to proxy the import request to the
right one.

With current setup glance image import just not working.

Closes-Bug: #2048525

Change-Id: I4246dc8a80038358cd5b6e44e991b3e2ed72be0e

9ecfcf5a

Jan 05, 2024

cadvisor: Set housekeeping interval to Prometheus scrape interval · 97e5c0e9

Mark Goddard authored 1 year ago

The prometheus_cadvisor container has high CPU usage. On various
production systems I checked it sits around 13-16% on controllers,
averaged over the prometheus 1m scrape interval. When viewed with top we
can see it is a bit spikey and can jump over 100%.

There are various bugs about this, but I found
https://github.com/google/cadvisor/issues/2523 which suggests reducing
the per-container housekeeping interval. This defaults to 1s, which
provides far greater granularity than we need with the default
prometheus scrape interval of 60s.

Reducing the housekeeping interval to 60s on a production controller
reduced the CPU usage from 13% to 3.5% average. This still seems high,
but is more reasonable.

Change-Id: I89c62a45b1f358aafadcc0317ce882f4609543e7
Closes-Bug: #2048223

97e5c0e9

Enable HAProxy Prometheus metrics endpoint · 140722f7

Dawud authored 1 year ago


HAProxy exposes a Prometheus metrics endpoint, it just needs to be
enabled. Enable this and remove configuration for
prometheus-haproxy-exporter. Remaining prometheus-haproxy-exporter
containers will automatically be removed.

Change-Id: If6e75691d2a996b06a9b95cb0aae772db54389fb
Co-Authored-By: Matt Anson <matta@stackhpc.com>

140722f7

Fix long service restarts while using systemd · b1fd2b40

Michal Arbet authored 1 year ago

Some containers exiting with 143 instead of 0, but
this is still OK. This patch just allows
ExitCode 143 (SIGTERM) as fix. Details in
bugreport.

Services which exited with 143 (SIGTERM):

kolla-cron-container.service
kolla-designate_producer-container.service
kolla-keystone_fernet-container.service
kolla-letsencrypt_lego-container.service
kolla-magnum_api-container.service
kolla-mariadb_clustercheck-container.service
kolla-neutron_l3_agent-container.service
kolla-openvswitch_db-container.service
kolla-openvswitch_vswitchd-container.service
kolla-proxysql-container.service

Partial-Bug: #2048130
Change-Id: Ia8c85d03404cfb368e4013066c67acd2a2f68deb

b1fd2b40

Jan 04, 2024

ironic: Remove enable_ironic_pxe_uefi bits · d8700ad0

Michal Nasiadka authored 1 year ago

These were missed in I081aa1345603fa27c390e4e09231a5ff226bcb39

Change-Id: I2884bca3c06ff98004e318757a20b60c12375924

d8700ad0

Jan 03, 2024
- Use service-images-pull role for letsencrypt and venus · 498d3243
  Mark Goddard authored 1 year ago
  
  This reduces code duplication. Change-Id: Ie529875aaa42435835417468868250bbe4fcf649
  498d3243
Jan 02, 2024

haproxy: Fix single frontend after LE cert path change · 21e5b21f

Michal Nasiadka authored 1 year ago

I35317ea0343f0db74ddc0e587862e95408e9e106 changed certificate path but omitted
single frontend template.

Change-Id: I638ba32e97234900745df62056710dcc37e7db77

21e5b21f

magnum: Disable CAPI driver when kubeconfig missing · 48796560
Michal Nasiadka authored 1 year ago
```
Closes-Bug: #2047360
Change-Id: I73490d84da39a74ea7ac493c7dd41fe7bfe2f578
```
48796560

Dec 28, 2023
- post-2023.1: Remove keystone admin endpoint bits · 982c4d5e
  Michal Nasiadka authored 1 year ago
  
  Change-Id: I27028ffae26a57d510e1a78c38ead2f925396e81
  982c4d5e
- Remove after-Zed TODOs · 65a0cee7
  Michal Nasiadka authored 1 year ago
  
  Change-Id: I081aa1345603fa27c390e4e09231a5ff226bcb39
  65a0cee7
Dec 21, 2023

Set a log retention policy for OpenSearch · 5e5a2dca

Doug Szumski authored 2 years ago

We previously used ElasticSearch Curator for managing log
retention. Now that we have moved to OpenSearch, we can use
the Index State Management (ISM) plugin which is bundled with
OpenSearch.

This change adds support for automating the configuration of
the ISM plugin via the OpenSearch API. By default, it has
similar behaviour to the previous ElasticSearch Curator
default policy.

Closes-Bug: #2047037

Change-Id: I5c6d938f2bc380f1575ee4f16fe17c6dca37dcba

5e5a2dca

Remove nova cell sync comment · e9e7362f

Alex-Welsh authored 1 year ago

Removed a comment suggesting we use nova-manage db sync --local_cell
when bootstrapping the nova service, since that suggestion has now been
implemented in Kolla. See [1] for more details.

[1]: https://review.opendev.org/c/openstack/kolla/+/902057

Related-Bug: #2045558
Depends-On: Ic64eb51325b3503a14ebab9b9ff2f4d9caec734a
Change-Id: I591f83c4886f5718e36011982c77c0ece6c4cbd7

e9e7362f

Dec 20, 2023
- fluentd: Fix LE pos_file path after version bump · bf22f3dd
  Michal Nasiadka authored 1 year ago
  
  Change-Id: Ia6db7d6a41ddbda8fcbf563dc55a0c65ef8db9be
  bf22f3dd
Dec 19, 2023
- Rework quorum queues precheck · afa24178
  Michal Nasiadka authored 1 year ago
  
  Change-Id: Ic9bd25a09b860838910dbe3d55f94421a0461c57
  afa24178
- quorum: add missing octavia and masakari · be5dc32c
  Michal Nasiadka authored 1 year ago
  
  Change-Id: Ibf9a9a0c18938f638c8e8b00b6017c64f1523b23
  be5dc32c
Dec 18, 2023

CI: fix two ansible-lint warnings · 176aa5a4

Sven Kieske authored 1 year ago


Signed-off-by: Sven Kieske <kieske@osism.tech>
Change-Id: I81a9b2dab7e9a4e2c8facaa0f32538f2884e3ca9

176aa5a4

Dec 14, 2023

Fix Docker health check for sahara_engine · 693c5c8b

Pierre Riteau authored 1 year ago

The wrong process name was being used.

Closes-Bug: #2046268
Change-Id: I5a5d4f227205e811732331ee6e020ccea67b6fab

693c5c8b

Dec 13, 2023

Add precheck for RabbitMQ quorum queues · 61f84e3b

Matt Crees authored 1 year ago

Adds a precheck to fail if non-quorum queues are found in RabbitMQ.

Currently excludes fanout and reply queues, pending support in
oslo.messaging [1].

[1]: https://review.opendev.org/c/openstack/oslo.messaging/+/888479

Closes-Bug: #2045887
Change-Id: Ibafdcd58618d97251a3405ef9332022d4d930e2b

61f84e3b

Dec 05, 2023

Fix broken list concatenation in horizon role · 97cd1731

Andrey Kurilin authored 1 year ago


Starting with ansible-core 2.13, list concatenation format is changed
and does not support concatenation operations outside of the jinja template.

The format change:

  "[1] + {{ [2] }}" -> "{{ [1] + [2] }}"

This affects the horizon role that iterates over existing policy files to
override and concatenate them into a single variable.

Co-Authored-By: Dr. Jens Harbott <harbott@osism.tech>

Closes-Bug: #2045660
Change-Id: I91a2101ff26cb8568f4615b4cdca52dcf09e6978

97cd1731

Support Ansible max_fail_percentage · af6e1ca4

Mark Goddard authored 3 years ago

This allows us to continue execution until a certain proportion of hosts
to fail. This can be useful at scale, where failures are common, and
restarting a deployment is time-consuming.

The default max failure percentage is 100, keeping the default
behaviour. A global max failure percentage may be set via
kolla_max_fail_percentage, and individual services may define a max
failure percentage via <service>_max_fail_percentage.

Note that all hosts in the inventory must be reachable for fact
gathering, even those not included in a --limit.

Closes-Bug: #1833737
Change-Id: I808474a75c0f0e8b539dc0421374b06cea44be4f

af6e1ca4

Dec 02, 2023

Fix wsrep sync status task while switched to TCP/IP · 35c7a9eb

Maksim Malchuk authored 1 year ago


Followup on Id6eae798784126d4dd53adef15bdce6b47b4601f to fix an issue
when a client with provided port set tries to connect 'localhost', so
while we switch to TCP/IP we need to explicitly provide the host too.

Partial-Bug: #2024554
Change-Id: Ib08c159dadd69a1f44924d658f4afe1e794a18b0
Signed-off-by: Maksim Malchuk <maksim.malchuk@gmail.com>

35c7a9eb