Commits · 9503308a8787c136e7351e6801a565c5e00c313c · Very Demiurge Very Mindful / Kolla Ansible

Apr 06, 2022

Ironic: Support both plain PXE and iPXE · 9503308a

Depends-On: https://review.opendev.org/c/openstack/kolla/+/832163
Change-Id: Ia2dba1854e925041ae23c731273b810bb2d5ec30

9503308a

Mar 30, 2022

neutron: add ssh key · 7fcf3ca3

Michal Nasiadka authored 3 years ago

This key can be used by users in networking-generic-switch
scenario instead of adding cleartext password in ml2_conf.ini.

Change-Id: I10003e6526a55a97f22678ab81c411e4645c5157

7fcf3ca3

Mar 24, 2022

designate: allow designate_ns_record to be a list · f193d1af

Michał Nasiadka authored 3 years ago

In most real world deployments, there will be multiple backend DNS
servers, allow to specify all of them for the pool configuration.

Change-Id: Ic9737d0446a807891b429f080ae1bf048a3c8e4a

f193d1af

Mar 17, 2022

ADD venus for kolla-ansible · 3ccb176f

jinyuanliu authored 3 years ago

This project [1] can provide a one-stop solution to log collection,
cleaning, indexing, analysis, alarm, visualization, report generation
and other needs, which involves helping operator or maintainer to
quickly solve retrieve problems, grasp the operational health of the
platform, and improve the level of platform management.

[1] https://wiki.openstack.org/wiki/Venus

Change-Id: If3562bbed6181002b76831bab54f863041c5a885

3ccb176f

Mar 10, 2022

libvirt: support SASL authentication · d2d4b53d

Mark Goddard authored 3 years ago

In Kolla Ansible OpenStack deployments, by default, libvirt is
configured to allow read-write access via an unauthenticated,
unencrypted TCP connection, using the internal API network. This is to
facilitate migration between hosts.

By default, Kolla Ansible does not use encryption for services on the
internal network (and did not support it until Ussuri). However, most
other services on the internal network are at least authenticated
(usually via passwords), ensuring that they cannot be used by anyone
with access to the network, unless they have credentials.

The main issue here is the lack of authentication. Any client with
access to the internal network is able to connect to the libvirt TCP
port and make arbitrary changes to the hypervisor. This could include
starting a VM, modifying an existing VM, etc. Given the flexibility of
the domain options, it could be seen as equivalent to having root access
to the hypervisor.

Kolla Ansible supports libvirt TLS [1] since the Train release, using
client and server certificates for mutual authentication and encryption.
However, this feature is not enabled by default, and requires
certificates to be generated for each compute host.

This change adds support for libvirt SASL authentication, and enables it
by default. This provides base level of security. Deployments requiring
further security should use libvirt TLS.

[1] https://docs.openstack.org/kolla-ansible/latest/reference/compute/libvirt-guide.html#libvirt-tls

Depends-On: https://review.opendev.org/c/openstack/kolla/+/833021
Closes-Bug: #1964013
Change-Id: Ia91ceeb609e4cdb144433122b443028c0278b71e

d2d4b53d

Mar 08, 2022

Adds etcd endpoints as a Prometheus scrape target · 0f2794a0

Nathan Taylor authored 3 years ago

Add "enable_prometheus_etcd_integration" configuration parameter which
can be used to configure Prometheus to scrape etcd metrics endpoints.
The default value of "enable_prometheus_etcd_integration" is set to
the combined values of "enable_prometheus" and "enable_etcd".

Change-Id: I7a0b802c5687e2d508e06baf55e355d9761e806f

0f2794a0

Feb 25, 2022

Enable Ironic iPXE support by default · baeca81a

Radosław Piliszek authored 3 years ago

Ironic has changed the default PXE to be iPXE (as opposed to plain
PXE) in Yoga. Kolla Ansible supports either one or the other and
we tend to stick to upstream defaults so this change enables
iPXE instead of plain PXE - by default - the users are allowed
to change back and they need to take one other action so it is
good to remind them via upgrade notes either way.

Change-Id: If14ec83670d2212906c6e22c7013c475f3c4748a

baeca81a

Feb 18, 2022

Add support for VMware First Class Disk (FCD) · 812e03f7

alecorps authored 3 years ago

An FCD, also known as an Improved Virtual Disk (IVD) or
Managed Virtual Disk, is a named virtual disk independent of
a virtual machine. Using FCDs for Cinder volumes eliminates
the need for shadow virtual machines.
This patch adds Kolla support.

Change-Id: Ic0b66269e6d32762e786c95cf6da78cb201d2765

812e03f7

Allow to define extra parameters for Prometheus exporters · dcba8297

Pierre Riteau authored 3 years ago

The following variables are added:

* prometheus_blackbox_exporter_cmdline_extras
* prometheus_elasticsearch_exporter_cmdline_extras
* prometheus_haproxy_exporter_cmdline_extras
* prometheus_memcached_exporter_cmdline_extras
* prometheus_mysqld_exporter_cmdline_extras
* prometheus_node_exporter_cmdline_extras
* prometheus_openstack_exporter_cmdline_extras

Change-Id: I5da2031b9367115384045775c515628e2acb1aa4

dcba8297

Feb 17, 2022

Add support for VMware NSXP · 458c8b13

Alban Lecorps authored 3 years ago

NSXP is the OpenStack support for the NSX Policy platform.
This is supported from neutron in the Stein version. This patch
adds Kolla support

This adds a new neutron_plugin_agent type 'vmware_nsxp'. The plugin
does not run any neutron agents.

Change-Id: I9e9d8f07e586bdc143d293e572031368af7f3fca

458c8b13

Jan 25, 2022

update the default value of node_custom_config · 825ef7ac

likui authored 3 years ago

The value of node_custom_config should is {{ node_config }}/config,
when specified using --configdir

Change-Id: I076b7d2c8980ddd3baa28f998f84a6b7005dc352

825ef7ac

Jan 05, 2022

Add support for deploying Prometheus libvirt exporter · 491d4184

Doug Szumski authored 6 years ago


Add support for deploying the Kolla Prometheus libvirt exporter image to
facilitate gathering metrics from the Nova libvirt service.

Co-Authored-by: Dr. Jens Harbott <harbott@osism.tech>
Change-Id: Ib27e60c39297b86ae674297370f9543ab08cda05
Partially-Implements: blueprint libvirt-exporter

491d4184

Dec 23, 2021

Deprecate storage_interface variable · 8cc56930

Radosław Piliszek authored 3 years ago

Per [1] and exchange on IRC.

[1] http://lists.openstack.org/pipermail/openstack-discuss/2021-December/026437.html

Change-Id: I322500e7204eb129d7bf085006627e8c4aaaa934

8cc56930

Dec 21, 2021

Drop vmtp · 0cbdedd0

Radosław Piliszek authored 3 years ago

Details in the attached reno.

Change-Id: I438a453ca522493524fdb9760c1edb330916084b

0cbdedd0

Nov 15, 2021

The deprecated iscsi deploy interface has been removed since xena · 42035e21

likui authored 3 years ago

[1] https://docs.openstack.org/releasenotes/ironic/xena.html

Change-Id: Ic0dd9fa7ef76b647682e124b1bae52e931a38225

42035e21

Oct 12, 2021

Add support for Ironic inspection through DHCP-relay · 37e4dba8

Maksim Malchuk authored 3 years ago

This change updates documentation, examples and tests to support
Ironic inspection through DHCP-relay. The dnsmasq service should be
configured with more specific format set in the variable
``ironic_dnsmasq_dhcp_range``. See the dnsmasq manual page [1].

[1] https://thekelleys.org.uk/dnsmasq/docs/dnsmasq-man.html



Change-Id: I9488a72db588e31289907668f1997596a8ccdec6
Signed-off-by: Maksim Malchuk <maksim.malchuk@gmail.com>

37e4dba8

Sep 30, 2021

Remove chrony role from kolla · 1f71df1a

wu.chunyang authored 3 years ago


chrony is not supported in Xena cycle, remove it from kolla

Moved tasks from chrony role to chrony-cleanup.yml playbook to avoid a
vestigial chrony role.

Co-Authored-By: Mark Goddard <mark@stackhpc.com>

Change-Id: I5a730d55afb49d517c85aeb9208188c81e2c84cf

1f71df1a

Add support for Ceph RadosGW integration · 8c5012e9

Mark Goddard authored 4 years ago

* Register Swift-compatible endpoints in Keystone
* Load balance across RadosGW API servers using HAProxy

The support is exercised in the cephadm CI jobs, but since RGW is
not currently enabled via cephadm, it is not yet tested.

https://docs.ceph.com/en/latest/radosgw/keystone/

Implements: blueprint ceph-rgw

Change-Id: I891c3ed4ed93512607afe65a42dd99596fd4dbf9

8c5012e9

Deploy source type images by default · 66c84843

Mark Goddard authored 3 years ago

Source images get the most test coverage, so it makes sense to deploy
these by default.

Change-Id: I8d0c8750e2c1600e84cc2e677a4eae0e9f502dac

66c84843

Aug 20, 2021

Never make Docker registry insecure by default · 802f7c62

Radosław Piliszek authored 3 years ago

To follow best security practices and help fellow operators.

More details inline and in the linked bug report.

Closes-Bug: #1940547
Change-Id: Ide9e9009a6e272f20a43319f27d257efdf315f68

802f7c62

Aug 17, 2021

Update Manila deploy steps for Wallaby · 8d5dde37

Skylar Kelty authored 3 years ago

Manila has changed from using subfolders to subvolumes.
We need a bit of a tidy up to prevent deploy errors.
This change also adds the ability to specify the ceph FS
Manila uses instead of relying on the default "first found".

Closes-Bug: #1938285
Closes-Bug: #1935784
Change-Id: I1d0d34919fbbe74a4022cd496bf84b8b764b5e0f

Unverified

8d5dde37

Aug 09, 2021

Support monitoring Fluentd with Prometheus · b692ce7a

Doug Szumski authored 3 years ago

This patch adds support for integrating Prometheus with Fluentd.
This can be used to extract useful information about the status
of Fluentd, such as output buffer capacity and logging rate,
and also to extract metrics from logs via custom Fluentd
configuration. More information can be found here in [1].

[1] https://docs.fluentd.org/monitoring-fluentd/monitoring-prometheus

Change-Id: I233d6dd744848ef1f1589a462dbf272ed0f3aaae

b692ce7a

Aug 05, 2021
- Remove support for Prometheus v1 · 0d79d25f
  Piotr Parczewski authored 3 years ago
  
  Change-Id: I0d7c7f47e6653cf2903589a9c86798a8c6404af5
  0d79d25f
Jul 28, 2021

Use more RMQ flags for less busy wait · d7cdad53

Radosław Piliszek authored 3 years ago

As mentioned in the Iced014acee7e590c10848e73feca166f48b622dc
commit message, in Ussuri+ we can use ``+sbwtdcpu none
+sbwtdio none`` as well. This is due to relying on RMQ-provided
erlang in version 23.x.

This change adds the extra arguments by default.
It should be backported down to Ussuri before we do a release with
Iced014acee7e590c10848e73feca166f48b622dc.

Change-Id: I32e247a6cb34d7f6763b544f247fd408dce2b3a2

d7cdad53

Jul 08, 2021

Reduce container metrics cardinality · c2ae21fd

Piotr Parczewski authored 3 years ago

Adds support for passing extra runtime options to cAdvisor.
By default new options disable exporting rarely useful metrics
and labels by cAdvisor. This helps reducing the load on Prometheus
and cAdvisor itself.

Change-Id: I81f3845d6cd03a70a0c8569f8d0ea421027df083

c2ae21fd

Jul 07, 2021

Remove tempest role · 52619984

wu.chunyang authored 3 years ago

Remove tempest role as planned

Change-Id: If3cf073e88c83f670c867a49afe48845f9e81008

52619984

Jul 02, 2021

Make setup module arguments configurable · 15f2fdcd

Rafael Weingärtner authored 4 years ago


Ansible facts can have a large impact on the performance of the Ansible
control host. This patch introduces some control over which facts are
gathered (kolla_ansible_setup_gather_subset) and which facts are stored
(kolla_ansible_setup_filter). By default we do not change the default
values of these arguments to the setup module. The flexibility of these
arguments is limited, but they do provide enough for a large performance
improvement in a typical moderate to large OpenStack cloud.

In particular, the large complex dict fact for each interface has a
large effect, and on an OpenStack controller or hypervisor there may be
many virtual interfaces. We can use the kolla_ansible_setup_filter
variable to help:

    kolla_ansible_setup_filter: 'ansible_[!qt]*'

This causes Ansible to collect but not store facts matching that
pattern, which includes the virtual interface facts. Currently we are
not referencing other facts matching the pattern within Kolla Ansible.
Note that including the 'ansible_' prefix causes meta facts module_setup
and gather_subset to be filtered, but this seems to be the only way to
get a good match on the interface facts. To work around this, we use
ansible_facts rather than module_setup to detect whether facts exist in
the cache.

The exact improvement will vary, but has been reported to be as large as
18x on systems with many virtual interfaces.

For reference, here are some other tunings tried:

* Increased the number of forks (great speedup depending of the size of
  the deployment)
* Use `strategy = mitogen_linear` (cut processing time in half)
* Ansible caching (little speed up)
* SSH tunning (little speed up)

Co-Authored-By: Mark Goddard <mark@stackhpc.com>
Closes-Bug: #1921538
Change-Id: Iae8ca4aae945892f1dc65e1b10381d2e26e88805

15f2fdcd

Jun 21, 2021

Drop support for Cinder ZFSSA backend · 0158221f

Radosław Piliszek authored 3 years ago

Following upstream which removed ZFSSA support in Ussuri [1].

[1] https://review.opendev.org/c/openstack/cinder/+/690137

Change-Id: Idb311e18b437fba696759ecb1cf2a6b4803aa5c5

0158221f

Jun 20, 2021

Revert "Reduce container metrics cardinality" · 640dbb03

Radosław Piliszek authored 3 years ago

This reverts commit c6259158.

Reason for revert: cAdvisor fails with:

invalid value "percpu,referenced_memory,cpu_topology,resctrl,udp,advtcp,sched,hugetlb,memory_numa,tcp,process" for flag -disable_metrics: unsupported metric "referenced_memory" specified in disable_metrics

Change-Id: I1a0eea5c20f95f38c707401b56b7d2454484377d

640dbb03

Jun 16, 2021

Reduce container metrics cardinality · c6259158

Piotr Parczewski authored 3 years ago

Adds support for passing extra runtime options to cAdvisor.
By default new options disable exporting rarely useful metrics
and labels by cAdvisor. This helps reducing the load on Prometheus
and cAdvisor itself.

Change-Id: Id0144e8fa518e3236cb94ba2e3961fb455d36443

c6259158

Remove rally deployment · 30091096

wu.chunyang authored 3 years ago

Remove rally role as planned

Change-Id: Ic898efe42b21b01c45d4621af2cf90ecd7afc398

30091096

Jun 11, 2021

Remove support for panko · ccf8cc5d

Matthias Runge authored 3 years ago

the project is deprecated and in the process of being removed
from OpenStack upstream.

Change-Id: I9d5ebed293a5fb25f4cd7daa473df152440e8b50

ccf8cc5d

Jun 07, 2021

Reduce RabbitMQ busy waiting, lowering CPU load · 70f6f8e4

John Garbutt authored 4 years ago

On machines with many cores, we were seeing excessive CPU load on systems
that were not very busy. With the following Erlang VM argument we saw
RabbitMQ CPU usage drop from about 150% to around 20%, on a system with
40 hyperthreads.

    +S 2:2

By default RabbitMQ starts N schedulers where N is the number of CPU
cores, including hyper-threaded cores. This is fine when you assume all
your CPUs are dedicated to RabbitMQ. Its not a good idea in a typical
Kolla Ansible setup. Here we go for two scheduler threads.
More details can be found here:
https://www.rabbitmq.com/runtime.html#scheduling
and here:
https://erlang.org/doc/man/erl.html#emulator-flags

    +sbwt none

This stops busy waiting of the scheduler, for more details see:
https://www.rabbitmq.com/runtime.html#busy-waiting
Newer versions of rabbit may need additional flags:
"+sbwt none +sbwtdcpu none +sbwtdio none"
But this patch should be back portable to older versions of RabbitMQ
used in Train and Stein.

Note that information on this tuning was found by looking at data from:
rabbitmq-diagnostics runtime_thread_stats
More details on that can be found here:
https://www.rabbitmq.com/runtime.html#thread-stats

Related-Bug: #1846467

Change-Id: Iced014acee7e590c10848e73feca166f48b622dc

70f6f8e4

May 11, 2021

Add ability to use the Neutron packet logging framework · e9232360

Florian LEDUC authored 5 years ago

* Enables the Neutron packet logging framework for OVS
(https://docs.openstack.org/neutron/latest/admin/config-logging.html).
* Adds a toggle variable "enable_neutron_packet_logging"

Change-Id: Ica3594cdac634b496949a06ed813dccd18090af4
Implements: blueprint neutron-log-service-plugin

e9232360

Apr 27, 2021

Remove Monasca Grafana service · 82cf40ed

Doug Szumski authored 3 years ago

In the Xena cycle it was decided to remove the Monasca
Grafana fork due to lack of maintenance. This commit removes
the service and provides a limited workaround using the
Monasca Grafana datasource with vanilla Grafana.

Depends-On: I9db7ec2df050fa20317d84f6cea40d1f5fd42e60
Change-Id: I4917ece1951084f6665722ba9a91d47764d3709a

82cf40ed

Apr 08, 2021

masakari: support host monitor · db517a44

Mark Goddard authored 4 years ago


Change-Id: I3f43df7766c57622ab8d01a759fbeeef0a0c2b93
Implements: blueprint masakari-hostmonitor
Co-Authored-By: Radosław Piliszek <radoslaw.piliszek@gmail.com>

db517a44

Add HAcluster Ansible role · 9f578c85

Gaëtan Trellu authored 5 years ago

Adds HAcluster Ansible role. This role contains High Availability
clustering solution composed of Corosync, Pacemaker and Pacemaker Remote.

HAcluster is added as a helper role for Masakari which requires it for
its host monitoring, allowing to provide HA to instances on a failed
compute host.

Kolla hacluster images merged in [1].

[1] https://review.opendev.org/#/c/668765/



Change-Id: I91e5c1840ace8f567daf462c4eb3ec1f0c503823
Implements: blueprint ansible-pacemaker-support
Co-Authored-By: Radosław Piliszek <radoslaw.piliszek@gmail.com>
Co-Authored-By: Mark Goddard <mark@stackhpc.com>

9f578c85

Apr 06, 2021

Deprecate and disable chrony by default · b647cb41

Radosław Piliszek authored 4 years ago

Per [1].

[1] http://lists.openstack.org/pipermail/openstack-discuss/2021-February/020707.html

Change-Id: Id6f3cd158bf5d01750971249b11364b6a8631789
Closes-Bug: #1885689

b647cb41

Mar 24, 2021

Add missing octavia-driver-agent · 7a066f71

Michal Nasiadka authored 4 years ago

For using 3rd party Octavia providers (such as OVN provider) an
octavia-driver-agent container must be running to expose those providers to
use.

OVN CI job has been extended with deploying Octavia and testing OVN Load
Balancer.

Closes-Bug: #1903506
Depends-On: https://review.opendev.org/c/openstack/kolla/+/771191

Change-Id: Ibafa8b7307981f2a51e630cc113d18af6162171c

7a066f71

Mar 04, 2021

Add variable for changing Apache HTTP timeout · 647ff667

Doug Szumski authored 4 years ago

In services which use the Apache HTTP server to service HTTP requests,
there exists a TimeOut directive [1] which defaults to 60 seconds. APIs
which come under heavy load, such as Cinder, can sometimes exceed this
which results in a HTTP 504 Gateway timeout, or similar. However, the
request can still be serviced without error. For example, if Nova calls
the Cinder API to detach a volume, and this operation takes longer
than the shortest of the two timeouts, Nova will emit a stack trace
with a 504 Gateway timeout. At some time later, the request to detach
the volume will succeed. The Nova and Cinder DBs then become
out-of-sync with each other, and frequently DB surgery is required.

Although strictly this category of bugs should be fixed in OpenStack
services, it is not realistic to expect this to happen in the short
term. Therefore, this change makes it easier to set the Apache HTTP
timeout via a new variable.

An example of a related bug is here:

https://bugs.launchpad.net/nova/+bug/1888665

Whilst this timeout can currently be set by overriding the WSGI
config for individual services, this change makes it much easier.

Change-Id: Ie452516655cbd40d63bdad3635fd66693e40ce34
Closes-Bug: #1917648

647ff667