Skip to content
Snippets Groups Projects
  1. Sep 21, 2021
  2. Aug 10, 2021
    • Radosław Piliszek's avatar
      Refactor and optimise image pulling · 9ff2ecb0
      Radosław Piliszek authored
      We get a nice optimisation by using a filtered loop instead
      of task skipping per service with 'when'.
      
      Partially-Implements: blueprint performance-improvements
      Change-Id: I8f68100870ab90cb2d6b68a66a4c97df9ea4ff52
      9ff2ecb0
  3. Jun 23, 2021
    • Mark Goddard's avatar
      Use ansible_facts to reference facts · ade5bfa3
      Mark Goddard authored
      By default, Ansible injects a variable for every fact, prefixed with
      ansible_. This can result in a large number of variables for each host,
      which at scale can incur a performance penalty. Ansible provides a
      configuration option [0] that can be set to False to prevent this
      injection of facts. In this case, facts should be referenced via
      ansible_facts.<fact>.
      
      This change updates all references to Ansible facts within Kolla Ansible
      from using individual fact variables to using the items in the
      ansible_facts dictionary. This allows users to disable fact variable
      injection in their Ansible configuration, which may provide some
      performance improvement.
      
      This change disables fact variable injection in the ansible
      configuration used in CI, to catch any attempts to use the injected
      variables.
      
      [0] https://docs.ansible.com/ansible/latest/reference_appendices/config.html#inject-facts-as-vars
      
      Change-Id: I7e9d5c9b8b9164d4aee3abb4e37c8f28d98ff5d1
      Partially-Implements: blueprint performance-improvements
      ade5bfa3
  4. May 28, 2021
  5. Apr 15, 2021
  6. Apr 14, 2021
  7. Apr 07, 2021
    • Michal Arbet's avatar
      Refactor mariadb to support shards · 09b3c6ca
      Michal Arbet authored
      
      Kolla-ansible is currently installing mariadb
      cluster on hosts defined in group['mariadb']
      and render haproxy configuration for this hosts.
      
      This is not enough if user want to have several
      service databases in several mariadb clusters (shards).
      
      Spread service databases to multiple clusters (shards)
      is usefull especially for databases with high load
      (neutron,nova).
      
      How it works ?
      
      It works exactly same as now, but group reference 'mariadb'
      is now used as group where all mariadb clusters (shards)
      are located, and mariadb clusters are installed to
      dynamic groups created by group_by and host variable
      'mariadb_shard_id'.
      
      It also adding special user 'shard_X' which will be used
      for creating users and databases, but only if haproxy
      is not used as load-balance solution.
      
      This patch will not affect user which has all databases
      on same db cluster on hosts in group 'mariadb', host
      variable 'mariadb_shard_id' is set to 0 if not defined.
      
      Mariadb's task in loadbalancer.yml (haproxy) is configuring
      mariadb default shard hosts as haproxy backends. If mariadb
      role is used to install several clusters (shards), only
      default one is loadbalanced via haproxy.
      
      Mariadb's backup is working only for default shard (cluster)
      when using haproxy as mariadb loadbalancer, if proxysql
      is used, all shards are backuped.
      
      After this patch will be merged, there will be way for proxysql
      patches which will implement L7 SQL balancing based on
      users and schemas.
      
      Example of inventory:
      
      [mariadb]
      server1
      server2
      server3 mariadb_shard_id=1
      server4 mariadb_shard_id=1
      server5 mariadb_shard_id=2
      server6 mariadb_shard_id=3
      
      Extra:
      wait_for_loadbalancer is removed instead of modified as its role
      is served by check already. The relevant refactor is applied as
      well.
      
      Change-Id: I933067f22ecabc03247ea42baf04f19100dffd08
      Co-Authored-By: default avatarRadosław Piliszek <radoslaw.piliszek@gmail.com>
      09b3c6ca
  8. Mar 13, 2021
  9. Jan 29, 2021
    • fudunwei's avatar
      Negative seqno need to be considered when comparing seqno · 068f3fea
      fudunwei authored
      Need to consider Negative seqno to compare in some cases,
      but the task does not support to do that, we need to make it work.
      
      1.we use mariabackup to restore datas on control1, delete the
      mariadb data on control2 and control3, and then use cluster recovery,
       as a result that the seqno of the other two nodes will be '-1'.
      
      2. add one more control node into our existing mariadb cluster,
      and then use cluster recovery, the seqno of the new node will be '-1'.
      
      Change-Id: Ic1ac8656f28c3835e091637014f075ac5479d390
      068f3fea
  10. Jan 25, 2021
  11. Jan 21, 2021
  12. Dec 14, 2020
    • Mark Goddard's avatar
      Revert "Performance: Use import_tasks in the main plays" · db4fc85c
      Mark Goddard authored
      This reverts commit 9cae59be.
      
      Reason for revert: This patch was found to introduce issues with fluentd customisation. The underlying issue is not currently fully understood, but could be a sign of other obscure issues.
      
      Change-Id: Ia4859c23d85699621a3b734d6cedb70225576dfc
      Closes-Bug: #1906288
      db4fc85c
  13. Dec 10, 2020
    • Mark Goddard's avatar
      Fix mariadb_recovery when mariadb container is missing · f903d774
      Mark Goddard authored
      Mariadb recovery fails if a cluster has previously been deployed, but any of
      the mariadb containers do not exist.
      
      Steps to reproduce
      ==================
      
      * Deploy a mariadb galera cluster
      * Remove the mariadb container from at least one host (docker rm -f mariadb)
      * Run kolla-ansible mariadb_recovery
      
      Expected results
      ================
      
      The cluster is recovered, and a new container deployed where necessary.
      
      Actual results
      ==============
      
      The task 'Stop MariaDB containers' fails on any host where the container does
      not exist.
      
      Solution
      ========
      
      This change fixes the issue by using the 'ignore_missing' flag for kolla_docker
      with the stop_container action. This means the task does not fail when the
      container does not exist. It is also necessary to swap some 'docker cp'
      commands for 'cp' on the host, using the path to the volume.
      
      Closes-Bug: #1907658
      
      Change-Id: Ibd4a6adeb8443e12c45cbab65f501392ffb16fc7
      f903d774
  14. Nov 08, 2020
  15. Oct 27, 2020
    • Radosław Piliszek's avatar
      Performance: Use import_tasks in the main plays · 9cae59be
      Radosław Piliszek authored
      Main plays are action-redirect-stubs, ideal for import_tasks.
      
      This avoids 'include' penalty and makes logs/ara look nicer.
      
      Fixes haproxy and rabbitmq not to check the host group as well.
      
      Change-Id: I46136fc40b815e341befff80b54a91ef431eabc0
      Partially-Implements: blueprint performance-improvements
      9cae59be
  16. Oct 12, 2020
    • Radosław Piliszek's avatar
      Performance: optimize genconfig · 3411b9e4
      Radosław Piliszek authored
      Config plays do not need to check containers. This avoids skipping
      tasks during the genconfig action.
      
      Ironic and Glance rolling upgrades are handled specially.
      
      Swift and Bifrost do not use the handlers at all.
      
      Partially-Implements: blueprint performance-improvements
      Change-Id: I140bf71d62e8f0932c96270d1f08940a5ba4542a
      3411b9e4
  17. Sep 29, 2020
    • Isaac Prior's avatar
      Fix invalid mariadb log options · 414677a6
      Isaac Prior authored
      Trivial: log-error & log-bin are both invalid mariadb config options.
      The appropriate options are log_error & log_bin.
      Note - this change mostly unnecessary as log_error is provided via cli 
      and log_bin value is the default.
      Change-Id: If7051f7139a68864e599cccffaf17c21855fc4a8
      414677a6
  18. Sep 17, 2020
  19. Aug 28, 2020
  20. Aug 10, 2020
    • Mark Goddard's avatar
      Mount /etc/timezone based on host OS · 146b00ef
      Mark Goddard authored
      Previously we mounted /etc/timezone if the kolla_base_distro is debian
      or ubuntu. This would fail prechecks if debian or ubuntu images were
      deployed on CentOS. While this is not a supported combination, for
      correctness we should fix the condition to reference the host OS rather
      than the container OS, since that is where the /etc/timezone file is
      located.
      
      Change-Id: Ifc252ae793e6974356fcdca810b373f362d24ba5
      Closes-Bug: #1882553
      146b00ef
  21. Jul 28, 2020
    • Mark Goddard's avatar
      Performance: use import_tasks for check-containers.yml · 9702d4c3
      Mark Goddard authored
      Including tasks has a performance penalty when compared with importing
      tasks. If the include has a condition associated with it, then the
      overhead of the include may be lower than the overhead of skipping all
      imported tasks. In the case of the check-containers.yml include, the
      included file only has a single task, so the overhead of skipping this
      task will not be greater than the overhead of the task import. It
      therefore makes sense to switch to use import_tasks there.
      
      Partially-Implements: blueprint performance-improvements
      
      Change-Id: I65d911670649960708b9f6a4c110d1a7df1ad8f7
      9702d4c3
  22. Jul 07, 2020
    • Mark Goddard's avatar
      Performance: Run common role in a separate play · 56ae2db7
      Mark Goddard authored
      The common role was previously added as a dependency to all other roles.
      It would set a fact after running on a host to avoid running twice. This
      had the nice effect that deploying any service would automatically pull
      in the common services for that host. When using tags, any services with
      matching tags would also run the common role. This could be both
      surprising and sometimes useful.
      
      When using Ansible at large scale, there is a penalty associated with
      executing a task against a large number of hosts, even if it is skipped.
      The common role introduces some overhead, just in determining that it
      has already run.
      
      This change extracts the common role into a separate play, and removes
      the dependency on it from all other roles. New groups have been added
      for cron, fluentd, and kolla-toolbox, similar to other services. This
      changes the behaviour in the following ways:
      
      * The common role is now run for all hosts at the beginning, rather than
        prior to their first enabled service
      * Hosts must be in the necessary group for each of the common services
        in order to have that service deployed. This is mostly to avoid
        deploying on localhost or the deployment host
      * If tags are specified for another service e.g. nova, the common role
        will *not* automatically run for matching hosts. The common tag must
        be specified explicitly
      
      The last of these is probably the largest behaviour change. While it
      would be possible to determine which hosts should automatically run the
      common role, it would be quite complex, and would introduce some
      overhead that would probably negate the benefit of splitting out the
      common role.
      
      Partially-Implements: blueprint performance-improvements
      
      Change-Id: I6a4676bf6efeebc61383ec7a406db07c7a868b2a
      56ae2db7
  23. Jun 08, 2020
  24. May 20, 2020
  25. Apr 09, 2020
    • Dincer Celik's avatar
      Introduce /etc/timezone to Debian/Ubuntu containers · 4b5df0d8
      Dincer Celik authored
      Some services look for /etc/timezone on Debian/Ubuntu, so we should
      introduce it to the containers.
      
      In addition, added prechecks for /etc/localtime and /etc/timezone.
      
      Closes-Bug: #1821592
      Change-Id: I9fef14643d1bcc7eee9547eb87fa1fb436d8a6b3
      4b5df0d8
  26. Mar 25, 2020
    • LinPeiWen's avatar
      mariadb container name variable · 8a206699
      LinPeiWen authored
      mariadb container name variable is fixed in some places,
      but in the defaults directory, mariadb container_name variable
      is variable. If the mariadb container_name variable is changed
      during deployment, it will not be assigned to container_name,
      but a fixed 'mariadb' name.
      
      Change-Id: Ie8efa509953d5efa5c3073c9b550be051a7f4f9b
      8a206699
  27. Mar 02, 2020
  28. Feb 28, 2020
    • Mark Goddard's avatar
      Add Ansible group check to prechecks · 49fb55f1
      Mark Goddard authored
      We assume that all groups are present in the inventory, and quite obtuse
      errors can result if any are not.
      
      This change adds a precheck that checks for the presence of all expected
      groups in the inventory for each service. It also introduces a common
      service-precheck role that we can use for other common prechecks.
      
      Change-Id: Ia0af1e7df4fff7f07cd6530e5b017db8fba530b3
      Partially-Implements: blueprint improve-prechecks
      49fb55f1
  29. Feb 19, 2020
  30. Feb 02, 2020
    • Radosław Piliszek's avatar
      Followup on MariaDB handling fixes · 1ea029a9
      Radosław Piliszek authored
      This fixes issues reported by Mark:
      - possible failure with 4-node cluster (however unlikely)
      - failure to stop all nodes from progressing when conditions are
        not valid (due to: "any_errors_fatal: False")
      
      Change-Id: Ib6995bf4c99202c9813859b3d9e2f420448f0445
      1ea029a9
  31. Jan 15, 2020
    • Radosław Piliszek's avatar
      Fix multiple issues with MariaDB handling · 9f14ad65
      Radosław Piliszek authored
      These affected both deploy (and reconfigure) and upgrade
      resulting in WSREP issues, failed deploys or need to
      recover the cluster.
      
      This patch makes sure k-a does not abruptly terminate
      nodes to break cluster.
      This is achieved by cleaner separation between stages
      (bootstrap, restart current, deploy new) and 3 phases
      for restarts (to keep the quorum).
      
      Upgrade actions, which operate on a healthy cluster,
      went to its section.
      
      Service restart was refactored.
      
      We no longer rely on the master/slave distinction as
      all nodes are masters in Galera.
      
      Closes-bug: #1857908
      Closes-bug: #1859145
      Change-Id: I83600c69141714fc412df0976f49019a857655f5
      9f14ad65
  32. Jan 13, 2020
  33. Jan 10, 2020
    • Mark Goddard's avatar
      CentOS 8: Support variable image tag suffix · 9755c924
      Mark Goddard authored
      For the CentOS 7 to 8 transition, we will have a period where both
      CentOS 7 and 8 images are available. We differentiate these images via a
      tag - the CentOS 8 images will have a tag of train-centos8 (or
      master-centos8 temporarily).
      
      To achieve this, and maintain backwards compatibility for the
      openstack_release variable, we introduce a new 'openstack_tag' variable.
      This variable is based on openstack_release, but has a suffix of
      'openstack_tag_suffix', which is empty except on CentOS 8 where it has a
      value of '-centos8'.
      
      Change-Id: I12ce4661afb3c255136cdc1aabe7cbd25560d625
      Partially-Implements: blueprint centos-rhel-8
      9755c924
  34. Jan 02, 2020
  35. Nov 22, 2019
    • Michal Nasiadka's avatar
      Change local_action to delegate_to: localhost · 10099311
      Michal Nasiadka authored
      As part of the effort to implement Ansible code linting in CI
      (using ansible-lint) - we need to implement recommendations from
      ansible-lint output [1].
      
      One of them is to stop using local_action in favor of delegate_to -
      to increase readability and and match the style of typical ansible
      tasks.
      
      [1]: https://review.opendev.org/694779/
      
      Partially implements: blueprint ansible-lint
      
      Change-Id: I46c259ddad5a6aaf9c7301e6c44cd8a1d5c457d3
      10099311
  36. Nov 07, 2019
    • Mark Goddard's avatar
      Fix restart policy after MariaDB recovery · f979ae1f
      Mark Goddard authored
      After performing a recovery of MariaDB, the mariadb containers are left
      without a restart policy. This leaves them unable to recover from the
      crash of a single galera node. There is another issue, in that the
      'master' node is left in a bootstrap configuration, with the
      --wsrep-new-cluster argument configured as BOOTSTRAP_ARGS.
      
      This change fixes these issues by removing the restart policy of 'no'
      from the 'slave' containers, and recreating the master container without
      the restart policy or bootstrap arguments.
      
      Change-Id: I36c875611931163ca2c29ae93b71d3af64cb197c
      Closes-Bug: #1851594
      f979ae1f
  37. Nov 04, 2019
Loading