Skip to content
Snippets Groups Projects
  1. Jul 23, 2019
  2. Jul 18, 2019
    • Jason's avatar
      Do not recreate Blazar DB if using preconfigured · 7d284761
      Jason authored
      Most other services already gate the DB bootstrap operations with the
      'use_preconfigured_databases' variable; Blazar did not.
      
      Change-Id: I772b1cb92612c7e6936f052ed9947f93582f264c
      Unverified
      7d284761
    • Mark Goddard's avatar
      Fix glance bootstrap with file backend · 1abd15d4
      Mark Goddard authored
      Change https://review.opendev.org/#/c/670247/ attempted to fix glance
      deployment with the file backend. However, it added a new bug by being
      more strict about only generating configuration where the container will
      be deployed. This means that the current method of running the glance
      bootstrap container on any host in glance-api group could be broken,
      since it needs the container configuration.
      
      This change only runs the bootstrap container on hosts in the
      glance_api_hosts list, which in the case of the file backend typically
      only contains one host.
      
      This change also fixes up some logic during rolling upgrade, where we
      might not generate new configuration for the bootstrap host.
      
      Change-Id: I83547cd83b06ddefb3a9e1f39844537bdb32bd7f
      Related-Bug: #1836151
      1abd15d4
  3. Jul 16, 2019
    • Michal Nasiadka's avatar
      ceph-nfs: Add rpcbind to Ubuntu host bootstrap · efcaf400
      Michal Nasiadka authored
      * Ubuntu ships with nfs-ganesha 2.6.0, which requires to do an rpcbind
      udp test on startup (was fixed later)
      * Add rpcbind package to be installed by kolla-ansible bootstrap when
      ceph_nfs is enabled
      * Update Ceph deployment docs with a note
      
      Change-Id: Ic19264191a0ed418fa959fdc122cef543446fbe5
      efcaf400
  4. Jul 12, 2019
    • Mark Goddard's avatar
      Fix ironic inspector iPXE boot with UEFI · 7b939756
      Mark Goddard authored
      The ironic inspector iPXE configuration includes the following kernel
      argument:
      
      initrd=agent.ramdisk
      
      However, the ramdisk is actually called ironic-agent.initramfs, so the
      argument should be:
      
      initrd=ironic-agent.initramfs
      
      In BIOS boot mode this does not cause a problem, but for compute nodes
      with UEFI enabled, it seems to be more strict about this, and fails to
      boot.
      
      Change-Id: Ic84f3b79fdd3cd1730ca2fb79c11c7a4e4d824de
      Closes-Bug: #1836375
      7b939756
    • Mark Goddard's avatar
      During deploy, always sync DB · d5e5e885
      Mark Goddard authored
      A common class of problems goes like this:
      
      * kolla-ansible deploy
      * Hit a problem, often in ansible/roles/*/tasks/bootstrap.yml
      * Re-run kolla-ansible deploy
      * Service fails to start
      
      This happens because the DB is created during the first run, but for some
      reason we fail before performing the DB sync. This means that on the second run
      we don't include ansible/roles/*/tasks/bootstrap_service.yml because the DB
      already exists, and therefore still don't perform the DB sync. However this
      time, the command may complete without apparent error.
      
      We should be less careful about when we perform the DB sync, and do it whenever
      it is necessary. There is an argument for not doing the sync during a
      'reconfigure' command, although we will not change that here.
      
      This change only always performs the DB sync during 'deploy' and
      'reconfigure' commands.
      
      Change-Id: I82d30f3fcf325a3fdff3c59f19a1f88055b566cc
      Closes-Bug: #1823766
      Closes-Bug: #1797814
      d5e5e885
  5. Jul 11, 2019
    • Mark Goddard's avatar
      Fix glance with file backend · 602f89ba
      Mark Goddard authored
      Since https://review.opendev.org/647699/, we lost the logic to only
      deploy glance-api on a single host when using the file backend.
      
      This code was always a bit custom, and would be better supported by
      using the 'host_in_groups' pattern we have in a few other places where a
      single group name does not describe the placement of containers for a
      service.
      
      Change-Id: I21ce4a3b0beee0009ac69fecd0ce24efebaf158d
      Closes-Bug: #1836151
      602f89ba
  6. Jul 10, 2019
  7. Jul 08, 2019
  8. Jul 05, 2019
    • Mark Goddard's avatar
      Fixes for MariaDB bootstrap and recovery · 86f373a1
      Mark Goddard authored
      * Fix wsrep sequence number detection. Log message format is
        'WSREP: Recovered position: <UUID>:<seqno>' but we were picking out
        the UUID rather than the sequence number. This is as good as random.
      
      * Add become: true to log file reading and removal since
        I4a5ebcedaccb9261dbc958ec67e8077d7980e496 added become: true to the
        'docker cp' command which creates it.
      
      * Don't run handlers during recovery. If the config files change we
        would end up restarting the cluster twice.
      
      * Wait for wsrep recovery container completion (don't detach). This
        avoids a potential race between wsrep recovery and the subsequent
        'stop_container'.
      
      * Finally, we now wait for the bootstrap host to report that it is in
        an OPERATIONAL state. Without this we can see errors where the
        MariaDB cluster is not ready when used by other services.
      
      Change-Id: Iaf7862be1affab390f811fc485fd0eb6879fd583
      Closes-Bug: #1834467
      86f373a1
  9. Jul 04, 2019
    • Mark Goddard's avatar
      Deprecate Ceph deployment · e6d0e610
      Mark Goddard authored
      There are now several good tools for deploying Ceph, including Ceph
      Ansible and ceph-deploy. Maintaining our own Ceph deployment is a
      significant maintenance burden, and we should focus on our core mission
      to deploy OpenStack. Given that this is a significant part of kolla
      ansible currently we will need a long deprecation period and a migration
      path to another tool.
      
      Change-Id: Ic603c85c04d8794580a19f9efaa7a8589565f4f6
      Partially-Implements: blueprint remove-ceph
      e6d0e610
    • Christian Berendt's avatar
      Add parameters to configure number of processes and threads of horizon · dc3489df
      Christian Berendt authored
      Change-Id: Ib5490d504a5b7c9a37dda7babf1257aa661c11de
      dc3489df
    • Mark Goddard's avatar
      Wait for all compute services before cell discovery · c38dd767
      Mark Goddard authored
      There is a race condition during nova deploy since we wait for at least
      one compute service to register itself before performing cells v2 host
      discovery.  It's quite possible that other compute nodes will not yet
      have registered and will therefore not be discovered. This leaves them
      not mapped into a cell, and results in the following error if the
      scheduler picks one when booting an instance:
      
      Host 'xyz' is not mapped to any cell
      
      The problem has been exacerbated by merging a fix [1][2] for a nova race
      condition, which disabled the dynamic periodic discovery mechanism in
      the nova scheduler.
      
      This change fixes the issue by waiting for all expected compute services
      to register themselves before performing host discovery. This includes
      both virtualised compute services and bare metal compute services.
      
      [1] https://bugs.launchpad.net/kolla-ansible/+bug/1832987
      [2] https://review.opendev.org/665554
      
      Change-Id: I2915e2610e5c0b8d67412e7ec77f7575b8fe9921
      Closes-Bug: #1835002
      c38dd767
  10. Jul 02, 2019
    • Rafael Weingärtner's avatar
      Cloudkitty InfluxDB Storage backend via Kolla-ansible · 97cb30cd
      Rafael Weingärtner authored
      This proposal will add support to Kolla-Ansible for Cloudkitty
       InfluxDB storage system deployment. The feature of InfluxDB as the
       storage backend for Cloudkitty was created with the following commit
       https://github.com/openstack/cloudkitty/commit/
       c4758e78b49386145309a44623502f8095a2c7ee
      
      Problem Description
      ===================
      
      With the addition of support for InfluxDB in Cloudkitty, which is
      achieving general availability via Stein release, we need a method to
      easily configure/support this storage backend system via Kolla-ansible.
      
      Kolla-ansible is already able to deploy and configure an InfluxDB
      system. Therefore, this proposal will use the InfluxDB deployment
      configured via Kolla-ansible to connect to CloudKitty and use it as a
      storage backend.
      
      If we do not provide a method for users (operators) to manage
      Cloudkitty storage backend via Kolla-ansible, the user has to execute
      these changes/configurations manually (or via some other set of
      automated scripts), which creates distributed set of configuration
      files, "configurations" scripts that have different versioning schemas
      and life cycles.
      
      Proposed Change
      ===============
      
      Architecture
      ------------
      
      We propose a flag that users can use to make Kolla-ansible configure
      CloudKitty to use InfluxDB as the storage backend system. When
      enabling this flag, Kolla-ansible will also enable the deployment of
      the InfluxDB via Kolla-ansible automatically.
      
      CloudKitty will be configured accordingly to [1] and [2]. We will also
      externalize the "retention_policy", "use_ssl", and "insecure", to
      allow fine granular configurations to operators. All of these
      configurations will only be used when configured; therefore, when they
      are not set, the default value/behavior defined in Cloudkitty will be
      used. Moreover, when we configure "use_ssl" to "true", the user will
      be able to set "cafile" to a custom trusted CA file. Again, if these
      variables are not set, the default ones in Cloudkitty will be used.
      
      Implementation
      --------------
      We need to introduce a new variable called
      `cloudkitty_storage_backend`. Valid options are `sqlalchemy` or
      `influxdb`. The default value in Kolla-ansible is `sqlalchemy` for
      backward compatibility. Then, the first step is to change the
      definition for the following variable:
      `/ansible/group_vars/all.yml:enable_influxdb: "{{ enable_monasca |
      bool }}"`
      
      We also need to enable InfluxDB when CloudKitty is configured to use
      it as the storage backend. Afterwards, we need to create tasks in
      CloudKitty configurations to create the InfluxDB schema and configure
      the configuration files accordingly.
      
      Alternatives
      ------------
      The alternative would be to execute the configurations manually or
      handle it via a different set of scripts and configurations files,
      which can become cumbersome with time.
      
      Security Impact
      ---------------
      None identified by the author of this spec
      
      Notifications Impact
      --------------------
      Operators that are already deploying CloudKitty with InfluxDB as
      storage backend would need to convert their configurations to
      Kolla-ansible (if they wish to adopt Kolla-ansible to execute these
      tasks).
      
      Also, deployments (OpenStack environments) that were created with
      Cloudkitty using storage v1 will need to migrate all of their data to
      V2 before enabling InfluxDB as the storage system.
      
      Other End User Impact
      ---------------------
      None.
      
      Performance Impact
      ------------------
      None.
      
      Other Deployer Impact
      ---------------------
      New configuration options will be available for CloudKitty.
      * cloudkitty_storage_backend
      * cloudkitty_influxdb_retention_policy
      * cloudkitty_influxdb_use_ssl
      * cloudkitty_influxdb_cafile
      * cloudkitty_influxdb_insecure_connections
      * cloudkitty_influxdb_name
      
      Developer Impact
      ----------------
      None
      
      Implementation
      ==============
      
      Assignee
      --------
      * `Rafael Weingärtner <rafaelweingartne>`
      
      Work Items
      ----------
       * Extend InfluxDB "enable/disable" variable
       * Add new tasks to configure Cloudkitty accordingly to these new
       variables that are presented above
       * Write documentation and release notes
      
      Dependencies
      ============
      None
      
      Documentation Impact
      ====================
      New documentation for the feature.
      
      References
      ==========
      [1] `https://docs.openstack.org/cloudkitty/latest/admin/configuration/storage.html#influxdb-v2`
      [2] `https://docs.openstack.org/cloudkitty/latest/admin/configuration/collector.html#metric-collection`
      
      
      
      Change-Id: I65670cb827f8ca5f8529e1786ece635fe44475b0
      Signed-off-by: default avatarRafael Weingärtner <rafael@apache.org>
      97cb30cd
    • Mark Goddard's avatar
      Add upgrade-bifrost command · 9cac1137
      Mark Goddard authored
      This performs the same as a deploy-bifrost, but first stops the
      bifrost services and container if they are running.
      
      This can help where a docker stop may lead to an ungraceful shutdown,
      possibly due to running multiple services in one container.
      
      Change-Id: I131ab3c0e850a1d7f5c814ab65385e3a03dfcc74
      Implements: blueprint bifrost-upgrade
      Closes-Bug: #1834332
      9cac1137
  11. Jul 01, 2019
    • Mark Goddard's avatar
      Bump minimum Ansible version to 2.5 · 0a769dc3
      Mark Goddard authored
      This is necessary for some Ansible tests which were renamed in 2.5 -
      including 'version' and 'successful'.
      
      Change-Id: Iacf88ef5589c7571fcf56ba8b99d3dbe76975195
      0a769dc3
  12. Jun 28, 2019
    • Will Szumski's avatar
      Specify endpoint when creating monasca user · 9074da56
      Will Szumski authored
      otherwise I'm seeing:
      
      TASK [monasca : Creating the monasca agent user] ****************************************************************************************************************************
      fatal: [monitor1]: FAILED! => {"changed": false, "module_stderr": "Shared connection to 172.16.3.24 closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n  F
      ile \"/tmp/ansible_I0RmxQ/ansible_module_kolla_toolbox.py\", line 163, in <module>\r\n    main()\r\n  File \"/tmp/ansible_I0RmxQ/ansible_module_kolla_toolbox.py\", line 141,
       in main\r\n    output = client.exec_start(job)\r\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python2.7/site-packages/docker/utils/decorators.py\", line 19, in wrapped\r\n
          return f(self, resource_id, *args, **kwargs)\r\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python2.7/site-packages/docker/api/exec_api.py\", line 165, in exec_start\r\
      n    return self._read_from_socket(res, stream, tty)\r\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python2.7/site-packages/docker/api/client.py\", line 377, in _read_from_
      socket\r\n    return six.binary_type().join(gen)\r\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python2.7/site-packages/docker/utils/socket.py\", line 75, in frames_iter\r\
      n    n = next_frame_size(socket)\r\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python2.7/site-packages/docker/utils/socket.py\", line 62, in next_frame_size\r\n    data = read_exactly(socket, 8)\r\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python2.7/site-packages/docker/utils/socket.py\", line 47, in read_exactly\r\n    next_data = read(socket, n - len(data))\r\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python2.7/site-packages/docker/utils/socket.py\", line 31, in read\r\n    return socket.recv(n)\r\nsocket.timeout: timed out\r\n", "msg": "MODULE FAILURE", "rc": 1}
      
      when the monitoring nodes aren't on the public API network.
      
      Change-Id: I7a93f69da0e02c9264da0b081d2e60626f899e3a
      9074da56
  13. Jun 27, 2019
    • Mark Goddard's avatar
      Simplify handler conditionals · de00bf49
      Mark Goddard authored
      Currently, we have a lot of logic for checking if a handler should run,
      depending on whether config files have changed and whether the
      container configuration has changed. As rm_work pointed out during
      the recent haproxy refactor, these conditionals are typically
      unnecessary - we can rely on Ansible's handler notification system
      to only trigger handlers when they need to run. This removes a lot
      of error prone code.
      
      This patch removes conditional handler logic for all services. It is
      important to ensure that we no longer trigger handlers when unnecessary,
      because without these checks in place it will trigger a restart of the
      containers.
      
      Implements: blueprint simplify-handlers
      
      Change-Id: I4f1aa03e9a9faaf8aecd556dfeafdb834042e4cd
      de00bf49
    • Christian Berendt's avatar
      Add support for neutron custom dnsmasq.conf · a3f1ded3
      Christian Berendt authored
      Change-Id: Ia7041be384ac07d0a790c2c5c68b1b31ff0e567a
      a3f1ded3
    • Mark Goddard's avatar
      Restart all nova services after upgrade · e6d2b922
      Mark Goddard authored
      During an upgrade, nova pins the version of RPC calls to the minimum
      seen across all services. This ensures that old services do not receive
      data they cannot handle. After the upgrade is complete, all nova
      services are supposed to be reloaded via SIGHUP to cause them to check
      again the RPC versions of services and use the new latest version which
      should now be supported by all running services.
      
      Due to a bug [1] in oslo.service, sending services SIGHUP is currently
      broken. We replaced the HUP with a restart for the nova_compute
      container for bug 1821362, but not other nova services. It seems we need
      to restart all nova services to allow the RPC version pin to be removed.
      
      Testing in a Queens to Rocky upgrade, we find the following in the logs:
      
      Automatically selected compute RPC version 5.0 from minimum service
      version 30
      
      However, the service version in Rocky is 35.
      
      There is a second issue in that it takes some time for the upgraded
      services to update the nova services database table with their new
      version. We need to wait until all nova-compute services have done this
      before the restart is performed, otherwise the RPC version cap will
      remain in place. There is currently no interface in nova available for
      checking these versions [2], so as a workaround we use a configurable
      delay with a default duration of 30 seconds. Testing showed it takes
      about 10 seconds for the version to be updated, so this gives us some
      headroom.
      
      This change restarts all nova services after an upgrade, after a 30
      second delay.
      
      [1] https://bugs.launchpad.net/oslo.service/+bug/1715374
      [2] https://bugs.launchpad.net/nova/+bug/1833542
      
      Change-Id: Ia6fc9011ee6f5461f40a1307b72709d769814a79
      Closes-Bug: #1833069
      Related-Bug: #1833542
      e6d2b922
    • Mark Goddard's avatar
      Don't rotate keystone fernet keys during deploy · 09e29d0d
      Mark Goddard authored
      When running deploy or reconfigure for Keystone,
      ansible/roles/keystone/tasks/deploy.yml calls init_fernet.yml,
      which runs /usr/bin/fernet-rotate.sh, which calls keystone-manage
      fernet_rotate.
      
      This means that a token can become invalid if the operator runs
      deploy or reconfigure too often.
      
      This change splits out fernet-push.sh from the fernet-rotate.sh
      script, then calls fernet-push.sh after the fernet bootstrap
      performed in deploy.
      
      Change-Id: I824857ddfb1dd026f93994a4ac8db8f80e64072e
      Closes-Bug: #1833729
      09e29d0d
  14. Jun 26, 2019
  15. Jun 24, 2019
  16. Jun 21, 2019
  17. Jun 19, 2019
  18. Jun 18, 2019
    • Marek Svensson's avatar
      Fix default deployment of freezer, use mariadb. · 10bf6b05
      Marek Svensson authored
      
      This change defaults freezer to use mariadb as default backend for database
      and adds elasticsearch as an optional backend due to the requirement of
      freezer to use elasticsearch version 2.3.0. The default elasticsearch in
      kolla-ansible is 5.6.x and that doesn't work with freezer.
      
      Added needed options to the elasticsearch backend like:
       - protocol
       - address
       - port
       - number of replicas
      
      Change-Id: I88616c285bdb297fd1f738846ddffe1b08a7a827
      Signed-off-by: default avatarMarek Svensson <marek@marex.st>
      10bf6b05
    • Doug Szumski's avatar
      Format internal Fluentd logs · d89a89c2
      Doug Szumski authored
      This change formats internal Fluent logs in a similar way to other
      logs. It makes it easier for a user to identify issues with Fluent
      parsing logs. Any failure to parse a log will be ingested into the
      logging framework and can easily be located by searching for
      'pattern not match' or by filtering for Fluent log warnings.
      
      Change-Id: Iea6d12c07a2f4152f2038d3de2ef589479b3332b
      d89a89c2
    • ZijianGuo's avatar
      Fix the redis_connection_string for osprofiler and make it generic · cd836dd3
      ZijianGuo authored
      
      * When using redis as the backend of osprofiler, it cannot connect to
      redis because the redis_connection_string is incorrect.
      
      * Let other places that use redis also use this variable.
      
      Change-Id: I14de6597932d05cd7f804a35c6764ba4ae9087cd
      Closes-Bug: #1833200
      Signed-off-by: default avatarZijianGuo <guozijn@gmail.com>
      cd836dd3
    • Doug Szumski's avatar
      Don't drop unmatched Kolla service logs · cfeb9dd9
      Doug Szumski authored
      Kolla service logs which don't match a Fluentd rewriterule get dropped.
      This change prevents that by tagging them with 'unmatched'.
      
      Change-Id: I0a2484d878d5c86977fb232a57c52f874ca7a34c
      cfeb9dd9
    • Doug Szumski's avatar
      Increase log coverage for Monasca · cb404743
      Doug Szumski authored
      Monasca Python service logs prior to this change were being dropped
      due to missing entries in the Fluent record_transformer config file.
      This change adds support for ingesting those logs, and explicitly
      removes support for ingesting Monasca Log API logs to reduce the risk
      of feedback, for example if debug logging is turned on in the Monasca
      Log API.
      
      Change-Id: I9e3436a8f946873867900eed5ff0643d84584358
      cb404743
    • Doug Szumski's avatar
      Ingest non-standard Monasca logs · 4b31fdcf
      Doug Szumski authored
      Presently, errors can appear in Fluentd and Monasca Log API logs due
      to log output from some Monasca services, which do not use Oslo log,
      being processed alongside other OpenStack logs which do.
      
      This change parses these log files separately to prevent these errors.
      
      Change-Id: Ie3cbb51424989b01727b5ebaaeba032767073462
      4b31fdcf
    • Radosław Piliszek's avatar
      Make Ceph upgrade check Ceph release to avoid EPERM · 0ea991e4
      Radosław Piliszek authored
      
      Since we have different upgrade paths, we must use the actually
      installed Ceph release name when doing require-osd-release
      
      Closes-Bug: #1832989
      
      Change-Id: I6aaa4b4ac0fb739f7ad885c13f55b6db969996a2
      Signed-off-by: default avatarRadosław Piliszek <radoslaw.piliszek@gmail.com>
      0ea991e4
  19. Jun 17, 2019
Loading