Skip to content
Snippets Groups Projects
  1. Apr 08, 2019
    • Mark Goddard's avatar
      Use ironic inspector 'dnsmasq' PXE filter by default · 86e83fae
      Mark Goddard authored
      With Docker CE, the daemon sets the default policy of the iptables
      FORWARD chain to DROP. This causes problems for provisioning bare metal
      servers when ironic inspector is used with the 'iptables' PXE filter.
      It's not entirely clear why these two things interact in this way,
      but switching to the 'dnsmasq' filter works around the issue, and is
      probably a good move anyway because it is more efficient.
      
      We have added a migration task here to flush and remove the ironic-inspector
      iptables chain since inspector does not do this itself currently.
      
      Change-Id: Iceed5a096819203eb2b92466d39575d3adf8e218
      Closes-Bug: #1823044
      86e83fae
  2. Apr 03, 2019
    • Mark Goddard's avatar
      Check configuration file permissions in CI · 8c4ab41f
      Mark Goddard authored
      Typically, non-executable files should have 660 or 600 and executable
      files and directories should have 770. All should be owned by the
      'config_owner_user' and 'config_owner_group' variables.
      
      This change adds a script to check the owner and permissions of config
      files under /etc/kolla, and runs it at the end of CI jobs.
      
      Change-Id: Icdbabf36e284b9030017a0dc07b9dc81a37758ab
      Related-Bug: #1821579
      8c4ab41f
  3. Apr 02, 2019
  4. Apr 01, 2019
  5. Mar 29, 2019
  6. Mar 28, 2019
  7. Mar 27, 2019
    • Zuul's avatar
    • Mark Goddard's avatar
      Test upgrades in CI · c23c9b2c
      Mark Goddard authored
      This patch adds two new jobs:
      
      * kolla-ansible-centos-source-upgrade
      * kolla-ansible-ubuntu-source-upgrade
      
      These jobs first deploy a control plane using the previous release of
      Kolla Ansible, then upgrade to the current release.
      
      Because we can't change the branch of the git repository on the Zuul
      executor, we change the branch of the kolla-ansible repository on the
      primary node to the branch of the previous release, in this case
      stable/rocky. A new remote-template role has been added that supports
      generating templates using a remote template source, to generate config
      files using the previous kolla-ansible branch.
      
      If the change being tested depends on a kolla change for the current
      branch, then we build images. Rather than using the current
      kolla-ansible version to tag the images, we now tag them with
      change_<gerrit change ID>. This is because the version of kolla-ansible
      will change from the previous release to the current one as we upgrade
      the system.
      
      Finally, it should be noted that the 'previous_release' variable in the
      Zuul config needs to be updated with each release, since this sets the
      release of kolla-ansible that is installed initially.
      
      Depends-On: https://review.openstack.org/645089/
      Depends-On: https://review.openstack.org/644250/
      Depends-On: https://review.openstack.org/645816/
      Depends-On: https://review.openstack.org/645840/
      Change-Id: If301e0affcd55360fefe3b105f023ae5c47b0853
      c23c9b2c
    • jamesbagwell's avatar
      Removing '/certificates' entry in generate.yml as this causes an · c0a3970e
      jamesbagwell authored
      incorrect path when generating certificates.
      
      The 'setting permissions on key' task fails because the task looks for
      the haproxy.key in an invalid path. The certificates_dir is defined as
      '{{ node_config }}/certificates' in the main.yml . The 'Setting
      permissions on Key' task has a path of '{{ certificates_dir
      }}/certificates/private/haproxy.key which is incorrect. Removing the
      'certificates' in the path corrects this problem and allows the user to
      successfully create certificates using 'kolla-ansible certificates'.
      
      Change-Id: I37b10b994b05d955b6f67c908df1472231a91160
      Closes-Bug: 1821805
      c0a3970e
    • Serhat Demircan's avatar
      Retry perform a synced flush task while upgrading elasticsearch · adb02958
      Serhat Demircan authored
      The synced flush fails due to concurrent indexing operations.
      The HTTP status code in that case will be 409 CONFLICT. We can
      retry this task until returns success.
      
      Change-Id: I57f9a009b12715eed8dfcf829a71f418d2ce437b
      adb02958
  8. Mar 26, 2019
  9. Mar 25, 2019
    • Mark Goddard's avatar
      Remove recurse: yes for owner/perms on /etc/kolla · 6b0be5c5
      Mark Goddard authored
      When kolla-ansible bootstrap-servers is run, it executes one of the
      following two tasks:
      
      - name: Ensure node_config_directory directory exists for user kolla
        file:
          path: "{{ node_config_directory }}"
          state: directory
          recurse: true
          owner: "{{ kolla_user }}"
          group: "{{ kolla_group }}"
          mode: "0755"
        become: True
        when: create_kolla_user | bool
      
      - name: Ensure node_config_directory directory exists
        file:
          path: "{{ node_config_directory }}"
          state: directory
          recurse: true
          mode: "0755"
        become: True
        when: not create_kolla_user | bool
      
      On the first run, normally node_config_directory (/etc/kolla/) doesn't
      exist, so it is created with kolla:kolla ownership and 0755 permissions.
      
      If we then run 'kolla-ansible deploy', config files are created for
      containers in this directory, e.g. /etc/kolla/nova-compute/. Permissions
      for those files should be set according to 'config_owner_user' and
      'config_owner_group'.
      
      If at some point we again run kolla-ansible bootstrap-servers, it will
      recursively set the ownership and permissions of all files in /etc/kolla
      to kolla:kolla / 0755.
      
      The solution is to change bootstrap-servers to not set the owner and
      permissions recursively. It's also arguable that /etc/kolla should be
      owned by 'config_owner_user' and 'config_owner_group', rather than
      kolla:kolla, although that's a separate issue.
      
      Change-Id: I24668914a9cedc94d5a6cb835648740ce9ce6e39
      Closes-Bug: #1821599
      6b0be5c5
    • Zuul's avatar
      Merge "Bump up timeout for ceph jobs" · def2ac9a
      Zuul authored
      def2ac9a
    • Zuul's avatar
      14a52eff
    • Zuul's avatar
      9ef0d6d5
    • Zuul's avatar
      Merge "Fix neutron rolling upgrade" · 42d664c1
      Zuul authored
      42d664c1
    • Zuul's avatar
      e4693e8d
    • Michal Nasiadka's avatar
      Bump up timeout for ceph jobs · ab04ab93
      Michal Nasiadka authored
      Currently ceph jobs are often getting TIMED_OUT, increasing limit.
      
      Change-Id: I3c6684984930d55a56da846bd8c3f19df2754b06
      ab04ab93
  10. Mar 23, 2019
    • Mark Goddard's avatar
      Fix MariaDB 10.3 upgrade · b25c0ee4
      Mark Goddard authored
      Upgrading MariaDB from Rocky to Stein currently fails, with the new
      container left continually restarting. The problem is that the Rocky
      container does not shutdown cleanly, leaving behind state that the new
      container cannot recover. The container does not shutdown cleanly
      because we run dumb-init with a --single-child argument, causing it to
      forward signals to only the process executed by dumb-init. In our case
      this is mysqld_safe, which ignores various signals, including SIGTERM.
      After a (default 10 second) timeout, Docker then kills the container.
      
      A Kolla change [1] removes the --single-child argument from dumb-init
      for the MariaDB container, however we still need to support upgrading
      from Rocky images that don't have this change. To do that, we add new
      handlers to execute 'mysqladmin shutdown' to cleanly shutdown the
      service.
      
      A second issue with the current upgrade approach is that we don't
      execute mysql_upgrade after starting the new service. This can leave the
      database state using the format of the previous release. This patch also
      adds handlers to execute mysql_upgrade.
      
      [1] https://review.openstack.org/644244
      
      Depends-On: https://review.openstack.org/644244
      Depends-On: https://review.openstack.org/645990
      Change-Id: I08a655a359ff9cfa79043f2166dca59199c7d67f
      Closes-Bug: #1820325
      b25c0ee4
  11. Mar 22, 2019
    • Mark Goddard's avatar
      Fix booting instances after nova-compute upgrade · 192dcd1e
      Mark Goddard authored
      After upgrading from Rocky to Stein, nova-compute services fail to start
      new instances with the following error message:
      
      Failed to allocate the network(s), not rescheduling.
      
      Looking in the nova-compute logs, we also see this:
      
      Neutron Reported failure on event
      network-vif-plugged-60c05a0d-8758-44c9-81e4-754551567be5 for instance
      32c493c4-d88c-4f14-98db-c7af64bf3324: NovaException: In shutdown, no new
      events can be scheduled
      
      During the upgrade process, we send nova containers a SIGHUP to cause
      them to reload their object version state. Speaking to the nova team in
      IRC, there is a known issue with this, caused by oslo.service performing
      a full shutdown in response to a SIGHUP, which breaks nova-compute.
      There is a patch [1] in review to address this.
      
      The workaround employed here is to restart the nova compute service.
      
      [1] https://review.openstack.org/#/c/641907
      
      Change-Id: Ia4fcc558a3f62ced2d629d7a22d0bc1eb6b879f1
      Closes-Bug: #1821362
      192dcd1e
    • Mark Goddard's avatar
      Update openstack_previous_release_name to rocky · 98df4dd8
      Mark Goddard authored
      This is used for version pinning during rolling upgrades.
      
      Change-Id: I6e878a8f7c9e0747d8d60cb4527c5f8f039ec15a
      98df4dd8
    • Zuul's avatar
      33a92b9f
    • Scott Solkhon's avatar
      Add mising handlers for external Ceph. · c70d8066
      Scott Solkhon authored
      When Nova, Glance, or Cinder are deployed alongside an external Ceph deployment
      handlers will fail to trigger if keyring files are updated, which results in the
      containers not being restarted.
      
      This change adds the missing 'when' conditions for nova-libvirt, nova-compute,
      cinder-volume, cinder-backup, and glance-api containers.
      
      Change-Id: I8e183aac9a72e7a7210f7edc7cdcbaedd4fbcaa9
      c70d8066
  12. Mar 21, 2019
    • Mark Goddard's avatar
      Wait for cinder volume to become available in CI · e956cd87
      Mark Goddard authored
      Fixes a race condition where sometimes a volume would still be in the
      'creating' state when trying to attach it to a server.
      
      Invalid volume: Volume <id> status must be available or downloading to
      reserve, but the current status is creating.
      
      Change-Id: I0687ddfd78c384650cb361ff07aa64c5c3806a93
      e956cd87
    • Zuul's avatar
      77419255
    • Zuul's avatar
      Merge "Fix placement-api WSGI error" · 5841ec78
      Zuul authored
      5841ec78
    • Mark Goddard's avatar
      Fix neutron rolling upgrade · 55633ebf
      Mark Goddard authored
      Services were being passed as a JSON list, then iterated over in the
      neutron-server container's extend_start.sh script like this:
      
      ['neutron-server'
      'neutron-fwaas'
      'neutron-vpnaas']
      
      I'm not actually sure why we have to specify services explicitly, it
      seems liable to break if we have other plugins that need migrating.
      
      Change-Id: Ic8ce595793cbe0772e44c041246d5af3a9471d44
      55633ebf
Loading