Skip to content
Snippets Groups Projects
  1. Sep 16, 2019
    • Mark Goddard's avatar
      Catch errors and changes in kolla_toolbox module · 70b515bf
      Mark Goddard authored
      The kolla_toolbox Ansible module executes as-hoc ansible commands in the
      kolla_toolbox container, and parses the output to make it look as if
      ansible-playbook executed the command. Currently however, this module
      sometimes fails to catch failures of the underlying command, and also
      sometimes shows tasks as 'ok' when the underlying command was changed.
      This has been tested both before and after the upgrade to ansible 2.8.
      
      This change fixes this issue by configuring ansible to emit output in
      JSON format, to make parsing simpler. We can now pick up errors and
      changes, and signal them to the caller.
      
      This change also adds an ansible playbook, tests/test-kolla-toolbox.yml,
      that can be executed to test the module. It's not currently integrated
      with any CI jobs.
      
      Note that this change cannot be backported as the JSON output callback
      plugin was added in Ansible 2.5.
      
      Change-Id: I8236dd4165f760c819ca972b75cbebc62015fada
      Closes-Bug: #1844114
      70b515bf
  2. Sep 10, 2019
    • Hongbin Lu's avatar
      Configure Zun for Placement (Train+) · 0f5e0658
      Hongbin Lu authored
      After the integration with placement [1], we need to configure how
      zun-compute is going to work with nova-compute.
      
      * If zun-compute and nova-compute run on the same compute node,
        we need to set 'host_shared_with_nova' as true so that Zun
        will use the resource provider (compute node) created by nova.
        In this mode, containers and VMs could claim allocations against
        the same resource provider.
      * If zun-compute runs on a node without nova-compute, no extra
        configuration is needed. By default, each zun-compute will create
        a resource provider in placement to represent the compute node
        it manages.
      
      [1] https://blueprints.launchpad.net/zun/+spec/use-placement-resource-management
      
      Change-Id: I2d85911c4504e541d2994ce3d48e2fbb1090b813
      0f5e0658
  3. Sep 05, 2019
  4. Aug 22, 2019
  5. Aug 16, 2019
  6. Aug 14, 2019
  7. Aug 06, 2019
  8. Aug 05, 2019
    • Radosław Piliszek's avatar
      ceph: fixes to deployment and upgrade · 826f6850
      Radosław Piliszek authored
      1) ceph-nfs (ganesha-ceph) - use NFSv4 only
      This is recommended upstream.
      v3 and UDP require portmapper (aka rpcbind) which we
      do not want, except where Ubuntu ganesha version (2.6)
      forces it by requiring enabled UDP, see [1].
      The issue has been fixed in 2.8, included in CentOS.
      Additionally disable v3 helper protocols and kerberos
      to avoid meaningless warnings.
      
      2) ceph-nfs (ganesha-ceph) - do not export host dbus
      It is not in use. This avoids the temptation to try
      handling it on host.
      
      3) Properly handle ceph services deploy and upgrade
      Upgrade runs deploy.
      The order has been corrected - nfs goes after mds.
      Additionally upgrade takes care of rgw for keystone
      (for swift emulation).
      
      4) Enhance ceph keyring module with error detection
      Now it does not blindly try to create a keyring after
      any failure. This used to hide real issue.
      
      5) Retry ceph admin keyring update until cluster works
      Reordering deployment caused issue with ceph cluster not being
      fully operational before taking actions on it.
      
      6) CI: Remove osd df from collected logs as it may hang CI
      Hangs are caused by healthy MON and no healthy MGR.
      A descriptive note is left in its place.
      
      7) CI: Add 5s timeout to ceph informational commands
      This decreases the timeout from the default 300s.
      
      [1] https://review.opendev.org/669315
      
      
      
      Change-Id: I1cf0ad10b80552f503898e723f0c4bd00a38f143
      Signed-off-by: default avatarRadosław Piliszek <radoslaw.piliszek@gmail.com>
      826f6850
  9. Jul 26, 2019
  10. Jul 18, 2019
    • Radosław Piliszek's avatar
      Fix handling of docker restart policy · 6a737b19
      Radosław Piliszek authored
      Docker has no restart policy named 'never'. It has 'no'.
      This has bitten us already (see [1]) and might bite us again whenever
      we want to change the restart policy to 'no'.
      
      This patch makes our docker integration honor all valid restart policies
      and only valid restart policies.
      All relevant docker restart policy usages are patched as well.
      
      I added some FIXMEs around which are relevant to kolla-ansible docker
      integration. They are not fixed in here to not alter behavior.
      
      [1] https://review.opendev.org/667363
      
      
      
      Change-Id: I1c9764fb9bbda08a71186091aced67433ad4e3d6
      Signed-off-by: default avatarRadosław Piliszek <radoslaw.piliszek@gmail.com>
      6a737b19
  11. Jul 16, 2019
  12. Jul 09, 2019
  13. Jul 04, 2019
  14. Jul 03, 2019
  15. Jul 02, 2019
  16. Jul 01, 2019
  17. Jun 28, 2019
    • Mark Goddard's avatar
      Exit on failure in init-runonce · bc08b44f
      Mark Goddard authored
      Previously we sourced this script in tests/deploy.sh, but this was
      recently changed. Following that change we lost the errexit setting,
      meaning we ignore errors in init-runonce.
      
      Adding errexit in the script itself means that all callers get error
      handling.
      
      Also log init-runonce output.
      
      TrivialFix
      
      Change-Id: I9b35bd5f0f76eec26ddd968d093a3a5fd55a7ce2
      bc08b44f
  18. Jun 27, 2019
    • Mark Goddard's avatar
      Fix conditionals in CI playbook · 3b218fd0
      Mark Goddard authored
      These were not templated, so always evaluated to true. This shouldn't be
      causing any issues.
      
      Change-Id: I7b8e407e688ba201c4f7d1a94bbd41af0918e7df
      3b218fd0
  19. Jun 21, 2019
  20. Jun 16, 2019
  21. Jun 11, 2019
    • Mark Goddard's avatar
      Add CI job for ironic · 845040ad
      Mark Goddard authored
      Adds four new CI jobs for testing centos/ubuntu binary/source deploys
      with ironic enabled. These are run only when there are changes to the
      ironic role.
      
      Performs some simple testing by creating a node using the fake-hardware
      hardware type and creating a server.
      
      Change-Id: Ie669e57ce2af53257b4ca05f45193cb73f48827a
      Depends-On: https://review.opendev.org/664011
      845040ad
  22. Jun 07, 2019
  23. Jun 03, 2019
    • Mark Goddard's avatar
      Test Ceph upgrade in CI · 78ee0287
      Mark Goddard authored
      Add CI jobs for testing an upgrade of a multinode system with Ceph
      enabled. As for the existing upgrade job, we upgrade from the previous
      release to the current release.
      
      Change-Id: I931772ca4c63757769467a57c80dc0726a11167a
      Depends-On: https://review.opendev.org/658163
      78ee0287
  24. May 31, 2019
    • Gaetan Trellu's avatar
      Adds Qinling Ansible role · edb34898
      Gaetan Trellu authored
      Qinling is an OpenStack project to provide "Function as a Service".
      This project aims to provide a platform to support serverless functions.
      
      Change-Id: I239a0130f8c8b061b531dab530d65172b0914d7c
      Implements: blueprint ansible-qinling-support
      Story: 2005760
      Task: 33468
      edb34898
  25. May 17, 2019
    • Mark Goddard's avatar
      Fix keystone fernet key rotation scheduling · 6c1442c3
      Mark Goddard authored
      Right now every controller rotates fernet keys. This is nice because
      should any controller die, we know the remaining ones will rotate the
      keys. However, we are currently over-rotating the keys.
      
      When we over rotate keys, we get logs like this:
      
       This is not a recognized Fernet token <token> TokenNotFound
      
      Most clients can recover and get a new token, but some clients (like
      Nova passing tokens to other services) can't do that because it doesn't
      have the password to regenerate a new token.
      
      With three controllers, in crontab in keystone-fernet we see the once a day
      correctly staggered across the three controllers:
      
      ssh ctrl1 sudo cat /etc/kolla/keystone-fernet/crontab
      0 0 * * * /usr/bin/fernet-rotate.sh
      ssh ctrl2 sudo cat /etc/kolla/keystone-fernet/crontab
      0 8 * * * /usr/bin/fernet-rotate.sh
      ssh ctrl3 sudo cat /etc/kolla/keystone-fernet/crontab
      0 16 * * * /usr/bin/fernet-rotate.sh
      
      Currently with three controllers we have this keystone config:
      
      [token]
      expiration = 86400 (although, keystone default is one hour)
      allow_expired_window = 172800 (this is the keystone default)
      
      [fernet_tokens]
      max_active_keys = 4
      
      Currently, kolla-ansible configures key rotation according to the following:
      
         rotation_interval = token_expiration / num_hosts
      
      This means we rotate keys more quickly the more hosts we have, which doesn't
      make much sense.
      
      Keystone docs state:
      
         max_active_keys =
           ((token_expiration + allow_expired_window) / rotation_interval) + 2
      
      For details see:
      https://docs.openstack.org/keystone/stein/admin/fernet-token-faq.html
      
      Rotation is based on pushing out a staging key, so should any server
      start using that key, other servers will consider that valid. Then each
      server in turn starts using the staging key, each in term demoting the
      existing primary key to a secondary key. Eventually you prune the
      secondary keys when there is no token in the wild that would need to be
      decrypted using that key. So this all makes sense.
      
      This change adds new variables for fernet_token_allow_expired_window and
      fernet_key_rotation_interval, so that we can correctly calculate the
      correct number of active keys. We now set the default rotation interval
      so as to minimise the number of active keys to 3 - one primary, one
      secondary, one buffer.
      
      This change also fixes the fernet cron job generator, which was broken
      in the following cases:
      
      * requesting an interval of more than 1 day resulted in no jobs
      * requesting an interval of more than 60 minutes, unless an exact
        multiple of 60 minutes, resulted in no jobs
      
      It should now be possible to request any interval up to a week divided
      by the number of hosts.
      
      Change-Id: I10c82dc5f83653beb60ddb86d558c5602153341a
      Closes-Bug: #1809469
      6c1442c3
    • Mark Goddard's avatar
      Add unit test for keystone fernet cron generator · 25ac955a
      Mark Goddard authored
      Before making changes to this script, document its behaviour with a unit
      test.
      
      There are two major issues:
      
      * requesting an interval of more than 1 day results in no jobs
      * requesting an interval of more than 60 minutes, unless an exact
        multiple of 60 minutes, results in no jobs
      
      Change-Id: I655da1102dfb4ca12437b7db0b79c9a61568f79e
      Related-Bug: #1809469
      25ac955a
  26. Apr 19, 2019
  27. Apr 14, 2019
    • Mark Goddard's avatar
      Fix periodic CI jobs · 2b7a9dc2
      Mark Goddard authored
      Periodic jobs don't have zuul.change defined, since there is no change
      being tested. This causes an early failure when referencing zuul.change
      to set the image tag for built images. In periodic jobs we'll never need
      to build images because there is no dependent kolla change under test.
      
      Change-Id: I6d9d81cf17b7d0d7aaf87cd96418c904c46681f2
      2b7a9dc2
  28. Apr 10, 2019
    • Mark Goddard's avatar
      Remove RabbitMQ support from Bifrost · 33564a00
      Mark Goddard authored
      During the Train cycle, Bifrost switched to using JSON-RPC by default
      for Ironic's internal communication [1], avoiding the need to install
      RabbitMQ. This simplifies things, so we may as well remove our custom
      configuration of RabbitMQ.
      
      [1] https://review.openstack.org/645093
      
      Change-Id: I3107349530aa753d68fd59baaf13eb7dd5485ae6
      33564a00
  29. Apr 08, 2019
    • Mark Goddard's avatar
      Do some Train TODOs · bb9d51e2
      Mark Goddard authored
      Make an early start on the TODOs for the Train cycle.
      
      1. Remove the task that removes the vitrage_collector container, which
         was added in the Stein cycle to clean up this container which is no
         longer deployed.
      
      2. Remove globals.yml configuration in CI to disable Heat for upgrade
         jobs. Heat is now enabled in the previous release (Stein).
      
      3. Remove the deprecated variable cinder_iscsi_helper, which was renamed
         to cinder_target_helper in Stein.
      
      Change-Id: I774bf395e0bdd4db9c20c6289a22cf059fa42e1a
      bb9d51e2
  30. Apr 03, 2019
    • Mark Goddard's avatar
      Check configuration file permissions in CI · 8c4ab41f
      Mark Goddard authored
      Typically, non-executable files should have 660 or 600 and executable
      files and directories should have 770. All should be owned by the
      'config_owner_user' and 'config_owner_group' variables.
      
      This change adds a script to check the owner and permissions of config
      files under /etc/kolla, and runs it at the end of CI jobs.
      
      Change-Id: Icdbabf36e284b9030017a0dc07b9dc81a37758ab
      Related-Bug: #1821579
      8c4ab41f
  31. Mar 27, 2019
    • Mark Goddard's avatar
      Test upgrades in CI · c23c9b2c
      Mark Goddard authored
      This patch adds two new jobs:
      
      * kolla-ansible-centos-source-upgrade
      * kolla-ansible-ubuntu-source-upgrade
      
      These jobs first deploy a control plane using the previous release of
      Kolla Ansible, then upgrade to the current release.
      
      Because we can't change the branch of the git repository on the Zuul
      executor, we change the branch of the kolla-ansible repository on the
      primary node to the branch of the previous release, in this case
      stable/rocky. A new remote-template role has been added that supports
      generating templates using a remote template source, to generate config
      files using the previous kolla-ansible branch.
      
      If the change being tested depends on a kolla change for the current
      branch, then we build images. Rather than using the current
      kolla-ansible version to tag the images, we now tag them with
      change_<gerrit change ID>. This is because the version of kolla-ansible
      will change from the previous release to the current one as we upgrade
      the system.
      
      Finally, it should be noted that the 'previous_release' variable in the
      Zuul config needs to be updated with each release, since this sets the
      release of kolla-ansible that is installed initially.
      
      Depends-On: https://review.openstack.org/645089/
      Depends-On: https://review.openstack.org/644250/
      Depends-On: https://review.openstack.org/645816/
      Depends-On: https://review.openstack.org/645840/
      Change-Id: If301e0affcd55360fefe3b105f023ae5c47b0853
      c23c9b2c
  32. Mar 21, 2019
    • Mark Goddard's avatar
      Wait for cinder volume to become available in CI · e956cd87
      Mark Goddard authored
      Fixes a race condition where sometimes a volume would still be in the
      'creating' state when trying to attach it to a server.
      
      Invalid volume: Volume <id> status must be available or downloading to
      reserve, but the current status is creating.
      
      Change-Id: I0687ddfd78c384650cb361ff07aa64c5c3806a93
      e956cd87
Loading