Skip to content
Snippets Groups Projects
  1. May 17, 2019
    • Mark Goddard's avatar
      Fix keystone fernet key rotation scheduling · 6c1442c3
      Mark Goddard authored
      Right now every controller rotates fernet keys. This is nice because
      should any controller die, we know the remaining ones will rotate the
      keys. However, we are currently over-rotating the keys.
      
      When we over rotate keys, we get logs like this:
      
       This is not a recognized Fernet token <token> TokenNotFound
      
      Most clients can recover and get a new token, but some clients (like
      Nova passing tokens to other services) can't do that because it doesn't
      have the password to regenerate a new token.
      
      With three controllers, in crontab in keystone-fernet we see the once a day
      correctly staggered across the three controllers:
      
      ssh ctrl1 sudo cat /etc/kolla/keystone-fernet/crontab
      0 0 * * * /usr/bin/fernet-rotate.sh
      ssh ctrl2 sudo cat /etc/kolla/keystone-fernet/crontab
      0 8 * * * /usr/bin/fernet-rotate.sh
      ssh ctrl3 sudo cat /etc/kolla/keystone-fernet/crontab
      0 16 * * * /usr/bin/fernet-rotate.sh
      
      Currently with three controllers we have this keystone config:
      
      [token]
      expiration = 86400 (although, keystone default is one hour)
      allow_expired_window = 172800 (this is the keystone default)
      
      [fernet_tokens]
      max_active_keys = 4
      
      Currently, kolla-ansible configures key rotation according to the following:
      
         rotation_interval = token_expiration / num_hosts
      
      This means we rotate keys more quickly the more hosts we have, which doesn't
      make much sense.
      
      Keystone docs state:
      
         max_active_keys =
           ((token_expiration + allow_expired_window) / rotation_interval) + 2
      
      For details see:
      https://docs.openstack.org/keystone/stein/admin/fernet-token-faq.html
      
      Rotation is based on pushing out a staging key, so should any server
      start using that key, other servers will consider that valid. Then each
      server in turn starts using the staging key, each in term demoting the
      existing primary key to a secondary key. Eventually you prune the
      secondary keys when there is no token in the wild that would need to be
      decrypted using that key. So this all makes sense.
      
      This change adds new variables for fernet_token_allow_expired_window and
      fernet_key_rotation_interval, so that we can correctly calculate the
      correct number of active keys. We now set the default rotation interval
      so as to minimise the number of active keys to 3 - one primary, one
      secondary, one buffer.
      
      This change also fixes the fernet cron job generator, which was broken
      in the following cases:
      
      * requesting an interval of more than 1 day resulted in no jobs
      * requesting an interval of more than 60 minutes, unless an exact
        multiple of 60 minutes, resulted in no jobs
      
      It should now be possible to request any interval up to a week divided
      by the number of hosts.
      
      Change-Id: I10c82dc5f83653beb60ddb86d558c5602153341a
      Closes-Bug: #1809469
      6c1442c3
    • Mark Goddard's avatar
      Add unit test for keystone fernet cron generator · 25ac955a
      Mark Goddard authored
      Before making changes to this script, document its behaviour with a unit
      test.
      
      There are two major issues:
      
      * requesting an interval of more than 1 day results in no jobs
      * requesting an interval of more than 60 minutes, unless an exact
        multiple of 60 minutes, results in no jobs
      
      Change-Id: I655da1102dfb4ca12437b7db0b79c9a61568f79e
      Related-Bug: #1809469
      25ac955a
Loading