Skip to content
Snippets Groups Projects
containerize-openstack.rst 8.52 KiB
Newer Older
  • Learn to ignore specific revisions
  • ..
       This work is licensed under a Creative Commons Attribution 3.0 Unported
     License.
    
     http://creativecommons.org/licenses/by/3.0/legalcode
    
    ======================
    Containerize OpenStack
    ======================
    
    When upgrading or downgrading OpenStack, it is possible to use package based
    management or image-based management.  Containerizing OpenStack is meant to
    optimize image-based management of OpenStack.  Containerizing OpenStack
    solves a manageability and availability problem with the current state of the
    art deployment systems in OpenStack.
    
    Problem description
    ===================
    
    Current state of the art deployment systems use either image based or package
    based upgrade.
    
    Image based upgrades are utilized by TripleO.  When TripleO updates a system,
    it creates an image of the entire disk and deploys that rather than just the
    parts that compose the OpenStack deployment.  This results in significant
    loss of availability.  Further running VMs are shut down in the imaging
    process.  However, image based systems offer atomicity, because all related
    software for a service is updated in one atomic action by reimaging the system.
    
    Other systems use package based upgrade.  Package based upgrades suffer from
    a non-atomic nature.  An update may update 1 or more RPM packages.  The update
    process could fail for any number of reasons, and there is no way to back
    out the existing changes.  Typically in an OpenStack deployment it is
    
    desirable to update a service that does one thing including it's dependencies
    
    as an atomic unit.  Package based upgrades do not offer atomicity.
    
    To solve this problem, containers can be used to provide an image-based update
    approach which offers atomic upgrade of a running system with minimal
    interruption in service.  A rough prototype of compute upgrade [1] shows
    approximately a 10 second window of unavailability during a software update.
    The prototype keeps virtual machines running without interruption.
    
    Use cases
    ---------
    1. Upgrade or rollback OpenStack deployments atomically.  End-user wants to
       change the running software versions in her system to deploy a new upstream
       release without interrupting service for significant periods.
    2. Upgrade OpenStack based by component.  End-user wants to upgrade her system
       in fine-grained chunks to limit damage from a failed upgrade.
    3. Rollback OpenStack based by component.  End-user experienced a failed
       upgrade and wishes to rollback to the last known good working version.
    
    
    Proposed change
    ===============
    An OpenStack deployment based on containers are represented in a tree structure
    with each node representing a container set, and each leaf representing a
    container.
    
    The full properties of a container set:
    
    * A container set is composed of one or more container subsets or one or more
      individual containers
    * A container set provides a single logical service
    * A container set is managed as a unit during startup, shutdown, and version
    * Each container set is launched together as one unit
    * A container set with subsets is launched as one unit including all subsets
    * A container set is not atomically managed
    * A container set provides appropriate hooks for high availability monitoring
    
    The full properties of a container:
    
    * A container is atomically upgraded or rolled back
    * A container includes a monotonically increasing generation number to identify
      the container's age in comparison with other containers
    * A container has a single responsibility
    * A container may be super-privileged when it needs significant access to the
      host including:
    
      * the network namespace of the host
      * The UUID namespace of the host
      * The IPC namespace of the host
      * Filesystem sharing of the host for persistent storage
    
    * A container may lack any privileges when it does not require significant
      access to the host.
    * A container should include a check function for evaluating its own health.
    * A container will include proper PID 1 handling for reaping exited child
      processes.
    
    The top level container sets are composed of:
    
    * database control
    * messaging control
    * high availability control
    
    * OpenStack interface
    
    * OpenStack control
    
    * OpenStack compute operation
    * OpenStack network operation
    
    * OpenStack storage operation
    
    The various container sets are composed in more detail as follows:
    
    * Database control
    
      * galera
      * mariadb
      * mongodb
    
    * Messaging control
    
      * rabbitmq
    
    * High availability control
    
      * keepalived
    
    * OpenStack interface
    
      * glance-api
      * nova-api
      * ceilometer-api
      * heat-api
    
    * OpenStack control
    
    
      * glance-controller
    
        * glance-registry
    
      * nova-controller
    
        * nova-conductor
        * nova-scheduler
    
        * metadata-service
    
      * cinder-controller
    
      * neutron-controller
    
        * neutron-server
    
      * ceilometer-controller
    
        * ceilometer-alarm
        * ceilometer-base
        * ceilometer-central
        * ceilometer-collector
        * ceilometer-notification
    
      * heat-controller
    
    * OpenStack compute operation
    
      * nova-compute
      * nova-libvirt
      * neutron-agents-linux-bridge
      * neutron-agents-ovs
    
    
    * OpenStack network operation
    
      * dhcp-agent
      * l3-agent
      * metadata-agent
      * lbaas-agent
      * fwaas-agent
    
    
    * OpenStack storage operation
    
        * swift-account
        * swift-base
        * swift-container
        * swift-object
        * swift-proxy-server
    
    In order to achieve the desired results, we plan to permit super-privileged
    containers.  A super-privileged container is defined as any container launched
    with the --privileged=true flag to docker that:
    
    * bind-mounts specific security-crucial host operating system directories
      with -v.  This includes nearly all directories in the filesystem except for
    
      leaf directories with no other host operating system use.
    
    * shares any namespace with the --ipc=host, --pid=host, or --net=host flags
    
    We will use the docker flag --restart=always to provide some measure of
    high availability for the individual containers and ensure they operate
    correctly as currently designed.
    
    A host tool will run and monitor the container's built-in check script via
    docker exec to validate the container is operational on a pre-configured timer.
    If the container does not pass its healthcheck operation, it should be
    restarted.
    
    Integration of metadata with fig or a similar single node Docker orchestration
    tool will be implemented.  Even though fig  executes on a single node, the
    containers will be designed to run multi-node and the deploy tool should take
    some form of information to allow it to operate multi-node.  The deploy tool
    should take a set of key/value pairs as inputs and convert them into inputs
    into the environment passed to Docker.  These key/value pairs could be a file
    or environment variables.  We will not offer integration with multi-node
    scheduling or orchestration tools, but instead expect our consumers to manage
    each bare metal machine using our fig or similar in nature tool integration.
    
    Any contributions from the community of the required metadata to run these
    containers using a multi-node orchestration tool will be warmly received but
    generally won't be maintained by the core team.
    
    The technique for launching the deploy script is not handled by Kolla.  This
    is a problem for a higher level deployment tool such as TripleO or Fuel to
    tackle.
    
    Logs from the individual containers will be retrievable in some consistent way.
    
    Security impact
    ---------------
    
    Container usage with super-privileged mode may possibly impact security.  For
    example, when using --net=host mode and bind-mounting /run which is necessary
    for a compute node, it is possible that a compute breakout could corrupt the
    host operating system.
    
    To mitigate security concerns, solutions such as SELinux and AppArmor should
    be used where appropriate to contain the security privileges of the containers.
    
    Performance Impact
    ------------------
    
    
    The upgrade or downgrade process changes from a multi-hour outage to a 10
    
    second outage across the system.
    
    Implementation
    ==============
    
    
    Assignee(s)
    -----------
    
    Primary assignee:
    
    kolla maintainers
    
    Work Items
    ----------
    
    1. Container Sets
    2. Containers
    3. A minimal proof of concept single-node fig deployment integration
    4. A minimal proof of concept fig healthchecking integration
    
    Testing
    =======
    
    Functional tests will be implemented in the OpenStack check/gating system to
    automatically check that containers pass each container's functional tests
    stored in the project's repositories.
    
    Documentation Impact
    ====================
    
    The documentation impact is unclear as this project is a proof of concept
    with no clear delivery consumer.
    
    
    References
    ==========
    
    * [1] https://github.com/sdake/compute-upgrade