Skip to content
Snippets Groups Projects
nova-cells-guide.rst 15.5 KiB
Newer Older
  • Learn to ignore specific revisions
  • Mark Goddard's avatar
    Mark Goddard committed
    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434
    ==========
    Nova Cells
    ==========
    
    Overview
    ========
    
    Nova cells V2 is a feature that allows Nova deployments to be scaled out to
    a larger size than would otherwise be possible. This is achieved through
    sharding of the compute nodes into pools known as *cells*, with each cell
    having a separate message queue and database.
    
    Further information on cells can be found in the Nova documentation
    :nova-doc:`here <user/cells.html>` and :nova-doc:`here
    <user/cellsv2-layout.html>`. This document assumes the reader is familiar with
    the concepts of cells.
    
    Cells: deployment perspective
    =============================
    
    From a deployment perspective, nova cell support involves separating the Nova
    services into two sets - global services and per-cell services.
    
    Global services:
    
    * ``nova-api``
    * ``nova-scheduler``
    * ``nova-super-conductor`` (in multi-cell mode)
    
    Per-cell control services:
    
    * ``nova-compute-ironic`` (for Ironic cells)
    * ``nova-conductor``
    * ``nova-novncproxy``
    * ``nova-serialproxy``
    * ``nova-spicehtml5proxy``
    
    Per-cell compute services:
    
    * ``nova-compute``
    * ``nova-libvirt``
    * ``nova-ssh``
    
    Another consideration is the database and message queue clusters that the cells
    depend on. This will be discussed later.
    
    Service placement
    -----------------
    
    There are a number of ways to place services in a multi-cell environment.
    
    Single cell topology
    ~~~~~~~~~~~~~~~~~~~~
    
    The single cell topology is used by default, and is limited to a single cell::
    
                +----------------+
                |                ++
                |                |-+
                |   controllers  |-|
                |                |-|
                |                |-|
                +------------------|
                 +-----------------|
                  +----------------+
    
        +--------------+     +--------------+
        |              |     |              |
        |   cell 1     |     |   cell 1     |
        |   compute 1  |     |   compute 2  |
        |              |     |              |
        +--------------+     +--------------+
    
    All control services run on the controllers, and there is no superconductor.
    
    Dedicated cell controller topology
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    In this topology, each cell has a dedicated group of controllers to run cell
    control services. The following diagram shows the topology for a cloud with two
    cells::
    
                                        +----------------+
                                        |                ++
                                        |                |-+
                                        |   controllers  |-|
                                        |                |-|
                                        |                |-|
                                        +------------------|
                                         +-----------------|
                                          +----------------+
    
                           +----------------+        +----------------+
                           |                ++       |                ++
                           |   cell 1       |-+      |   cell 2       |-+
                           |   controllers  |-|      |   controllers  |-|
                           |                |-|      |                |-|
                           +------------------|      +------------------|
                            +-----------------|       +-----------------|
                             +----------------+        +----------------+
    
        +--------------+     +--------------+        +--------------+     +--------------+
        |              |     |              |        |              |     |              |
        |   cell 1     |     |   cell 1     |        |   cell 2     |     |   cell 2     |
        |   compute 1  |     |   compute 2  |        |   compute 1  |     |   compute 2  |
        |              |     |              |        |              |     |              |
        +--------------+     +--------------+        +--------------+     +--------------+
    
    Shared cell controller topology
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    .. note::
    
       This topology is not yet supported by Kolla Ansible.
    
    An alternative configuration is to place the cell control services for multiple
    cells on a single shared group of cell controllers. This might allow for more
    efficient use of hardware where the control services for a single cell do not
    fully consume the resources of a set of cell controllers::
    
                                        +----------------+
                                        |                ++
                                        |                |-+
                                        |   controllers  |-|
                                        |                |-|
                                        |                |-|
                                        +------------------|
                                         +-----------------|
                                          +----------------+
    
                                        +----------------+
                                        |                ++
                                        |   shared cell  |-+
                                        |   controllers  |-|
                                        |                |-|
                                        +------------------|
                                         +-----------------|
                                          +----------------+
    
        +--------------+     +--------------+        +--------------+     +--------------+
        |              |     |              |        |              |     |              |
        |   cell 1     |     |   cell 1     |        |   cell 2     |     |   cell 2     |
        |   compute 1  |     |   compute 2  |        |   compute 1  |     |   compute 2  |
        |              |     |              |        |              |     |              |
        +--------------+     +--------------+        +--------------+     +--------------+
    
    Databases & message queues
    --------------------------
    
    The global services require access to a database for the API and cell0
    databases, in addition to a message queue. Each cell requires its own database
    and message queue instance. These could be separate database and message queue
    clusters, or shared database and message queue clusters partitioned via
    database names and virtual hosts. Currently Kolla Ansible supports deployment
    of shared database cluster and message queue clusters.
    
    Configuration
    =============
    
    .. seealso::
    
       Configuring Kolla Ansible for deployment of multiple cells typically
       requires use of :ref:`inventory host and group variables
       <multinode-host-and-group-variables>`.
    
    Enabling multi-cell support
    ---------------------------
    
    Support for deployment of multiple cells is disabled by default - nova is
    deployed in single conductor mode.
    
    Deployment of multiple cells may be enabled by setting ``enable_cells`` to
    ``yes`` in ``globals.yml``. This deploys nova in superconductor mode, with
    separate conductors for each cell.
    
    Naming cells
    ------------
    
    By default, all cell services are deployed in a single unnamed cell. This
    behaviour is backwards compatible with previous releases of Kolla Ansible.
    
    To deploy hosts in a different cell, set the ``nova_cell_name`` variable
    for the hosts in the cell. This can be done either using host variables or
    group variables.
    
    Groups
    ------
    
    In a single cell deployment, the following Ansible groups are used to determine
    the placement of services:
    
    * ``compute``: ``nova-compute``, ``nova-libvirt``, ``nova-ssh``
    * ``nova-compute-ironic``: ``nova-compute-ironic``
    * ``nova-conductor``: ``nova-conductor``
    * ``nova-novncproxy``: ``nova-novncproxy``
    * ``nova-serialproxy``: ``nova-serialproxy``
    * ``nova-spicehtml5proxy``: ``nova-spicehtml5proxy``
    
    In a multi-cell deployment, this is still necessary - compute hosts must be in
    the ``compute`` group. However, to provide further control over where cell
    services are placed, the following variables are used:
    
    * ``nova_cell_compute_group``
    * ``nova_cell_compute_ironic_group``
    * ``nova_cell_conductor_group``
    * ``nova_cell_novncproxy_group``
    * ``nova_cell_serialproxy_group``
    * ``nova_cell_spicehtml5proxy_group``
    
    For backwards compatibility, these are set by default to the original group
    names.  For a multi-cell deployment, they should be set to the name of a group
    containing only the compute hosts in that cell.
    
    Example
    ~~~~~~~
    
    In the following example we have two cells, ``cell1`` and ``cell2``. Each cell
    has two compute nodes and a cell controller.
    
    Inventory:
    
    .. code-block:: INI
    
       [compute:children]
       compute-cell1
       compute-cell2
    
       [nova-conductor:children]
       cell-control-cell1
       cell-control-cell2
    
       [nova-novncproxy:children]
       cell-control-cell1
       cell-control-cell2
    
       [nova-spicehtml5proxy:children]
       cell-control-cell1
       cell-control-cell2
    
       [nova-serialproxy:children]
       cell-control-cell1
       cell-control-cell2
    
       [cell1:children]
       compute-cell1
       cell-control-cell1
    
       [cell2:children]
       compute-cell2
       cell-control-cell2
    
       [compute-cell1]
       compute01
       compute02
    
       [compute-cell2]
       compute03
       compute04
    
       [cell-control-cell1]
       cell-control01
    
       [cell-control-cell2]
       cell-control02
    
    Cell1 group variables (``group_vars/cell1``):
    
    .. code-block:: yaml
    
       nova_cell_name: cell1
       nova_cell_compute_group: compute-cell1
       nova_cell_conductor_group: cell-control-cell1
       nova_cell_novncproxy_group: cell-control-cell1
       nova_cell_serialproxy_group: cell-control-cell1
       nova_cell_spicehtml5proxy_group: cell-control-cell1
    
    Cell2 group variables (``group_vars/cell2``):
    
    .. code-block:: yaml
    
       nova_cell_name: cell2
       nova_cell_compute_group: compute-cell2
       nova_cell_conductor_group: cell-control-cell2
       nova_cell_novncproxy_group: cell-control-cell2
       nova_cell_serialproxy_group: cell-control-cell2
       nova_cell_spicehtml5proxy_group: cell-control-cell2
    
    Note that these example cell group variables specify groups for all console
    proxy services for completeness. You will need to ensure that there are no
    port collisions. For example, if in both cell1 and cell2, you use the default
    ``novncproxy`` console proxy, you could add ``nova_novncproxy_port: 6082``
    to the cell2 group variables to prevent a collision with cell1.
    
    Databases
    ---------
    
    The database connection for each cell is configured via the following
    variables:
    
    * ``nova_cell_database_name``
    * ``nova_cell_database_user``
    * ``nova_cell_database_password``
    * ``nova_cell_database_address``
    * ``nova_cell_database_port``
    
    By default the MariaDB cluster deployed by Kolla Ansible is used.  For an
    unnamed cell, the ``nova`` database is used for backwards compatibility.  For a
    named cell, the database is named ``nova_<cell name>``.
    
    Message queues
    --------------
    
    The RPC message queue for each cell is configured via the following variables:
    
    * ``nova_cell_rpc_user``
    * ``nova_cell_rpc_password``
    * ``nova_cell_rpc_port``
    * ``nova_cell_rpc_group_name``
    * ``nova_cell_rpc_transport``
    * ``nova_cell_rpc_vhost``
    
    And for notifications:
    
    * ``nova_cell_notify_user``
    * ``nova_cell_notify_password``
    * ``nova_cell_notify_port``
    * ``nova_cell_notify_group_name``
    * ``nova_cell_notify_transport``
    * ``nova_cell_notify_vhost``
    
    By default the message queue cluster deployed by Kolla Ansible is used. For an
    unnamed cell, the ``/`` virtual host used by all OpenStack services is used for
    backwards compatibility.  For a named cell, a virtual host named ``nova_<cell
    name>`` is used.
    
    Conductor & API database
    ------------------------
    
    By default the cell conductors are configured with access to the API database.
    This is currently necessary for `some operations
    <https://docs.openstack.org/nova/latest/user/cellsv2-layout.html#operations-requiring-upcalls>`__
    in Nova which require an *upcall*.
    
    If those operations are not required, it is possible to prevent cell conductors
    from accessing the API database by setting
    ``nova_cell_conductor_has_api_database`` to ``no``.
    
    Console proxies
    ---------------
    
    General information on configuring console access in Nova is available
    :ref:`here <nova-consoles>`. For deployments with multiple cells, the console
    proxies for each cell must be accessible by a unique endpoint. We achieve this
    by adding an HAProxy frontend for each cell that forwards to the console
    proxies for that cell. Each frontend must use a different port. The port may be
    configured via the following variables:
    
    * ``nova_novncproxy_port``
    * ``nova_spicehtml5proxy_port``
    * ``nova_serialproxy_port``
    
    Ironic
    ------
    
    Currently all Ironic-based instances are deployed in a single cell. The name of
    that cell is configured via ``nova_cell_ironic_cell_name``, and defaults to the
    unnamed cell. ``nova_cell_compute_ironic_group`` can be used to set the group
    that the ``nova-compute-ironic`` services are deployed to.
    
    Deployment
    ==========
    
    Deployment in a multi-cell environment does not need to be done differently
    than in a single-cell environment - use the ``kolla-ansible deploy`` command.
    
    Scaling out
    -----------
    
    A common operational task in large scale environments is to add new compute
    resources to an existing deployment. In a multi-cell environment it is likely
    that these will all be added to one or more new or existing cells. Ideally we
    would not risk affecting other cells, or even the control hosts, when deploying
    these new resources.
    
    The Nova cells support in Kolla Ansible has been built such that it is possible
    to add new cells or extend existing ones without affecting the rest of the
    cloud. This is achieved via the ``--limit`` argument to ``kolla-ansible``. For
    example, if we are adding a new cell ``cell03`` to an existing cloud, and all
    hosts for that cell (control and compute) are in a ``cell03`` group, we could
    use this as our limit:
    
    .. code-block:: console
    
       kolla-ansible deploy --limit cell03
    
    When adding a new cell, we also need to ensure that HAProxy is configured for
    the console proxies in that cell:
    
    .. code-block:: console
    
       kolla-ansible deploy --tags haproxy
    
    Another benefit of this approach is that it should be faster to complete, as
    the number of hosts Ansible manages is reduced.
    
    .. _nova-cells-upgrade:
    
    Upgrades
    ========
    
    Similar to deploys, upgrades in a multi-cell environment can be performed in
    the same way as single-cell environments, via ``kolla-ansible upgrade``.
    
    Staged upgrades
    ---------------
    
    .. note::
    
       Staged upgrades are not applicable when ``nova_safety_upgrade`` is ``yes``.
    
    In large environments the risk involved with upgrading an entire site can be
    significant, and the ability to upgrade one cell at a time is crucial. This
    is very much an advanced procedure, and operators attempting this should be
    familiar with the :nova-doc:`Nova upgrade documentation <user/upgrade>`.
    
    Here we use Ansible tags and limits to control the upgrade process. We will
    only consider the Nova upgrade here. It is assumed that all dependent services
    have been upgraded (see ``ansible/site.yml`` for correct ordering).
    
    The first step, which may be performed in advance of the upgrade, is to perform
    the database schema migrations.
    
    .. code-block:: console
    
    
       kolla-ansible upgrade --tags nova-bootstrap
    
    Mark Goddard's avatar
    Mark Goddard committed
    
    Next, we upgrade the global services.
    
    .. code-block:: console
    
    
       kolla-ansible upgrade --tags nova-api-upgrade
    
    Mark Goddard's avatar
    Mark Goddard committed
    
    Now the cell services can be upgraded. This can be performed in batches of
    one or more cells at a time, using ``--limit``. For example, to upgrade
    services in ``cell03``:
    
    .. code-block:: console
    
    
       kolla-ansible upgrade --tags nova-cell-upgrade --limit cell03
    
    Mark Goddard's avatar
    Mark Goddard committed
    
    At this stage, we might wish to perform testing of the new services, to check
    that they are functioning correctly before proceeding to other cells.
    
    Once all cells have been upgraded, we can reload the services to remove RPC
    version pinning, and perform online data migrations.
    
    .. code-block:: console
    
    
       kolla-ansible upgrade --tags nova-reload,nova-online-data-migrations
    
    Mark Goddard's avatar
    Mark Goddard committed
    
    The nova upgrade is now complete, and upgrading of other services may continue.