ZCompute Load Mitigation

Introduction

zCompute includes 3 main capabilities to ensure load mitigation on the system:

  1. VM-Placement. When a new VM powers on, the VM-Placement service is designed to place the VM on the nodes that can sustain new load in the best possible manner under a given set of placement rules. There are many factors to these decisions which will not be described in this document.

  2. VM live migration - zCompute equivalent to vmotion. It could be triggered manually by the admin or by the Load mitigation service.

  3. The Load mitigation service which is zCompute equivalent of VMware DRS, which balances the cluster based on actual load post placement and when load patterns change in the cluster.

zCompute load mitigation service

The service monitors the load on all the nodes and continually calculates the optimal placement of the virtual machine on the cluster. However there is a performance price of moving a VM between nodes, it exists in any virtualization solution (including VMware). This is why the load mitigation will only move VMs when a node is really loaded and the move of a VM will actually benefit the node. The VM selected to be moved is the VM that, if moved, will reduce the load on the node back to acceptable values while producing minimal impact on the migrated VM and the destination node. This means it’s not necessarily the VM that is most loading the node since these VMs are usually much harder to migrate and they need all the resources they can get.

All the above, like when to start migrating VMs and the parameters that influence the selected VMs and the aggressiveness of the mitigation, are fully configurable by the cloud_admin user, however it is for L4 support use only as they have been carefully tuned with factory defaults.

A full list of the configurable parameters:

Symphony > placement configure

name

default

description

type

value

mitigation-minimal-improvement

0.05

Minimal weight improvement for migrating a VM between nodes during load mitigation

config

0.05

max-placement-rate-tolerance

0.05

Tolerance of the max-placement-rate

config

0.05

vcpu-min-allocation

0.1

Minimum vCPU allocation for VMs (including Spot instances)

config

0.1

over-provision-size-mem

0.4

System over provision factor for memory of on-demand VMs

config

0.4

mitigation-cpu-threshold

0.7

CPU usage threshold for triggering load mitigation

config

0.7

mitigation-mem-threshold

0.7

Memory usage threshold for triggering load mitigation

config

0.7

rule-consideration-load-threshold

0.75

If the load on a node is above this threshold, only mandatory rules are considered for placement. Below it, the load is ignored and only the number of soft rules that are complied with is used to rank nodes for placement.

config

0.75

default-vm-memory-share

0.8

Default maximum memory share reserved for VMs in a node

config

0.8

expected-cpu-usage-factor

0.8

Expected CPU usage factor of incoming VMs

config

0.8

expected-memory-usage-factor

0.8

Expected memory usage factor of incoming VMs

config

0.8

memory-weight-in-load

0.8

Weight given to memory when deciding VM/node load (CPU weight will be 1 - memory-weight)

config

0.8

default-vm-cpu-share

0.9

Default maximum CPU share reserved for VMs in a node

config

0.9

cpu-allocation-disparity

1.0

Disparity in CPU allocation as a function of priority (0 = disregard priority)

config

1.0

load-mitigation-enable

true

Enable load mitigation

config

true

mitigation-throttling-enable

true

Enable load mitigation throttling

config

true

evac-timeout-enable

true

Enable stopping non-compliant VMs after timeout when executing a hard compute rule

config

true

mitigation-invocation-period-lower-limit

1

Lower limit of the period between NRAD load mitigation process invocation

config

1

memory-oversubscription-factor

1.0

Memory oversubscription factor - ratio of VM RAM to physical RAM

config

1.0

swap-oversubscription-factor

1.0

Swap memory oversubscription factor - ratio of VM swap space to available swap space

config

1.0

suggestions-max-num

3

VM rejection: maximum suggesions on how to succeed in placing a rejected VM

config

3

cpu-oversubscription-factor

4.0

CPU oversubscription factor - ratio of vCPUs to cores

config

4.0

max-active-migrations

4

maximal number of active migrations on a node

config

4

mitigation-throttling-period

5

Period between Placement Service load mitigation throttling invocations

config

5

over-provision-size-cpu

5.66666666667

System over provision factor for CPU of on-demand VMs

config

5.66666666667

max-placement-rate

10

Weak upper limit to number of placements per second

config

10

reservation-vm-timeout

20

Reservation object lifetime period. If the VM does not start during this period, the resources will be freed

config

20

pending-placement-timeout

20

Timeout for remembering pending VMs in Placement Service

config

20

suggestions-tag-bias

25

VM rejection: bias of suggestions to remove tags (when a VM is rejected) over suggestions to free resources

config

25

node-max-age

30

Maximum delay in node metrics before node is considered failed

config

30

vm-migration-attempt-cooldown

60

Minimum duration between migration attempts, per VM

config

60

mitigation-invocation-period-upper-limit

60

Upper limit of the period between the NRAD load mitigation process invocation

config

60

vm-list-update-period

300

Period of NRAD VM list updates to the Placement Service

config

300

migration-request-timeout

300

Timeout for migration attempt (per VM; migration is assumed to have failed if timeout is exceeded)

config

300

evac-stalled-event-timeout

600

Timeout from last successful migration to an event about stalled maintenance

config

600

evac-timeout

1200

Timeout for migrating VMs when executing a hard compute rule, before non-compliant VMs are stopped

config

1200

blacklisted-nodes

[]

Blacklisted nodes on which new VMs cannot be placed. Nodes are specified either by UUID or by hostname (not name!)

config

[]

swappiness

100

Swappiness of VMs per priority, starting from priority 0 (swappiness of system services is 60)

config

100

100

100

80

80

0

0

required-services-for-placement

strato-snld

Required services on a node to receive VMs

config

strato-snld

openstack-nova-compute

openstack-nova-compute

suggestions-free-resources-format

Free up some %(resources)s in the cluster

VM rejection: format of suggestion to free up resources (must include %(resources)s)

config

Free up some %(resources)s in the cluster

suggestions-remove-tag-format

Remove tag %(tag)s from VM

VM rejection: format of suggestion to remove a single tag (must include %(tag)s)

config

Remove tag %(tag)s from VM

suggestions-remove-tags-format

Remove tags %(tags)s from VM

VM rejection: format of suggestion to remove a more than one tag (must include %(tags)s)

config

Remove tags %(tags)s from VM

mitigation-invocation-period

none

Approximate period between migration requests for load mitigation

status

1

placement-count

none

Number of VM placements

status

787447

evac-migration-count

none

Number of evac VM migrations

status

0

evac-migration-failures

none

Number of failures of evac VM migrations

status

0

migration-failures

none

Number of failures of load mitigations VM migrations

status

0

migration-count

none

Number of load mitigation VM migrations

status

0

last-placement

none

Time of last VM placement

status

1626687126.0

last-evac-migration

none

Time of last evac VM migration

status

none

last-migration

none

Time of last load mitigation VM migration

status

none

Yes it is a long list that affects placement and load balancing. But some important ones are mitigation-minimal-improvement( Don’t move VMs that don’t improve at least 5%), mitigation-cpu-threshold (70%), mitigation-mem-threshold (70%), load-mitigation-enable.

How to test Load balancing

  1. Use live migration () to move VM to a specific node until the node is fully or almost loaded memory wise

    1. Live migration invoked by the admin bypasses the VMs-Placement and forces the VM to move to a specific node as long as the node can accept them, even on the expense of some memory overcommit.

      10-migrate-vm

    2. The VMs may be running workloads but in case they don’t just go into the heatmap view and see that the node is indeed loaded with virtual resources

      11-heatmap-view

    A node may be loaded with virtual resources while still will not be loaded due to actual consumption of the resources as can be seen in the following screenshot where the memory pressure due to reservations is 100% (we did that by forcing VM migration for the tests itself).

    12-memory-pressure

  2. Now just go into the VMs on the node and start running loads, after couple of minutes VMs will starts moving of off the node - in the screenshots we can see the VMs migrating while the load on the node has reached 75% CPU

    13-vm-migrating

    In the event log these can be distinguished by the fact these VMs don’t show the account and project while being migrated.

    14-event-log