Installing a VIRL OpenStack Cluster

Why Use a VIRL OpenStack Cluster?

Most users will install VIRL onto a single bare-metal or virtual machine node. Every node (router, switch, server, or Linux Container (LXC)) used within a given VIRL simulation runs on this single machine.

Depending on the number and type of nodes used, it is possible for a simulation to require more compute and memory resources than can supported by a single node.

VIRL on Openstack Clusters enables you to combine multiple nodes (up to five) into a cluster, and to distibute the nodes in large, resource-intenstive simulations across this cluster so they can take advantage of the additional compute and memory resources.

At a minimum, a cluster must be composed of one 'controller' and one 'compute' node. Today, VIRL clusters can be scaled to a maximum of one controller and four compute nodes.

VIRL OpenStack Cluster Terminology

Term Description
Controller The primary VIRL node that includes a complete installation of the VIRL server software, including full compute, storage, and network functionality and all of the node and container images.
Compute node A node that includes a partial installation of the VIRL server software that enables it to provide additional compute and networking resources for use by a VIRL simulation.
Cluster A collection of nodes operating in concert. At a minimum, a cluster can be composed of one 'controller and one compute node.
VIRL Server Image A standard VIRL installation source (OVA or ISO) that contains the full compliment of VIRL software.
VIRL Compute Image A VIRL installation source (OVA or ISO) that contains only the VIRL software necessary to provide compute and networking services.

When using VIRL Clusters, the IP address used to reach the User Workspace Manager and by VM Maestro will be that of the Controller.

Installation Requirements

Just so you know.

The instructions that follow were developed for use with Cisco UCS C-series servers - either bare-metal or running vSphere ESXi.  You may need to adapt them to suit other site-unique deployment scenarios.

Cluster-Member Resources

To successfully deploy a VIRL OpenStack Cluster please ensure that the following minimum requirements are met:

Software

Please observe the following minimum requirements for VIRL software when deploying VIRL OpenStack Clusters:
Node-Type VIRL Software Release
Controller 1.3.286 or later
Compute Node 1.3.286 or later

Network Time Protocol (NTP)

Every bare-metal node, ESXi-node, and cluster-member must be configured to properly synchronize with a valid NTP clock source.

If you are in a lab or other environment where special requirements apply to the use of NTP, please work with your network administrators to ensure that NTP is propery and successfully configured.

Networking

The VIRL networks are named 'Management', 'Flat', 'Flat1', 'SNAT', and 'INT'.  These are used for management, Layer-2 and Layer-3 connectivity, and cluster control-plane and data-plane functions, respectively.

Each of the five required interfaces on a cluster member are connected to these networks in order, as shown below for both bare-metal and virtualized environments.

Bare-Metal Interface Mapping

In bare-metal deployments multiple LAN switches or VLANs must be used to provide seamless, isolated connectivity for each of the VIRL networks. The five phyiscal network interfaces on each node are connected as illustrated below:

Interface Switch or VLAN
eth0 Management
eth1 Flat
eth2 Flat1
eth3 SNAT
eth4 INT
vSphere ESXi Interface Mapping

In vSphere ESXi deployments multiple Port-Groups should used to provide seamless, isolated connectivity for each of the VIRL networks. The five virtual network interfaces (vNIC) on each virtual machine are connected as illustrated below:

vNIC Port-Group
eth0 VM Network (default)
eth1 Flat
eth2 Flat1
eth3 SNAT
eth4 INT

The default vSphere ESXi port-group used for the 'Management' network is 'VM Network' but any port-group may be used.  Please adapt as-needed to conform to site-specific configurations.

The 'Flat' and 'Flat1' Port-Groups must be configured in 'Promiscuous-Mode' in order to allow communications between nodes running in different simulations.  Please refer to the vSphere Client or Web-Client installation sections for detailed steps.

In deployments where cluster-members are deployed across multiple vSphere ESXi hosts care must be taken to ensure that seamless connectivity is maintained for each VIRL network.  This can be done in one of two ways:


Regardless of the method used, the logical connectivity between ESXi hosts and within the VIRL OpenStack Cluster must be as illustrated below:

Ethernet MTU Considerations

When configured for clusters, the VIRL controller and compute nodes use Virtual Extensible LAN (VXLAN) over the 'INT' / eth4 network to provide a communications path between virtual routers, switches, and other nodes within a simulation that exist on different compute nodes.

To account for the headers used by IP, UDP, VXLAN, and any 802.1Q headers that may be present while still allowing for 1500-byte frames to be conveyed between virtual end-points it is necessary to configure Ethernet Jumbo Frames on all physical and / or virtual switching elements which service the 'INT' / eth4 network between the VIRL controller and compute nodes.

Specifically, please ensure that the following requirements are met before deploying VIRL controller or compute nodes:

IGMP Snooping / Querier

VXLAN uses IP multicast to transport Layer-2 broadcast, unknown end-point, and multicast traffic.  To assist with end-point discovery and group-management please ensure that the following requirements are met:

Failure to ensure that the above requirements for Jumbo Frames, IGMP Snooping, and IGMP Queriers are met will prevent nodes within simulations from communicating.

Interface Addressing

The default interface addressing convention for VIRL on OpenStack Clusters is described below.  The adresses for the 'Management', 'Flat', 'Flat1', and 'SNAT' networks can and should be adjusted to suit your exact deployment requirements when necessary.

Controller Compute-1 Compute-2 Compute-3 Compute-4
eth0 DHCP or Static DHCP or Static DHCP or Static DHCP or Static DHCP or Static
eth1 172.16.1.254 172.16.1.241 172.16.1.242 172.16.1.243 172.16.1.244
eth2 172.16.2.254 172.16.2.241 172.16.2.242 172.16.2.243 172.16.2.244
eth3 172.16.3.254 172.16.3.241 172.16.3.242 172.16.3.243 172.16.3.244
eth4 172.16.10.250 172.16.10.241 172.16.10.242 172.16.10.243 172.16.10.244

You must not change the subnet used for the 'INT' network.  This must remain on the 172.16.10.0/24 subnet, and the Controller must be assigned 172.16.10.250 on interface 'eth4'.

If you are installing a VIRL OpenStack Cluster alongside an existing standalone VIRL deployment you must ensure that they remain isolated using distinct switches, VLANs, or port-groups.  Otherwise, conflicts will occur on one or more of the Controller interfaces.

Controller Deployment

Installing the Controller Software

The Controller in a VIRL OpenStack Cluster is adapted from a VIRL standalone node.  As such, start by using menu at the left to select and follow the full installation process appropriate to your target environment.

Do not proceed until you have fully installed, configured, licensed, and verified your VIRL node using the installation process described for your environment.  A validated connection to the VIRL Salt infrastructure is required for successful cluster deployment.

Configuring the Controller

The first series of steps inform the new VIRL standalone node that it will be operating in a cluster:

  1. Login to the controller at the IP address recording during installation using the username 'virl' and password 'VIRL':

    ssh virl@<controller-ip-address>

  2. Make a copy of the VIRL configuration file 'virl.ini':

    sudo cp /etc/virl.ini /etc/virl.ini.orig

  3. Open 'virl.ini' using the 'nano' editor:

    sudo nano /etc/virl.ini

  4. Locate the configuration element 'virl_cluster: FALSE' and change its value to 'TRUE'.

Next you must identify how many compute nodes - from 1 to 4 - will be present in the cluster:

  1. Locate the configuration element for the first compute node - 'compute1_active:' - and if found to be 'FALSE' - change its value to 'TRUE'.


  2. Repeat for each additional compute node to be included in the cluster, from 2 to 4.

Save and apply the configuration changes:

  1. Enter 'Control-X', 'Y' and 'Enter' to save the file and exit.


  2. Apply the changes and update the controller's Salt configuration using the following commands:

    vinstall salt
    sudo salt-call state.sls virl.cluster
    vinstall salt
    sudo service virl-std restart
    sudo service virl-uwm restart

Continue configuring the controller using the User Workspace Manager (UWM):

  1. Open a browser and navigate to the the controller's UWM interface - 'http://<controller-ip-address>:19400'.


  2. Login using username 'uwmadmin' and password 'password'


  3. Select 'VIRL Server' from the menu.


  4. Select 'System Configuration'.


The 'System Configuration' page will include tabs for each of the enabled compute nodes.

  1. For each compute node in your cluster, select the appropriate tab and adjust its configuration to match your environment.

    The 'Cluster Configuration and Defaults' section below includes a description of each cluster configuration element and the default values associated with each of the four possible compute nodes.


  2. Once you have made all of the necessary changes select the 'Apply Changes' button and follow the instructions that are provided to reconfigure VIRL.


Restarting the Controller

You must now restart the controller in order to finalize the configuration and services:

  1. Restart the controller:

    sudo reboot now

Compute Node Deployment

Compute nodes are installed using specialized OVAs or ISOs named 'compute.n.n.n.ova' or 'compute.n.n.n.iso', respectively.  Please refer to your license confirmation email for information on how to download these installation sources.

Install the Compute Node Software

The installation of a compute node starts as an abbreviated version of a standalone installation:

Are your compute nodes up already?

If you jumped ahead and deployed your compute nodes before you configured your controller you must reboot them now.  Compute nodes try to initialize with the controller on startup and will not recheck if one is not found.

  1. Download the compute node OVA or ISO files, 1 through up to 4, depending on the number you wish to deploy.

  2. Deploy each of the OVAs or ISOs to a different server in your vCenter / ESXi environment (using the compute node OVAs) or to a different bare metal server (using the compute node ISOs).

    Pay attention to your port-groups.

    It is critical the CML cluster controller and compute-nodes share common port-groups for each of the five netowrk interfaces.  These are typically 'VM Network', 'Flat', 'Flat1', 'SNAT', and 'INT'.


  3. Boot the newly deployed compute-node OVA or bare metal server.

Validate Compute Node Operations

Once each compute node has been deployed and booted, continue by validating connectivity to the controller and enabling the CML compute node software using the steps below:

  1. Login to the compute node using the username 'virl' and password 'VIRL':

    ssh virl@<compute-node-ip-address>

  2. Ensure that connectivity to the controller exists:

    ping 172.16.10.250
  3. Connectivity to the controller must be confirmed.  If no connectivity exists explore and resolve any virtual or physical networking issues before proceeding.

Cluster Validation

Once the controller and each compute node has been deployed, you should validate that the cluster is properly configured and operational:

  1. Login to the controller at the IP address recording during installation using the username 'virl' and password 'VIRL':

    ssh virl@<controller-ip-address>

  2. Verify that each compute node is registered with OpenStack Nova:

    nova service-list

    In the example below there are five 'nova-compute' services registered - one on the controller and another for each compute node that has beeen deployed:


  3. Verify that each compute node is registered with OpenStack Neutron

    neutron agent-list

    In the example below there are five 'Linux bridge agents' registered - one on the controller and another for each compute node that has beeen deployed:

At this point your VIRL OpenStack Cluster should be fully operational and you should be able to start a VIRL simulation and observe all of the nodes become 'Active'.

Cluster Troubleshooting

In situations where communications is lost between the controller and a compute node, or if a compute node becomes inoperable, you can determine the state of each compute node from the controller using the 'nova service-list'' and 'neutron agent-list' commands you learned above.

For example, in the illustration below communications has been lost with 'compute4'.  Observe that Nova shows the node as 'down', and Neutron shows the agent as 'xxx' (dead):


In this circumstances proper operation may be restored by restarting the affected node.

If restarting the compute node does not restore proper operation you may also want to check that the node has associated with a valid NTP clock source:

sudo ntpq -p

Valid peer associations are indicated by a '*' alongside the clock-source name, as illustrated:

Cluster Maintenance

Adding Additional Compute Nodes

To add additional compute nodes to an existing VIRL cluster:

  1. Login to the controller at the IP address recording during installation using the username 'virl' and password 'VIRL'.

  2. Edit '/etc/virl.ini' and set the configuration element for the new compute node to 'True'.

  3. Apply the changes and update the controller's cluster configuration:

    vinstall salt
    sudo salt-call state.sls virl.cluster
    vinstall salt
    sudo service virl-std restart
    sudo service virl-uwm restart

  4. Complete the instructions for 'Compute Node Deployment' for the new compute node.

  5. Complete the instructions in 'Cluster Validation' to ensure that the new compute node was properly deployed.

Considerations for Cluster Use

The primary reason for deploying a VIRL cluster is to simulate large complex topologies, but these simulations can be very taxing on VIRL compute and networking resources.  The following guidelines should be followed to minimize simulation start times and ensure successful launches.

Increase Project Quotas

By default VIRL projects permit 200 instances (nodes), 200 vCPUs, and 512000MB of RAM.  Running large simulations may require that you increase these limits using the UWM.  For example:



Enable RAMdisk

Enabling RAMdisk on systems with greater than 32GB of memory will improve simulation start times by caching instances of router, switch, and other node images in memory, rather than reading images from disk for each node:



Pacing Simulation Startup

When working with very large (>100 nodes) simulations it is recommended that the nodes be started in groups of 100 or less.  This can be achieved by selecting groups of nodes in VM Maestro and exluding them from launch, as shown:




Once the nodes that were not excluded have started, select the nodes that were excluded - again in groups of 100 or less - and start them, as shown:




Cluster Configuration and Defaults

The following configuration elements defiend in '/etc/virl.ini' or via the UWM are used to define VIRL OpenStack Cluster configurations:

Parameter Default Description Notes
computeN_active False Specifies the absense or presence of the compute node 'N' in the cluster. Set to 'True' for each available compute node (1 through 4).
computeN_nodename computeN Specifies the hosname associated with the compute node. This field must match the nodename defined on the compute node.
computeN_public_port eth0 Specifies the name of the port used to reach the Internet on the compute node. This field must match exactly the public port name and format specified on the compute node.
computeN_using_dhcp_on_the_public_port True Specifies the addressing method used for the public port on the compute node. Set to 'False' if using Static IP addressing.
computeN_static_ip 10.10.10.10 The Static IP address assigned to the public port. Not used if DHCP is enabled.  Review and modify to match deployment requirements.
computeN_public_netmask 255.255.255.0 The network mask assigned to the public port. Not used if DHCP is enabled.  Review and modify to match deployment requirements.
computeN_public_gateway 10.10.10.1 The IP address of the default gateway assigned to the public port. Not used if DHCP is enabled.  Review and modify to match deployment requirements.
computeN_first_nameserver 8.8.8.8 The IP address of the first name-server assigned to the public port. Not used if DHCP is enabled.  Review and modify to match deployment requirements.
computeN_second_nameserver 8.8.4.4 The IP address of the second name-server assigned to the public port. Not used if DHCP is enabled.  Review and modify to match deployment requirements.
computeN_l2_port eth1 The name of the first layer-2 network port ('Flat') on the compute node. This field must match exactly the name and format specified on the compute node.
computeN_l2_address 172.16.1.24N The IP address assigned to the first layer-2 network port ('Flat') on the compute node. This field must match exactly the IP address specified on the compute node.  'N must match the nodename / position in the cluster.
computeN_l2_port2 eth2 The name of the second layer-2 network port ('Flat1') on the compute node. This field must match exactly the name and format specified on the compute node.
computeN_l2_address2 172.16.2.24N The IP address assigned to the second layer-2 network port ('Flat1') on the compute node. This field must match exactly the IP address specified on the compute node.  'N must match the nodename / position in the cluster.
computeN_l3_port eth2 The name of the layer-3 network port ('SNAT') on the compute node. This field must match exactly the name and format specified on the compute node.
computeN_l3_address 172.16.3.24N The IP address assigned to layer-3 network port ('SNAT') on the compute node. This field must match exactly the IP address specified on the compute node.  'N must match the nodename / position in the cluster.
computeN_internalnet_ip 172.16.10.24N The IP address assigned to internal / cluster network interface ('eth4'). This field must match exactly the IP address specified on the compute node.  'N must match the nodename / position in the cluster.

The default configuration elements for each compute node in a VIRL OpenStack Cluster are as follows:

The configuration element defaults that are identical across compute nodes have been greyed-out in the table below.

Compute Node 1 Compute Node 2 Compute Node 3 Compute Node 4
computeN_active False False False False
computeN_nodename compute1 compute2 compute3 compute4
computeN_public_port eth0 eth0 eth0 eth0
computeN_using_dhcp_on_public_port True True True True
computeN_static_ip 10.10.10.10 10.10.10.10 10.10.10.10 10.10.10.10
computeN_public_netmask 255.255.255.0 255.255.255.0 255.255.255.0 255.255.255.0
computeN_public_gateway 10.10.10.1 10.10.10.1 10.10.10.1 10.10.10.1
computeN_first_nameserver 8.8.8.8 8.8.8.8 8.8.8.8 8.8.8.8
computeN_second_nameserver 8.8.4.4 8.8.4.4 8.8.4.4 8.8.4.4
computeN_l2_port eth1 eth1 eth1 eth1
computeN_l2_address 172.16.1.241 172.16.1.242 172.16.1.243 172.16.1.244
computeN_l2_port2 eth2 eth2 eth2 eth2
computeN_l2_address2 172.16.2.241 172.16.2.242 172.16.2.243 172.16.2.244
computeN_l3_port eth3 eth3 eth3 eth3
computeN_l2_address2 172.16.3.241 172.16.3.242 172.16.3.243 172.16.3.244
computeN_internalnet_port eth4 eth4 eth4 eth4
computeN_internalnet_ip 172.16.10.241 172.16.10.242 172.16.10.243 172.16.10.244

End of Installation.