Learn Terraform – Define a virtual machine scale set

Now that we have one VM serving a web site, it is a common pattern to deploy not only one VM. Use multiple VMs to distribute the load. In Azure, this feature is called a virtual machine scale set (see the DOCs).

To build this in Terraform we need the azurerm_linux_virtual_machine_scale_set resource type. The documentation shows a sample on how to use it.

Please read first!

But CAUTION – I have done everything several times and tried a lot of possible parameters to deploy the scales set including the Apache webserver. I did not find out, why the configuration of the custom script extension does not work during the initial deployment. Only if you change the VM count after the deployment, the custom script will be deployed. You can see this issue here.

So I go through the whole sample and afterward I would like to show, how I would build the sample out of Yevgeniy Brikmann’s book by leveraging app services in Azure.

Let’s go first the way thru the virtual machine scale set:

We need a resource group to deploy everything to

### Resource Group
resource "azurerm_resource_group" "rg" {
  name     = "rg-vmssssample-test"
  location = "East US"
  tags = {
      App = "VMSS"
      Source = "Terraform"
  }
}

In this sample, we start using tags at the resource group level for the App we deployed, the source and what kind of environment we have. Also, I want to establish a naming convention based on the Microsoft best practices shared in this article.

So for a resource group, there is the suggested pattern
rg-<App or service name>-<Subscription type>-<### >

Next – the vNet

### Network
resource "azurerm_virtual_network" "vNet" {
  name                = "vnet-shared-eastus-001"
  resource_group_name = azurerm_resource_group.rg.name
  location            = azurerm_resource_group.rg.location
  address_space       = ["10.0.0.0/16"]
  tags = azurerm_resource_group.rg.tags
}

În the VNet we have to define the internal subnet for the VMs in the scale set

### Subnet
resource "azurerm_subnet" "sNet" {
  name                 = "snet-shared-vmsssample-001"
  resource_group_name  = azurerm_resource_group.rg.name
  virtual_network_name = azurerm_virtual_network.vNet.name
  address_prefix       = "10.0.2.0/24"
}

In my script, I add the following resource in front of the VM scale set definition. Because I want to add a FQDN to public IP assign to the load balancer. There is a helpful resource in Terraform to build a random String to be used for the FQDN:

### Random FQDN String
resource "random_string" "fqdn" {
 length  = 6
 special = false
 upper   = false
 number  = false
}

Implement a Loadbalancer into our script

The common design pattern is to deploy a load balancer in front of the VMs in the scale set. With this, the incoming traffic can be distributed between the virtual machines in the scale set. We add a load balancer definition to the script:

### Loadbalancer definition
resource "azurerm_lb" "vmsssample" {
 name                = "lb-vmsssample-test-001"
 location            = azurerm_resource_group.rg.location
 resource_group_name = azurerm_resource_group.rg.name

 frontend_ip_configuration {
   name                 = "ipconf-PublicIPAddress-test"
   public_ip_address_id = azurerm_public_ip.vmss-pip.id
 }

  tags = azurerm_resource_group.rg.tags
}

The load balancer needs some more configuration. We need to define a backend IP pool as well as a probe to check the health status of VMs in the backend pool:

### Define the backend pool
resource "azurerm_lb_backend_address_pool" "vmsssample" {
 resource_group_name = azurerm_resource_group.rg.name
 loadbalancer_id     = azurerm_lb.vmsssample.id
 name                = "ipconf-BackEndAddressPool-test"
}

### Define the lb probes
resource "azurerm_lb_probe" "vmsssample" {
 resource_group_name = azurerm_resource_group.rg.name
 loadbalancer_id     = azurerm_lb.vmsssample.id
 name                = "http-running-probe"
 port                = 80
}

The last step in the configuration is the rule for the load balancing – so which port should be balanced:

### Define the lb rule
resource "azurerm_lb_rule" "vmsssample" {
   resource_group_name            = azurerm_resource_group.rg.name
   loadbalancer_id                = azurerm_lb.vmsssample.id
   name                           = "http"
   protocol                       = "Tcp"
   frontend_port                  = 80
   backend_port                   = 80
   backend_address_pool_id        = azurerm_lb_backend_address_pool.vmsssample.id
   frontend_ip_configuration_name = "ipconf-PublicIPAddress-test"
   probe_id                       = azurerm_lb_probe.vmsssample.id
}

Now we have deployed the basic components of our architecture and can go ahead. As in our sample for a single VM it is important to define the network security group. But we do not need the SSH port been opened, we just need the port 80 on our webserver.

### Define the NSG
resource "azurerm_network_security_group" "vmsssample" {
    name                = "nsg-weballow-001"
    location            = azurerm_resource_group.rg.location
    resource_group_name = azurerm_resource_group.rg.name
    
     security_rule {
        name                       = "WebServer"
        priority                   = 1002
        direction                  = "Inbound"
        access                     = "Allow"
        protocol                   = "Tcp"
        source_port_range          = "*"
        destination_port_range     = "80"
        source_address_prefix      = "*"
        destination_address_prefix = "*"
     }
}

Finally, the VM scale set itself

### The VM Scale Set (VMSS)
resource "azurerm_linux_virtual_machine_scale_set" "vmsssample" {
  name                = "vmss-vmsssample-test-001"
  resource_group_name = azurerm_resource_group.rg.name
  location            = azurerm_resource_group.rg.location
  sku                 = "Standard_B2s"
  instances           = 1
  admin_username      = "adminuser"
  admin_password      = "Password1234!"
  disable_password_authentication = false
  tags = azurerm_resource_group.rg.tags

#### define the os image
 source_image_reference {
    publisher = "Canonical"
    offer     = "UbuntuServer"
    sku       = "16.04-LTS"
    version   = "latest"
  }

#### define the os disk
  os_disk {
    storage_account_type = "Standard_LRS"
    caching              = "ReadWrite"
  }

#### Define Network
  network_interface {
      name    = "nic-01-vmsssample-test-001"
      primary = true

    ip_configuration {
        name      = "ipconf-vmssample-test"
        primary   = true
        subnet_id = azurerm_subnet.sNet.id
        load_balancer_backend_address_pool_ids = [azurerm_lb_backend_address_pool.vmsssample.id]
    }
    network_security_group_id = azurerm_network_security_group.vmsssample.id

  }
}

Now we can plan our script and apply it to our Azure Account. Now that we have out VM scale set up and running we need our Webserver in the machine again. To achieve this, we need to deploy a new resource – the “azurerm_virtual_machine_scale_setextension”. It is somehow the same kind of extension we used for the single VM – so our additional entry in the script will look like this:

### Add the Webserver to the VMSS
resource "azurerm_virtual_machine_scale_set_extension" "vmsssampleextension" {
  name                         = "ext-vmsssample-test"
  virtual_machine_scale_set_id = azurerm_linux_virtual_machine_scale_set.vmsssample.id
  publisher                    = "Microsoft.Azure.Extensions"
  type                         = "CustomScript"
  type_handler_version         = "2.0"
  auto_upgrade_minor_version   = true
  force_update_tag             = true
  
  settings = jsonencode({
      "commandToExecute" : "apt-get -y update && apt-get install -y apache2" 
    })
}

During my research on the web I found that with terraform version 0.12 the function jsoncode has been implemented. With this, it is easier to convert a given string to JSON. I used this function for the commandToExcecute attribute.

But

If we now deploy our script to azure we will have all components in place to have a virtual machine scale set with a web server installed. As mentioned at the beginning the custom script extension does not work as expected. If you go to the portal and change the number of deployed instances in the scaling option of the scale set, the custom script extensions will be deployed to the VMs. If we then browse to URL of the public IP – we will have the apache web server default website been presented.

VMSS scaling Option

So after scaling up – our script will show our desired state when browsing to the FQDN.

The next post will then show the deployment using Azure App Services to solve the same challenge and add a real website to that script.