Skip to content

Automated installation

Introduction

This guide covers setting up your Composabl training cluster using pulumi, an Infrastructure as Code tool.

This example uses Azure Kubernetes Service, but can be adapted to other supported providers.

Prerequisites

  1. An Azure subscription with sufficient permissions to create and update various resources
  2. A working installation of pulumi
  3. If you're following along in typescript, a working installation of NodeJS
  4. A new pulumi project, as per the pulumi documentation. You can find the documentation for Azure here

Overview

We will be deploying the following resources to your Azure subscription:

  1. Resource group, containing all resources
  2. A container registry, to hold simulator images
  3. An AKS cluster

Resource group

The resource group will contain all resources. It is also what determines in what Azure location the resources will be deployed.

typescript
import * as resources from "@pulumi/azure-native/resources/index.js";

const resourceGroup = new resources.ResouceGroup('my-resource-group-', {
  location: 'eastus'
});

export const rgName = pulumi.interpolate`${resourceGroup.name}`;

At the end, we export the name of the resource group (which will be randomized by pulumi) for further use in our definition

Container registry

The container registry is where you will be able to privately store your simulator docker images, if any.

typescript
import * as containerregistry from "@pulumi/azure-native/containerregistry/index.js";

const registry = new containerregistry.Registry("registry", {
  resourceGroupName: resourceGroup.name,
  sku: {
    name: "Basic",
  },
  adminUserEnabled: true,
});

export const registryName = pulumi.interpolate`${registry.name}`;

Kubernetes Cluster

The cluster is where both the Composabl components and your training will be running. This configuration is more complex, so additional information will be provided as comments in the typescript definition:

typescript
import * as containerservice from "@pulumi/azure-native/containerservice/index.js";

const k8sCluster = new containerservice.ManagedCluster("aks", {
  resourceGroupName: resourceGroup.name, // Here, we reference the resourceGroup we created earlier
  location: resourceGroup.location,

  dnsPrefix: "composabl-aks",
  kubernetesVersion: "1.29.2", // you can get supported versions using the Azure CLI: az aks get-versions -l <location> -o table - replace <location> with the location you set in your resourcegroup.
  enableRBAC: true,

  // Assign a managed identity to the cluster
  identity: {
    type: "UserAssigned",
    userAssignedIdentities: [appMiAKS.id],
  },

  // Configure 3 pools
  // 1. Main (the kubernetes control plane nodes)
  // 2. Train (Composabl system components and training workers)
  // 3. Sims (Composabl simulators)
  agentPoolProfiles: [
    // The Main pool has 3 small nodes to act as a control plane
    {
      name: "main",
      count: 3,
      vmSize: "Standard_B2s", // (2 core, 4GB RAM, 0.041/hour)
      osType: "Linux",
      osSKU: "Ubuntu",
      mode: "System",
    },
    // The train pool will run the actual Composabl workers and system components
    {
      name: "train",
      vmSize: "Standard_d8s_v5", // (8vCPU, 32GB RAM, x64 Machines, .4$/hour)
      count: 1,
      minCount: 0,
      maxCount: 10,
      enableAutoScaling: true,
      osType: "Linux",
      osSKU: "Ubuntu",
      osDiskSizeGB: 100,
      osDiskType: "Premium_LRS",
    },

    // The Sims pool will run the simulators
    {
      name: "sims",
      vmSize: "Standard_d8s_v5", // (8vCPU, 32GB RAM, x64 Machines, .4$/hour)
      count: 1,
      minCount: 0,
      maxCount: 10,
      enableAutoScaling: true,
      osType: "Linux",
      osSKU: "Ubuntu",
      osDiskSizeGB: 100,
      osDiskType: "Premium_LRS",
    },
  ],
  sku: {
    name: "Base",
    tier: "Standard"
  },
  // This is an optional part, unless using very large clusters with several 100s of nodes.
  networkProfile: {
    networkPlugin: "azure",
    networkPolicy: "calico",
  }
});

Notes:

  1. Autoscaling:
    • This template enables autoscaling to have the cluster automatically scale to the required size and back down afterward to reduce costs.
    • You can disable autoscaling by removing the minCount, maxCount and enableAutoScaling properties, but you'll have to set the count value accordingly.
  2. vmSize: The vmSizes used above can be adjusted to instances that adhere more to your needs.