Ir para o conteúdo

Azure and Terraform Initial Setup

pipeline status

[[TOC]]

Intro

This code is the first infrastructure configuration to run.

Its purpose is to configure Azure so that Terraform can manage Azure resources. (Terraform requires environment variables such as ARM_CLIENT_ID, ARM_CLIENT_SECRET, etc., and this script generates the values for those variables.)

You must execute this code manually (it must be run by a user with Azure Owner permissions).

In this project, GitLab pipelines are used only to validate script syntax and tag versions; they do not perform any other automated tasks.

This code creates:

  • An Azure Active Directory service principal (similar to a user token, but for app registration) that allows Terraform to access Azure when running in GitLab pipelines;
  • It also assigns the following roles to the app: Contributor, Network Contributor, Resource Policy Contributor
  • A Resource Group, an Azure storage account, and a storage container to save all the Terraform "state" files;
graph TB

subgraph human["Primetag Azure Administrator (Human)"]
style human fill:#FFFFF5
  D1[Copy Variables to GitLab CI]
end

subgraph script["./create-storage-and-service-principal-for-terraform.sh"]
style script fill:#FFFFF5

  subgraph s3["Show Variables"]
  C1[Print subscription id, resource group name, password, ...] --> D1
  end

  subgraph s2["Service Principals to Terraform have permitions"]
  B1[Create service principals in Active Directory] --> B2
  B2[Set roles for service principals] --> C1
  end

  subgraph s1["Storage for Terraform 'State' files"]
  A1[Creating resource group if not exists] --> A2[Create storage account if not exists]
  A2 --> |Get storage account key| A3[Create storage container if not exists]
  A3 --> B1
  end
end

How to execute

Run in bash: (You must be a user with owner permission, like Zé Carlos)

az login
./create-storage-and-service-principal-for-terraform.sh

Note: You need to have jq installed to run the script, and only some Azure users have permissions to execute this.

After running the script, there will appear the resources created and the credentials.

Example output:
ENVIRONMENT: dev
AZURE_SUBSCRIPTION_DEV: 1e34e640-66d9-4058-bab3-b5b5efe2dbdb
                         "Development Subscription"
# Storage:
RESOURCE_GROUP_NAME_DEV: terraform-state-dev
STORAGE_ACCOUNT_NAME_DEV: primetagterraformdev
CONTAINER_NAME_DEV: terraform-state-dev
# APP:
ACCOUNT_KEY_DEV: em******Q==
CREDENTIAL_APP_ID_DEV: 8f****e2f
CREDENTIAL_PASSWORD_DEV: 0bca****a5b
CREDENTIAL_TENANT_DEV: e07*****a

ENVIRONMENT: staging
AZURE_SUBSCRIPTION_STAGING: 1e34e640-66d9-4058-bab3-b5b5efe2dbdb
                         "Development Subscription"
# Storage:
RESOURCE_GROUP_NAME_STAGING: terraform-state-staging
STORAGE_ACCOUNT_NAME_STAGING: primetagterraformstaging
CONTAINER_NAME_STAGING: terraform-state-staging
ACCOUNT_KEY_STAGING: 9y*****g==
# Service Principal:
CREDENTIAL_APP_ID_STAGING: b1*****0c
CREDENTIAL_PASSWORD_STAGING: d*****9
CREDENTIAL_TENANT_STAGING: e*****2a

ENVIRONMENT: prod
AZURE_SUBSCRIPTION_DEV: 1e34e640-66d9-4058-bab3-b5b5efe2dbdb
                         "Development Subscription"
# Storage:
RESOURCE_GROUP_NAME_PROD: terraform-state-prod
STORAGE_ACCOUNT_NAME_PROD: primetagterraformprod
CONTAINER_NAME_PROD: terraform-state-prod
# APP:
ACCOUNT_KEY_PROD: ******
CREDENTIAL_APP_ID_PROD: ****
CREDENTIAL_PASSWORD_PROD: ****
CREDENTIAL_TENANT_PROD: *****

You must copy the following variables to ...

  • ARM_CLIENT_ID_DEV: CREDENTIAL_APP_ID_DEV
  • ARM_CLIENT_SECRET_DEV: CREDENTIAL_PASSWORD_DEV
  • ARM_SUBSCRIPTION_ID_DEV: AZURE_SUBSCRIPTION_DEV
  • ARM_TENANT_ID_DEV: CREDENTIAL_TENANT_DEV
  • ARM_CLIENT_ID_PROD: CREDENTIAL_APP_ID_PROD
  • ARM_CLIENT_SECRET_PROD: CREDENTIAL_PASSWORD_PROD
  • ARM_SUBSCRIPTION_ID_PROD: AZURE_SUBSCRIPTION_PROD
  • ARM_TENANT_ID_PROD: CREDENTIAL_TENANT_PROD

And mask the CREDENTIAL* variables: || |-

Some passwords are not compatible with "Masked": || |-

It's done!


Now, you have a new app registration per environment: https://portal.azure.com/#blade/Microsoft_AAD_IAM/ActiveDirectoryMenuBlade/RegisteredApps || |- * DEV: https://portal.azure.com/#view/Microsoft_AAD_RegisteredApps/ApplicationMenuBlade/~/Overview/appId/**/isMSAApp~/false * STAGING: https://portal.azure.com/#view/Microsoft_AAD_RegisteredApps/ApplicationMenuBlade/~/Overview/appId/*/isMSAApp~/false * PROD: https://portal.azure.com/#view/Microsoft_AAD_RegisteredApps/ApplicationMenuBlade/~/Overview/appId/**/isMSAApp~/false

And a storage account per environment: https://portal.azure.com/#blade/HubsExtension/BrowseResource/resourceType/Microsoft.Storage%2FStorageAccounts || |- * DEV: https://portal.azure.com/#@primetag.net/resource/subscriptions/*/resourceGroups/terraform-state-dev/providers/Microsoft.Storage/storageAccounts/primetagterraformdev * STAGING: https://portal.azure.com/#@primetag.net/resource/subscriptions/*/resourceGroups/terraform-state-staging/providers/Microsoft.Storage/storageAccounts/primetagterraformstaging * PROD: https://portal.azure.com/#@primetag.net/resource/subscriptions/*/resourceGroups/terraform-state-prod/providers/Microsoft.Storage/storageAccounts/primetagterraformprod


If you want to run for only one environment, or different environments, like "staging", etc., you must edit the following line in the script:

declare -r ENV_LIST=("dev" "prod")

Note:

Every time that you run the script, a new CREDENTIAL_PASSWORD_xxx will be generated for the same CREDENTIAL_APP_ID_xxx, then if you don't update GitLab variables ARM_CLIENT_SECRET_xxx with the new CREDENTIAL_PASSWORD_xxx, you will start seeing this error when you try to execute pipelines that update infrastructure in Azure:

$ terraform init -backend-config=terraform-backend/backend-config-"${ENV}".tfvars -var-file=".env-${ENV}.tfvars" && (terraform workspace select ${WORKSPACE}-${ENV} || terraform workspace new ${WORKSPACE}-${ENV})
 Initializing the backend...

 Successfully configured the backend "azurerm"! Terraform will automatically
 use this backend unless the backend configuration changes.

 Error: Failed to get existing workspaces: Error retrieving keys for Storage Account "primetagterraformprod": azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/1e34e640-66d9-4058-bab3-b5b5efe2dbdb/resourceGroups/terraform-state-prod/providers/Microsoft.Storage/storageAccounts/primetagterraformprod/listKeys?api-version=2016-01-01: StatusCode=401 -- Original Error: adal: Refresh request failed. Status Code = '401'. Response body: {"error":"invalid_client","error_description":"AADSTS7000215: Invalid client secret is provided.\r\nTrace ID: 4fdd573e-c96c-4b9f-b524-bec450a18d01\r\nCorrelation ID: 2185cb33-b71d-47fd-a9eb-9beac665fd44\r\nTimestamp: 2020-06-29 10:03:00Z","error_codes":[7000215],"timestamp":"2020-06-29 10:03:00Z","trace_id":"4fdd573e-c96c-4b9f-b524-bec450a18d01","correlation_id":"2185cb33-b71d-47fd-a9eb-9beac665fd44","error_uri":"https://login.microsoftonline.com/error?code=7000215"}

$ terraform plan -input=false  -var-file=".env-${ENV}.tfvars" -out "planfile-${ENV}"
 Error: Error loading state: Error retrieving keys for Storage Account "primetagterraformprod": azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/1e34e640-66d9-4058-bab3-b5b5efe2dbdb/resourceGroups/terraform-state-prod/providers/Microsoft.Storage/storageAccounts/primetagterraformprod/listKeys?api-version=2016-01-01: StatusCode=401 -- Original Error: adal: Refresh request failed. Status Code = '401'. Response body: {"error":"invalid_client","error_description":"AADSTS7000215: Invalid client secret is provided.\r\nTrace ID: 153f2eae-25e2-49a0-827f-2e68b2ea9001\r\nCorrelation ID: e59dd860-76f8-4947-b715-6f0853dbbf01\r\nTimestamp: 2020-06-29 10:03:01Z","error_codes":[7000215],"timestamp":"2020-06-29 10:03:01Z","trace_id":"153f2eae-25e2-49a0-827f-2e68b2ea9001","correlation_id":"e59dd860-76f8-4947-b715-6f0853dbbf01","error_uri":"https://login.microsoftonline.com/error?code=7000215"}
 ERROR: Job failed: exit code 1

Rotate the service principal credentials

Option 1 - Command line

# ENV="prod"  # <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< SELECT ONE
# ENV="dev"   # <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< SELECT ONE
ENV=$(echo "${ENV}" | tr '[:upper:]' '[:lower:]')        # Convert ENV to lowercase
ENV_UPCASE=$(echo "${ENV}" | tr '[:lower:]' '[:upper:]') # Convert ENV to UPPERCASE

SP_ID=$(az aks show --resource-group primetag-$ENV --name k8s-cluster-1-$ENV \
    --query servicePrincipalProfile.clientId -o tsv)
echo "SP_ID: ${SP_ID}"

# In some cases you need `--name` and in another cases you need `--id` (different versions of az? different operative system?)
SP_SECRET=$(az ad sp credential reset --name "$SP_ID" --query password -o tsv)
# SP_SECRET=$(az ad sp credential reset --id "$SP_ID" --query password -o tsv)

echo "Please replace the value of the variable 'ARM_CLIENT_SECRET_${ENV_UPCASE}' by this secret: '${SP_SECRET}'. (SP_ID:'${SP_ID}')"

Option 2 - Azure Portal

Add a new client secret, exclude the old one.

Set "Expires" to "6 months" or "1 year" and add a reminder to the Google Calendar to remember your and your teammates (you can be outside on that date, then it is important to invite your teammates to the calendar reminder).

Update GitLab variables

Note: If you rotate the "service principal" passwords, you need to update the GitLab CI Variables ARM_CLIENT_SECRET_DEV and ARM_CLIENT_SECRET_PROD with the new value.

Update K8s

Note: If you rotate the "service principal" passwords, you need to update the GitLab CI Variables, but you also need to update the Kubernetes clusters, running something like the next two options.

But check more details here: https://gitlab.com/primetag/infrastructure/kubernetes-cluster-1#reset-the-credentials-to-extend-the-service-principal-for-one-year

IMPORTANT: Kubernetes will reboot after updating the service principals, I think it will progressively replace the PODs, and you will have zero downtimes for services with more than one replica, but all cronjobs and workers will be rebooted!

Option 1 - Command line

az aks update-credentials \
    --resource-group primetag-$ENV \
    --name k8s-cluster-1-$ENV \
    --reset-service-principal \
    --service-principal "$SP_ID" \
    --client-secret "$SP_SECRET"

Option 2 - Running GitLab Pipelines

After you update the ARM_CLIENT_SECRET_<env> variables, you can run the pipelines again to apply last changes to K8s. It will be detected a new client's secret and K8s will be updated.

Warnings before the secret being blocked

Pay attention to these warnings: || |-

Price per environment

Terraform state files only need some Kb/Mb per workspace.

Show Azure calculator: |![](./.img/Price.png)| |-

With ❤️ from Primetag - Engineering Team