Azure OpenAI¶
Configure HolmesGPT to use Azure OpenAI Service.
Setup¶
Create an Azure OpenAI resource.
Configuration¶
Create Kubernetes Secret:
kubectl create secret generic holmes-secrets \
--from-literal=azure-api-key="your-azure-api-key" \
-n <namespace>
Configure Helm Values:
# values.yaml
additionalEnvVars:
- name: AZURE_API_KEY
valueFrom:
secretKeyRef:
name: holmes-secrets
key: azure-api-key
# Configure at least one model using modelList
modelList:
azure-gpt-41:
api_key: "{{ env.AZURE_API_KEY }}"
model: azure/gpt-4.1
api_base: https://your-resource.openai.azure.com/
api_version: "2025-01-01-preview"
temperature: 0
azure-gpt-5:
api_key: "{{ env.AZURE_API_KEY }}"
model: azure/gpt-5
api_base: https://your-resource.openai.azure.com/
api_version: "2025-01-01-preview"
temperature: 1
# Optional: Set default model (use modelList key name)
config:
model: "azure-gpt-41" # This refers to the key name in modelList above
Create Kubernetes Secret:
kubectl create secret generic robusta-holmes-secret \
--from-literal=azure-api-key="your-azure-api-key" \
-n <namespace>
Configure Helm Values:
# values.yaml
holmes:
additionalEnvVars:
- name: AZURE_API_KEY
valueFrom:
secretKeyRef:
name: robusta-holmes-secret
key: azure-api-key
# Configure at least one model using modelList
modelList:
azure-gpt-41:
api_key: "{{ env.AZURE_API_KEY }}"
model: azure/gpt-4.1
api_base: https://your-resource.openai.azure.com/
api_version: "2025-01-01-preview"
temperature: 0
azure-gpt-5:
api_key: "{{ env.AZURE_API_KEY }}"
model: azure/gpt-5
api_base: https://your-resource.openai.azure.com/
api_version: "2025-01-01-preview"
temperature: 1
# Optional: Set default model (use modelList key name)
config:
model: "azure-gpt-41" # This refers to the key name in modelList above
Using CLI Parameters¶
You can also pass the API key directly as a command-line parameter:
Microsoft Entra ID Authentication¶
Authenticate HolmesGPT to Azure OpenAI or Microsoft Foundry using Microsoft Entra ID (formerly Azure AD) instead of an API key.
This uses DefaultAzureCredential to obtain a bearer token, which is ideal for organizations that enforce identity-based access control and disable local authentication (API keys) on their Azure AI resources.
Why Disable Local Auth?¶
Microsoft Foundry and Azure OpenAI resources support two authentication methods: API keys (local auth) and Microsoft Entra ID. For production environments, Microsoft recommends disabling local authentication so that:
- All access is governed by Azure RBAC — no static secrets to rotate or leak
- Every request is tied to an auditable identity
- Conditional Access policies and Privileged Identity Management apply
When local auth is disabled on your Azure resource, API key access is completely blocked and only Entra ID tokens are accepted. HolmesGPT supports this scenario via the AZURE_AD_TOKEN_AUTH environment variable.
To disable local auth on your resource:
az cognitiveservices account update \
--name <resource-name> \
--resource-group <rg> \
--custom-domain <resource-name> \
--disable-local-auth true
For more details, see Disable local authentication in Azure AI Services.
How It Works¶
Set AZURE_AD_TOKEN_AUTH=true to enable Entra ID authentication. When enabled, HolmesGPT will:
- Skip the
AZURE_API_KEYrequirement during startup validation - Obtain a token via
DefaultAzureCredentialand pass it to each LLM request - Cache the token for 1 hour and refresh automatically
Required Permissions¶
The identity used by HolmesGPT must be able to perform the action:
on the target Azure OpenAI or Microsoft Foundry resource.
You can grant this with the Cognitive Services OpenAI User or Azure AI User built-in role, or a custom role that includes the action above:
az role assignment create \
--assignee "<identity-principal-id>" \
--role "Cognitive Services OpenAI User" \
--scope "/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.CognitiveServices/accounts/<resource>"
Running Locally¶
When running HolmesGPT on your machine, DefaultAzureCredential tries credentials in order: environment variables, Azure CLI, Azure PowerShell, etc. The simplest method is to sign in with the Azure CLI:
az login
export AZURE_AD_TOKEN_AUTH=true
export AZURE_API_VERSION="2024-02-15-preview"
export AZURE_API_BASE="https://your-resource.openai.azure.com"
holmes ask "what pods are failing?" --model="azure/<your-deployment-name>"
No AZURE_API_KEY is needed.
For service-principal auth locally, set these environment variables instead:
export AZURE_CLIENT_ID="<app-client-id>"
export AZURE_CLIENT_SECRET="<app-client-secret>"
export AZURE_TENANT_ID="<tenant-id>"
Running in Kubernetes with Workload Identity¶
When running as a pod in AKS, use AKS Workload Identity so the pod authenticates via a federated credential rather than a stored secret. The managed identity (or service principal) bound to the pod must have the Cognitive Services OpenAI User role on the target resource.
Prerequisites:
- AKS cluster with OIDC issuer and workload identity enabled
- A managed identity with the Cognitive Services OpenAI User role on your Azure OpenAI resource
- A federated credential linking the managed identity to the Holmes ServiceAccount
Set up the identity and federation:
# Get the OIDC issuer URL
OIDC_ISSUER=$(az aks show -n <cluster> -g <rg> --query "oidcIssuerProfile.issuerUrl" -o tsv)
# Create a managed identity
az identity create -n holmes-identity -g <rg>
IDENTITY_CLIENT_ID=$(az identity show -n holmes-identity -g <rg> --query clientId -o tsv)
IDENTITY_PRINCIPAL_ID=$(az identity show -n holmes-identity -g <rg> --query principalId -o tsv)
# Assign the Cognitive Services OpenAI User role
az role assignment create \
--assignee "$IDENTITY_PRINCIPAL_ID" \
--role "Cognitive Services OpenAI User" \
--scope "/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.CognitiveServices/accounts/<resource>"
# Create a federated credential for the Holmes ServiceAccount
az identity federated-credential create \
--name holmes-federated \
--identity-name holmes-identity \
--resource-group <rg> \
--issuer "$OIDC_ISSUER" \
--subject "system:serviceaccount:<namespace>:holmes" \
--audiences "api://AzureADTokenExchange"
Configure Helm Values:
# values.yaml
additionalEnvVars:
- name: AZURE_AD_TOKEN_AUTH
value: "true"
- name: AZURE_CLIENT_ID
value: "<managed-identity-client-id>"
- name: AZURE_TENANT_ID
value: "<tenant-id>"
serviceAccount:
annotations:
azure.workload.identity/client-id: "<managed-identity-client-id>"
podLabels:
azure.workload.identity/use: "true"
modelList:
azure-gpt-41:
model: azure/gpt-4.1
api_base: https://your-resource.openai.azure.com/
api_version: "2025-01-01-preview"
temperature: 0
config:
model: "azure-gpt-41"
Note that api_key is omitted from the modelList entries — authentication is handled entirely by the workload identity token.
Configure Helm Values:
# values.yaml
holmes:
additionalEnvVars:
- name: AZURE_AD_TOKEN_AUTH
value: "true"
- name: AZURE_CLIENT_ID
value: "<managed-identity-client-id>"
- name: AZURE_TENANT_ID
value: "<tenant-id>"
serviceAccount:
annotations:
azure.workload.identity/client-id: "<managed-identity-client-id>"
podLabels:
azure.workload.identity/use: "true"
modelList:
azure-gpt-41:
model: azure/gpt-4.1
api_base: https://your-resource.openai.azure.com/
api_version: "2025-01-01-preview"
temperature: 0
config:
model: "azure-gpt-41"
Troubleshooting¶
# Verify the pod has workload identity labels and env vars injected
kubectl describe pod -l app=holmes -n <namespace> | grep -A5 "AZURE_"
# Test that the identity can obtain a token (from inside the pod)
kubectl exec -n <namespace> deploy/holmes -- python -c "
from azure.identity import DefaultAzureCredential
token = DefaultAzureCredential().get_token('https://cognitiveservices.azure.com/.default')
print('Token obtained, expires at:', token.expires_on)
"
# Check role assignment
az role assignment list \
--assignee "<identity-principal-id>" \
--scope "/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.CognitiveServices/accounts/<resource>" \
--output table