Prometheus¶

Connect HolmesGPT to Prometheus for metrics analysis and query generation.

Prerequisites¶

A running and accessible Prometheus server
Ensure HolmesGPT can connect to the Prometheus endpoint (see Finding your Prometheus URL)

Configuration¶

Holmes CLI Holmes Helm Chart Robusta Helm Chart

Add the following to ~/.holmes/config.yaml. Create the file if it doesn't exist:

toolsets:
    prometheus/metrics:
        enabled: true
        config:
            prometheus_url: http://<your-prometheus-service>:9090

            # Optional:
            #additional_headers:
            #    Authorization: "Basic <base_64_encoded_string>"

When using the standalone Holmes Helm Chart, update your values.yaml:

toolsets:
    prometheus/metrics:
        enabled: true
        config:
            prometheus_url: http://<your-prometheus-service>:9090

            # Optional:
            #additional_headers:
            #    Authorization: "Basic <base_64_encoded_string>"

Apply the configuration:

helm upgrade holmes holmes/holmes --values=values.yaml

When using the Robusta Helm Chart (which includes HolmesGPT), update your generated_values.yaml:

holmes:
  toolsets:
      prometheus/metrics:
          enabled: true
          config:
              prometheus_url: http://<your-prometheus-service>:9090

              # Optional:
              #additional_headers:
              #    Authorization: "Basic <base_64_encoded_string>"

Apply the configuration:

helm upgrade robusta robusta/robusta --values=generated_values.yaml --set clusterName=<YOUR_CLUSTER_NAME>

Finding your Prometheus URL¶

There are several ways to find your Prometheus URL:

Option 1: Simple method (port-forwarding)

# Find Prometheus services
kubectl get svc -A | grep prometheus

# Port forward for testing
kubectl port-forward svc/<your-prometheus-service> 9090:9090 -n <namespace>
# Then access Prometheus at: http://localhost:9090

Option 2: Advanced method (get full cluster DNS URL)

If you want to find the full internal DNS URL for Prometheus, run:

kubectl get svc --all-namespaces -o jsonpath='{range .items[*]}{.metadata.name}{"."}{.metadata.namespace}{".svc.cluster.local:"}{.spec.ports[0].port}{"\n"}{end}' | grep prometheus | grep -Ev 'operat|alertmanager|node|coredns|kubelet|kube-scheduler|etcd|controller' | awk '{print "http://"$1}'

This will print all possible Prometheus service URLs in your cluster. Pick the one that matches your deployment.

Specific Providers¶

Coralogix Prometheus¶

To use a Coralogix PromQL endpoint with HolmesGPT:

Go to Coralogix Documentation and choose the relevant PromQL endpoint for your region.
In Coralogix, create an API key with permissions to query metrics (Data Flow → API Keys).

Create a Kubernetes secret for the API key and expose it as an environment variable in your Helm values:

holmes:
  additionalEnvVars:
    - name: CORALOGIX_API_KEY
      valueFrom:
        secretKeyRef:
          name: coralogix-api-key
          key: CORALOGIX_API_KEY

Add the following under your toolsets in the Helm chart:

holmes:
  toolsets:
    prometheus/metrics:
      enabled: true
      config:
        prometheus_url: "https://prom-api.eu2.coralogix.com"  # Use your region's endpoint
        additional_headers:
          token: "{{ env.CORALOGIX_API_KEY }}"
        discover_metrics_from_last_hours: 72  # Look back 72 hours for metrics
        tool_calls_return_data: true

AWS Managed Prometheus (AMP)¶

To connect HolmesGPT to AWS Managed Prometheus:

holmes:
  toolsets:
    prometheus/metrics:
      enabled: true
      config:
        prometheus_url: https://aps-workspaces.us-east-1.amazonaws.com/workspaces/ws-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/
        aws_region: us-east-1
        aws_service_name: aps  # Default value, can be omitted
        # Optional: Specify credentials (otherwise uses default AWS credential chain)
        aws_access_key: "{{ env.AWS_ACCESS_KEY_ID }}"
        aws_secret_access_key: "{{ env.AWS_SECRET_ACCESS_KEY }}"
        # Optional: Assume a role for cross-account access
        assume_role_arn: "arn:aws:iam::123456789012:role/PrometheusReadRole"
        refresh_interval_seconds: 900  # Refresh AWS credentials every 15 minutes (default)

Notes: - The toolset automatically detects AWS configuration when aws_region is present - Uses SigV4 authentication for all requests - Supports IAM roles and cross-account access via assume_role_arn - Credentials refresh automatically based on refresh_interval_seconds

Google Managed Prometheus¶

Before configuring Holmes, make sure you have:

Google Managed Prometheus enabled
A Prometheus Frontend endpoint accessible from your cluster (If you don’t already have one, you can create it following the instructions here )

To connect HolmesGPT to Google Cloud Managed Prometheus:

holmes:
  toolsets:
    prometheus/metrics:
      enabled: true
      config:
        # Set this to the URL of your Prometheus Frontend endpoint, it may change based on the namespace you deployed frontend to.
        prometheus_url: http://frontend.default.svc.cluster.local:9090

Notes:

Authentication is handled automatically via Google Cloud (Workload Identity or default service account in the frontend deployed app)
No additional headers or credentials are required
The Prometheus Frontend endpoint must be accessible from the cluster

Azure Managed Prometheus¶

Before configuring Holmes, make sure you have:

An Azure Monitor workspace with Managed Prometheus enabled
A service principal (or managed identity) that has access to the workspace

Using a service principal (client secret)¶

holmes:
  toolsets:
    prometheus/metrics:
      enabled: true
      config:
        prometheus_url: "https://<your-workspace>.<region>.prometheus.monitor.azure.com:443/"
  additionalEnvVars:
    - name: AZURE_CLIENT_ID
      value: "<your-app-client-id>"
    - name: AZURE_TENANT_ID
      value: "<your-tenant-id>"
    - name: AZURE_CLIENT_SECRET
      value: "<your-client-secret>"

Notes: - prometheus_url must point to the Azure Managed Prometheus workspace endpoint (include the trailing slash). - No extra headers are required; authentication is handled through Azure AD (service principal or managed identity). - SSL is enabled by default (verify_ssl: true). Disable only if you know you need to trust a custom cert.

Grafana Cloud (Mimir)¶

There are two ways to connect HolmesGPT to Grafana Cloud's Prometheus/Mimir endpoint.

Option 1: Direct Prometheus Endpoint (Recommended)¶

Use Grafana Cloud's direct Prometheus endpoint with Basic authentication. This is the simplest approach.

Find your credentials:

Go to your Grafana Cloud portal → your stack → Prometheus card → Details
Note the remote write endpoint URL — remove the /push suffix to get the query endpoint
Note the Username / Instance ID (a numeric ID)
Generate a Cloud Access Policy token with metrics:read scope

The query endpoint URL format is: https://prometheus-prod-XX-prod-REGION.grafana.net/api/prom

Holmes CLIHolmes Helm ChartRobusta Helm Chart

Add the following to ~/.holmes/config.yaml. Create the file if it doesn't exist:

toolsets:
  prometheus/metrics:
    enabled: true
    config:
      prometheus_url: https://prometheus-prod-XX-prod-REGION.grafana.net/api/prom
      additional_headers:
        Authorization: "Basic <base64_encoded_credentials>"

The Basic auth credentials are <instance_id>:<cloud_access_policy_token> base64-encoded.

After making changes to your configuration, run:

holmes toolset refresh

First, create a Kubernetes secret with your credentials:

# Base64-encode your credentials: <instance_id>:<cloud_access_policy_token>
kubectl create secret generic grafana-cloud-prometheus \
  --from-literal=auth-header="Basic $(echo -n 'INSTANCE_ID:CLOUD_ACCESS_POLICY_TOKEN' | base64)"

Then add to your Holmes Helm values:

additionalEnvVars:
  - name: GRAFANA_CLOUD_PROM_AUTH
    valueFrom:
      secretKeyRef:
        name: grafana-cloud-prometheus
        key: auth-header

toolsets:
  prometheus/metrics:
    enabled: true
    config:
      prometheus_url: "https://prometheus-prod-XX-prod-REGION.grafana.net/api/prom"
      additional_headers:
        Authorization: "{{ env.GRAFANA_CLOUD_PROM_AUTH }}"

First, create a Kubernetes secret with your credentials:

# Base64-encode your credentials: <instance_id>:<cloud_access_policy_token>
kubectl create secret generic grafana-cloud-prometheus \
  --from-literal=auth-header="Basic $(echo -n 'INSTANCE_ID:CLOUD_ACCESS_POLICY_TOKEN' | base64)"

Then add to your Robusta Helm values:

holmes:
  additionalEnvVars:
    - name: GRAFANA_CLOUD_PROM_AUTH
      valueFrom:
        secretKeyRef:
          name: grafana-cloud-prometheus
          key: auth-header
  toolsets:
    prometheus/metrics:
      enabled: true
      config:
        prometheus_url: "https://prometheus-prod-XX-prod-REGION.grafana.net/api/prom"
        additional_headers:
          Authorization: "{{ env.GRAFANA_CLOUD_PROM_AUTH }}"

Update your Helm values and run a Helm upgrade:

helm upgrade robusta robusta/robusta --values=generated_values.yaml --set clusterName=<YOUR_CLUSTER_NAME>

Option 2: Grafana API Proxy¶

Use Grafana's datasource proxy to route requests through the Grafana API. This approach uses a Grafana service account token.

Find your credentials:

Navigate to "Administration → Service accounts" in Grafana Cloud
Create a new service account and generate a token (starts with glsa_)
Find your Prometheus datasource UID:

curl -H "Authorization: Bearer YOUR_GLSA_TOKEN" \
     "https://YOUR-INSTANCE.grafana.net/api/datasources" | \
     jq '.[] | select(.type=="prometheus") | {name, uid}'

Holmes CLIHolmes Helm ChartRobusta Helm Chart

Add the following to ~/.holmes/config.yaml. Create the file if it doesn't exist:

toolsets:
  prometheus/metrics:
    enabled: true
    config:
      prometheus_url: https://YOUR-INSTANCE.grafana.net/api/datasources/proxy/uid/PROMETHEUS_DATASOURCE_UID
      additional_headers:
        Authorization: Bearer YOUR_GLSA_TOKEN

After making changes to your configuration, run:

holmes toolset refresh

First, create a Kubernetes secret with your service account token:

kubectl create secret generic grafana-cloud-sa-token \
  --from-literal=token=YOUR_GLSA_TOKEN

Then add to your Holmes Helm values:

additionalEnvVars:
  - name: GRAFANA_CLOUD_SA_TOKEN
    valueFrom:
      secretKeyRef:
        name: grafana-cloud-sa-token
        key: token

toolsets:
  prometheus/metrics:
    enabled: true
    config:
      prometheus_url: "https://YOUR-INSTANCE.grafana.net/api/datasources/proxy/uid/PROMETHEUS_DATASOURCE_UID"
      additional_headers:
        Authorization: "Bearer {{ env.GRAFANA_CLOUD_SA_TOKEN }}"

First, create a Kubernetes secret with your service account token:

kubectl create secret generic grafana-cloud-sa-token \
  --from-literal=token=YOUR_GLSA_TOKEN

Then add to your Robusta Helm values:

holmes:
  additionalEnvVars:
    - name: GRAFANA_CLOUD_SA_TOKEN
      valueFrom:
        secretKeyRef:
          name: grafana-cloud-sa-token
          key: token
  toolsets:
    prometheus/metrics:
      enabled: true
      config:
        prometheus_url: "https://YOUR-INSTANCE.grafana.net/api/datasources/proxy/uid/PROMETHEUS_DATASOURCE_UID"
        additional_headers:
          Authorization: "Bearer {{ env.GRAFANA_CLOUD_SA_TOKEN }}"

Update your Helm values and run a Helm upgrade:

helm upgrade robusta robusta/robusta --values=generated_values.yaml --set clusterName=<YOUR_CLUSTER_NAME>

Advanced Configuration¶

You can further customize the Prometheus toolset with the following options:

toolsets:
  prometheus/metrics:
    enabled: true
    config:
      prometheus_url: http://prometheus-server.monitoring.svc.cluster.local:9090
      additional_headers:
        Authorization: "Basic <base64_encoded_credentials>"

      # Discovery settings
      discover_metrics_from_last_hours: 1  # Only return metrics with data in last N hours (default: 1)

      # Timeout configuration
      query_timeout_seconds_default: 20  # Default timeout for PromQL queries (default: 20)
      query_timeout_seconds_hard_max: 180  # Maximum allowed timeout for PromQL queries (default: 180)
      metadata_timeout_seconds_default: 20  # Default timeout for metadata/discovery APIs (default: 20)
      metadata_timeout_seconds_hard_max: 60  # Maximum allowed timeout for metadata APIs (default: 60)

      # Other options
      rules_cache_duration_seconds: 1800  # Cache duration for Prometheus rules (default: 30 minutes)
      verify_ssl: true  # Enable SSL verification (default: true)
      tool_calls_return_data: true  # If false, disables returning Prometheus data (default: true)
      additional_labels:  # Additional labels to add to all queries
        cluster: "production"

Configuration options:

Option	Default	Description
`prometheus_url`	-	Prometheus server URL (include protocol and port)
`additional_headers`	`{}`	Authentication headers (e.g., `Authorization: Bearer token`)
`discover_metrics_from_last_hours`	`1`	Only discover metrics with data in last N hours
`query_timeout_seconds_default`	`20`	Default PromQL query timeout
`query_timeout_seconds_hard_max`	`180`	Maximum query timeout
`metadata_timeout_seconds_default`	`20`	Default metadata/discovery API timeout
`metadata_timeout_seconds_hard_max`	`60`	Maximum metadata API timeout
`rules_cache_duration_seconds`	`1800`	Cache duration for rules (set to `null` to disable)
`verify_ssl`	`true`	Enable SSL certificate verification
`tool_calls_return_data`	`true`	Return Prometheus data (disable if hitting token limits)
`additional_labels`	`{}`	Labels to add to all queries (AWS/AMP only)

Capabilities¶

Tool Name	Description
list_prometheus_rules	List all defined Prometheus rules with descriptions and annotations
get_metric_names	Get list of metric names (fastest discovery method) - requires match filter
get_label_values	Get all values for a specific label (e.g., pod names, namespaces)
get_all_labels	Get list of all label names available in Prometheus
get_series	Get time series matching a selector (returns full label sets)
get_metric_metadata	Get metadata (type, description, unit) for metrics
execute_prometheus_instant_query	Execute an instant PromQL query (single point in time)
execute_prometheus_range_query	Execute a range PromQL query for time series data with graph generation