Skip to main content

Overview

Connect your on-premise or private Kubernetes clusters to IncidentFox SaaS without firewall changes. IncidentFox uses an outbound agent pattern to access your private Kubernetes clusters:
Your Kubernetes Cluster              IncidentFox SaaS
====================                 ================

┌──────────────────┐                ┌──────────────────┐
│ incidentfox-     │   outbound     │  K8s Gateway     │
│ k8s-agent        │───────────────>│  Service         │
│ (Helm chart)     │   HTTPS/SSE    │                  │
└────────┬─────────┘                └────────┬─────────┘
         │                                   │
         ▼                                   ▼
┌──────────────────┐                ┌──────────────────┐
│ K8s API Server   │                │ AI Agent         │
│ (your cluster)   │                │ (investigations) │
└──────────────────┘                └──────────────────┘
Key benefits:
  • No inbound firewall rules needed
  • Agent connects outbound to IncidentFox (port 443)
  • You control RBAC permissions via Helm values
  • Multiple clusters supported per team

Prerequisites

Before you start:
  • IncidentFox SaaS account with a team created
  • Kubernetes cluster (v1.24+)
  • kubectl configured and able to access your cluster
  • helm v3.x installed
  • Outbound HTTPS access to ui.incidentfox.ai (or your self-hosted gateway)

Setup

1

Generate API Key

  1. Log in to the IncidentFox dashboard
  2. Navigate to SettingsIntegrationsKubernetes
  3. Click “Add Cluster”
  4. Enter a Cluster Name (e.g., prod-us-east-1, staging)
  5. Click “Generate API Key”
  6. Copy the API key (starts with ixfx_k8s_) — you won’t see it again!
The API key authenticates your agent with IncidentFox. Each cluster needs its own key.
2

Add the Helm Repository

helm repo add incidentfox https://charts.incidentfox.ai
helm repo update
3

Install the Agent

Create a namespace and install the agent:
# Create namespace
kubectl create namespace incidentfox

# Install the agent
helm install incidentfox-agent incidentfox/incidentfox-k8s-agent \
  --namespace incidentfox \
  --set apiKey=ixfx_k8s_YOUR_API_KEY \
  --set clusterName=prod-us-east-1
Configuration options:
ParameterDescriptionDefault
apiKeyAPI key from Step 1 (required)
clusterNameName shown in IncidentFox dashboard
gatewayUrlIncidentFox gateway URLhttps://orchestrator.incidentfox.ai/gateway
replicaCountNumber of agent replicas1
logLevelLogging verbosity (DEBUG, INFO, WARNING)INFO
4

Verify Connection

  1. Check agent pod is running:
kubectl get pods -n incidentfox
You should see:
NAME                                READY   STATUS    RESTARTS   AGE
incidentfox-agent-xxx-yyy           1/1     Running   0          30s
  1. Check agent logs for successful connection:
kubectl logs -n incidentfox -l app.kubernetes.io/name=incidentfox-k8s-agent
Look for:
{"event": "connected_to_gateway", "cluster_name": "prod-us-east-1"}
  1. Verify in dashboard:
    • Go to SettingsIntegrationsKubernetes
    • Your cluster should show Status: Connected

Usage

Once connected, ask IncidentFox about your cluster:
@incidentfox show me failing pods in prod-us-east-1
@incidentfox what's happening with deployment nginx in staging?
@incidentfox get logs from pod api-server-xxx in production
If you have multiple clusters, specify which one:
@incidentfox list pods in namespace payments on cluster prod-us-east-1

RBAC Permissions

The agent uses a ClusterRole to access Kubernetes resources. By default, it has read-only access to:
ResourcePermissions
Podsget, list, watch
Pod logsget
Deploymentsget, list, watch
ReplicaSetsget, list, watch
Servicesget, list, watch
Nodesget, list, watch
Eventsget, list, watch
ConfigMapsget, list, watch
Namespacesget, list

Customizing RBAC

To restrict or expand permissions, use Helm values:
# values.yaml
rbac:
  # Only allow access to specific namespaces
  namespaceRestriction:
    enabled: true
    namespaces:
      - production
      - staging

  # Add custom rules
  additionalRules:
    - apiGroups: ["apps"]
      resources: ["statefulsets"]
      verbs: ["get", "list", "watch"]
Apply with:
helm upgrade incidentfox-agent incidentfox/incidentfox-k8s-agent \
  --namespace incidentfox \
  -f values.yaml

Managing Multiple Clusters

Add multiple clusters by repeating the setup for each:
  1. Generate a new API key for each cluster
  2. Install the agent with a unique release name:
# Production cluster
helm install incidentfox-agent-prod incidentfox/incidentfox-k8s-agent \
  --namespace incidentfox \
  --set apiKey=ixfx_k8s_PROD_KEY \
  --set clusterName=prod-us-east-1

# Staging cluster (in a different cluster context)
helm install incidentfox-agent-staging incidentfox/incidentfox-k8s-agent \
  --namespace incidentfox \
  --set apiKey=ixfx_k8s_STAGING_KEY \
  --set clusterName=staging
In the dashboard, you’ll see all connected clusters and can query any of them.

Revoking Access

To disconnect a cluster:
  1. Uninstall the agent:
helm uninstall incidentfox-agent -n incidentfox
  1. Revoke the API key in the dashboard:
    • Go to SettingsIntegrationsKubernetes
    • Find the cluster and click “Revoke”
Revoking the key immediately disconnects the agent, even if it’s still running.

Troubleshooting

Agent not connecting

Check pod status:
kubectl describe pod -n incidentfox -l app.kubernetes.io/name=incidentfox-k8s-agent
Common issues:
SymptomCauseSolution
ImagePullBackOffCan’t pull agent imageCheck network/registry access
CrashLoopBackOffInvalid API keyVerify API key in secret
Running but not connectedNetwork blockedAllow outbound HTTPS to gateway
Check logs:
kubectl logs -n incidentfox -l app.kubernetes.io/name=incidentfox-k8s-agent --tail=100

Connection drops frequently

The agent automatically reconnects with exponential backoff. Frequent disconnections may indicate:
  • Unstable network connection
  • Gateway maintenance (check status.incidentfox.ai)
  • Resource constraints on the agent pod
Check resource usage:
kubectl top pod -n incidentfox
Increase resources if needed:
helm upgrade incidentfox-agent incidentfox/incidentfox-k8s-agent \
  --namespace incidentfox \
  --set resources.requests.memory=256Mi \
  --set resources.limits.memory=512Mi

Permission denied errors

If IncidentFox reports permission errors when querying resources:
  1. Check the ClusterRole exists:
kubectl get clusterrole incidentfox-agent
  1. Verify ClusterRoleBinding:
kubectl get clusterrolebinding incidentfox-agent
  1. Test permissions manually:
kubectl auth can-i list pods --as=system:serviceaccount:incidentfox:incidentfox-agent

Security

ConcernHow we address it
API key securityKeys are hashed with SHA-256 + pepper; plaintext never stored
TransportAll traffic encrypted via TLS (HTTPS)
Agent permissionsYou control RBAC; default is read-only
Multi-tenant isolationEach team’s clusters are isolated; agents can only access their team’s data
Audit loggingAll commands from IncidentFox are logged

Support

Next Steps