Skip to main content

Overview

This guide walks you through setting up IncidentFox for your team. By the end, you’ll have an AI SRE agent ready to investigate incidents from Slack.
IncidentFox is deployed as a managed service. Contact your account team for access credentials and your team token.

Step 1: Get Your Credentials

After onboarding, you’ll receive:
  • Team Token - Used to authenticate API requests
  • Web UI Access - Dashboard to configure agents and view investigation history
  • Slack App - Bot to install in your workspace
Your team token follows the format tokid.toksecret. Keep it secure and never commit to version control.

Step 2: Install the Slack Bot

The Slack bot is the primary interface for triggering investigations.
1

Install the App

Your IncidentFox admin will provide an installation link. Click it and authorize the app for your Slack workspace.
2

Invite to Channels

Invite @incidentfox to channels where you want to trigger investigations:
/invite @incidentfox
3

Test the Connection

Send a test message:
@incidentfox hello
The bot should respond confirming it’s connected.

Step 3: Configure Data Sources

Connect IncidentFox to your observability stack to enable investigations.

Via Web UI

  1. Log in to your IncidentFox dashboard
  2. Navigate to Team Console > Integrations
  3. Click Add Integration and select your data source
  4. Enter the required credentials

Common Data Sources

PlatformRequired Credentials
CoralogixAPI Key, Domain
SnowflakeAccount, Username, Password, Warehouse
AWSAccess Key, Secret Key, Region
DatadogAPI Key, Application Key
GrafanaURL, API Key
GitHubPersonal Access Token
See the Data Sources section for detailed setup instructions for each platform.

Step 4: Enable Tools

IncidentFox comes with 50+ built-in tools. Enable the ones relevant to your stack. In the Web UI:
  1. Go to Team Console > Tools
  2. Toggle on the tools you need
  3. Configure any tool-specific settings
Example configuration:
{
  "tools": {
    "kubernetes": {
      "enabled": true,
      "kubeconfig_path": "~/.kube/config"
    },
    "aws": {
      "enabled": true,
      "region": "us-west-2"
    },
    "coralogix": {
      "enabled": true,
      "api_key": "vault://secrets/coralogix"
    }
  }
}

Step 5: Start Investigating

You’re ready to go! Try these commands in Slack:

Basic Investigation

@incidentfox investigate high latency in the payments service

Check Pod Status

@incidentfox check the status of cart pods in production

Analyze CI Failure

@incidentfox why did the build fail on PR #123?

Query Logs

@incidentfox search for errors in the checkout service logs from the last hour

What Happens During an Investigation

When you trigger an investigation, IncidentFox:
  1. Acknowledges - Reacts to your message to confirm it’s working
  2. Plans - The Planner agent creates an investigation strategy
  3. Gathers Data - Pulls logs, metrics, and events from your data sources
  4. Analyzes - Correlates data to identify root cause
  5. Reports - Posts findings back to the Slack thread
Investigation Flow

Example Investigation Output

{
  "summary": "Payment service experiencing elevated latency due to database connection pool exhaustion",
  "root_cause": {
    "description": "RDS connection pool at maximum capacity (100/100 connections)",
    "confidence": 92,
    "evidence": [
      "CloudWatch RDS connections metric at 100%",
      "Application logs show 'connection timeout' errors",
      "Spike correlates with deployment at 14:32 UTC"
    ]
  },
  "timeline": [
    "14:32 - New deployment rolled out",
    "14:35 - Connection count started increasing",
    "14:42 - Connection pool exhausted",
    "14:43 - Latency alerts fired"
  ],
  "affected_systems": ["payment-service", "checkout-service", "RDS primary"],
  "recommendations": [
    "Increase RDS max_connections parameter",
    "Review connection pool settings in application config",
    "Consider rolling back deployment if issue persists"
  ]
}

Next Steps