Welcome to IncidentFox
IncidentFox is an AI SRE / AI On-Call engineer that integrates with your observability stack, infrastructure, and collaboration tools to automatically investigate incidents, find root causes, and suggest fixes.Quick Start
Get IncidentFox up and running in minutes
How It Works
Understand the multi-agent architecture
Configuration
Configure agents, tools, and prompts
Integrations
Connect to Slack, GitHub, PagerDuty, and more
Key Features
Dual-Runtime Agent Architecture
Dual-Runtime Agent Architecture
IncidentFox uses two powerful agent runtimes:
- OpenAI SDK Agent - Production automation with multi-agent orchestration (Planner + Specialists)
- Claude SDK SRE Agent - Interactive debugging with Kubernetes sandbox isolation
300+ Built-in Tools
300+ Built-in Tools
Pre-built integrations across 20+ categories:
- Kubernetes: Pod logs, events, deployments, resource usage (9 tools)
- AWS: EC2, Lambda, RDS, ECS, CloudWatch (8+ tools)
- Observability: Grafana, Datadog, Prometheus, Coralogix, Sentry, New Relic (15+ tools)
- Log Analysis: Statistics, sampling, pattern search, anomaly detection (7 tools)
- Docker: Container logs, stats, exec, events (15 tools)
- GitHub: Code search, PRs, issues, Actions, commits (16 tools)
- Database: MySQL, PostgreSQL, Snowflake, BigQuery (70+ tools)
- And more: PagerDuty, Slack, Linear, Jira, Confluence, Terraform…
RAPTOR Knowledge Base
RAPTOR Knowledge Base
Hierarchical knowledge retrieval system based on ICLR 2024 research:
- Handles 100+ page runbooks without context loss
- Knowledge graphs for service dependencies and ownership
- Learns from past investigations to improve over time
- Multi-level abstraction: procedural, factual, temporal, policy
Multiple Trigger Sources
Multiple Trigger Sources
Invoke IncidentFox from wherever your team works:
- Slack - Mention the bot in any channel
- GitHub - Comment on issues or PRs
- PagerDuty - Automatic investigation on alerts
- Incident.io - Integrated incident response
- REST API - Programmatic access
- Web UI - Dashboard for investigations and configuration
Advanced AI/ML Capabilities
Advanced AI/ML Capabilities
Intelligent analysis powered by state-of-the-art ML:
- Anomaly Detection - Z-score, Prophet-based seasonal detection
- Forecasting - Capacity planning with uncertainty bounds
- Correlation Analysis - Cross-service metric relationships
- Change Point Detection - Identify when issues started
- Pattern Learning - Records and reuses incident patterns
Enterprise Security & Compliance
Enterprise Security & Compliance
Built for enterprise security and compliance:
- SOC 2 compliant infrastructure
- Claude Sandbox isolation with Kubernetes + gVisor
- Credentials proxy (Envoy) - secrets never touch agent
- SSO/OIDC authentication (Google, Azure AD, Okta)
- Approval workflows for critical changes
- Full audit logging
- On-premise and air-gapped deployment options
What Can IncidentFox Do?
Incident Investigation
When an incident occurs, IncidentFox automatically:- Gathers Context - Pulls logs, metrics, and recent changes from your observability stack
- Analyzes Root Cause - Correlates data across services to identify the issue
- Provides Timeline - Reconstructs what happened and when
- Suggests Fixes - Recommends actionable remediation steps
CI/CD Auto-Fix
When your CI pipeline fails, IncidentFox can:- Detect Failures - Monitors GitHub Actions, CodePipeline, and other CI systems
- Analyze Logs - Reads test output and build errors
- Identify Root Cause - Correlates failures with code changes in the PR
- Propose Fixes - Suggests code changes to resolve the issue
Proactive Monitoring
IncidentFox can monitor your systems and alert before issues escalate:- Anomaly Detection - Prophet-based forecasting identifies unusual patterns
- Correlation Analysis - Links metrics across services to find relationships
- Knowledge Base - RAPTOR hierarchical retrieval learns from runbooks and past incidents
- Alert Correlation - Connects Prometheus, Alertmanager, and PagerDuty alerts
Getting Started
Connect Your Data Sources
Configure connections to your observability stack (Coralogix, Datadog, Grafana, etc.)
Set Up Integrations
Connect IncidentFox to Slack, GitHub, or PagerDuty for triggering investigations

