Overview
Incident.io integration enables IncidentFox to:
- Automatically investigate when incidents are created
- Post findings to incident channels
- Enrich incident timelines with investigation data
- Correlate incidents with recent changes
Prerequisites
- Incident.io account with API access
- Webhook configuration permissions
- Slack integration configured (for responses)
Setup
- Log in to Incident.io
- Go to Settings > Integrations > Webhooks
- Click Add Webhook
- Configure:
- URL:
https://api.incidentfox.ai/api/incident-io/webhook
- Events:
incident.created, incident.updated
- Copy the signing secret
- Save
Step 2: Add to IncidentFox
{
"integrations": {
"incident_io": {
"enabled": true,
"webhook_secret": "vault://secrets/incident-io-webhook-secret",
"auto_investigate": true,
"severity_threshold": "high"
}
}
}
Configuration Options
| Option | Description | Default |
|---|
auto_investigate | Automatically start investigation | true |
severity_threshold | Minimum severity to auto-investigate | medium |
post_to_channel | Post findings to incident channel | true |
create_timeline_entry | Add to incident timeline | true |
How It Works
- Incident created in Incident.io
- Webhook fires to IncidentFox
- Investigation starts with incident context
- Findings posted to incident Slack channel
- Timeline updated with investigation summary
Automatic Investigation
When an incident is created, IncidentFox:
- Extracts context from incident title and description
- Identifies services mentioned in the incident
- Queries data sources for relevant logs/metrics
- Correlates with changes in the last 4 hours
- Posts findings to the incident channel
Example
Incident created:
Title: High error rate on checkout service
Description: PagerDuty alert fired. Customers reporting failed checkouts.
IncidentFox response (in incident channel):
Investigation Started
Context: High error rate detected on checkout-service
Severity: High
Investigating...
---
Preliminary Findings:
Summary: Checkout service experiencing 503 errors due to
upstream dependency failure.
Root Cause (Confidence: 87%):
• Payment gateway returning timeout errors
• Started at 14:32 UTC
• Correlates with payment-gateway deploy at 14:30
Evidence:
• Error logs: "upstream connect error: connection timeout"
• 99.9th percentile latency: 30s (normal: 200ms)
• Payment gateway pod restarted 3 times
Recommended Actions:
1. Check payment-gateway pod logs
2. Consider rollback of payment-gateway deployment
3. Enable circuit breaker if not already active
Timeline:
• 14:30 - payment-gateway v2.1.0 deployed
• 14:32 - First timeout errors
• 14:35 - Error rate exceeded threshold
• 14:36 - PagerDuty alert fired
• 14:36 - This incident created
Timeline Integration
IncidentFox can add entries to the Incident.io timeline:
{
"integrations": {
"incident_io": {
"create_timeline_entry": true,
"timeline_entry_types": [
"investigation_start",
"root_cause_found",
"investigation_complete"
]
}
}
}
Severity Mapping
| Incident.io Severity | IncidentFox Priority |
|---|
| Critical | P0 |
| High | P1 |
| Medium | P2 |
| Low | P3 |
Configure severity threshold:
{
"integrations": {
"incident_io": {
"severity_threshold": "medium",
"skip_low_severity": true
}
}
}
Best Practices
Set up IncidentFox investigation as one of your first actions in the incident workflow.
- Include service names in incident titles
- Add PagerDuty context when creating incidents
- Use structured descriptions for better parsing
- Review and iterate on auto-investigation findings
Troubleshooting
Webhook Not Triggering
- Verify webhook URL is correct
- Check signing secret matches
- Review Incident.io webhook logs
- Ensure events are selected
Missing Findings
- Check data source connectivity
- Verify services are named correctly
- Review investigation logs in Web UI
Next Steps