Agent Configuration

Overview

IncidentFox agents can be customized through configuration to:

Modify system prompts
Enable or disable specific tools
Add custom context about your infrastructure
Tune behavior for your team’s needs

Agent Types

Agent	Purpose	Default Tools
`planner`	Orchestrate investigations	None (delegates to others)
`k8s_agent`	Kubernetes troubleshooting	9 K8s-specific tools
`aws_agent`	AWS resource debugging	8 AWS-specific tools
`metrics_agent`	Anomaly detection	22 metrics/analytics tools
`coding_agent`	Code analysis	15 code/git tools
`investigation_agent`	Full toolkit investigations	30+ tools from all categories

Configuration Structure

Each agent is configured under the agents key:

{
  "agents": {
    "investigation_agent": {
      "prompt": "System prompt for the agent...",
      "enabled": true,
      "disable_default_tools": ["shell", "docker_exec"],
      "enable_extra_tools": ["custom-runbook-search"]
    },
    "code_fix_agent": {
      "enabled": false
    }
  }
}

Configuration Options

`prompt`

The system prompt that defines the agent’s behavior, knowledge, and communication style.

{
  "agents": {
    "investigation_agent": {
      "prompt": "You are an AI SRE agent for Acme Corp. Our infrastructure runs on AWS EKS in us-west-2. Key services include: payments (critical), cart (high), catalog (medium). Always check CloudWatch metrics first, then pod logs. Escalate P1 incidents immediately to #incidents-critical."
    }
  }
}

Include context about your infrastructure in the prompt:

Service criticality tiers
Common failure patterns
Escalation procedures
Team-specific runbooks to reference

`enabled`

Toggle an agent on or off. Defaults to true.

{
  "agents": {
    "code_fix_agent": {
      "enabled": false
    }
  }
}

`disable_default_tools`

Remove specific tools from an agent’s default toolkit. Useful for security or compliance.

{
  "agents": {
    "investigation_agent": {
      "disable_default_tools": [
        "shell",
        "docker_exec",
        "db_write"
      ]
    }
  }
}

Disabling critical tools may impact investigation effectiveness. Test thoroughly before disabling in production.

`enable_extra_tools`

Add tools beyond the agent’s default set.

{
  "agents": {
    "investigation_agent": {
      "enable_extra_tools": [
        "coralogix",
        "snowflake",
        "custom-runbooks"
      ]
    }
  }
}

Writing Effective Prompts

Structure

A well-structured agent prompt includes:

Role definition - What the agent is and does
Context - Information about your infrastructure
Guidelines - How to approach investigations
Constraints - What to avoid or be careful about
Output format - How to structure responses

Example: Investigation Agent

You are an AI SRE agent for Acme Corp's platform team.

## Infrastructure Context
- Cloud: AWS (us-west-2, us-east-1)
- Orchestration: EKS (Kubernetes 1.28)
- Key Services:
  - payments-service (P0 - business critical)
  - cart-service (P1 - customer facing)
  - catalog-service (P2 - internal)
  - analytics-service (P3 - batch processing)

## Observability Stack
- Logs: Coralogix (primary), CloudWatch (backup)
- Metrics: Grafana Cloud + Prometheus
- Traces: Datadog APM
- Alerts: PagerDuty -> Slack #incidents

## Investigation Guidelines
1. Always start by identifying affected services and their criticality
2. Check recent deployments (last 4 hours) first
3. Query Coralogix for error logs before CloudWatch
4. For database issues, check RDS Performance Insights
5. Correlate with recent PRs merged to main

## Response Format
Always include:
- Summary (1-2 sentences)
- Root cause with confidence level
- Evidence (specific logs, metrics, or events)
- Timeline of events
- Recommended actions with priority

## Constraints
- Never execute remediation without approval
- Escalate P0/P1 incidents immediately
- Don't access production databases directly

Example: Slack Bot Agent

You are the IncidentFox Slack bot for Acme Corp.

## Communication Style
- Be concise and actionable
- Use bullet points for multiple items
- Include confidence levels when uncertain
- Link to dashboards and runbooks when relevant

## Quick Commands
When users say:
- "check [service]" -> Run health check on service
- "logs [service]" -> Fetch recent error logs
- "who's oncall" -> Check PagerDuty schedule
- "deploy status" -> Check recent deployments

## Escalation
For P0/P1, immediately ping @oncall-platform and post to #incidents-critical.

Agent Specialization

Creating Workflow-Specific Agents

You can create specialized agents for different workflows: CI/CD Investigation Agent:

{
  "agents": {
    "ci_investigation_agent": {
      "prompt": "You specialize in CI/CD failures. Focus on: build logs, test output, dependency changes, environment differences between PR and main.",
      "enable_extra_tools": ["github_actions", "codepipeline", "ecr"]
    }
  }
}

Database Investigation Agent:

{
  "agents": {
    "db_investigation_agent": {
      "prompt": "You specialize in database performance issues. Check RDS metrics, slow query logs, connection pools, and recent schema changes.",
      "enable_extra_tools": ["rds_insights", "pg_stat_statements", "snowflake"]
    }
  }
}

Tuning Tips

Improve Root Cause Accuracy

Add service dependencies to the prompt
Include common failure patterns you’ve seen
Specify data source priority (which to check first)
Add context about recent changes (migrations, refactors)

Reduce Investigation Time

Prioritize fast data sources in the prompt
Include known quick wins (common issues and solutions)
Set appropriate timeouts for tool execution

Improve Response Quality

Define output format explicitly
Include examples of good responses
Specify confidence thresholds for recommendations

Validation

Before deploying prompt changes:

Test in staging with known scenarios
Compare results with previous prompt version
Check for regressions in accuracy or speed

If approval workflows are enabled, prompt changes require admin approval before taking effect.

Getting Started

Core Concepts

Configuration

Integrations

Data Sources

Tools Catalog

Overview

Agent Types

Configuration Structure

Configuration Options

`prompt`

`enabled`

`disable_default_tools`

`enable_extra_tools`

Writing Effective Prompts

Structure

Example: Investigation Agent

Example: Slack Bot Agent

Agent Specialization

Creating Workflow-Specific Agents

Tuning Tips

Improve Root Cause Accuracy

Reduce Investigation Time

Improve Response Quality

Validation

Next Steps

Tool Configuration

Custom Prompts

Getting Started

Core Concepts

Configuration

Integrations

Data Sources

Tools Catalog

​Overview

​Agent Types

​Configuration Structure

​Configuration Options

​prompt

​enabled

​disable_default_tools

​enable_extra_tools

​Writing Effective Prompts

​Structure

​Example: Investigation Agent

​Example: Slack Bot Agent

​Agent Specialization

​Creating Workflow-Specific Agents

​Tuning Tips

​Improve Root Cause Accuracy

​Reduce Investigation Time

​Improve Response Quality

​Validation

​Next Steps

Tool Configuration

Custom Prompts

Overview

Agent Types

Configuration Structure

Configuration Options

`prompt`

`enabled`

`disable_default_tools`

`enable_extra_tools`

Writing Effective Prompts

Structure

Example: Investigation Agent

Example: Slack Bot Agent

Agent Specialization

Creating Workflow-Specific Agents

Tuning Tips

Improve Root Cause Accuracy

Reduce Investigation Time

Improve Response Quality

Validation

Next Steps