Debug a Failing Agent

When an autonomous agent fails, it rarely spits out a simple stack trace. It usually gets confused, stuck in a loop, or hallucinates bad tool payloads. Here is how to use Savine to diagnose the failure.

1. Agent Loops ("Loop threshold exceeded")

The Symptom: The task fails after 2 minutes with ERROR: Loop threshold exceeded.

The Diagnosis: Open the Trace view in the Dashboard. Look at the ACT steps. You will likely see the agent calling the exact same tool with the exact same arguments repeatedly.

Example: Calling python_exec with code that causes a SyntaxError: expected ':'. The agent sees the error, thinks about it, and then submits the exact same broken code again.

The Fix:

Update your system_prompt to be explicit: "When writing Python, double check your indentations and trailing colons before executing."
If it still fails, the model you are using may simply lack the reasoning strength for the task. Switch from a weaker model (e.g. llama-3.1-8b) to a stronger one (e.g. gpt-4o or claude-3-5-sonnet).

2. Agent Times Out

The Symptom: The task fails with ERROR: Task execution timed out (300s limit).

The Diagnosis: Look at the duration_ms on the OBSERVE steps in the Trace view. Is web_search taking 45 seconds per call? Is the agent making 20 sequential web searches?

The Fix:

Increase timeout_seconds in agent.json config if the workload is genuinely large.
If the agent is being overly talkative to itself, instruct it to gather bulk data: "Search once, rather than searching 10 different times."

3. Tool Failure (Status 400s/500s)

The Symptom: The OBSERVE step shows a raw HTTP 403 or 500 error from an API you called via http_request.

The Diagnosis: The agent is constructing an invalid payload or is unauthorized. Check the headers property the agent passed in the ACT step. Did it include the Authorization token?

The Fix: Map the API keys to the system explicitly.

json

"env": {
  "STRIPE_KEY": "sk_test_123"
}

And add to the system prompt: "When making HTTP requests to Stripe, you must include the Bearer token located in the STRIPE_KEY environment variable."

4. Model Does Not Support Tools

The Symptom: The task fails instantly with ProviderError: Model does not support function calling.

The Diagnosis: You selected a model (like o1 or an older open-source model) that does not natively support the Tool Calling/Function Calling API schema.

The Fix: Select a model marked "Function Calling Support" in the LLM Providers reference table.

Debug a Failing Agent ​

1. Agent Loops ("Loop threshold exceeded") ​

2. Agent Times Out ​

3. Tool Failure (Status 400s/500s) ​

4. Model Does Not Support Tools ​

Debug a Failing Agent

1. Agent Loops ("Loop threshold exceeded")

2. Agent Times Out

3. Tool Failure (Status 400s/500s)

4. Model Does Not Support Tools