Skip to content

Error Handling & Troubleshooting

TokenPak provides normalized error handling across all providers, automatic retries, and fallback chains.


Common Errors & Solutions

1. Connection Refused (Proxy Not Running)

Error Message:

ConnectionRefusedError: [Errno 111] Connection refused
Failed to connect to http://127.0.0.1:8766

Cause: The TokenPak proxy server is not running.

Solution:

# Start the proxy
tokenpak serve

# (in another terminal)
python your_script.py

Prevention: Keep the proxy running in a background process or systemd service.


2. Authentication Failed (Invalid API Key)

Error Message:

AuthenticationError: Invalid API key for provider: anthropic
Check your ANTHROPIC_API_KEY environment variable

Cause: Missing or incorrect API key.

Solution:

# Check if key is set
echo $ANTHROPIC_API_KEY

# Set the key
export ANTHROPIC_API_KEY="sk-ant-..."

# Restart the proxy
tokenpak serve

Prevention: - Use a .env file (see Installation) - Check key format (should start with sk-ant-, sk-, or AIza-) - Rotate expired keys immediately


3. Rate Limit Exceeded

Error Message:

RateLimitError: Rate limit exceeded (429)
Retry-After: 60

Cause: Too many requests to the provider in a short time.

Solution (Automatic): TokenPak automatically retries with exponential backoff:

Attempt 1: Wait 1 second, retry
Attempt 2: Wait 2 seconds, retry
Attempt 3: Wait 4 seconds, retry
Attempt 4: Wait 8 seconds, retry
(Circuit breaker opens, switch to fallback provider)

Solution (Manual):

When a request is rate-limited, the proxy returns a 429 with a rate_limit_exceeded error body. With the TokenPak SDK adapter, this surfaces as a TokenPakAdapterError carrying status_code == 429:

import time
from tokenpak.sdk import AnthropicAdapter
from tokenpak.sdk.base import TokenPakAdapterError

adapter = AnthropicAdapter(base_url="http://127.0.0.1:8766", api_key="sk-ant-...")
request = {"model": "claude-opus-4-8", "max_tokens": 100,
           "messages": [{"role": "user", "content": "Hello"}]}

try:
    response = adapter.call(request)
except TokenPakAdapterError as e:
    if e.status_code == 429:
        print("Rate limited. Waiting 60s...")
        time.sleep(60)
        response = adapter.call(request)
    else:
        raise

Prevention: - Implement request batching (fewer, larger requests) - Use fallback chains for load balancing - Monitor your request frequency


4. Model Not Found

Cause: Using a model name that the upstream provider doesn't support. The proxy forwards the request and the upstream provider rejects it — the error is passed back to the client as a 4xx with the provider's message.

Solution: Use a valid model name for the target provider. The proxy lists the models it knows about at GET /v1/models:

curl http://127.0.0.1:8766/v1/models

Common Model Names:

Provider Models
Anthropic claude-opus-4-8, claude-sonnet-4-6, claude-haiku-4-5
OpenAI gpt-4o, gpt-4-turbo, gpt-3.5-turbo
Google gemini-1.5-pro, gemini-1.5-flash

Prevention: Hardcode model names; don't accept user input directly.


5. Provider Timeout

Cause: The upstream provider took too long to respond. The TokenPak SDK adapter raises TokenPakTimeoutError when the proxy does not respond within timeout_s.

Solution (Manual):

from tokenpak.sdk import AnthropicAdapter
from tokenpak.sdk.base import TokenPakTimeoutError

adapter = AnthropicAdapter(
    base_url="http://127.0.0.1:8766",
    api_key="sk-ant-...",
    timeout_s=60.0,  # 60 second timeout
)

try:
    response = adapter.call(request)
except TokenPakTimeoutError:
    print("Proxy/upstream timed out — retry or use a fallback")

Prevention: - Set reasonable timeouts (timeout_s) - Configure fallback chains in config.yaml - Monitor provider status


6. Request Too Large

Cause: Request exceeds the target model's context window. The upstream provider rejects oversized requests; the error is passed back through the proxy.

Solution:

# Option 1: Reduce message size — keep only relevant context
short_context = "Summary of relevant context only..."

# Option 2: Let the proxy compress context automatically
#   (compression is enabled by default; tune via config.yaml / env vars)

# Option 3: Split into multiple smaller requests

Prevention: - Enable compression (on by default — see config.yaml) - Use vault context injection selectively - Preview compression savings on a file with tokenpak preview <file>


7. Invalid Configuration

Error Message:

ConfigError: Invalid config.yaml syntax at line 5:
  compression.enabled must be a boolean, got 'yes'

Cause: Malformed YAML or invalid option.

Solution:

# Wrong
compression:
  enabled: yes  # ❌ Should be true/false

# Right
compression:
  enabled: true  # ✅

Validation:

# Validate config before starting
tokenpak config validate

# Shows all errors

Prevention: - Use YAML validator: https://yamllint.com/ - Check indentation (spaces, not tabs) - Refer to Installation guide for examples


Fallback Chains & Circuit Breaker

TokenPak automatically switches providers when the primary fails.

How It Works

provider: anthropic
fallback:
  - google      # Try if Anthropic fails
  - openai      # Try if Google fails

Request flow:

1. Try Anthropic
   ├─ Success? ✅ Return response
   ├─ Timeout? → Try Google
   ├─ Rate limit? → Wait then retry
   └─ Permanent error? → Try Google

2. Try Google
   ├─ Success? ✅ Return response
   └─ Fail? → Try OpenAI

3. Try OpenAI
   ├─ Success? ✅ Return response
   └─ Fail? → Return error to client

Circuit Breaker

When a provider fails repeatedly, TokenPak opens the circuit breaker to prevent cascading failures:

State: CLOSED (normal operation)
  └─ 3 failures in 60 seconds → OPEN

State: OPEN (provider is down)
  └─ Skip to fallback provider
  └─ After 300 seconds → HALF_OPEN

State: HALF_OPEN (testing recovery)
  └─ Try 1 request
  ├─ Success? → CLOSED
  └─ Fail? → OPEN (restart 300s timer)

Configuration

fallback:
  - anthropic
  - google
  - openai

circuit_breaker:
  failure_threshold: 3      # Open after 3 failures
  recovery_timeout: 300     # Reset after 5 minutes
  half_open_requests: 1     # Test 1 request in half-open

Monitoring & Debugging

Enable Debug Logging

# Detailed logs
TOKENPAK_LOG_LEVEL=DEBUG tokenpak serve

# Write to file
tokenpak serve --log-file /tmp/tokenpak.log

Check Proxy Status

# Health check endpoint
curl http://127.0.0.1:8766/health

# Circuit breaker / degradation state
curl http://127.0.0.1:8766/circuit-breakers
curl http://127.0.0.1:8766/degradation

Inspect Recent Requests

# Stats for the most recent request
curl http://127.0.0.1:8766/stats/last

# Full pipeline trace of the last request
curl http://127.0.0.1:8766/trace/last

# All stored pipeline traces
curl http://127.0.0.1:8766/traces

# Export the request ledger as CSV
curl http://127.0.0.1:8766/v1/export/csv

Test Proxy Connectivity

from tokenpak.sdk import AnthropicAdapter

adapter = AnthropicAdapter(base_url="http://127.0.0.1:8766", api_key="sk-ant-...")

try:
    response = adapter.call({
        "model": "claude-opus-4-8",
        "max_tokens": 10,
        "messages": [{"role": "user", "content": "test"}],
    })
    print("✅ Reached upstream via the proxy")
except Exception as e:
    print(f"❌ Request failed: {e}")

Error Types Reference

Proxy HTTP Error Types

The proxy returns errors as a JSON object {"error": {"type": ..., "message": ...}}. Common types:

HTTP Status error.type Cause
400 bad_request Malformed request body
401 unauthorized Missing or invalid X-TokenPak-Key
403 forbidden Operation not allowed from this IP
404 not_found Unknown endpoint path
429 rate_limit_exceeded Too many requests from this IP
500 internal_error Proxy-side error
503 circuit_open Upstream provider circuit breaker open
503 upstream_unreachable Cannot reach upstream provider

Core Exception Classes (proxy / core)

Raised internally by the proxy and core library. Base class: TokenPakError.

Exception Cause
AuthenticationError / InvalidAPIKeyError / MissingAPIKeyError API key invalid or absent
RateLimitError Upstream rate limit hit
UpstreamError Upstream provider returned an error
CircuitOpenError Provider circuit breaker is open
SpendGuardBlocked Spend guard blocked the request
ProxyError Generic proxy-side failure
ConfigError / ConfigValidationError Invalid configuration
CacheError Cache subsystem failure
NetworkConnectionError / ProviderConnectionError Network/connection failure
PortInUseError Configured port already in use

SDK Adapter Exceptions

Raised by the tokenpak.sdk adapters. Base class: TokenPakAdapterError (import from tokenpak.sdk.base).

Exception Cause
TokenPakTimeoutError Proxy did not respond within timeout_s
TokenPakConfigError Missing required fields / bad config
TokenPakAuthError 401 or 403 from the proxy

Network Errors

Error Cause Solution
ConnectionRefusedError Proxy not running Start tokenpak serve
ConnectionError Network unreachable Check internet connection
SSLError Certificate validation failed Check CA certificates

Best Practices

1. Always Use Fallback Chains

provider: anthropic
fallback:
  - google
  - openai

2. Wrap Requests in Try-Catch

from tokenpak.sdk.base import (
    TokenPakAdapterError,
    TokenPakTimeoutError,
    TokenPakAuthError,
)

try:
    response = adapter.call(request)
except TokenPakTimeoutError:
    # Handle timeout
    pass
except TokenPakAuthError:
    # Handle auth error
    pass
except TokenPakAdapterError as e:
    # Handle other adapter errors (e.status_code carries the HTTP status)
    logger.error(f"Adapter error: {e}")

3. Implement Exponential Backoff

The proxy retries upstream failures automatically, but for custom client-side retries:

import time
from tokenpak.sdk.base import TokenPakAdapterError

def call_with_backoff(fn, max_attempts=3):
    for attempt in range(max_attempts):
        try:
            return fn()
        except TokenPakAdapterError as e:
            if e.status_code != 429:
                raise
            wait = 2 ** attempt  # 1, 2, 4 seconds
            print(f"Attempt {attempt + 1} rate-limited. Waiting {wait}s...")
            time.sleep(wait)
    raise Exception("All attempts failed")

4. Preview Compression Before Sending

# Dry-run compression on a file to estimate token savings
tokenpak preview prompt.txt

5. Set Timeouts

from tokenpak.sdk import AnthropicAdapter

adapter = AnthropicAdapter(
    base_url="http://127.0.0.1:8766",
    api_key="sk-ant-...",
    timeout_s=30.0,  # 30 second timeout
)

Getting Help

  • Question? Check this guide or the FAQ
  • Bug? Open an issue on GitHub

Next Steps