Error Handling & Troubleshooting¶
TokenPak provides normalized error handling across all providers, automatic retries, and fallback chains.
Common Errors & Solutions¶
1. Connection Refused (Proxy Not Running)¶
Error Message:
ConnectionRefusedError: [Errno 111] Connection refused
Failed to connect to http://127.0.0.1:8766
Cause: The TokenPak proxy server is not running.
Solution:
# Start the proxy
tokenpak serve
# (in another terminal)
python your_script.py
Prevention: Keep the proxy running in a background process or systemd service.
2. Authentication Failed (Invalid API Key)¶
Error Message:
AuthenticationError: Invalid API key for provider: anthropic
Check your ANTHROPIC_API_KEY environment variable
Cause: Missing or incorrect API key.
Solution:
# Check if key is set
echo $ANTHROPIC_API_KEY
# Set the key
export ANTHROPIC_API_KEY="sk-ant-..."
# Restart the proxy
tokenpak serve
Prevention:
- Use a .env file (see Installation)
- Check key format (should start with sk-ant-, sk-, or AIza-)
- Rotate expired keys immediately
3. Rate Limit Exceeded¶
Error Message:
RateLimitError: Rate limit exceeded (429)
Retry-After: 60
Cause: Too many requests to the provider in a short time.
Solution (Automatic): TokenPak automatically retries with exponential backoff:
Attempt 1: Wait 1 second, retry
Attempt 2: Wait 2 seconds, retry
Attempt 3: Wait 4 seconds, retry
Attempt 4: Wait 8 seconds, retry
(Circuit breaker opens, switch to fallback provider)
Solution (Manual):
When a request is rate-limited, the proxy returns a 429 with a rate_limit_exceeded error body. With the TokenPak SDK adapter, this surfaces as a TokenPakAdapterError carrying status_code == 429:
import time
from tokenpak.sdk import AnthropicAdapter
from tokenpak.sdk.base import TokenPakAdapterError
adapter = AnthropicAdapter(base_url="http://127.0.0.1:8766", api_key="sk-ant-...")
request = {"model": "claude-opus-4-8", "max_tokens": 100,
"messages": [{"role": "user", "content": "Hello"}]}
try:
response = adapter.call(request)
except TokenPakAdapterError as e:
if e.status_code == 429:
print("Rate limited. Waiting 60s...")
time.sleep(60)
response = adapter.call(request)
else:
raise
Prevention: - Implement request batching (fewer, larger requests) - Use fallback chains for load balancing - Monitor your request frequency
4. Model Not Found¶
Cause: Using a model name that the upstream provider doesn't support. The proxy forwards the request and the upstream provider rejects it — the error is passed back to the client as a 4xx with the provider's message.
Solution: Use a valid model name for the target provider. The proxy lists the models it knows about at GET /v1/models:
curl http://127.0.0.1:8766/v1/models
Common Model Names:
| Provider | Models |
|---|---|
| Anthropic | claude-opus-4-8, claude-sonnet-4-6, claude-haiku-4-5 |
| OpenAI | gpt-4o, gpt-4-turbo, gpt-3.5-turbo |
gemini-1.5-pro, gemini-1.5-flash |
Prevention: Hardcode model names; don't accept user input directly.
5. Provider Timeout¶
Cause: The upstream provider took too long to respond. The TokenPak SDK adapter raises TokenPakTimeoutError when the proxy does not respond within timeout_s.
Solution (Manual):
from tokenpak.sdk import AnthropicAdapter
from tokenpak.sdk.base import TokenPakTimeoutError
adapter = AnthropicAdapter(
base_url="http://127.0.0.1:8766",
api_key="sk-ant-...",
timeout_s=60.0, # 60 second timeout
)
try:
response = adapter.call(request)
except TokenPakTimeoutError:
print("Proxy/upstream timed out — retry or use a fallback")
Prevention:
- Set reasonable timeouts (timeout_s)
- Configure fallback chains in config.yaml
- Monitor provider status
6. Request Too Large¶
Cause: Request exceeds the target model's context window. The upstream provider rejects oversized requests; the error is passed back through the proxy.
Solution:
# Option 1: Reduce message size — keep only relevant context
short_context = "Summary of relevant context only..."
# Option 2: Let the proxy compress context automatically
# (compression is enabled by default; tune via config.yaml / env vars)
# Option 3: Split into multiple smaller requests
Prevention:
- Enable compression (on by default — see config.yaml)
- Use vault context injection selectively
- Preview compression savings on a file with tokenpak preview <file>
7. Invalid Configuration¶
Error Message:
ConfigError: Invalid config.yaml syntax at line 5:
compression.enabled must be a boolean, got 'yes'
Cause: Malformed YAML or invalid option.
Solution:
# Wrong
compression:
enabled: yes # ❌ Should be true/false
# Right
compression:
enabled: true # ✅
Validation:
# Validate config before starting
tokenpak config validate
# Shows all errors
Prevention: - Use YAML validator: https://yamllint.com/ - Check indentation (spaces, not tabs) - Refer to Installation guide for examples
Fallback Chains & Circuit Breaker¶
TokenPak automatically switches providers when the primary fails.
How It Works¶
provider: anthropic
fallback:
- google # Try if Anthropic fails
- openai # Try if Google fails
Request flow:
1. Try Anthropic
├─ Success? ✅ Return response
├─ Timeout? → Try Google
├─ Rate limit? → Wait then retry
└─ Permanent error? → Try Google
2. Try Google
├─ Success? ✅ Return response
└─ Fail? → Try OpenAI
3. Try OpenAI
├─ Success? ✅ Return response
└─ Fail? → Return error to client
Circuit Breaker¶
When a provider fails repeatedly, TokenPak opens the circuit breaker to prevent cascading failures:
State: CLOSED (normal operation)
└─ 3 failures in 60 seconds → OPEN
State: OPEN (provider is down)
└─ Skip to fallback provider
└─ After 300 seconds → HALF_OPEN
State: HALF_OPEN (testing recovery)
└─ Try 1 request
├─ Success? → CLOSED
└─ Fail? → OPEN (restart 300s timer)
Configuration¶
fallback:
- anthropic
- google
- openai
circuit_breaker:
failure_threshold: 3 # Open after 3 failures
recovery_timeout: 300 # Reset after 5 minutes
half_open_requests: 1 # Test 1 request in half-open
Monitoring & Debugging¶
Enable Debug Logging¶
# Detailed logs
TOKENPAK_LOG_LEVEL=DEBUG tokenpak serve
# Write to file
tokenpak serve --log-file /tmp/tokenpak.log
Check Proxy Status¶
# Health check endpoint
curl http://127.0.0.1:8766/health
# Circuit breaker / degradation state
curl http://127.0.0.1:8766/circuit-breakers
curl http://127.0.0.1:8766/degradation
Inspect Recent Requests¶
# Stats for the most recent request
curl http://127.0.0.1:8766/stats/last
# Full pipeline trace of the last request
curl http://127.0.0.1:8766/trace/last
# All stored pipeline traces
curl http://127.0.0.1:8766/traces
# Export the request ledger as CSV
curl http://127.0.0.1:8766/v1/export/csv
Test Proxy Connectivity¶
from tokenpak.sdk import AnthropicAdapter
adapter = AnthropicAdapter(base_url="http://127.0.0.1:8766", api_key="sk-ant-...")
try:
response = adapter.call({
"model": "claude-opus-4-8",
"max_tokens": 10,
"messages": [{"role": "user", "content": "test"}],
})
print("✅ Reached upstream via the proxy")
except Exception as e:
print(f"❌ Request failed: {e}")
Error Types Reference¶
Proxy HTTP Error Types¶
The proxy returns errors as a JSON object {"error": {"type": ..., "message": ...}}. Common types:
| HTTP Status | error.type |
Cause |
|---|---|---|
| 400 | bad_request |
Malformed request body |
| 401 | unauthorized |
Missing or invalid X-TokenPak-Key |
| 403 | forbidden |
Operation not allowed from this IP |
| 404 | not_found |
Unknown endpoint path |
| 429 | rate_limit_exceeded |
Too many requests from this IP |
| 500 | internal_error |
Proxy-side error |
| 503 | circuit_open |
Upstream provider circuit breaker open |
| 503 | upstream_unreachable |
Cannot reach upstream provider |
Core Exception Classes (proxy / core)¶
Raised internally by the proxy and core library. Base class: TokenPakError.
| Exception | Cause |
|---|---|
AuthenticationError / InvalidAPIKeyError / MissingAPIKeyError |
API key invalid or absent |
RateLimitError |
Upstream rate limit hit |
UpstreamError |
Upstream provider returned an error |
CircuitOpenError |
Provider circuit breaker is open |
SpendGuardBlocked |
Spend guard blocked the request |
ProxyError |
Generic proxy-side failure |
ConfigError / ConfigValidationError |
Invalid configuration |
CacheError |
Cache subsystem failure |
NetworkConnectionError / ProviderConnectionError |
Network/connection failure |
PortInUseError |
Configured port already in use |
SDK Adapter Exceptions¶
Raised by the tokenpak.sdk adapters. Base class: TokenPakAdapterError (import from tokenpak.sdk.base).
| Exception | Cause |
|---|---|
TokenPakTimeoutError |
Proxy did not respond within timeout_s |
TokenPakConfigError |
Missing required fields / bad config |
TokenPakAuthError |
401 or 403 from the proxy |
Network Errors¶
| Error | Cause | Solution |
|---|---|---|
ConnectionRefusedError |
Proxy not running | Start tokenpak serve |
ConnectionError |
Network unreachable | Check internet connection |
SSLError |
Certificate validation failed | Check CA certificates |
Best Practices¶
1. Always Use Fallback Chains¶
provider: anthropic
fallback:
- google
- openai
2. Wrap Requests in Try-Catch¶
from tokenpak.sdk.base import (
TokenPakAdapterError,
TokenPakTimeoutError,
TokenPakAuthError,
)
try:
response = adapter.call(request)
except TokenPakTimeoutError:
# Handle timeout
pass
except TokenPakAuthError:
# Handle auth error
pass
except TokenPakAdapterError as e:
# Handle other adapter errors (e.status_code carries the HTTP status)
logger.error(f"Adapter error: {e}")
3. Implement Exponential Backoff¶
The proxy retries upstream failures automatically, but for custom client-side retries:
import time
from tokenpak.sdk.base import TokenPakAdapterError
def call_with_backoff(fn, max_attempts=3):
for attempt in range(max_attempts):
try:
return fn()
except TokenPakAdapterError as e:
if e.status_code != 429:
raise
wait = 2 ** attempt # 1, 2, 4 seconds
print(f"Attempt {attempt + 1} rate-limited. Waiting {wait}s...")
time.sleep(wait)
raise Exception("All attempts failed")
4. Preview Compression Before Sending¶
# Dry-run compression on a file to estimate token savings
tokenpak preview prompt.txt
5. Set Timeouts¶
from tokenpak.sdk import AnthropicAdapter
adapter = AnthropicAdapter(
base_url="http://127.0.0.1:8766",
api_key="sk-ant-...",
timeout_s=30.0, # 30 second timeout
)
Getting Help¶
Next Steps¶
- Monitoring: See Observability Guide
- Performance: Check Feature Matrix for optimization tips
- Adapters: See Adapter Reference for provider-specific notes