Skip to content

Feature Matrix

All features are FREE and open source under the Apache 2.0 license.


Core Features

Category Feature Status Notes
Core Routing Multiple provider adapters Built-in adapters: Anthropic, OpenAI, Google, xAI/Grok, Ollama
Fallback chains Auto-failover to backup providers
Circuit breaker Recovers from rate limits
Token Management Token counting (all providers) Unified across Anthropic, OpenAI, Google
Cost tracking Basic tracking + reporting
Compression Deduplication Remove repeated content
Document compression Summarize long docs
Instruction table Compress repetitive instructions
Error Handling Normalized error messages Consistent across providers
Automatic retries Exponential backoff, configurable
Error telemetry Log error types and frequency
Observability Request/response logging JSON logs, searchable
Token usage reports CSV export, JSON export
Agentic Error normalization Convert errors to agent-readable format
Streaming support Handle streaming + non-streaming
Vault Integration Document indexing Index local files (.md, .txt, .pdf)
Semantic search Search vault by meaning
Auto-injection Automatically add relevant docs to context
Symbol extraction Extract functions, classes, variables
AST parsing Parse code structure
Chunk optimization Smart chunking for injection
Watcher mode Live re-index on file changes
CLI serve command Start the proxy
preview command Dry-run compression on a file (shows token savings)
benchmark command Test compression on a document
validate command Validate a TokenPak JSON file against the schema
report command Generate usage reports

Feature Details

Core Proxy

Provider routing, adapters, tool schema handling, fallback chains, circuit breaker, streaming, passthrough.

TokenPak's primary usage model is to run the local proxy and point your existing provider SDK or tool at it via a base URL — no application code changes required:

# Start the proxy (default: http://127.0.0.1:8766)
tokenpak serve
# Use the standard Anthropic SDK, routed through TokenPak
import anthropic

client = anthropic.Anthropic(
    base_url="http://127.0.0.1:8766",
    api_key="sk-ant-...",
)
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
)

Token & Cost Tracking

TokenPak tracks per-request token usage and savings server-side. Inspect them over HTTP or via the CLI:

curl http://127.0.0.1:8766/stats/last    # most recent request
tokenpak stats                           # registry totals
tokenpak savings                         # tokens and cost saved

Compression

Deduplication, doc compression, instruction table, budget tracking, fidelity tiers

Automatically applied. Semantic equivalence guaranteed.

Error Handling

Normalized errors, automatic retries with exponential backoff, and circuit breaking are applied transparently by the proxy. When an upstream provider fails repeatedly, the proxy opens a circuit breaker (visible at GET /circuit-breakers) and surfaces a consistent JSON error to the client. See the Error Handling Guide.

Vault Features

Indexing, search, auto-injection, symbol extraction, AST parsing, chunking, watcher, SQLite backend

vault:
  enabled: true
  root: "~/my-vault"
  auto_inject: true  # Automatically add relevant docs

Installation & Usage

pip install tokenpak
tokenpak serve

Configuration Reference

Basic config.yaml

proxy:
  port: 8766
  host: 127.0.0.1

provider: anthropic
fallback:
  - google
  - openai

compression:
  enabled: true

telemetry:
  enabled: true
  log_file: /tmp/tokenpak.log

vault:
  enabled: true
  root: ~/my-vault

Feature Roadmap

These items are on the roadmap and are not part of the current OSS beta surface — they are directions, not commitments. For exactly what ships in the beta today, see the feature overview.

  • Multi-turn conversation history management
  • Vision / multimodal support
  • Advanced batch processing

TokenPak already ships deterministic Prompt Packing today (see the feature overview); it interoperates with provider-side prompt caching rather than replacing it.


Support & Licensing

License: Apache 2.0. Use however you like.

See README for more information.