Skip to content

Feature Matrix — FREE vs PRO

All core proxy features are FREE and open source. Advanced features are in the optional PRO package.


Quick Reference

Category Feature FREE PRO Notes
Core Routing Multiple provider adapters 5 adapters built-in
Fallback chains Auto-failover to backup providers
Circuit breaker Recovers from rate limits
Custom routing rules Cost-based, latency-based logic
Token Management Token counting (all providers) Unified across Anthropic, OpenAI, Google
Cost tracking Basic tracking + reporting
Cost alerts Notifications when exceeding budget
Budget enforcement Auto-reject requests exceeding limit
Compression Deduplication Remove repeated content
Document compression Summarize long docs
Instruction table Compress repetitive instructions
Dictionary compression Custom compression dictionaries
Code/log extraction Smart code and log parsing
Alias compression Shorten variable names, function calls
Error Handling Normalized error messages Consistent across providers
Automatic retries Exponential backoff, configurable
Retry orchestration Advanced retry strategies, decision trees
Error telemetry Log error types and frequency
Observability Request/response logging JSON logs, searchable
Cost dashboard (web UI) Real-time cost, savings, usage charts
Token usage reports CSV export, JSON export
Live request explorer Debug individual requests
Agentic Error normalization Convert errors to agent-readable format
Streaming support Handle streaming + non-streaming
Function call translation Convert between provider formats
Tool/function use optimization Smart function routing
Vault Integration Document indexing Index local files (.md, .txt, .pdf)
Semantic search Search vault by meaning
Auto-injection Automatically add relevant docs to context
Symbol extraction Extract functions, classes, variables
AST parsing Parse code structure
Chunk optimization Smart chunking for injection
Watcher mode Live re-index on file changes
Framework Integration LangChain adapter Drop-in replacement for LangChain
CrewAI adapter CrewAI integration
AutoGen adapter Microsoft AutoGen support
Ollama integration Local LLM support
CLI serve command Start the proxy
count command Count tokens in a file
compress command Test compression on a document
validate command Check config and connectivity
dashboard command Launch web dashboard
report command Generate usage reports

FREE Features in Detail

Core Proxy

R1-R12 — Provider routing, adapters, tool schema handling, fallback chains, circuit breaker, streaming, passthrough

from tokenpak import Client

# Works out-of-the-box
client = Client(api_key="...", model="claude-opus-4-6")
response = client.messages.create(...)

Token Counting

T2 — Accurate token counts across all providers

tokens = client.count_tokens(
    model="claude-opus-4-6",
    messages=[{"role": "user", "content": "..."}]
)

Basic Compression

C1, C2, C4, C8, C9, C10, C11, C13 — Deduplication, doc compression, instruction table, budget tracking, fidelity tiers

Automatically applied. Semantic equivalence guaranteed.

Error Handling

A3, A5 — Normalized errors, automatic retries with exponential backoff

# Retries automatically with circuit breaker
response = client.messages.create(...)
# If provider fails, falls back to next in chain

Vault Features

V1-V8 — Indexing, search, auto-injection, symbol extraction, AST parsing, chunking, watcher, SQLite backend

vault:
  enabled: true
  root: "~/my-vault"
  auto_inject: true  # Automatically add relevant docs

PRO Features in Detail (Upgrade Path)

Advanced Compression

  • Dictionary compression (C7) — Custom compression tables for your domain
  • Code extraction (C5) — Smart parsing and compression of code blocks
  • Log extraction (C6) — Compress repetitive logs
  • Alias compression (C8) — Shorten variable names intelligently

Cost Management

  • Budget enforcement — Reject requests exceeding limit
  • Cost alerts — Email/webhook when approaching budget
  • Advanced routing — Route by cost, latency, capability

Dashboard

  • Real-time cost tracking
  • Token usage charts
  • Request history explorer
  • Model comparison (cost vs latency)
  • Savings calculator

Framework Adapters

  • LangChain integration (drop-in replacement)
  • CrewAI support
  • Microsoft AutoGen
  • Ollama (local LLM)
  • Custom adapter builder

Advanced Agentic

  • Function call optimization
  • Retry orchestration (conditional retries based on error type)
  • Capability-aware routing (route by function support)

Choosing FREE vs PRO

Use FREE if you:

  • ✅ Building a prototype or MVP
  • ✅ Want open source with no licensing headaches
  • ✅ Need multi-provider routing (core feature)
  • ✅ Want to understand how it works
  • ✅ Running on a budget

Upgrade to PRO if you:

  • 🚀 Running in production with high volume
  • 🚀 Need advanced compression (30-60% more savings)
  • 🚀 Want budget enforcement (cost control)
  • 🚀 Need dashboard (real-time monitoring)
  • 🚀 Using frameworks (LangChain, CrewAI)
  • 🚀 Want priority support

Installation & Usage

FREE Package

pip install tokenpak
tokenpak serve

PRO Package

pip install tokenpak-pro
# (includes all FREE features + advanced ones)
tokenpak serve --pro

Both can run side-by-side. PRO seamlessly upgrades FREE.


Configuration Reference

Basic config.yaml (FREE)

proxy:
  port: 8000

provider: anthropic
fallback:
  - google
  - openai

compression:
  enabled: true

telemetry:
  enabled: true
  log_file: /tmp/tokenpak.log

vault:
  enabled: true
  root: ~/my-vault

Advanced config.yaml (PRO)

Adds: - routing.strategy (cost, latency, capability) - budget.limit (enforce spending cap) - alerts.webhook (cost alerts) - compression.custom_dictionaries (domain-specific) - framework.langchain (enable LangChain integration)


Feature Roadmap

Coming Soon (FREE)

  • Multi-turn conversation history management
  • Prompt caching integration
  • Vision/multimodal support

Coming Soon (PRO)

  • Fine-tuning pipeline integration
  • Advanced batch processing
  • Team/workspace management

Support & Licensing

FREE — MIT License. Use however you like.

PRO — Commercial license. Support options available.

See README for more information.