Feature Matrix — FREE vs PRO¶
All core proxy features are FREE and open source. Advanced features are in the optional PRO package.
Quick Reference¶
| Category | Feature | FREE | PRO | Notes |
|---|---|---|---|---|
| Core Routing | Multiple provider adapters | ✅ | ✅ | 5 adapters built-in |
| Fallback chains | ✅ | ✅ | Auto-failover to backup providers | |
| Circuit breaker | ✅ | ✅ | Recovers from rate limits | |
| Custom routing rules | ❌ | ✅ | Cost-based, latency-based logic | |
| Token Management | Token counting (all providers) | ✅ | ✅ | Unified across Anthropic, OpenAI, Google |
| Cost tracking | ✅ | ✅ | Basic tracking + reporting | |
| Cost alerts | ❌ | ✅ | Notifications when exceeding budget | |
| Budget enforcement | ❌ | ✅ | Auto-reject requests exceeding limit | |
| Compression | Deduplication | ✅ | ✅ | Remove repeated content |
| Document compression | ✅ | ✅ | Summarize long docs | |
| Instruction table | ✅ | ✅ | Compress repetitive instructions | |
| Dictionary compression | ❌ | ✅ | Custom compression dictionaries | |
| Code/log extraction | ❌ | ✅ | Smart code and log parsing | |
| Alias compression | ❌ | ✅ | Shorten variable names, function calls | |
| Error Handling | Normalized error messages | ✅ | ✅ | Consistent across providers |
| Automatic retries | ✅ | ✅ | Exponential backoff, configurable | |
| Retry orchestration | ❌ | ✅ | Advanced retry strategies, decision trees | |
| Error telemetry | ✅ | ✅ | Log error types and frequency | |
| Observability | Request/response logging | ✅ | ✅ | JSON logs, searchable |
| Cost dashboard (web UI) | ❌ | ✅ | Real-time cost, savings, usage charts | |
| Token usage reports | ✅ | ✅ | CSV export, JSON export | |
| Live request explorer | ❌ | ✅ | Debug individual requests | |
| Agentic | Error normalization | ✅ | ✅ | Convert errors to agent-readable format |
| Streaming support | ✅ | ✅ | Handle streaming + non-streaming | |
| Function call translation | ❌ | ✅ | Convert between provider formats | |
| Tool/function use optimization | ❌ | ✅ | Smart function routing | |
| Vault Integration | Document indexing | ✅ | ✅ | Index local files (.md, .txt, .pdf) |
| Semantic search | ✅ | ✅ | Search vault by meaning | |
| Auto-injection | ✅ | ✅ | Automatically add relevant docs to context | |
| Symbol extraction | ✅ | ✅ | Extract functions, classes, variables | |
| AST parsing | ✅ | ✅ | Parse code structure | |
| Chunk optimization | ✅ | ✅ | Smart chunking for injection | |
| Watcher mode | ✅ | ✅ | Live re-index on file changes | |
| Framework Integration | LangChain adapter | ❌ | ✅ | Drop-in replacement for LangChain |
| CrewAI adapter | ❌ | ✅ | CrewAI integration | |
| AutoGen adapter | ❌ | ✅ | Microsoft AutoGen support | |
| Ollama integration | ❌ | ✅ | Local LLM support | |
| CLI | serve command |
✅ | ✅ | Start the proxy |
count command |
✅ | ✅ | Count tokens in a file | |
compress command |
✅ | ✅ | Test compression on a document | |
validate command |
✅ | ✅ | Check config and connectivity | |
dashboard command |
❌ | ✅ | Launch web dashboard | |
report command |
✅ | ✅ | Generate usage reports |
FREE Features in Detail¶
Core Proxy¶
R1-R12 — Provider routing, adapters, tool schema handling, fallback chains, circuit breaker, streaming, passthrough
from tokenpak import Client
# Works out-of-the-box
client = Client(api_key="...", model="claude-opus-4-6")
response = client.messages.create(...)
Token Counting¶
T2 — Accurate token counts across all providers
tokens = client.count_tokens(
model="claude-opus-4-6",
messages=[{"role": "user", "content": "..."}]
)
Basic Compression¶
C1, C2, C4, C8, C9, C10, C11, C13 — Deduplication, doc compression, instruction table, budget tracking, fidelity tiers
Automatically applied. Semantic equivalence guaranteed.
Error Handling¶
A3, A5 — Normalized errors, automatic retries with exponential backoff
# Retries automatically with circuit breaker
response = client.messages.create(...)
# If provider fails, falls back to next in chain
Vault Features¶
V1-V8 — Indexing, search, auto-injection, symbol extraction, AST parsing, chunking, watcher, SQLite backend
vault:
enabled: true
root: "~/my-vault"
auto_inject: true # Automatically add relevant docs
PRO Features in Detail (Upgrade Path)¶
Advanced Compression¶
- Dictionary compression (C7) — Custom compression tables for your domain
- Code extraction (C5) — Smart parsing and compression of code blocks
- Log extraction (C6) — Compress repetitive logs
- Alias compression (C8) — Shorten variable names intelligently
Cost Management¶
- Budget enforcement — Reject requests exceeding limit
- Cost alerts — Email/webhook when approaching budget
- Advanced routing — Route by cost, latency, capability
Dashboard¶
- Real-time cost tracking
- Token usage charts
- Request history explorer
- Model comparison (cost vs latency)
- Savings calculator
Framework Adapters¶
- LangChain integration (drop-in replacement)
- CrewAI support
- Microsoft AutoGen
- Ollama (local LLM)
- Custom adapter builder
Advanced Agentic¶
- Function call optimization
- Retry orchestration (conditional retries based on error type)
- Capability-aware routing (route by function support)
Choosing FREE vs PRO¶
Use FREE if you:¶
- ✅ Building a prototype or MVP
- ✅ Want open source with no licensing headaches
- ✅ Need multi-provider routing (core feature)
- ✅ Want to understand how it works
- ✅ Running on a budget
Upgrade to PRO if you:¶
- 🚀 Running in production with high volume
- 🚀 Need advanced compression (30-60% more savings)
- 🚀 Want budget enforcement (cost control)
- 🚀 Need dashboard (real-time monitoring)
- 🚀 Using frameworks (LangChain, CrewAI)
- 🚀 Want priority support
Installation & Usage¶
FREE Package¶
pip install tokenpak
tokenpak serve
PRO Package¶
pip install tokenpak-pro
# (includes all FREE features + advanced ones)
tokenpak serve --pro
Both can run side-by-side. PRO seamlessly upgrades FREE.
Configuration Reference¶
Basic config.yaml (FREE)¶
proxy:
port: 8000
provider: anthropic
fallback:
- google
- openai
compression:
enabled: true
telemetry:
enabled: true
log_file: /tmp/tokenpak.log
vault:
enabled: true
root: ~/my-vault
Advanced config.yaml (PRO)¶
Adds:
- routing.strategy (cost, latency, capability)
- budget.limit (enforce spending cap)
- alerts.webhook (cost alerts)
- compression.custom_dictionaries (domain-specific)
- framework.langchain (enable LangChain integration)
Feature Roadmap¶
Coming Soon (FREE)¶
- Multi-turn conversation history management
- Prompt caching integration
- Vision/multimodal support
Coming Soon (PRO)¶
- Fine-tuning pipeline integration
- Advanced batch processing
- Team/workspace management
Support & Licensing¶
FREE — MIT License. Use however you like.
PRO — Commercial license. Support options available.
See README for more information.