Skip to content

TokenPak Documentation Index

Complete documentation for TokenPak — the open-source LLM proxy for context compression, cost tracking, and intelligent routing.


Document Description
README Project overview, installation, quick start
Getting Started 5-minute setup guide
CLI Reference All commands and flags
API Reference Python API for programmatic use

🚀 Getting Started

  1. Installation — pip, source, Docker
  2. Start the Proxytokenpak serve
  3. Connect Your Client — Claude Code, OpenAI, etc.
  4. Verify It Workstokenpak status

📖 Core Documentation

Architecture & Design

Document What It Covers
ARCHITECTURE.md System design, compression pipeline, block registry
Compression Deep Dive How compression works, modes, recipes
Cache System LRU cache, vault registry, change detection

Operations & Deployment

Document What It Covers
DEPLOYMENT.md Production deployment, systemd, Docker, scaling
Troubleshooting Symptom-first problem solving (12 categories)
Error Codes Full error code reference (TP-Exxx)
Telemetry Cost tracking, privacy model, data retention
FAQ Common questions and troubleshooting

Guides

Guide What You'll Learn
Proxy Setup Multi-provider routing, SSL, authentication
Recipe Development Custom compression recipes
Telemetry Dashboard Cost reports, export, alerts
Team Server Shared instance for teams

🔧 Reference

CLI Commands

# Core
tokenpak serve           # Start proxy
tokenpak status          # Health check
tokenpak cost            # Cost report
tokenpak savings         # Token savings

# Compression
tokenpak compress        # Dry-run compression
tokenpak demo            # Live demo
tokenpak trace           # Debug pipeline

# Vault
tokenpak index           # Index directory
tokenpak vault search    # Semantic search
tokenpak calibrate       # Auto-tune performance

# Routing
tokenpak route add       # Add routing rule
tokenpak route list      # List rules

See CLI Reference for complete documentation.

Python API

from tokenpak import (
    TelemetryCollector,    # Usage tracking
    CacheManager,          # Token cache
    CompressionEngine,     # Compression base class
    HeuristicEngine,       # Rule-based compression
    Budgeter,              # Token budget allocation
    BlockRegistry,         # Content-addressed storage
)

See API Reference for full class documentation.


🤝 Contributing

Document What It Covers
CONTRIBUTING.md Development setup, testing, PR process
Recipe SDK Building custom compression recipes

Quick Dev Setup

git clone https://github.com/tokenpak/tokenpak
cd tokenpak
pip install -e ".[dev]"
pytest

📊 Performance

Metric Value
Token reduction 43–84%
Indexing throughput 2,700+ files/sec
Search latency ~23ms
Cold start < 100ms

See ARCHITECTURE.md for benchmarks.



📄 License

TokenPak is released under the MIT License.