TokenPak Testing Guide¶

Overview¶

TokenPak uses pytest for unit testing, with structured test organization and coverage tracking. This guide covers how to run, write, and maintain tests — especially Phase 2 coverage additions.

Quick Start¶

Run all tests¶

pytest tests/ -v

Run Phase 2 tests (validation + error + cache)¶

pytest tests/test_validation*.py tests/test_error*.py tests/test_cache*.py -v

Run with coverage¶

pytest tests/ --cov=tokenpak --cov-report=html
open htmlcov/index.html

Run a single test file¶

pytest tests/test_validation.py -v

Run tests matching a pattern¶

pytest -k "test_schema" -v

Test Organization¶

tests/
├── test_validation.py          # Phase 2: Input validation, schemas
├── test_error_handling.py       # Phase 2: Error recovery, logging
├── test_cache.py                # Phase 2: Cache behavior, TTL, concurrency
├── test_adapters.py             # Core: Adapter implementations
├── test_routing.py              # Core: Provider routing logic
├── test_compression.py          # Core: Token compression
├── test_integration.py          # Integration tests (slow)
└── conftest.py                  # Shared fixtures

Test Markers¶

Tests are marked with @pytest.mark to enable selective running:

Marker	Use Case	Run Command
`unit`	Fast unit tests	`pytest -m unit`
`integration`	Requires external services	`pytest -m integration`
`slow`	>1 second per test	`pytest -m "not slow"`
`phase2`	Phase 2 coverage additions	`pytest -m phase2`
`xfail`	Known failures (expected)	Displayed in report

Run unit tests only¶

pytest -m "not integration and not slow" -v

Run Phase 2 tests with coverage¶

pytest -m phase2 \
  --cov=tokenpak.validation \
  --cov=tokenpak.core.error_handling \
  --cov=tokenpak.cache \
  --cov-report=term-missing

Phase 2 Test Scope¶

Phase 2 adds comprehensive testing for three critical modules:

Validation (`tests/test_validation.py`)¶

Tests input validation, schema enforcement, and type checking.

Coverage areas: - Valid config parsing - Invalid/missing required fields - Type coercion (str → int, bool, etc.) - Default value handling - Nested object validation - Array/list validation

Example:

@pytest.mark.phase2
def test_schema_validation_missing_required_field():
    """Invalid config missing 'provider' field should raise ValidationError."""
    config = {"model": "gpt-4"}  # Missing required 'provider'
    with pytest.raises(ValidationError):
        validate_config(config)

Error Handling (`tests/test_error_handling.py`)¶

Tests error recovery, logging, and context propagation.

Coverage areas: - API errors (timeout, auth failure, rate limit) - Fallback logic (retry, alternative provider) - Error logging and context - Circuit breaker behavior - Graceful degradation

Example:

@pytest.mark.phase2
def test_retry_on_timeout():
    """Timeout should trigger automatic retry."""
    with mock.patch('requests.post', side_effect=Timeout()):
        result = call_with_retry(max_retries=2)
        assert mock.patch.call_count == 2

Cache Management (`tests/test_cache.py`)¶

Tests cache hit/miss behavior, TTL expiry, and concurrent access.

Coverage areas: - Cache hit (key found, fresh) - Cache miss (key not found or expired) - TTL expiry handling - Cache invalidation - LRU eviction - Concurrent access / race conditions - Size limits

Example:

@pytest.mark.phase2
def test_cache_ttl_expiry():
    """Item should expire after TTL."""
    cache = LRUCache(ttl_seconds=1)
    cache.set("key", "value")
    assert cache.get("key") == "value"

    time.sleep(1.1)  # Wait past TTL
    assert cache.get("key") is None

Writing Tests¶

Test Structure¶

import pytest
from unittest import mock
from tokenpak.validation import validate_config
from tokenpak.core.error_handling import ValidationError

@pytest.mark.phase2
def test_meaningful_name_describes_behavior():
    """
    Docstring: 1-2 sentences explaining what is being tested and why.
    """
    # Arrange: Set up test data and mocks
    config = {"provider": "anthropic", "model": "claude-3-sonnet"}

    # Act: Execute the code under test
    result = validate_config(config)

    # Assert: Verify expected behavior
    assert result.is_valid
    assert result.provider == "anthropic"

Fixtures (Reusable Setup)¶

Define shared test data in conftest.py:

@pytest.fixture
def valid_config():
    """A minimal valid TokenPak config."""
    return {
        "provider": "anthropic",
        "model": "claude-3-sonnet",
        "max_tokens": 4096,
    }

@pytest.fixture
def mock_api_response():
    """Mock a successful API response."""
    return {
        "id": "msg_123",
        "content": "Hello, world!",
        "usage": {"input_tokens": 10, "output_tokens": 5}
    }

Use in tests:

def test_with_fixture(valid_config, mock_api_response):
    result = process_config(valid_config)
    assert result is not None

Mocking External Dependencies¶

from unittest import mock

@pytest.mark.phase2
def test_with_mock():
    """Test in isolation by mocking external calls."""
    with mock.patch('tokenpak.sdk.anthropic.call_api') as mock_call:
        mock_call.return_value = {"content": "Mocked response"}

        result = adapter.get_completion("Hello")

        assert result == "Mocked response"
        mock_call.assert_called_once()

Coverage Reports¶

Understand the HTML Report¶

pytest tests/ --cov=tokenpak --cov-report=html
open htmlcov/index.html

Red lines = Not covered (no test path) Yellow lines = Partially covered (some branches not executed) Green lines = Fully covered

Coverage by Module¶

pytest tests/ --cov=tokenpak --cov-report=term-missing

Example output:

Name                                   Stmts   Miss Cover   Missing
------------------------------------------------------------------------
tokenpak/__init__.py                      5      0   100%
tokenpak/sdk/base.py                      60      8    87%   120-125,134
tokenpak/validation/__init__.py          45      3    93%   78,102,115
tokenpak/cache/__init__.py               55     12    78%   45-60,88-92
------------------------------------------------------------------------
TOTAL                                   165     23    86%

Missing column shows line numbers not executed by any test.

Find Untested Code Paths¶

Look for high "Miss" counts in critical modules
Read the "Missing" line numbers
Add tests to cover those paths
Re-run coverage to verify improvement

Continuous Integration¶

GitHub Actions¶

Coverage is checked automatically on every PR:

Phase 2 tests run on Python 3.11
Tier-1 gate enforced — fails if validation+error+cache < 50%
Coverage report uploaded to Codecov
Delta comment posted on PR (coverage change)

On PR merge, coverage baseline is updated for next comparison.

Pre-commit Hook¶

Local coverage check before every commit:

git commit  # Automatically runs coverage-check.sh

Warns if coverage < 45%
Fails if coverage < 40%
Override: git commit --no-verify

Common Issues & Fixes¶

Coverage not updating?¶

Check 1: Are you running the right test file?

pytest tests/test_validation.py -v  # Should run the tests you added

Check 2: Is the code path being executed?

pytest tests/test_validation.py --cov=tokenpak.validation --cov-report=term-missing
# Check "Missing" lines — add tests for those

Check 3: Is coverage being measured correctly?

# Regenerate coverage DB
rm -rf .coverage htmlcov/
pytest tests/ --cov=tokenpak --cov-report=html

Test is flaky (fails randomly)?¶

Look for: Time-dependent code, random data, external API calls
Fix with: Mocks, fixtures, freezegun for time, @pytest.mark.flaky
Never: Use @pytest.mark.skip to hide flakiness

Test runs too slowly?¶

Mark as @pytest.mark.slow
Run fast tests with: pytest -m "not slow"
Run slow tests separately in CI or nightly

Best Practices¶

✅ Do: - Write tests before implementing features (TDD) - Use descriptive test names (test_cache_hit_returns_fresh_value) - Mock external API calls - Test edge cases (empty inputs, null values, exceptions) - Use fixtures for reusable setup - Keep tests focused (one behavior per test)

❌ Don't: - Test implementation details (test behavior, not code) - Skip flaky tests (fix them instead) - Disable coverage for convenience - Write overly complex tests (refactor into smaller pieces) - Ignore test failures (fix the bug or mark as xfail)

Troubleshooting Test Collection¶

Slow or hanging collection¶

If pytest --collect-only is unusually slow or appears to hang, the most common cause is a fixture or module-level import that does expensive work at import time (network calls, large model loads, or reading a missing file path).

Run pytest --collect-only --trace-config to identify slow plugin/fixture loads.
Prefer package imports (from tokenpak.core import ...) over loading a module from a hardcoded filesystem path — path-based loading breaks when the working directory or layout changes.
If a test references a file that may be absent, guard it with pytest.importorskip or a fixture that skips cleanly rather than failing at collection time.

TokenPak Testing Guide¶

Overview¶

Quick Start¶

Run all tests¶

Run Phase 2 tests (validation + error + cache)¶

Run with coverage¶

Run a single test file¶

Run tests matching a pattern¶

Test Organization¶

Test Markers¶

Run unit tests only¶

Run Phase 2 tests with coverage¶

Phase 2 Test Scope¶

Validation (tests/test_validation.py)¶

Error Handling (tests/test_error_handling.py)¶

Cache Management (tests/test_cache.py)¶

Writing Tests¶

Test Structure¶

Fixtures (Reusable Setup)¶

Mocking External Dependencies¶

Coverage Reports¶

Understand the HTML Report¶

Coverage by Module¶

Find Untested Code Paths¶

Continuous Integration¶

GitHub Actions¶

Pre-commit Hook¶

Common Issues & Fixes¶

Coverage not updating?¶

Test is flaky (fails randomly)?¶

Test runs too slowly?¶

Best Practices¶

Further Reading¶

Troubleshooting Test Collection¶

Slow or hanging collection¶

Validation (`tests/test_validation.py`)¶

Error Handling (`tests/test_error_handling.py`)¶

Cache Management (`tests/test_cache.py`)¶