Skip to main content

API Calls, Streaming & Model Management

API Client Construction

Multi-Provider Support

// src/services/api/client.ts
if (CLAUDE_CODE_USE_BEDROCK) return new AnthropicBedrock(...)
if (CLAUDE_CODE_USE_FOUNDRY) return new AnthropicFoundry(...)
if (CLAUDE_CODE_USE_VERTEX) return new AnthropicVertex(...)
return new Anthropic(...) // first-party API (default)

Auth varies by provider: API Key / OAuth for first-party, AWS credentials for Bedrock, GCP credentials for Vertex.

Streaming

Core Flow

Uses the raw stream (not BetaMessageStream) to avoid repeated partial JSON parsing on tool deltas. Events handled: message_start, content_block_start/delta/stop, message_delta (final usage, stop_reason, cost), message_stop.

Retry & Fault Tolerance

  • withRetry: Exponential backoff, rate limit (429) and server error (5xx) handling, fallback model support
  • Idle Watchdog: Aborts if no chunks arrive within timeout (CLAUDE_STREAM_IDLE_TIMEOUT_MS)
  • Non-streaming Fallback: Falls back to create() without stream: true on streaming failure

Model Selection

Resolution priority: session override -> --model CLI arg -> ANTHROPIC_MODEL env -> settings -> default model. Same logical model maps to different API strings per provider via ALL_MODEL_CONFIGS.

Cost Tracking

// src/cost-tracker.ts
addToTotalSessionCost(cost, usage, model)
// 1. Accumulate per-model ModelUsage (input/output/cache tokens)
// 2. Update global cost state
// 3. Increment OpenTelemetry counters
// 4. Log analytics
// 5. Recursively process advisor sub-usage

formatTotalCost() // CLI-style cost summary

Session costs are persisted to project config and restored on session resume.

Key Source Files

FileResponsibility
src/services/api/claude.tsStreaming/non-streaming API call core
src/services/api/client.tsAnthropic SDK client construction
src/services/api/withRetry.tsRetry logic
src/utils/model/model.tsModel selection logic
src/utils/model/configs.tsModel config mappings
src/cost-tracker.tsCost tracking

Next

Go to 13-config-settings.md to learn about the configuration system.

Hands-on Experiment

This chapter has a corresponding Python experiment:

Lab 12 — Streaming API

Covers: SSE streaming, JSON fragment assembly, retry, idle timeout

cd experiments && python -m exp_12_streaming_api.main --mock