Skip to main content

Format Conversion and Smart Forwarding

PipeLLM Gateway’s core functionality is protocol conversion and smart forwarding. Use official SDKs (Anthropic, OpenAI, Google, etc.), we automatically convert requests to different platform protocols and convert responses back to your expected format. Biggest benefit: No new API to learn! Use familiar SDKs, we handle all protocol differences.

🔄 How It Works

Your Code (Anthropic SDK)

[PipeLLM Gateway]
    ↓ (Protocol Conversion)
AWS Bedrock / Google Vertex / Azure
    ↓ (Protocol Conversion)
[PipeLLM Gateway]

Your Code (Still Anthropic Format)
Key points:
  • Use native SDK (e.g., Anthropic)
  • Send standard protocol request
  • We convert to target platform format
  • Target platform processes request
  • We convert response back to your SDK format
  • Your code unchanged

📊 Supported Protocol Conversions

1. Anthropic SDK ↔ Platforms

Use Anthropic’s official SDK, we handle protocol conversion. Example: Anthropic SDK → Bedrock
# Your code (standard Anthropic SDK)
from anthropic import Anthropic

client = Anthropic(
    api_key="your-pipellm-key",
    base_url="https://api.pipellm.com/v1"  # Point to our gateway
)

# Standard Anthropic API call
response = client.messages.create(
    model="claude-3-sonnet",
    max_tokens=1000,
    messages=[{"role": "user", "content": "Hello"}]
)

print(response.content[0].text)  # Use directly
Actual conversion:
Your SDKTarget PlatformConversion
Anthropic SDKAWS BedrockAnthropic → Bedrock protocol
Anthropic SDKGoogle VertexAnthropic → Vertex protocol
Anthropic SDKAzureAnthropic → Azure protocol
Anthropic SDKAnthropic OfficialDirect passthrough
Your response:
{
  "id": "msg-123",
  "content": [{"type": "text", "text": "Hello!"}],
  "model": "claude-3-sonnet",
  "role": "assistant",
  "usage": {"input_tokens": 15, "output_tokens": 25}
}

2. OpenAI SDK ↔ Platforms

Use OpenAI’s official SDK. Example: OpenAI SDK → Azure
# Your code (standard OpenAI SDK)
import openai

client = openai.OpenAI(
    api_key="your-pipellm-key",
    base_url="https://api.pipellm.com/v1"  # Point to our gateway
)

# Standard OpenAI API call
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello"}],
    max_tokens=100
)

print(response.choices[0].message.content)  # Use directly
Actual conversion:
Your SDKTarget PlatformConversion
OpenAI SDKAzure OpenAIOpenAI → Azure protocol
OpenAI SDKAWS BedrockOpenAI → Bedrock protocol
OpenAI SDKGoogle VertexOpenAI → Vertex protocol
OpenAI SDKOpenAI OfficialDirect passthrough

3. Google SDK ↔ Platforms

Use Google’s native libraries or standard Gemini API. Example: Gemini API → Vertex AI
# Standard Gemini API
import requests

headers = {
    "Authorization": f"Bearer your-pipellm-key",
    "Content-Type": "application/json"
}

data = {
    "model": "gemini-pro",
    "contents": [{"role": "user", "parts": [{"text": "Hello"}]}]
}

response = requests.post(
    "https://api.pipellm.com/v1/chat/completions",
    headers=headers,
    json=data
)

result = response.json()
print(result["choices"][0]["message"]["content"])
Actual conversion:
Your SDKTarget PlatformConversion
Gemini SDKGoogle VertexGemini → Vertex protocol
Gemini SDKAWS BedrockGemini → Bedrock protocol
Gemini SDKOther PlatformsGemini → Platform protocol

🎯 Smart Forwarding Strategy

1. Auto Load Balancing

Choose best provider based on:
  • Availability: Real-time health monitoring
  • Latency: Select fastest response
  • Cost: Best value under quality guarantee
  • Quota: Avoid single provider overload

2. Failover

If primary provider unavailable:
User Request

Check Provider A Status
    ↓ (If A unavailable)
Auto Switch to Provider B

Return Result

3. Model Mapping

We maintain detailed model mapping:
Your Request ModelOpenAIAnthropicGeminiAWS Bedrock
gpt-4✅ GPT-4
gpt-3.5-turbo
claude-3-sonnet
gemini-pro
autoSmart selection
Smart selection logic:
  • If specific model requested, use that model
  • If auto or generic name, choose best based on current status
  • If primary provider quota exhausted, auto switch to backup

⚙️ Advanced Configuration

1. Preferred Provider

Specify preferred provider via request header:
curl https://api.pipellm.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "X-Preferred-Provider: anthropic" \
  -d '{"model": "auto", "messages": [...]}'
Available providers:
  • openai - OpenAI
  • anthropic - Anthropic Claude
  • google - Google Gemini
  • azure - Azure OpenAI
  • aws - AWS Bedrock

2. Force Provider

Bypass smart routing:
curl https://api.pipellm.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "X-Force-Provider: openai" \
  -d '{"model": "auto", "messages": [...]}'

3. Disable Format Conversion

Use native format directly:
curl https://api.pipellm.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "X-Raw-Format: anthropic" \
  -d '{"model": "claude-3-sonnet", "messages": [...]}'

⚡ Performance Optimization

1. Zero Conversion Overhead

Format conversion has minimal overhead:
  • Request conversion: < 1ms
  • Response conversion: < 1ms
  • Total latency increase: < 2ms

2. Smart Caching

Automatic caching across providers:
  • Cross-provider caching (format independent)
  • Smart cache key generation
  • Auto cache refresh

3. Connection Reuse

  • Long-lived connections
  • Connection pool management
  • Concurrent request optimization

🛠️ Developer Tools

1. Debug Mode

Enable debug mode to view conversion details:
curl https://api.pipellm.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "X-Debug: true" \
  -d '{"model": "auto", "messages": [...]}'
Response includes:
{
  "_debug": {
    "original_request": {...},
    "converted_request": {...},
    "provider": "openai",
    "conversion_time_ms": 1.2,
    "cache_hit": false
  },
  "choices": [...]
}

2. Performance Monitoring

Monitor via dashboard:
  • Provider usage statistics
  • Conversion time analysis
  • Cache hit rate
  • Failover counts

📝 Usage Guidelines

1. Best Practices

Recommended:
  • Use standard OpenAI format
  • Let gateway auto-select provider
  • Use caching appropriately
  • Implement retry logic
Not recommended:
  • Frequent provider switching
  • Disable smart routing (unless necessary)
  • Ignore error handling

2. Migration Guide

Step 1: Keep existing code unchanged
# Original code
import openai
openai.api_key = "YOUR_OPENAI_KEY"
openai.api_base = "https://api.openai.com/v1/"

# Modify to
openai.api_base = "https://api.pipellm.com/v1/"
openai.api_key = "YOUR_GATEWAY_KEY"
Step 2: Test basic functionality
response = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "test"}]
)
print(response.choices[0].message.content)
Step 3: Optimize gradually
  • Adjust model selection based on needs
  • Enable batch requests
  • Configure monitoring alerts

🤝 Support

If you encounter format conversion issues:
  1. Enable debug mode to get detailed information
  2. Check request logs to confirm conversion
  3. Contact support with debug information

Tip: In most cases, you don’t need to care about format conversion details. Our gateway handles everything automatically!