Format Conversion and Smart Forwarding

PipeLLM Gateway’s core functionality is protocol conversion and smart forwarding. Use official SDKs (Anthropic, OpenAI, Google, etc.), we automatically convert requests to different platform protocols and convert responses back to your expected format. Biggest benefit: No new API to learn! Use familiar SDKs, we handle all protocol differences.

🔄 How It Works

Your Code (Anthropic SDK)
    ↓
[PipeLLM Gateway]
    ↓ (Protocol Conversion)
AWS Bedrock / Google Vertex / Azure
    ↓ (Protocol Conversion)
[PipeLLM Gateway]
    ↓
Your Code (Still Anthropic Format)

Key points:

Use native SDK (e.g., Anthropic)
Send standard protocol request
We convert to target platform format
Target platform processes request
We convert response back to your SDK format
Your code unchanged

📊 Supported Protocol Conversions

1. Anthropic SDK ↔ Platforms

Use Anthropic’s official SDK, we handle protocol conversion. Example: Anthropic SDK → Bedrock

# Your code (standard Anthropic SDK)
from anthropic import Anthropic

client = Anthropic(
    api_key="your-pipellm-key",
    base_url="https://api.pipellm.com/v1"  # Point to our gateway
)

# Standard Anthropic API call
response = client.messages.create(
    model="claude-3-sonnet",
    max_tokens=1000,
    messages=[{"role": "user", "content": "Hello"}]
)

print(response.content[0].text)  # Use directly

Actual conversion:

Your SDK	Target Platform	Conversion
Anthropic SDK	AWS Bedrock	Anthropic → Bedrock protocol
Anthropic SDK	Google Vertex	Anthropic → Vertex protocol
Anthropic SDK	Azure	Anthropic → Azure protocol
Anthropic SDK	Anthropic Official	Direct passthrough

Your response:

{
  "id": "msg-123",
  "content": [{"type": "text", "text": "Hello!"}],
  "model": "claude-3-sonnet",
  "role": "assistant",
  "usage": {"input_tokens": 15, "output_tokens": 25}
}

2. OpenAI SDK ↔ Platforms

Use OpenAI’s official SDK. Example: OpenAI SDK → Azure

# Your code (standard OpenAI SDK)
import openai

client = openai.OpenAI(
    api_key="your-pipellm-key",
    base_url="https://api.pipellm.com/v1"  # Point to our gateway
)

# Standard OpenAI API call
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello"}],
    max_tokens=100
)

print(response.choices[0].message.content)  # Use directly

Actual conversion:

Your SDK	Target Platform	Conversion
OpenAI SDK	Azure OpenAI	OpenAI → Azure protocol
OpenAI SDK	AWS Bedrock	OpenAI → Bedrock protocol
OpenAI SDK	Google Vertex	OpenAI → Vertex protocol
OpenAI SDK	OpenAI Official	Direct passthrough

3. Google SDK ↔ Platforms

Use Google’s native libraries or standard Gemini API. Example: Gemini API → Vertex AI

# Standard Gemini API
import requests

headers = {
    "Authorization": f"Bearer your-pipellm-key",
    "Content-Type": "application/json"
}

data = {
    "model": "gemini-pro",
    "contents": [{"role": "user", "parts": [{"text": "Hello"}]}]
}

response = requests.post(
    "https://api.pipellm.com/v1/chat/completions",
    headers=headers,
    json=data
)

result = response.json()
print(result["choices"][0]["message"]["content"])

Actual conversion:

Your SDK	Target Platform	Conversion
Gemini SDK	Google Vertex	Gemini → Vertex protocol
Gemini SDK	AWS Bedrock	Gemini → Bedrock protocol
Gemini SDK	Other Platforms	Gemini → Platform protocol

🎯 Smart Forwarding Strategy

1. Auto Load Balancing

Choose best provider based on:

Availability: Real-time health monitoring
Latency: Select fastest response
Cost: Best value under quality guarantee
Quota: Avoid single provider overload

2. Failover

If primary provider unavailable:

User Request
    ↓
Check Provider A Status
    ↓ (If A unavailable)
Auto Switch to Provider B
    ↓
Return Result

3. Model Mapping

We maintain detailed model mapping:

Your Request Model	OpenAI	Anthropic	Gemini	AWS Bedrock
`gpt-4`	✅ GPT-4	❌	❌	❌
`gpt-3.5-turbo`	✅	❌	❌	❌
`claude-3-sonnet`	❌	✅	❌	✅
`gemini-pro`	❌	❌	✅	❌
`auto`	Smart selection

Smart selection logic:

If specific model requested, use that model
If auto or generic name, choose best based on current status
If primary provider quota exhausted, auto switch to backup

⚙️ Advanced Configuration

1. Preferred Provider

Specify preferred provider via request header:

curl https://api.pipellm.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "X-Preferred-Provider: anthropic" \
  -d '{"model": "auto", "messages": [...]}'

Available providers:

openai - OpenAI
anthropic - Anthropic Claude
google - Google Gemini
azure - Azure OpenAI
aws - AWS Bedrock

2. Force Provider

Bypass smart routing:

curl https://api.pipellm.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "X-Force-Provider: openai" \
  -d '{"model": "auto", "messages": [...]}'

3. Disable Format Conversion

Use native format directly:

curl https://api.pipellm.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "X-Raw-Format: anthropic" \
  -d '{"model": "claude-3-sonnet", "messages": [...]}'

⚡ Performance Optimization

1. Zero Conversion Overhead

Format conversion has minimal overhead:

Request conversion: < 1ms
Response conversion: < 1ms
Total latency increase: < 2ms

2. Smart Caching

Automatic caching across providers:

Cross-provider caching (format independent)
Smart cache key generation
Auto cache refresh

3. Connection Reuse

Long-lived connections
Connection pool management
Concurrent request optimization

🛠️ Developer Tools

1. Debug Mode

Enable debug mode to view conversion details:

curl https://api.pipellm.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "X-Debug: true" \
  -d '{"model": "auto", "messages": [...]}'

Response includes:

{
  "_debug": {
    "original_request": {...},
    "converted_request": {...},
    "provider": "openai",
    "conversion_time_ms": 1.2,
    "cache_hit": false
  },
  "choices": [...]
}

2. Performance Monitoring

Monitor via dashboard:

Provider usage statistics
Conversion time analysis
Cache hit rate
Failover counts

📝 Usage Guidelines

1. Best Practices

Recommended:

Use standard OpenAI format
Let gateway auto-select provider
Use caching appropriately
Implement retry logic

Not recommended:

Frequent provider switching
Disable smart routing (unless necessary)
Ignore error handling

2. Migration Guide

Step 1: Keep existing code unchanged

# Original code
import openai
openai.api_key = "YOUR_OPENAI_KEY"
openai.api_base = "https://api.openai.com/v1/"

# Modify to
openai.api_base = "https://api.pipellm.com/v1/"
openai.api_key = "YOUR_GATEWAY_KEY"

Step 2: Test basic functionality

response = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "test"}]
)
print(response.choices[0].message.content)

Step 3: Optimize gradually

Adjust model selection based on needs
Enable batch requests
Configure monitoring alerts

🤝 Support

If you encounter format conversion issues:

Enable debug mode to get detailed information
Check request logs to confirm conversion
Contact support with debug information

Tip: In most cases, you don’t need to care about format conversion details. Our gateway handles everything automatically!

Quick Start

Gateway Features

Format Conversion and Smart Forwarding

Format Conversion and Smart Forwarding

🔄 How It Works

📊 Supported Protocol Conversions

1. Anthropic SDK ↔ Platforms

2. OpenAI SDK ↔ Platforms

3. Google SDK ↔ Platforms

🎯 Smart Forwarding Strategy

1. Auto Load Balancing

2. Failover

3. Model Mapping

⚙️ Advanced Configuration

1. Preferred Provider

2. Force Provider

3. Disable Format Conversion

⚡ Performance Optimization

1. Zero Conversion Overhead

2. Smart Caching

3. Connection Reuse

🛠️ Developer Tools

1. Debug Mode

2. Performance Monitoring

📝 Usage Guidelines

1. Best Practices

2. Migration Guide

🤝 Support

Quick Start

Gateway Features

​Format Conversion and Smart Forwarding

​🔄 How It Works

​📊 Supported Protocol Conversions

​1. Anthropic SDK ↔ Platforms

​2. OpenAI SDK ↔ Platforms

​3. Google SDK ↔ Platforms

​🎯 Smart Forwarding Strategy

​1. Auto Load Balancing

​2. Failover

​3. Model Mapping

​⚙️ Advanced Configuration

​1. Preferred Provider

​2. Force Provider

​3. Disable Format Conversion

​⚡ Performance Optimization

​1. Zero Conversion Overhead

​2. Smart Caching

​3. Connection Reuse

​🛠️ Developer Tools

​1. Debug Mode

​2. Performance Monitoring

​📝 Usage Guidelines

​1. Best Practices

​2. Migration Guide

​🤝 Support

Format Conversion and Smart Forwarding

🔄 How It Works

📊 Supported Protocol Conversions

1. Anthropic SDK ↔ Platforms

2. OpenAI SDK ↔ Platforms

3. Google SDK ↔ Platforms

🎯 Smart Forwarding Strategy

1. Auto Load Balancing

2. Failover

3. Model Mapping

⚙️ Advanced Configuration

1. Preferred Provider

2. Force Provider

3. Disable Format Conversion

⚡ Performance Optimization

1. Zero Conversion Overhead

2. Smart Caching

3. Connection Reuse

🛠️ Developer Tools

1. Debug Mode

2. Performance Monitoring

📝 Usage Guidelines

1. Best Practices

2. Migration Guide

🤝 Support