LLM Provider Guide
Complete guide to configuring, managing, and using Large Language Model providers in FIberwise
Table of Contents
Overview
Fiberwiseprovides a unified interface for working with multiple Large Language Model (LLM) providers. The LLM provider system abstracts away the differences between various AI services, allowing developers to seamlessly switch between providers or use multiple providers simultaneously.
Key Features:
- Multi-Provider Support: OpenAI, Anthropic, Google AI, Ollama, Hugging Face, OpenRouter, Cloudflare Workers AI, and custom providers
- Unified Interface: Consistent API regardless of underlying provider
- Response Standardization: Normalized responses across all providers
- Structured Output: JSON schema-based structured generation
- Configuration Management: Centralized provider configuration and credentials
- Fallback Support: Graceful handling when providers are unavailable
- Cost Optimization: Free tier options and intelligent model selection
- Edge Computing: Global deployment with Cloudflare Workers AI
Architecture
System Components
βββββββββββββββββββββββ
β Agent Layer β β Agents use LLMProviderService
βββββββββββββββββββββββ€
β LLMProviderService β β Main service interface
βββββββββββββββββββββββ€
β LLMServiceFactory β β Creates provider-specific services
βββββββββββββββββββββββ€
β Provider Services β β All supported providers
β - OpenAIService β
β - AnthropicService β
β - GoogleAIService β
β - OllamaService β
β - HuggingFaceServiceβ
β - OpenRouterService β
β - CloudflareService β
βββββββββββββββββββββββ€
β Database Layer β β Provider configurations and credentials
β - llm_providers β
β - provider_configs β
βββββββββββββββββββββββ
Key Files
Component | File Path | Purpose |
---|---|---|
LLMProviderService | fiberwise-common/services/llm_provider_service.py |
Main service interface and response standardization |
LLMServiceFactory | fiberwise-core-web/worker/llm_service.py |
Factory for creating provider-specific services |
ProviderService | fiberwise-common/services/provider_service.py |
Configuration management and database operations |
CLI Management | fiberwise/cli/account.py |
Command-line provider configuration |
Supported Providers
OpenAI
Configuration
{
"provider_type": "openai",
"api_endpoint": "https://api.openai.com/v1",
"configuration": {
"api_key": "your-openai-api-key",
"default_model": "gpt-4",
"temperature": 0.7,
"max_tokens": 2048
}
}
Popular Models
gpt-4
- Most capable modelgpt-4-turbo
- Faster GPT-4 variantgpt-3.5-turbo
- Fast and cost-effectivegpt-4o
- Multimodal capabilities
Anthropic
Configuration
{
"provider_type": "anthropic",
"api_endpoint": "https://api.anthropic.com/v1",
"configuration": {
"api_key": "your-anthropic-api-key",
"default_model": "claude-3-5-sonnet-20241022",
"temperature": 0.7,
"max_tokens": 2048
}
}
Popular Models
claude-3-5-sonnet-20241022
- Latest Claude 3.5 Sonnetclaude-3-opus-20240229
- Most capable Claude modelclaude-3-sonnet-20240229
- Balanced performanceclaude-3-haiku-20240307
- Fast and lightweight
Google AI
Configuration
{
"provider_type": "google",
"api_endpoint": "https://generativelanguage.googleapis.com/v1",
"configuration": {
"api_key": "your-google-ai-api-key",
"default_model": "gemini-1.5-pro",
"temperature": 0.7,
"max_tokens": 2048
}
}
Popular Models
gemini-1.5-pro
- Most capable Gemini modelgemini-1.5-flash
- Fast and efficientgemini-pro
- Standard Gemini model
Ollama (Local)
Configuration
{
"provider_type": "ollama",
"api_endpoint": "http://localhost:11434/api",
"configuration": {
"default_model": "llama2",
"temperature": 0.7
}
}
Popular Models
llama2
- Meta's Llama 2 modelmistral
- Mistral 7B modelcodellama
- Code-specialized Llamaphi
- Microsoft's Phi model
Hugging Face π€
Configuration
{
"provider_type": "huggingface",
"api_endpoint": "https://api-inference.huggingface.co",
"configuration": {
"api_key": "your-huggingface-api-key",
"default_model": "meta-llama/Llama-2-7b-chat-hf",
"temperature": 0.7,
"max_tokens": 2048
}
}
Popular Models
meta-llama/Llama-2-7b-chat-hf
- Llama 2 Chat modelmicrosoft/DialoGPT-large
- Conversational AIgoogle/flan-t5-large
- Instruction-tuned T5bigscience/bloom-7b1
- Multilingual modelsentence-transformers/all-MiniLM-L6-v2
- Embeddings
Free Tier Available: Access 200,000+ models with generous free usage limits. Perfect for experimentation and development.
OpenRouter
Configuration
{
"provider_type": "openrouter",
"api_endpoint": "https://openrouter.ai/api/v1",
"configuration": {
"api_key": "your-openrouter-api-key",
"default_model": "meta-llama/llama-3.1-8b-instruct:free",
"site_url": "https://yourapp.com",
"app_name": "Your App Name",
"temperature": 0.7,
"max_tokens": 2048
}
}
Popular Models
meta-llama/llama-3.1-8b-instruct:free
- Free Llama 3.1anthropic/claude-3.5-sonnet
- Latest Claude modelopenai/gpt-4o
- GPT-4 Omni via OpenRoutergoogle/gemini-pro
- Gemini Pro via OpenRoutermistralai/mistral-7b-instruct
- Cost-effective Mistral
Multi-Provider Access: 100+ models from 15+ providers through a unified API. Often 10-50% cheaper than direct provider APIs with intelligent routing and fallbacks.
Cloudflare Workers AI
Configuration
{
"provider_type": "cloudflare",
"api_endpoint": "https://api.cloudflare.com/client/v4",
"configuration": {
"api_key": "your-cloudflare-api-key",
"account_id": "your-cloudflare-account-id",
"default_model": "@cf/meta/llama-3.1-8b-instruct",
"temperature": 0.7,
"max_tokens": 2048
}
}
Popular Models
@cf/meta/llama-3.1-8b-instruct
- Llama 3.1 8B on edge@cf/microsoft/phi-2
- Microsoft Phi-2 model@cf/mistral/mistral-7b-instruct-v0.1
- Mistral 7B@cf/qwen/qwen1.5-7b-chat-awq
- Qwen chat model@cf/baai/bge-base-en-v1.5
- Text embeddings
Edge Computing: Models run on Cloudflare's global edge network for ultra-low latency. Generous free tier with 100,000 requests per day.
Configuration
Database Schema
-- Main provider configurations
CREATE TABLE llm_providers (
provider_id TEXT PRIMARY KEY,
name TEXT NOT NULL,
provider_type TEXT NOT NULL,
api_endpoint TEXT,
is_active BOOLEAN DEFAULT TRUE,
configuration TEXT NOT NULL, -- JSON configuration
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Provider defaults and capabilities
CREATE TABLE llm_provider_defaults (
default_id TEXT PRIMARY KEY,
provider_type TEXT NOT NULL UNIQUE,
default_name TEXT NOT NULL,
default_api_endpoint TEXT,
default_configuration TEXT DEFAULT '{}',
form_schema TEXT DEFAULT '{}',
supports_streaming INTEGER DEFAULT 0,
supports_functions INTEGER DEFAULT 0,
supports_tools INTEGER DEFAULT 0,
supports_vision INTEGER DEFAULT 0
);
Configuration Structure
{
"provider_id": "my-openai-provider",
"name": "My OpenAI Configuration",
"provider_type": "openai",
"api_endpoint": "https://api.openai.com/v1",
"is_active": true,
"configuration": {
"api_key": "sk-...",
"default_model": "gpt-4",
"temperature": 0.7,
"max_tokens": 2048,
"custom_settings": {
"timeout": 30,
"retry_attempts": 3
}
}
}
Provider Service
Core Methods
get_provider_by_id()
async def get_provider_by_id(self, provider_id: str) -> Optional[Dict[str, Any]]:
"""
Get provider details from database by provider_id
Returns:
Provider details including configuration and credentials
"""
query = """
SELECT provider_id, name, provider_type, api_endpoint, configuration
FROM llm_providers
WHERE provider_id = ? AND is_active = true
"""
provider = await self.db.fetchrow(query, provider_id)
if provider:
provider_dict = dict(provider)
# Parse JSON configuration
if isinstance(provider_dict.get('configuration'), str):
provider_dict['configuration'] = json.loads(provider_dict['configuration'])
return provider_dict
return None
execute_llm_request()
async def execute_llm_request(
self,
provider_id: str,
prompt: str,
model_id: Optional[str] = None,
temperature: Optional[float] = None,
max_tokens: Optional[int] = None,
output_schema: Optional[Dict[str, Any]] = None
) -> Dict[str, Any]:
"""
Execute an LLM request with the specified provider
Args:
provider_id: The ID of the provider to use
prompt: The text prompt to send
model_id: Optional model ID (overrides provider default)
temperature: Optional temperature setting
max_tokens: Optional max tokens setting
output_schema: Optional schema for structured output
Returns:
Standardized response with model output
"""
generate_structured_output()
async def generate_structured_output(
self,
prompt: str,
schema: Dict[str, Any],
provider_id: str = "default-openai",
**kwargs
) -> Dict[str, Any]:
"""
Generate structured output using JSON schema
Args:
prompt: The text prompt to send to the model
schema: JSON schema definition for expected output
provider_id: The ID of the provider to use
**kwargs: Additional parameters
Returns:
Dictionary containing structured data if successful
"""
LLM Service Layer
Factory Pattern
class LLMServiceFactory:
"""Factory for creating LLM services based on provider type"""
@staticmethod
def create_service(provider_type: str, api_key: str = None, api_endpoint: str = None):
"""Create an LLM service based on provider type"""
if provider_type == "openai":
return OpenAIService(api_key, api_endpoint or "https://api.openai.com/v1")
elif provider_type == "anthropic":
return AnthropicService(api_key, api_endpoint or "https://api.anthropic.com/v1")
elif provider_type == "google":
return GoogleAIService(api_key, api_endpoint or "https://generativelanguage.googleapis.com/v1")
elif provider_type == "ollama":
return OllamaService(api_endpoint or "http://localhost:11434/api")
elif provider_type == "huggingface":
return HuggingFaceService(api_key, api_endpoint or "https://api-inference.huggingface.co")
elif provider_type == "openrouter":
return OpenRouterService(api_key, api_endpoint or "https://openrouter.ai/api/v1")
elif provider_type == "cloudflare":
return CloudflareWorkersAIService(api_key, account_id, api_endpoint or "https://api.cloudflare.com/client/v4")
else:
raise ValueError(f"Unsupported provider type: {provider_type}")
Base Service Interface
class BaseLLMService(ABC):
"""Base class for LLM service providers"""
@abstractmethod
async def generate_completion(self, prompt: str, model: str, **kwargs) -> Dict[str, Any]:
"""Generate a completion from the LLM provider"""
pass
@abstractmethod
async def generate_embedding(self, text: str, model: str, **kwargs) -> List[float]:
"""Generate embeddings from the LLM provider"""
pass
Response Standardization
@staticmethod
def standardize_response(raw_response, provider_type, model, output_schema=None):
"""
Standardize responses from different providers into consistent format
Handles:
- OpenAI: choices[0].message.content
- Anthropic: content[0].text
- Google: candidates[0].content.parts[0].text
- Ollama: response
- Hugging Face: [0].generated_text or generated_text
- OpenRouter: choices[0].message.content (OpenAI-compatible)
- Cloudflare: result.response
Returns standardized format:
{
"text": "Generated text",
"model": "model-name",
"provider": "provider-type",
"finish_reason": "stop",
"structured_data": {...} # If schema provided
}
"""
Agent Integration
Using LLM Provider in Agents
from fiberwise import FIberwise, BaseAgent
from worker.llm_provider_service import LLMProviderService
class ChatAgent(BaseAgent):
async def run_agent(
self,
input_data: Dict[str, Any],
fiber: FIberwise,
llm_provider_service: LLMProviderService
) -> Dict[str, Any]:
"""Agent with LLM provider service dependency injection"""
user_message = input_data.get('message', '')
# Basic completion
response = await llm_provider_service.execute_llm_request(
provider_id="default-openai",
prompt=f"User says: {user_message}. Please respond helpfully.",
temperature=0.7,
max_tokens=1000
)
if response.get("status") == "completed":
return {
"status": "success",
"text": response["text"],
"model": response["model"],
"provider": response["provider"]
}
else:
return {
"status": "error",
"error": response.get("error", "Unknown error")
}
Structured Output Example
class AnalysisAgent(BaseAgent):
async def run_agent(self, input_data, fiber, llm_provider_service):
text_to_analyze = input_data.get('text', '')
# Define schema for structured output
analysis_schema = {
"type": "json",
"properties": {
"sentiment": {"type": "string"},
"confidence": {"type": "number"},
"key_points": {"type": "array", "items": {"type": "string"}},
"summary": {"type": "string"}
}
}
# Generate structured analysis
result = await llm_provider_service.generate_structured_output(
prompt=f"Analyze this text and provide sentiment, confidence, key points, and summary: {text_to_analyze}",
schema=analysis_schema,
provider_id="claude-3-sonnet"
)
if result.get("status") == "completed":
return {
"status": "success",
"analysis": result["data"],
"raw_text": result.get("text", "")
}
else:
return {"status": "error", "error": result.get("error")}
CLI Management
Adding Providers
# Add OpenAI provider
fiber account add-provider \
--provider openai \
--api-key "your-openai-api-key" \
--model gpt-4 \
--set-default
# Add Anthropic provider
fiber account add-provider \
--provider anthropic \
--api-key "your-anthropic-api-key" \
--model claude-3-5-sonnet-20241022
# Add Google AI provider
fiber account add-provider \
--provider google \
--api-key "your-google-ai-api-key" \
--model gemini-1.5-pro
# Add Ollama (local) provider
fiber account add-provider \
--provider ollama \
--api-endpoint "http://localhost:11434/api" \
--model llama2
# Add Hugging Face provider
fiber account add-provider \
--provider huggingface \
--api-key "your-huggingface-api-key" \
--model "meta-llama/Llama-2-7b-chat-hf"
# Add OpenRouter provider (with free model)
fiber account add-provider \
--provider openrouter \
--api-key "your-openrouter-api-key" \
--model "meta-llama/llama-3.1-8b-instruct:free" \
--site-url "https://yourapp.com" \
--app-name "Your App Name"
# Add Cloudflare Workers AI provider
fiber account add-provider \
--provider cloudflare \
--api-key "your-cloudflare-api-key" \
--account-id "your-account-id" \
--model "@cf/meta/llama-3.1-8b-instruct"
Managing Providers
# List all providers
fiber account list-providers
# List providers by type
fiber account provider list --type openai
# Set default provider
fiber account provider default "My OpenAI Provider"
# Import providers from app
fiber account import-providers --app-id app-123xyz
API Usage
Direct Service Usage
from fiberwise_common.services.llm_provider_service import LLMProviderService
from fiberwise_common.database.factory import get_database_provider
# Initialize service
db_provider = get_database_provider(settings)
llm_service = LLMProviderService(db_provider)
# Simple completion
response = await llm_service.execute_llm_request(
provider_id="my-openai-provider",
prompt="What is the capital of France?",
temperature=0.3
)
print(response["text"]) # "The capital of France is Paris."
Batch Processing
async def process_multiple_prompts(prompts, provider_id="default-openai"):
results = []
for prompt in prompts:
response = await llm_service.execute_llm_request(
provider_id=provider_id,
prompt=prompt,
temperature=0.7,
max_tokens=500
)
if response.get("status") == "completed":
results.append({
"prompt": prompt,
"response": response["text"],
"model": response["model"]
})
else:
results.append({
"prompt": prompt,
"error": response.get("error")
})
return results
Advanced Features
Custom Providers
# Adding custom OpenAI-compatible provider
{
"provider_type": "custom-openai",
"api_endpoint": "https://api.your-custom-provider.com/v1",
"configuration": {
"api_key": "your-custom-api-key",
"default_model": "custom-model-name",
"temperature": 0.7,
"max_tokens": 2048
}
}
Fallback Providers
async def robust_llm_request(prompt, primary_provider, fallback_provider):
"""Try primary provider, fallback to secondary if it fails"""
try:
response = await llm_service.execute_llm_request(
provider_id=primary_provider,
prompt=prompt
)
if response.get("status") == "completed":
return response
except Exception as e:
logger.warning(f"Primary provider failed: {e}")
# Fallback to secondary provider
try:
response = await llm_service.execute_llm_request(
provider_id=fallback_provider,
prompt=prompt
)
response["used_fallback"] = True
return response
except Exception as e:
logger.error(f"Both providers failed: {e}")
return {"status": "failed", "error": "All providers unavailable"}
Response Caching
import hashlib
from typing import Optional
class CachedLLMService:
def __init__(self, llm_service, cache_ttl=3600):
self.llm_service = llm_service
self.cache = {}
self.cache_ttl = cache_ttl
def _get_cache_key(self, provider_id, prompt, **kwargs):
"""Generate cache key from request parameters"""
key_data = f"{provider_id}:{prompt}:{json.dumps(kwargs, sort_keys=True)}"
return hashlib.md5(key_data.encode()).hexdigest()
async def execute_llm_request(self, provider_id, prompt, **kwargs):
cache_key = self._get_cache_key(provider_id, prompt, **kwargs)
# Check cache
if cache_key in self.cache:
cached_response, timestamp = self.cache[cache_key]
if time.time() - timestamp < self.cache_ttl:
cached_response["from_cache"] = True
return cached_response
# Make request
response = await self.llm_service.execute_llm_request(
provider_id, prompt, **kwargs
)
# Cache successful responses
if response.get("status") == "completed":
self.cache[cache_key] = (response, time.time())
return response
Provider Comparison & Use Cases
Cost-Effectiveness
Provider | Free Tier | Cost Level | Best For |
---|---|---|---|
Hugging Face | β Generous free limits | Free β Low | Experimentation, research, development |
OpenRouter | β Free models available | Free β Medium | Multi-model access, cost optimization |
Cloudflare Workers AI | β 100k requests/day | Free β Low | Edge computing, low latency, global scale |
Ollama | β Completely free | Free (self-hosted) | Privacy, offline usage, full control |
OpenAI | β Pay-per-use | Medium β High | Highest quality, latest features |
Anthropic | β Pay-per-use | Medium β High | Safety, reasoning, analysis |
Google AI | β Limited free tier | Low β Medium | Fast inference, good value |
Development Workflow
# Recommended workflow for new projects:
# 1. Start with free providers for prototyping
fiber account add-provider --provider huggingface --api-key "hf_xxx" --model "meta-llama/Llama-2-7b-chat-hf"
fiber account add-provider --provider openrouter --api-key "sk-or-xxx" --model "meta-llama/llama-3.1-8b-instruct:free"
# 2. Add edge computing for production
fiber account add-provider --provider cloudflare --api-key "xxx" --account-id "xxx" --model "@cf/meta/llama-3.1-8b-instruct"
# 3. Add premium providers for quality-critical features
fiber account add-provider --provider openai --api-key "sk-xxx" --model "gpt-4"
fiber account add-provider --provider anthropic --api-key "sk-ant-xxx" --model "claude-3-5-sonnet-20241022"
Use Case Recommendations
π Rapid Prototyping
Best Providers: Hugging Face, OpenRouter (free models)
- No cost barrier to experimentation
- Wide variety of models to test
- Quick setup and iteration
π Global Production Apps
Best Providers: Cloudflare Workers AI, OpenRouter
- Ultra-low latency via edge computing
- Automatic failover and load balancing
- Global availability and scaling
π° Cost-Sensitive Applications
Best Providers: OpenRouter, Hugging Face, Cloudflare
- Competitive pricing and free tiers
- Intelligent model selection for cost optimization
- Volume discounts and efficient routing
π Privacy & Security
Best Providers: Ollama (local), Cloudflare (edge)
- Data never leaves your infrastructure
- Full control over model and data
- Compliance with strict privacy requirements
π― Premium Quality Applications
Best Providers: OpenAI, Anthropic
- State-of-the-art model capabilities
- Latest features and improvements
- Reliable performance for critical tasks
Troubleshooting
Common Issues
Provider Not Found
{"status": "failed", "error": "Provider my-provider not found or not active"}
Solutions:
- Check provider ID spelling
- Verify provider is active in database
- List providers:
fiber account list-providers
API Key Invalid
{"status": "failed", "error": "Authentication failed"}
Solutions:
- Verify API key is correct and active
- Check API key has required permissions
- Ensure no extra spaces in API key
Model Not Available
{"status": "failed", "error": "Model gpt-5 not found"}
Solutions:
- Check model name spelling
- Verify model is available for your API key
- Use provider's model listing API
Rate Limit Exceeded
{"status": "failed", "error": "Rate limit exceeded"}
Solutions:
- Implement exponential backoff
- Use multiple API keys
- Switch to different provider
- Upgrade API plan
Debugging Tips
# Enable debug logging
import logging
logging.getLogger('fiberwise_common.services.llm_provider_service').setLevel(logging.DEBUG)
# Check raw response
response = await llm_service.execute_llm_request(
provider_id="my-provider",
prompt="test"
)
print("Raw response:", response.get("_raw_response"))
# Test provider connection
provider = await llm_service.get_provider_by_id("my-provider")
if provider:
print("Provider config:", provider["configuration"])
else:
print("Provider not found")
Health Checks
async def health_check_provider(provider_id):
"""Test if a provider is working correctly"""
try:
response = await llm_service.execute_llm_request(
provider_id=provider_id,
prompt="Say 'OK' if you can respond",
max_tokens=10,
temperature=0
)
if response.get("status") == "completed":
return {
"provider_id": provider_id,
"status": "healthy",
"response_time": response.get("response_time"),
"model": response.get("model")
}
else:
return {
"provider_id": provider_id,
"status": "unhealthy",
"error": response.get("error")
}
except Exception as e:
return {
"provider_id": provider_id,
"status": "error",
"error": str(e)
}