LLM Provider Guide

Complete guide to configuring, managing, and using Large Language Model providers in FIberwise

Overview
Architecture
Supported Providers
Configuration
Provider Service
LLM Service
Agent Integration
CLI Management
API Usage
Advanced Features
Provider Comparison & Use Cases
Troubleshooting

Overview

Fiberwiseprovides a unified interface for working with multiple Large Language Model (LLM) providers. The LLM provider system abstracts away the differences between various AI services, allowing developers to seamlessly switch between providers or use multiple providers simultaneously.

Key Features:

Multi-Provider Support: OpenAI, Anthropic, Google AI, Ollama, Hugging Face, OpenRouter, Cloudflare Workers AI, and custom providers
Unified Interface: Consistent API regardless of underlying provider
Response Standardization: Normalized responses across all providers
Structured Output: JSON schema-based structured generation
Configuration Management: Centralized provider configuration and credentials
Fallback Support: Graceful handling when providers are unavailable
Cost Optimization: Free tier options and intelligent model selection
Edge Computing: Global deployment with Cloudflare Workers AI

Architecture

System Components

┌─────────────────────┐
│    Agent Layer      │  ← Agents use LLMProviderService
├─────────────────────┤
│ LLMProviderService  │  ← Main service interface
├─────────────────────┤
│ LLMServiceFactory   │  ← Creates provider-specific services
├─────────────────────┤
│ Provider Services   │  ← All supported providers
│ - OpenAIService     │
│ - AnthropicService  │
│ - GoogleAIService   │
│ - OllamaService     │
│ - HuggingFaceService│
│ - OpenRouterService │
│ - CloudflareService │
├─────────────────────┤
│ Database Layer      │  ← Provider configurations and credentials
│ - llm_providers     │
│ - provider_configs  │
└─────────────────────┘

Key Files

Component	File Path	Purpose
LLMProviderService	`fiberwise-common/services/llm_provider_service.py`	Main service interface and response standardization
LLMServiceFactory	`fiberwise-core-web/worker/llm_service.py`	Factory for creating provider-specific services
ProviderService	`fiberwise-common/services/provider_service.py`	Configuration management and database operations
CLI Management	`fiberwise/cli/account.py`	Command-line provider configuration

Supported Providers

OpenAI

Configuration

{
  "provider_type": "openai",
  "api_endpoint": "https://api.openai.com/v1",
  "configuration": {
    "api_key": "your-openai-api-key",
    "default_model": "gpt-4",
    "temperature": 0.7,
    "max_tokens": 2048
  }
}

Popular Models

gpt-4 - Most capable model
gpt-4-turbo - Faster GPT-4 variant
gpt-3.5-turbo - Fast and cost-effective
gpt-4o - Multimodal capabilities

Anthropic

Configuration

{
  "provider_type": "anthropic",
  "api_endpoint": "https://api.anthropic.com/v1",
  "configuration": {
    "api_key": "your-anthropic-api-key",
    "default_model": "claude-3-5-sonnet-20241022",
    "temperature": 0.7,
    "max_tokens": 2048
  }
}

Popular Models

claude-3-5-sonnet-20241022 - Latest Claude 3.5 Sonnet
claude-3-opus-20240229 - Most capable Claude model
claude-3-sonnet-20240229 - Balanced performance
claude-3-haiku-20240307 - Fast and lightweight

Google AI

Configuration

{
  "provider_type": "google",
  "api_endpoint": "https://generativelanguage.googleapis.com/v1",
  "configuration": {
    "api_key": "your-google-ai-api-key",
    "default_model": "gemini-1.5-pro",
    "temperature": 0.7,
    "max_tokens": 2048
  }
}

Popular Models

gemini-1.5-pro - Most capable Gemini model
gemini-1.5-flash - Fast and efficient
gemini-pro - Standard Gemini model

Ollama (Local)

Configuration

{
  "provider_type": "ollama",
  "api_endpoint": "http://localhost:11434/api",
  "configuration": {
    "default_model": "llama2",
    "temperature": 0.7
  }
}

Popular Models

llama2 - Meta's Llama 2 model
mistral - Mistral 7B model
codellama - Code-specialized Llama
phi - Microsoft's Phi model

Hugging Face 🤗

Configuration

{
  "provider_type": "huggingface",
  "api_endpoint": "https://api-inference.huggingface.co",
  "configuration": {
    "api_key": "your-huggingface-api-key",
    "default_model": "meta-llama/Llama-2-7b-chat-hf",
    "temperature": 0.7,
    "max_tokens": 2048
  }
}

Popular Models

meta-llama/Llama-2-7b-chat-hf - Llama 2 Chat model
microsoft/DialoGPT-large - Conversational AI
google/flan-t5-large - Instruction-tuned T5
bigscience/bloom-7b1 - Multilingual model
sentence-transformers/all-MiniLM-L6-v2 - Embeddings

Free Tier Available: Access 200,000+ models with generous free usage limits. Perfect for experimentation and development.

OpenRouter

Configuration

{
  "provider_type": "openrouter",
  "api_endpoint": "https://openrouter.ai/api/v1",
  "configuration": {
    "api_key": "your-openrouter-api-key",
    "default_model": "meta-llama/llama-3.1-8b-instruct:free",
    "site_url": "https://yourapp.com",
    "app_name": "Your App Name",
    "temperature": 0.7,
    "max_tokens": 2048
  }
}

Popular Models

meta-llama/llama-3.1-8b-instruct:free - Free Llama 3.1
anthropic/claude-3.5-sonnet - Latest Claude model
openai/gpt-4o - GPT-4 Omni via OpenRouter
google/gemini-pro - Gemini Pro via OpenRouter
mistralai/mistral-7b-instruct - Cost-effective Mistral

Multi-Provider Access: 100+ models from 15+ providers through a unified API. Often 10-50% cheaper than direct provider APIs with intelligent routing and fallbacks.

Cloudflare Workers AI

Configuration

{
  "provider_type": "cloudflare",
  "api_endpoint": "https://api.cloudflare.com/client/v4",
  "configuration": {
    "api_key": "your-cloudflare-api-key",
    "account_id": "your-cloudflare-account-id",
    "default_model": "@cf/meta/llama-3.1-8b-instruct",
    "temperature": 0.7,
    "max_tokens": 2048
  }
}

Popular Models

@cf/meta/llama-3.1-8b-instruct - Llama 3.1 8B on edge
@cf/microsoft/phi-2 - Microsoft Phi-2 model
@cf/mistral/mistral-7b-instruct-v0.1 - Mistral 7B
@cf/qwen/qwen1.5-7b-chat-awq - Qwen chat model
@cf/baai/bge-base-en-v1.5 - Text embeddings

Edge Computing: Models run on Cloudflare's global edge network for ultra-low latency. Generous free tier with 100,000 requests per day.

Configuration

Database Schema

-- Main provider configurations
CREATE TABLE llm_providers (
    provider_id TEXT PRIMARY KEY,
    name TEXT NOT NULL,
    provider_type TEXT NOT NULL,
    api_endpoint TEXT,
    is_active BOOLEAN DEFAULT TRUE,
    configuration TEXT NOT NULL, -- JSON configuration
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Provider defaults and capabilities
CREATE TABLE llm_provider_defaults (
    default_id TEXT PRIMARY KEY,
    provider_type TEXT NOT NULL UNIQUE,
    default_name TEXT NOT NULL,
    default_api_endpoint TEXT,
    default_configuration TEXT DEFAULT '{}',
    form_schema TEXT DEFAULT '{}',
    supports_streaming INTEGER DEFAULT 0,
    supports_functions INTEGER DEFAULT 0,
    supports_tools INTEGER DEFAULT 0,
    supports_vision INTEGER DEFAULT 0
);

Configuration Structure

{
  "provider_id": "my-openai-provider",
  "name": "My OpenAI Configuration",
  "provider_type": "openai",
  "api_endpoint": "https://api.openai.com/v1",
  "is_active": true,
  "configuration": {
    "api_key": "sk-...",
    "default_model": "gpt-4",
    "temperature": 0.7,
    "max_tokens": 2048,
    "custom_settings": {
      "timeout": 30,
      "retry_attempts": 3
    }
  }
}

Provider Service

Core Methods

get_provider_by_id()

async def get_provider_by_id(self, provider_id: str) -> Optional[Dict[str, Any]]:
    """
    Get provider details from database by provider_id

    Returns:
        Provider details including configuration and credentials
    """
    query = """
        SELECT provider_id, name, provider_type, api_endpoint, configuration
        FROM llm_providers
        WHERE provider_id = ? AND is_active = true
    """

    provider = await self.db.fetchrow(query, provider_id)

    if provider:
        provider_dict = dict(provider)
        # Parse JSON configuration
        if isinstance(provider_dict.get('configuration'), str):
            provider_dict['configuration'] = json.loads(provider_dict['configuration'])
        return provider_dict

    return None

execute_llm_request()

async def execute_llm_request(
    self,
    provider_id: str,
    prompt: str,
    model_id: Optional[str] = None,
    temperature: Optional[float] = None,
    max_tokens: Optional[int] = None,
    output_schema: Optional[Dict[str, Any]] = None
) -> Dict[str, Any]:
    """
    Execute an LLM request with the specified provider

    Args:
        provider_id: The ID of the provider to use
        prompt: The text prompt to send
        model_id: Optional model ID (overrides provider default)
        temperature: Optional temperature setting
        max_tokens: Optional max tokens setting
        output_schema: Optional schema for structured output

    Returns:
        Standardized response with model output
    """

generate_structured_output()

async def generate_structured_output(
    self,
    prompt: str,
    schema: Dict[str, Any],
    provider_id: str = "default-openai",
    **kwargs
) -> Dict[str, Any]:
    """
    Generate structured output using JSON schema

    Args:
        prompt: The text prompt to send to the model
        schema: JSON schema definition for expected output
        provider_id: The ID of the provider to use
        **kwargs: Additional parameters

    Returns:
        Dictionary containing structured data if successful
    """

LLM Service Layer

Factory Pattern

class LLMServiceFactory:
    """Factory for creating LLM services based on provider type"""

    @staticmethod
    def create_service(provider_type: str, api_key: str = None, api_endpoint: str = None):
        """Create an LLM service based on provider type"""
        if provider_type == "openai":
            return OpenAIService(api_key, api_endpoint or "https://api.openai.com/v1")
        elif provider_type == "anthropic":
            return AnthropicService(api_key, api_endpoint or "https://api.anthropic.com/v1")
        elif provider_type == "google":
            return GoogleAIService(api_key, api_endpoint or "https://generativelanguage.googleapis.com/v1")
        elif provider_type == "ollama":
            return OllamaService(api_endpoint or "http://localhost:11434/api")
        elif provider_type == "huggingface":
            return HuggingFaceService(api_key, api_endpoint or "https://api-inference.huggingface.co")
        elif provider_type == "openrouter":
            return OpenRouterService(api_key, api_endpoint or "https://openrouter.ai/api/v1")
        elif provider_type == "cloudflare":
            return CloudflareWorkersAIService(api_key, account_id, api_endpoint or "https://api.cloudflare.com/client/v4")
        else:
            raise ValueError(f"Unsupported provider type: {provider_type}")

Base Service Interface

class BaseLLMService(ABC):
    """Base class for LLM service providers"""

    @abstractmethod
    async def generate_completion(self, prompt: str, model: str, **kwargs) -> Dict[str, Any]:
        """Generate a completion from the LLM provider"""
        pass

    @abstractmethod
    async def generate_embedding(self, text: str, model: str, **kwargs) -> List[float]:
        """Generate embeddings from the LLM provider"""
        pass

Response Standardization

@staticmethod
def standardize_response(raw_response, provider_type, model, output_schema=None):
    """
    Standardize responses from different providers into consistent format

    Handles:
    - OpenAI: choices[0].message.content
    - Anthropic: content[0].text
    - Google: candidates[0].content.parts[0].text
    - Ollama: response
    - Hugging Face: [0].generated_text or generated_text
    - OpenRouter: choices[0].message.content (OpenAI-compatible)
    - Cloudflare: result.response

    Returns standardized format:
    {
        "text": "Generated text",
        "model": "model-name",
        "provider": "provider-type",
        "finish_reason": "stop",
        "structured_data": {...}  # If schema provided
    }
    """

Agent Integration

Using LLM Provider in Agents

from fiberwise import FIberwise, BaseAgent
from worker.llm_provider_service import LLMProviderService

class ChatAgent(BaseAgent):
    async def run_agent(
        self,
        input_data: Dict[str, Any],
        fiber: FIberwise,
        llm_provider_service: LLMProviderService
    ) -> Dict[str, Any]:
        """Agent with LLM provider service dependency injection"""

        user_message = input_data.get('message', '')

        # Basic completion
        response = await llm_provider_service.execute_llm_request(
            provider_id="default-openai",
            prompt=f"User says: {user_message}. Please respond helpfully.",
            temperature=0.7,
            max_tokens=1000
        )

        if response.get("status") == "completed":
            return {
                "status": "success",
                "text": response["text"],
                "model": response["model"],
                "provider": response["provider"]
            }
        else:
            return {
                "status": "error",
                "error": response.get("error", "Unknown error")
            }

Structured Output Example

class AnalysisAgent(BaseAgent):
    async def run_agent(self, input_data, fiber, llm_provider_service):
        text_to_analyze = input_data.get('text', '')

        # Define schema for structured output
        analysis_schema = {
            "type": "json",
            "properties": {
                "sentiment": {"type": "string"},
                "confidence": {"type": "number"},
                "key_points": {"type": "array", "items": {"type": "string"}},
                "summary": {"type": "string"}
            }
        }

        # Generate structured analysis
        result = await llm_provider_service.generate_structured_output(
            prompt=f"Analyze this text and provide sentiment, confidence, key points, and summary: {text_to_analyze}",
            schema=analysis_schema,
            provider_id="claude-3-sonnet"
        )

        if result.get("status") == "completed":
            return {
                "status": "success",
                "analysis": result["data"],
                "raw_text": result.get("text", "")
            }
        else:
            return {"status": "error", "error": result.get("error")}

CLI Management

Adding Providers

# Add OpenAI provider
fiber account add-provider \
  --provider openai \
  --api-key "your-openai-api-key" \
  --model gpt-4 \
  --set-default

# Add Anthropic provider
fiber account add-provider \
  --provider anthropic \
  --api-key "your-anthropic-api-key" \
  --model claude-3-5-sonnet-20241022

# Add Google AI provider
fiber account add-provider \
  --provider google \
  --api-key "your-google-ai-api-key" \
  --model gemini-1.5-pro

# Add Ollama (local) provider
fiber account add-provider \
  --provider ollama \
  --api-endpoint "http://localhost:11434/api" \
  --model llama2

# Add Hugging Face provider
fiber account add-provider \
  --provider huggingface \
  --api-key "your-huggingface-api-key" \
  --model "meta-llama/Llama-2-7b-chat-hf"

# Add OpenRouter provider (with free model)
fiber account add-provider \
  --provider openrouter \
  --api-key "your-openrouter-api-key" \
  --model "meta-llama/llama-3.1-8b-instruct:free" \
  --site-url "https://yourapp.com" \
  --app-name "Your App Name"

# Add Cloudflare Workers AI provider
fiber account add-provider \
  --provider cloudflare \
  --api-key "your-cloudflare-api-key" \
  --account-id "your-account-id" \
  --model "@cf/meta/llama-3.1-8b-instruct"

Managing Providers

# List all providers
fiber account list-providers

# List providers by type
fiber account provider list --type openai

# Set default provider
fiber account provider default "My OpenAI Provider"

# Import providers from app
fiber account import-providers --app-id app-123xyz

API Usage

Direct Service Usage

from fiberwise_common.services.llm_provider_service import LLMProviderService
from fiberwise_common.database.factory import get_database_provider

# Initialize service
db_provider = get_database_provider(settings)
llm_service = LLMProviderService(db_provider)

# Simple completion
response = await llm_service.execute_llm_request(
    provider_id="my-openai-provider",
    prompt="What is the capital of France?",
    temperature=0.3
)

print(response["text"])  # "The capital of France is Paris."

Batch Processing

async def process_multiple_prompts(prompts, provider_id="default-openai"):
    results = []

    for prompt in prompts:
        response = await llm_service.execute_llm_request(
            provider_id=provider_id,
            prompt=prompt,
            temperature=0.7,
            max_tokens=500
        )

        if response.get("status") == "completed":
            results.append({
                "prompt": prompt,
                "response": response["text"],
                "model": response["model"]
            })
        else:
            results.append({
                "prompt": prompt,
                "error": response.get("error")
            })

    return results

Advanced Features

Custom Providers

# Adding custom OpenAI-compatible provider
{
  "provider_type": "custom-openai",
  "api_endpoint": "https://api.your-custom-provider.com/v1",
  "configuration": {
    "api_key": "your-custom-api-key",
    "default_model": "custom-model-name",
    "temperature": 0.7,
    "max_tokens": 2048
  }
}

Fallback Providers

async def robust_llm_request(prompt, primary_provider, fallback_provider):
    """Try primary provider, fallback to secondary if it fails"""

    try:
        response = await llm_service.execute_llm_request(
            provider_id=primary_provider,
            prompt=prompt
        )

        if response.get("status") == "completed":
            return response

    except Exception as e:
        logger.warning(f"Primary provider failed: {e}")

    # Fallback to secondary provider
    try:
        response = await llm_service.execute_llm_request(
            provider_id=fallback_provider,
            prompt=prompt
        )

        response["used_fallback"] = True
        return response

    except Exception as e:
        logger.error(f"Both providers failed: {e}")
        return {"status": "failed", "error": "All providers unavailable"}

Response Caching

import hashlib
from typing import Optional

class CachedLLMService:
    def __init__(self, llm_service, cache_ttl=3600):
        self.llm_service = llm_service
        self.cache = {}
        self.cache_ttl = cache_ttl

    def _get_cache_key(self, provider_id, prompt, **kwargs):
        """Generate cache key from request parameters"""
        key_data = f"{provider_id}:{prompt}:{json.dumps(kwargs, sort_keys=True)}"
        return hashlib.md5(key_data.encode()).hexdigest()

    async def execute_llm_request(self, provider_id, prompt, **kwargs):
        cache_key = self._get_cache_key(provider_id, prompt, **kwargs)

        # Check cache
        if cache_key in self.cache:
            cached_response, timestamp = self.cache[cache_key]
            if time.time() - timestamp < self.cache_ttl:
                cached_response["from_cache"] = True
                return cached_response

        # Make request
        response = await self.llm_service.execute_llm_request(
            provider_id, prompt, **kwargs
        )

        # Cache successful responses
        if response.get("status") == "completed":
            self.cache[cache_key] = (response, time.time())

        return response

Provider Comparison & Use Cases

Cost-Effectiveness

Provider	Free Tier	Cost Level	Best For
Hugging Face	✅ Generous free limits	Free → Low	Experimentation, research, development
OpenRouter	✅ Free models available	Free → Medium	Multi-model access, cost optimization
Cloudflare Workers AI	✅ 100k requests/day	Free → Low	Edge computing, low latency, global scale
Ollama	✅ Completely free	Free (self-hosted)	Privacy, offline usage, full control
OpenAI	❌ Pay-per-use	Medium → High	Highest quality, latest features
Anthropic	❌ Pay-per-use	Medium → High	Safety, reasoning, analysis
Google AI	❌ Limited free tier	Low → Medium	Fast inference, good value

Development Workflow

# Recommended workflow for new projects:

# 1. Start with free providers for prototyping
fiber account add-provider --provider huggingface --api-key "hf_xxx" --model "meta-llama/Llama-2-7b-chat-hf"
fiber account add-provider --provider openrouter --api-key "sk-or-xxx" --model "meta-llama/llama-3.1-8b-instruct:free"

# 2. Add edge computing for production
fiber account add-provider --provider cloudflare --api-key "xxx" --account-id "xxx" --model "@cf/meta/llama-3.1-8b-instruct"

# 3. Add premium providers for quality-critical features
fiber account add-provider --provider openai --api-key "sk-xxx" --model "gpt-4"
fiber account add-provider --provider anthropic --api-key "sk-ant-xxx" --model "claude-3-5-sonnet-20241022"

Use Case Recommendations

🚀 Rapid Prototyping

Best Providers: Hugging Face, OpenRouter (free models)

No cost barrier to experimentation
Wide variety of models to test
Quick setup and iteration

🌍 Global Production Apps

Best Providers: Cloudflare Workers AI, OpenRouter

Ultra-low latency via edge computing
Automatic failover and load balancing
Global availability and scaling

💰 Cost-Sensitive Applications

Best Providers: OpenRouter, Hugging Face, Cloudflare

Competitive pricing and free tiers
Intelligent model selection for cost optimization
Volume discounts and efficient routing

🔒 Privacy & Security

Best Providers: Ollama (local), Cloudflare (edge)

Data never leaves your infrastructure
Full control over model and data
Compliance with strict privacy requirements

🎯 Premium Quality Applications

Best Providers: OpenAI, Anthropic

State-of-the-art model capabilities
Latest features and improvements
Reliable performance for critical tasks

Troubleshooting

Common Issues

Provider Not Found

{"status": "failed", "error": "Provider my-provider not found or not active"}

Solutions:

Check provider ID spelling
Verify provider is active in database
List providers: fiber account list-providers

API Key Invalid

{"status": "failed", "error": "Authentication failed"}

Solutions:

Verify API key is correct and active
Check API key has required permissions
Ensure no extra spaces in API key

Model Not Available

{"status": "failed", "error": "Model gpt-5 not found"}

Solutions:

Check model name spelling
Verify model is available for your API key
Use provider's model listing API

Rate Limit Exceeded

{"status": "failed", "error": "Rate limit exceeded"}

Solutions:

Implement exponential backoff
Use multiple API keys
Switch to different provider
Upgrade API plan

Debugging Tips

# Enable debug logging
import logging
logging.getLogger('fiberwise_common.services.llm_provider_service').setLevel(logging.DEBUG)

# Check raw response
response = await llm_service.execute_llm_request(
    provider_id="my-provider",
    prompt="test"
)
print("Raw response:", response.get("_raw_response"))

# Test provider connection
provider = await llm_service.get_provider_by_id("my-provider")
if provider:
    print("Provider config:", provider["configuration"])
else:
    print("Provider not found")

Health Checks

async def health_check_provider(provider_id):
    """Test if a provider is working correctly"""

    try:
        response = await llm_service.execute_llm_request(
            provider_id=provider_id,
            prompt="Say 'OK' if you can respond",
            max_tokens=10,
            temperature=0
        )

        if response.get("status") == "completed":
            return {
                "provider_id": provider_id,
                "status": "healthy",
                "response_time": response.get("response_time"),
                "model": response.get("model")
            }
        else:
            return {
                "provider_id": provider_id,
                "status": "unhealthy",
                "error": response.get("error")
            }

    except Exception as e:
        return {
            "provider_id": provider_id,
            "status": "error",
            "error": str(e)
        }

← LLM Providers Concept SDK Reference →

Table of Contents

Overview

Architecture

System Components

Key Files

Supported Providers

OpenAI

Configuration

Popular Models

Anthropic

Configuration

Popular Models

Google AI

Configuration

Popular Models

Ollama (Local)

Configuration

Popular Models

Hugging Face 🤗

Configuration

Popular Models

OpenRouter

Configuration

Popular Models

Cloudflare Workers AI

Configuration

Popular Models

Configuration

Database Schema

Configuration Structure

Provider Service

Core Methods

get_provider_by_id()

execute_llm_request()

generate_structured_output()

LLM Service Layer

Factory Pattern

Base Service Interface

Response Standardization

Agent Integration

Using LLM Provider in Agents

Structured Output Example

CLI Management

Adding Providers

Managing Providers

API Usage

Direct Service Usage

Batch Processing

Advanced Features

Custom Providers

Fallback Providers

Response Caching

Provider Comparison & Use Cases

Cost-Effectiveness

Development Workflow

Use Case Recommendations

🚀 Rapid Prototyping

🌍 Global Production Apps

💰 Cost-Sensitive Applications

🔒 Privacy & Security

🎯 Premium Quality Applications

Troubleshooting

Common Issues

Provider Not Found

API Key Invalid

Model Not Available

Rate Limit Exceeded

Debugging Tips

Health Checks