LLM Provider Guide

Complete guide to configuring, managing, and using Large Language Model providers in FIberwise

Overview

Fiberwiseprovides a unified interface for working with multiple Large Language Model (LLM) providers. The LLM provider system abstracts away the differences between various AI services, allowing developers to seamlessly switch between providers or use multiple providers simultaneously.

Key Features:
  • Multi-Provider Support: OpenAI, Anthropic, Google AI, Ollama, Hugging Face, OpenRouter, Cloudflare Workers AI, and custom providers
  • Unified Interface: Consistent API regardless of underlying provider
  • Response Standardization: Normalized responses across all providers
  • Structured Output: JSON schema-based structured generation
  • Configuration Management: Centralized provider configuration and credentials
  • Fallback Support: Graceful handling when providers are unavailable
  • Cost Optimization: Free tier options and intelligent model selection
  • Edge Computing: Global deployment with Cloudflare Workers AI

Architecture

System Components

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚    Agent Layer      β”‚  ← Agents use LLMProviderService
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ LLMProviderService  β”‚  ← Main service interface
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ LLMServiceFactory   β”‚  ← Creates provider-specific services
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Provider Services   β”‚  ← All supported providers
β”‚ - OpenAIService     β”‚
β”‚ - AnthropicService  β”‚
β”‚ - GoogleAIService   β”‚
β”‚ - OllamaService     β”‚
β”‚ - HuggingFaceServiceβ”‚
β”‚ - OpenRouterService β”‚
β”‚ - CloudflareService β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Database Layer      β”‚  ← Provider configurations and credentials
β”‚ - llm_providers     β”‚
β”‚ - provider_configs  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Files

Component File Path Purpose
LLMProviderService fiberwise-common/services/llm_provider_service.py Main service interface and response standardization
LLMServiceFactory fiberwise-core-web/worker/llm_service.py Factory for creating provider-specific services
ProviderService fiberwise-common/services/provider_service.py Configuration management and database operations
CLI Management fiberwise/cli/account.py Command-line provider configuration

Supported Providers

OpenAI

Configuration

{
  "provider_type": "openai",
  "api_endpoint": "https://api.openai.com/v1",
  "configuration": {
    "api_key": "your-openai-api-key",
    "default_model": "gpt-4",
    "temperature": 0.7,
    "max_tokens": 2048
  }
}

Popular Models

  • gpt-4 - Most capable model
  • gpt-4-turbo - Faster GPT-4 variant
  • gpt-3.5-turbo - Fast and cost-effective
  • gpt-4o - Multimodal capabilities

Anthropic

Configuration

{
  "provider_type": "anthropic",
  "api_endpoint": "https://api.anthropic.com/v1",
  "configuration": {
    "api_key": "your-anthropic-api-key",
    "default_model": "claude-3-5-sonnet-20241022",
    "temperature": 0.7,
    "max_tokens": 2048
  }
}

Popular Models

  • claude-3-5-sonnet-20241022 - Latest Claude 3.5 Sonnet
  • claude-3-opus-20240229 - Most capable Claude model
  • claude-3-sonnet-20240229 - Balanced performance
  • claude-3-haiku-20240307 - Fast and lightweight

Google AI

Configuration

{
  "provider_type": "google",
  "api_endpoint": "https://generativelanguage.googleapis.com/v1",
  "configuration": {
    "api_key": "your-google-ai-api-key",
    "default_model": "gemini-1.5-pro",
    "temperature": 0.7,
    "max_tokens": 2048
  }
}

Popular Models

  • gemini-1.5-pro - Most capable Gemini model
  • gemini-1.5-flash - Fast and efficient
  • gemini-pro - Standard Gemini model

Ollama (Local)

Configuration

{
  "provider_type": "ollama",
  "api_endpoint": "http://localhost:11434/api",
  "configuration": {
    "default_model": "llama2",
    "temperature": 0.7
  }
}

Popular Models

  • llama2 - Meta's Llama 2 model
  • mistral - Mistral 7B model
  • codellama - Code-specialized Llama
  • phi - Microsoft's Phi model

Hugging Face πŸ€—

Configuration

{
  "provider_type": "huggingface",
  "api_endpoint": "https://api-inference.huggingface.co",
  "configuration": {
    "api_key": "your-huggingface-api-key",
    "default_model": "meta-llama/Llama-2-7b-chat-hf",
    "temperature": 0.7,
    "max_tokens": 2048
  }
}

Popular Models

  • meta-llama/Llama-2-7b-chat-hf - Llama 2 Chat model
  • microsoft/DialoGPT-large - Conversational AI
  • google/flan-t5-large - Instruction-tuned T5
  • bigscience/bloom-7b1 - Multilingual model
  • sentence-transformers/all-MiniLM-L6-v2 - Embeddings
Free Tier Available: Access 200,000+ models with generous free usage limits. Perfect for experimentation and development.

OpenRouter

Configuration

{
  "provider_type": "openrouter",
  "api_endpoint": "https://openrouter.ai/api/v1",
  "configuration": {
    "api_key": "your-openrouter-api-key",
    "default_model": "meta-llama/llama-3.1-8b-instruct:free",
    "site_url": "https://yourapp.com",
    "app_name": "Your App Name",
    "temperature": 0.7,
    "max_tokens": 2048
  }
}

Popular Models

  • meta-llama/llama-3.1-8b-instruct:free - Free Llama 3.1
  • anthropic/claude-3.5-sonnet - Latest Claude model
  • openai/gpt-4o - GPT-4 Omni via OpenRouter
  • google/gemini-pro - Gemini Pro via OpenRouter
  • mistralai/mistral-7b-instruct - Cost-effective Mistral
Multi-Provider Access: 100+ models from 15+ providers through a unified API. Often 10-50% cheaper than direct provider APIs with intelligent routing and fallbacks.

Cloudflare Workers AI

Configuration

{
  "provider_type": "cloudflare",
  "api_endpoint": "https://api.cloudflare.com/client/v4",
  "configuration": {
    "api_key": "your-cloudflare-api-key",
    "account_id": "your-cloudflare-account-id",
    "default_model": "@cf/meta/llama-3.1-8b-instruct",
    "temperature": 0.7,
    "max_tokens": 2048
  }
}

Popular Models

  • @cf/meta/llama-3.1-8b-instruct - Llama 3.1 8B on edge
  • @cf/microsoft/phi-2 - Microsoft Phi-2 model
  • @cf/mistral/mistral-7b-instruct-v0.1 - Mistral 7B
  • @cf/qwen/qwen1.5-7b-chat-awq - Qwen chat model
  • @cf/baai/bge-base-en-v1.5 - Text embeddings
Edge Computing: Models run on Cloudflare's global edge network for ultra-low latency. Generous free tier with 100,000 requests per day.

Configuration

Database Schema

-- Main provider configurations
CREATE TABLE llm_providers (
    provider_id TEXT PRIMARY KEY,
    name TEXT NOT NULL,
    provider_type TEXT NOT NULL,
    api_endpoint TEXT,
    is_active BOOLEAN DEFAULT TRUE,
    configuration TEXT NOT NULL, -- JSON configuration
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Provider defaults and capabilities
CREATE TABLE llm_provider_defaults (
    default_id TEXT PRIMARY KEY,
    provider_type TEXT NOT NULL UNIQUE,
    default_name TEXT NOT NULL,
    default_api_endpoint TEXT,
    default_configuration TEXT DEFAULT '{}',
    form_schema TEXT DEFAULT '{}',
    supports_streaming INTEGER DEFAULT 0,
    supports_functions INTEGER DEFAULT 0,
    supports_tools INTEGER DEFAULT 0,
    supports_vision INTEGER DEFAULT 0
);

Configuration Structure

{
  "provider_id": "my-openai-provider",
  "name": "My OpenAI Configuration",
  "provider_type": "openai",
  "api_endpoint": "https://api.openai.com/v1",
  "is_active": true,
  "configuration": {
    "api_key": "sk-...",
    "default_model": "gpt-4",
    "temperature": 0.7,
    "max_tokens": 2048,
    "custom_settings": {
      "timeout": 30,
      "retry_attempts": 3
    }
  }
}

Provider Service

Core Methods

get_provider_by_id()

async def get_provider_by_id(self, provider_id: str) -> Optional[Dict[str, Any]]:
    """
    Get provider details from database by provider_id

    Returns:
        Provider details including configuration and credentials
    """
    query = """
        SELECT provider_id, name, provider_type, api_endpoint, configuration
        FROM llm_providers
        WHERE provider_id = ? AND is_active = true
    """

    provider = await self.db.fetchrow(query, provider_id)

    if provider:
        provider_dict = dict(provider)
        # Parse JSON configuration
        if isinstance(provider_dict.get('configuration'), str):
            provider_dict['configuration'] = json.loads(provider_dict['configuration'])
        return provider_dict

    return None

execute_llm_request()

async def execute_llm_request(
    self,
    provider_id: str,
    prompt: str,
    model_id: Optional[str] = None,
    temperature: Optional[float] = None,
    max_tokens: Optional[int] = None,
    output_schema: Optional[Dict[str, Any]] = None
) -> Dict[str, Any]:
    """
    Execute an LLM request with the specified provider

    Args:
        provider_id: The ID of the provider to use
        prompt: The text prompt to send
        model_id: Optional model ID (overrides provider default)
        temperature: Optional temperature setting
        max_tokens: Optional max tokens setting
        output_schema: Optional schema for structured output

    Returns:
        Standardized response with model output
    """

generate_structured_output()

async def generate_structured_output(
    self,
    prompt: str,
    schema: Dict[str, Any],
    provider_id: str = "default-openai",
    **kwargs
) -> Dict[str, Any]:
    """
    Generate structured output using JSON schema

    Args:
        prompt: The text prompt to send to the model
        schema: JSON schema definition for expected output
        provider_id: The ID of the provider to use
        **kwargs: Additional parameters

    Returns:
        Dictionary containing structured data if successful
    """

LLM Service Layer

Factory Pattern

class LLMServiceFactory:
    """Factory for creating LLM services based on provider type"""

    @staticmethod
    def create_service(provider_type: str, api_key: str = None, api_endpoint: str = None):
        """Create an LLM service based on provider type"""
        if provider_type == "openai":
            return OpenAIService(api_key, api_endpoint or "https://api.openai.com/v1")
        elif provider_type == "anthropic":
            return AnthropicService(api_key, api_endpoint or "https://api.anthropic.com/v1")
        elif provider_type == "google":
            return GoogleAIService(api_key, api_endpoint or "https://generativelanguage.googleapis.com/v1")
        elif provider_type == "ollama":
            return OllamaService(api_endpoint or "http://localhost:11434/api")
        elif provider_type == "huggingface":
            return HuggingFaceService(api_key, api_endpoint or "https://api-inference.huggingface.co")
        elif provider_type == "openrouter":
            return OpenRouterService(api_key, api_endpoint or "https://openrouter.ai/api/v1")
        elif provider_type == "cloudflare":
            return CloudflareWorkersAIService(api_key, account_id, api_endpoint or "https://api.cloudflare.com/client/v4")
        else:
            raise ValueError(f"Unsupported provider type: {provider_type}")

Base Service Interface

class BaseLLMService(ABC):
    """Base class for LLM service providers"""

    @abstractmethod
    async def generate_completion(self, prompt: str, model: str, **kwargs) -> Dict[str, Any]:
        """Generate a completion from the LLM provider"""
        pass

    @abstractmethod
    async def generate_embedding(self, text: str, model: str, **kwargs) -> List[float]:
        """Generate embeddings from the LLM provider"""
        pass

Response Standardization

@staticmethod
def standardize_response(raw_response, provider_type, model, output_schema=None):
    """
    Standardize responses from different providers into consistent format

    Handles:
    - OpenAI: choices[0].message.content
    - Anthropic: content[0].text
    - Google: candidates[0].content.parts[0].text
    - Ollama: response
    - Hugging Face: [0].generated_text or generated_text
    - OpenRouter: choices[0].message.content (OpenAI-compatible)
    - Cloudflare: result.response

    Returns standardized format:
    {
        "text": "Generated text",
        "model": "model-name",
        "provider": "provider-type",
        "finish_reason": "stop",
        "structured_data": {...}  # If schema provided
    }
    """

Agent Integration

Using LLM Provider in Agents

from fiberwise import FIberwise, BaseAgent
from worker.llm_provider_service import LLMProviderService

class ChatAgent(BaseAgent):
    async def run_agent(
        self,
        input_data: Dict[str, Any],
        fiber: FIberwise,
        llm_provider_service: LLMProviderService
    ) -> Dict[str, Any]:
        """Agent with LLM provider service dependency injection"""

        user_message = input_data.get('message', '')

        # Basic completion
        response = await llm_provider_service.execute_llm_request(
            provider_id="default-openai",
            prompt=f"User says: {user_message}. Please respond helpfully.",
            temperature=0.7,
            max_tokens=1000
        )

        if response.get("status") == "completed":
            return {
                "status": "success",
                "text": response["text"],
                "model": response["model"],
                "provider": response["provider"]
            }
        else:
            return {
                "status": "error",
                "error": response.get("error", "Unknown error")
            }

Structured Output Example

class AnalysisAgent(BaseAgent):
    async def run_agent(self, input_data, fiber, llm_provider_service):
        text_to_analyze = input_data.get('text', '')

        # Define schema for structured output
        analysis_schema = {
            "type": "json",
            "properties": {
                "sentiment": {"type": "string"},
                "confidence": {"type": "number"},
                "key_points": {"type": "array", "items": {"type": "string"}},
                "summary": {"type": "string"}
            }
        }

        # Generate structured analysis
        result = await llm_provider_service.generate_structured_output(
            prompt=f"Analyze this text and provide sentiment, confidence, key points, and summary: {text_to_analyze}",
            schema=analysis_schema,
            provider_id="claude-3-sonnet"
        )

        if result.get("status") == "completed":
            return {
                "status": "success",
                "analysis": result["data"],
                "raw_text": result.get("text", "")
            }
        else:
            return {"status": "error", "error": result.get("error")}

CLI Management

Adding Providers

# Add OpenAI provider
fiber account add-provider \
  --provider openai \
  --api-key "your-openai-api-key" \
  --model gpt-4 \
  --set-default

# Add Anthropic provider
fiber account add-provider \
  --provider anthropic \
  --api-key "your-anthropic-api-key" \
  --model claude-3-5-sonnet-20241022

# Add Google AI provider
fiber account add-provider \
  --provider google \
  --api-key "your-google-ai-api-key" \
  --model gemini-1.5-pro

# Add Ollama (local) provider
fiber account add-provider \
  --provider ollama \
  --api-endpoint "http://localhost:11434/api" \
  --model llama2

# Add Hugging Face provider
fiber account add-provider \
  --provider huggingface \
  --api-key "your-huggingface-api-key" \
  --model "meta-llama/Llama-2-7b-chat-hf"

# Add OpenRouter provider (with free model)
fiber account add-provider \
  --provider openrouter \
  --api-key "your-openrouter-api-key" \
  --model "meta-llama/llama-3.1-8b-instruct:free" \
  --site-url "https://yourapp.com" \
  --app-name "Your App Name"

# Add Cloudflare Workers AI provider
fiber account add-provider \
  --provider cloudflare \
  --api-key "your-cloudflare-api-key" \
  --account-id "your-account-id" \
  --model "@cf/meta/llama-3.1-8b-instruct"

Managing Providers

# List all providers
fiber account list-providers

# List providers by type
fiber account provider list --type openai

# Set default provider
fiber account provider default "My OpenAI Provider"

# Import providers from app
fiber account import-providers --app-id app-123xyz

API Usage

Direct Service Usage

from fiberwise_common.services.llm_provider_service import LLMProviderService
from fiberwise_common.database.factory import get_database_provider

# Initialize service
db_provider = get_database_provider(settings)
llm_service = LLMProviderService(db_provider)

# Simple completion
response = await llm_service.execute_llm_request(
    provider_id="my-openai-provider",
    prompt="What is the capital of France?",
    temperature=0.3
)

print(response["text"])  # "The capital of France is Paris."

Batch Processing

async def process_multiple_prompts(prompts, provider_id="default-openai"):
    results = []

    for prompt in prompts:
        response = await llm_service.execute_llm_request(
            provider_id=provider_id,
            prompt=prompt,
            temperature=0.7,
            max_tokens=500
        )

        if response.get("status") == "completed":
            results.append({
                "prompt": prompt,
                "response": response["text"],
                "model": response["model"]
            })
        else:
            results.append({
                "prompt": prompt,
                "error": response.get("error")
            })

    return results

Advanced Features

Custom Providers

# Adding custom OpenAI-compatible provider
{
  "provider_type": "custom-openai",
  "api_endpoint": "https://api.your-custom-provider.com/v1",
  "configuration": {
    "api_key": "your-custom-api-key",
    "default_model": "custom-model-name",
    "temperature": 0.7,
    "max_tokens": 2048
  }
}

Fallback Providers

async def robust_llm_request(prompt, primary_provider, fallback_provider):
    """Try primary provider, fallback to secondary if it fails"""

    try:
        response = await llm_service.execute_llm_request(
            provider_id=primary_provider,
            prompt=prompt
        )

        if response.get("status") == "completed":
            return response

    except Exception as e:
        logger.warning(f"Primary provider failed: {e}")

    # Fallback to secondary provider
    try:
        response = await llm_service.execute_llm_request(
            provider_id=fallback_provider,
            prompt=prompt
        )

        response["used_fallback"] = True
        return response

    except Exception as e:
        logger.error(f"Both providers failed: {e}")
        return {"status": "failed", "error": "All providers unavailable"}

Response Caching

import hashlib
from typing import Optional

class CachedLLMService:
    def __init__(self, llm_service, cache_ttl=3600):
        self.llm_service = llm_service
        self.cache = {}
        self.cache_ttl = cache_ttl

    def _get_cache_key(self, provider_id, prompt, **kwargs):
        """Generate cache key from request parameters"""
        key_data = f"{provider_id}:{prompt}:{json.dumps(kwargs, sort_keys=True)}"
        return hashlib.md5(key_data.encode()).hexdigest()

    async def execute_llm_request(self, provider_id, prompt, **kwargs):
        cache_key = self._get_cache_key(provider_id, prompt, **kwargs)

        # Check cache
        if cache_key in self.cache:
            cached_response, timestamp = self.cache[cache_key]
            if time.time() - timestamp < self.cache_ttl:
                cached_response["from_cache"] = True
                return cached_response

        # Make request
        response = await self.llm_service.execute_llm_request(
            provider_id, prompt, **kwargs
        )

        # Cache successful responses
        if response.get("status") == "completed":
            self.cache[cache_key] = (response, time.time())

        return response

Provider Comparison & Use Cases

Cost-Effectiveness

Provider Free Tier Cost Level Best For
Hugging Face βœ… Generous free limits Free β†’ Low Experimentation, research, development
OpenRouter βœ… Free models available Free β†’ Medium Multi-model access, cost optimization
Cloudflare Workers AI βœ… 100k requests/day Free β†’ Low Edge computing, low latency, global scale
Ollama βœ… Completely free Free (self-hosted) Privacy, offline usage, full control
OpenAI ❌ Pay-per-use Medium β†’ High Highest quality, latest features
Anthropic ❌ Pay-per-use Medium β†’ High Safety, reasoning, analysis
Google AI ❌ Limited free tier Low β†’ Medium Fast inference, good value

Development Workflow

# Recommended workflow for new projects:

# 1. Start with free providers for prototyping
fiber account add-provider --provider huggingface --api-key "hf_xxx" --model "meta-llama/Llama-2-7b-chat-hf"
fiber account add-provider --provider openrouter --api-key "sk-or-xxx" --model "meta-llama/llama-3.1-8b-instruct:free"

# 2. Add edge computing for production
fiber account add-provider --provider cloudflare --api-key "xxx" --account-id "xxx" --model "@cf/meta/llama-3.1-8b-instruct"

# 3. Add premium providers for quality-critical features
fiber account add-provider --provider openai --api-key "sk-xxx" --model "gpt-4"
fiber account add-provider --provider anthropic --api-key "sk-ant-xxx" --model "claude-3-5-sonnet-20241022"

Use Case Recommendations

πŸš€ Rapid Prototyping

Best Providers: Hugging Face, OpenRouter (free models)

  • No cost barrier to experimentation
  • Wide variety of models to test
  • Quick setup and iteration

🌍 Global Production Apps

Best Providers: Cloudflare Workers AI, OpenRouter

  • Ultra-low latency via edge computing
  • Automatic failover and load balancing
  • Global availability and scaling

πŸ’° Cost-Sensitive Applications

Best Providers: OpenRouter, Hugging Face, Cloudflare

  • Competitive pricing and free tiers
  • Intelligent model selection for cost optimization
  • Volume discounts and efficient routing

πŸ”’ Privacy & Security

Best Providers: Ollama (local), Cloudflare (edge)

  • Data never leaves your infrastructure
  • Full control over model and data
  • Compliance with strict privacy requirements

🎯 Premium Quality Applications

Best Providers: OpenAI, Anthropic

  • State-of-the-art model capabilities
  • Latest features and improvements
  • Reliable performance for critical tasks

Troubleshooting

Common Issues

Provider Not Found

{"status": "failed", "error": "Provider my-provider not found or not active"}

Solutions:

  • Check provider ID spelling
  • Verify provider is active in database
  • List providers: fiber account list-providers

API Key Invalid

{"status": "failed", "error": "Authentication failed"}

Solutions:

  • Verify API key is correct and active
  • Check API key has required permissions
  • Ensure no extra spaces in API key

Model Not Available

{"status": "failed", "error": "Model gpt-5 not found"}

Solutions:

  • Check model name spelling
  • Verify model is available for your API key
  • Use provider's model listing API

Rate Limit Exceeded

{"status": "failed", "error": "Rate limit exceeded"}

Solutions:

  • Implement exponential backoff
  • Use multiple API keys
  • Switch to different provider
  • Upgrade API plan

Debugging Tips

# Enable debug logging
import logging
logging.getLogger('fiberwise_common.services.llm_provider_service').setLevel(logging.DEBUG)

# Check raw response
response = await llm_service.execute_llm_request(
    provider_id="my-provider",
    prompt="test"
)
print("Raw response:", response.get("_raw_response"))

# Test provider connection
provider = await llm_service.get_provider_by_id("my-provider")
if provider:
    print("Provider config:", provider["configuration"])
else:
    print("Provider not found")

Health Checks

async def health_check_provider(provider_id):
    """Test if a provider is working correctly"""

    try:
        response = await llm_service.execute_llm_request(
            provider_id=provider_id,
            prompt="Say 'OK' if you can respond",
            max_tokens=10,
            temperature=0
        )

        if response.get("status") == "completed":
            return {
                "provider_id": provider_id,
                "status": "healthy",
                "response_time": response.get("response_time"),
                "model": response.get("model")
            }
        else:
            return {
                "provider_id": provider_id,
                "status": "unhealthy",
                "error": response.get("error")
            }

    except Exception as e:
        return {
            "provider_id": provider_id,
            "status": "error",
            "error": str(e)
        }