Skip to content

[Feature] Add huggingface as a provider #430

@akshaydeo

Description

@akshaydeo

Overview

Add Hugging Face as a native provider in Bifrost to enable access to models from the Hugging Face Hub via the Inference API. This will allow users to leverage a wide range of open-source models for chat completions, text generation, embeddings, and other AI tasks.

Description

Hugging Face hosts thousands of open-source models that users may want to access through Bifrost. Adding native support will:

  • Enable access to popular models like meta-llama/Llama-3.1-8B-Instruct, mistralai/Mistral-7B-Instruct-v0.2, google/gemma-2-9b-it, etc.
  • Provide a unified interface for Hugging Face models alongside other providers
  • Support both free tier and paid Inference API endpoints
  • Enable embeddings from models like sentence-transformers/all-MiniLM-L6-v2

Implementation Requirements

Core Functionality

The Hugging Face provider should implement the Provider interface with support for:

  1. Chat Completions (Primary)

    • Support for /chat/completions endpoint (OpenAI-compatible format)
    • Both sync and streaming modes
    • Model selection via model ID (e.g., meta-llama/Llama-3.1-8B-Instruct)
  2. Text Completions (If supported by model)

    • Support for /completions endpoint
    • Both sync and streaming modes
  3. Embeddings

    • Support for /embeddings endpoint
    • Access to embedding models from Hugging Face Hub
  4. List Models

    • Ability to list available models (may need to query Hugging Face Hub API or use a predefined list)
  5. Responses API

    • Support for Responses API (can use chat completion internally)

Technical Implementation

  1. Provider Registration

    • Add HuggingFace constant to ModelProvider type in core/schemas/bifrost.go
    • Register provider in createBaseProvider() function in core/bifrost.go
  2. Provider Structure

    • Create core/providers/huggingface/ directory
    • Implement provider struct following the pattern of existing providers (e.g., openai, anthropic)
    • Support for API key authentication via Authorization: Bearer <token> header
    • Base URL: https://api-inference.huggingface.co (or configurable)
  3. API Compatibility

    • Hugging Face Inference API supports OpenAI-compatible endpoints
    • May need to handle model-specific differences
    • Support for streaming responses (Server-Sent Events)
  4. Configuration

    • Support for HUGGINGFACE_API_KEY or HF_TOKEN environment variable
    • Network config support (base URL, timeouts, retries)
    • Custom provider config support
  5. Error Handling

    • Proper error parsing and conversion to BifrostError format
    • Handle rate limiting and model loading states

Files to Create/Modify

New Files:

  • core/providers/huggingface/huggingface.go - Main provider implementation
  • core/providers/huggingface/chat.go - Chat completion handlers
  • core/providers/huggingface/embedding.go - Embedding handlers
  • core/providers/huggingface/models.go - Model listing and types
  • core/providers/huggingface/responses.go - Response parsing
  • core/providers/huggingface/types.go - Type definitions
  • tests/core-providers/huggingface_test - Test setup

Files to Modify:

  • core/schemas/bifrost.go - Add HuggingFace ModelProvider constant
  • core/bifrost.go - Register Hugging Face provider in createBaseProvider()
  • framework/configstore/ - Update provider config handling if needed
  • transports/bifrost-http/ - Update HTTP transport if needed
  • ui/ - Add Hugging Face to UI provider list (if applicable)

API Reference

Testing Considerations

  • Test with various model types (chat, text, embeddings)
  • Test streaming vs non-streaming modes
  • Test error handling (invalid API key, model not found, rate limits)
  • Test with both free and paid Inference API tiers
  • Verify compatibility with existing Bifrost features (plugins, logging, etc.)

Optional Enhancements

  • Support for custom inference endpoints (self-hosted Hugging Face endpoints)
  • Support for model-specific parameters
  • Integration with Hugging Face Hub API for dynamic model discovery
  • Support for task-specific endpoints (summarization, translation, etc.)

Acceptance Criteria

  • Hugging Face provider can be initialized and configured
  • Chat completions (sync and stream) work correctly
  • Embeddings endpoint is functional
  • List models returns available models
  • Error handling is comprehensive
  • Provider follows existing Bifrost patterns and conventions
  • Documentation is updated
  • Tests are added for the new provider

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    Status

    In progress

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions