-
Notifications
You must be signed in to change notification settings - Fork 131
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers
Milestone
Description
Overview
Add Hugging Face as a native provider in Bifrost to enable access to models from the Hugging Face Hub via the Inference API. This will allow users to leverage a wide range of open-source models for chat completions, text generation, embeddings, and other AI tasks.
Description
Hugging Face hosts thousands of open-source models that users may want to access through Bifrost. Adding native support will:
- Enable access to popular models like
meta-llama/Llama-3.1-8B-Instruct,mistralai/Mistral-7B-Instruct-v0.2,google/gemma-2-9b-it, etc. - Provide a unified interface for Hugging Face models alongside other providers
- Support both free tier and paid Inference API endpoints
- Enable embeddings from models like
sentence-transformers/all-MiniLM-L6-v2
Implementation Requirements
Core Functionality
The Hugging Face provider should implement the Provider interface with support for:
-
Chat Completions (Primary)
- Support for
/chat/completionsendpoint (OpenAI-compatible format) - Both sync and streaming modes
- Model selection via model ID (e.g.,
meta-llama/Llama-3.1-8B-Instruct)
- Support for
-
Text Completions (If supported by model)
- Support for
/completionsendpoint - Both sync and streaming modes
- Support for
-
Embeddings
- Support for
/embeddingsendpoint - Access to embedding models from Hugging Face Hub
- Support for
-
List Models
- Ability to list available models (may need to query Hugging Face Hub API or use a predefined list)
-
Responses API
- Support for Responses API (can use chat completion internally)
Technical Implementation
-
Provider Registration
- Add
HuggingFaceconstant toModelProvidertype incore/schemas/bifrost.go - Register provider in
createBaseProvider()function incore/bifrost.go
- Add
-
Provider Structure
- Create
core/providers/huggingface/directory - Implement provider struct following the pattern of existing providers (e.g.,
openai,anthropic) - Support for API key authentication via
Authorization: Bearer <token>header - Base URL:
https://api-inference.huggingface.co(or configurable)
- Create
-
API Compatibility
- Hugging Face Inference API supports OpenAI-compatible endpoints
- May need to handle model-specific differences
- Support for streaming responses (Server-Sent Events)
-
Configuration
- Support for
HUGGINGFACE_API_KEYorHF_TOKENenvironment variable - Network config support (base URL, timeouts, retries)
- Custom provider config support
- Support for
-
Error Handling
- Proper error parsing and conversion to BifrostError format
- Handle rate limiting and model loading states
Files to Create/Modify
New Files:
core/providers/huggingface/huggingface.go- Main provider implementationcore/providers/huggingface/chat.go- Chat completion handlerscore/providers/huggingface/embedding.go- Embedding handlerscore/providers/huggingface/models.go- Model listing and typescore/providers/huggingface/responses.go- Response parsingcore/providers/huggingface/types.go- Type definitionstests/core-providers/huggingface_test- Test setup
Files to Modify:
core/schemas/bifrost.go- AddHuggingFaceModelProvider constantcore/bifrost.go- Register Hugging Face provider increateBaseProvider()framework/configstore/- Update provider config handling if neededtransports/bifrost-http/- Update HTTP transport if neededui/- Add Hugging Face to UI provider list (if applicable)
API Reference
- Hugging Face Inference API: https://huggingface.co/docs/api-inference
- OpenAI-compatible endpoints: https://huggingface.co/docs/api-inference/openai-compatible-endpoints
- Authentication: https://huggingface.co/docs/api-inference/authentication
Testing Considerations
- Test with various model types (chat, text, embeddings)
- Test streaming vs non-streaming modes
- Test error handling (invalid API key, model not found, rate limits)
- Test with both free and paid Inference API tiers
- Verify compatibility with existing Bifrost features (plugins, logging, etc.)
Optional Enhancements
- Support for custom inference endpoints (self-hosted Hugging Face endpoints)
- Support for model-specific parameters
- Integration with Hugging Face Hub API for dynamic model discovery
- Support for task-specific endpoints (summarization, translation, etc.)
Acceptance Criteria
- Hugging Face provider can be initialized and configured
- Chat completions (sync and stream) work correctly
- Embeddings endpoint is functional
- List models returns available models
- Error handling is comprehensive
- Provider follows existing Bifrost patterns and conventions
- Documentation is updated
- Tests are added for the new provider
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers
Type
Projects
Status
In progress