-
Notifications
You must be signed in to change notification settings - Fork 130
feat: huggingface provider added #1006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds HuggingFace as a first-class provider (models, chat, streaming, embeddings, TTS/STT, model listing), implements a full HuggingFace provider with converters, types, network/routing utilities, tests, UI entries, and wires HUGGING_FACE_API_KEY into CI/release workflows. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant BifrostCore as Bifrost Core
participant HFProvider as HuggingFaceProvider
participant HFAPI as Hugging Face API
Client->>BifrostCore: Request (chat / embedding / speech / list)
BifrostCore->>HFProvider: Route request + key/config
HFProvider->>HFProvider: splitIntoModelProvider / build URL / prepare body
HFProvider->>HFAPI: HTTP request (sync or SSE stream)
alt streaming
HFAPI-->>HFProvider: SSE chunks
HFProvider->>BifrostCore: Streamed BifrostStream events
BifrostCore->>Client: Streamed events to client
else non-stream
HFAPI-->>HFProvider: JSON response
HFProvider->>HFProvider: convert -> Bifrost response
HFProvider->>BifrostCore: BifrostResponse (with latency/raw)
BifrostCore->>Client: Final response
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🧪 Test Suite AvailableThis PR can be tested by a repository admin. |
135b9b5 to
5baaee2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 18
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/schemas/mux.go (1)
1146-1229: Thought content is emitted twice in the stream when delta contains both Content and Thought.At lines 1207-1213, the code aggregates both
delta.Contentanddelta.Thoughtinto a singlecontentDeltastring and emits it asOutputTextDeltaat line 1218. Then at lines 1343-1377, ifhasThoughtis true, the code separately emitsReasoningSummaryTextDeltacontaining the thought content again.Since these are sequential conditions (not mutually exclusive), when a delta contains both fields, thought appears in:
- The aggregated
OutputTextDeltaevent (lines 1207-1213, 1218)- The separate
ReasoningSummaryTextDeltaevent (lines 1365-1377)Clarify the intended behavior: should thought only appear in the reasoning delta event, or is it correct for it to appear in both? If thought should be separate, remove
delta.Thoughtfrom the aggregation at lines 1207-1213.
🧹 Nitpick comments (6)
docs/features/unified-interface.mdx (1)
88-106: Verify Hugging Face capability matrix matches actual implementationThe Hugging Face row advertises ✅ for Text/Text (stream)/Chat/Chat (stream)/Embeddings/TTS/STT and ❌ for both Responses modes. Please double‑check this against the actual HuggingFace provider implementation (especially Responses and any streaming audio paths) so the matrix doesn’t drift from reality.
core/providers/huggingface/huggingface_test.go (1)
59-62: Moveclient.Shutdown()inside the subtest or uset.Cleanup.
client.Shutdown()is called unconditionally aftert.Run, but if the subtest is run in parallel or skipped, this could cause issues. Consider usingt.Cleanupfor proper resource cleanup.+ t.Cleanup(func() { + client.Shutdown() + }) + t.Run("HuggingFaceTests", func(t *testing.T) { testutil.RunAllComprehensiveTests(t, client, ctx, testConfig) }) - client.Shutdown()core/providers/huggingface/speech.go (1)
27-28: Consider avoiding empty Parameters struct allocation.
hfRequest.Parametersis allocated on line 28 even if no parameters are actually mapped. Consider only allocating when there are parameters to set.// Map parameters if present if request.Params != nil { - hfRequest.Parameters = &HuggingFaceSpeechParameters{} - // Map generation parameters from ExtraParams if available if request.Params.ExtraParams != nil { genParams := &HuggingFaceTranscriptionGenerationParameters{} // ... parameter mapping ... - hfRequest.Parameters.GenerationParameters = genParams + hfRequest.Parameters = &HuggingFaceSpeechParameters{ + GenerationParameters: genParams, + } } }core/providers/huggingface/chat.go (1)
36-38: Errors fromsonic.Marshalare silently ignored.Multiple calls to
sonic.Marshaldiscard the error using_. While marshalling simple structs rarely fails, silently ignoring errors could mask issues in edge cases (e.g., cyclic references, unusual types).Consider logging or returning an error when marshalling fails for better debuggability:
contentJSON, err := sonic.Marshal(*msg.Content.ContentStr) if err != nil { // At minimum, log the error for debugging if debug { fmt.Printf("[huggingface debug] Failed to marshal content: %v\n", err) } continue }Also applies to: 61-62, 189-194
core/providers/huggingface/huggingface.go (1)
51-85: Typo: "aquire" should be "acquire"The function names use "aquire" instead of the correct spelling "acquire". While this doesn't affect functionality, it's a code quality issue that should be fixed for consistency and readability.
-func aquireHuggingFaceChatResponse() *HuggingFaceChatResponse { +func acquireHuggingFaceChatResponse() *HuggingFaceChatResponse {-func aquireHuggingFaceTranscriptionResponse() *HuggingFaceTranscriptionResponse { +func acquireHuggingFaceTranscriptionResponse() *HuggingFaceTranscriptionResponse {-func aquireHuggingFaceSpeechResponse() *HuggingFaceSpeechResponse { +func acquireHuggingFaceSpeechResponse() *HuggingFaceSpeechResponse {Don't forget to update the call sites at lines 348, 790, and 861.
core/providers/huggingface/types.go (1)
379-396: Silent error swallowing in UnmarshalJSONThe
UnmarshalJSONmethod silently returnsnilwhen both boolean and string unmarshaling fail, leaving the struct in an uninitialized state. This could mask malformed JSON input.Consider returning an error when the input doesn't match expected types:
func (e *HuggingFaceTranscriptionEarlyStopping) UnmarshalJSON(data []byte) error { + // Handle null explicitly + if string(data) == "null" { + return nil + } + // Try boolean first var boolVal bool if err := json.Unmarshal(data, &boolVal); err == nil { e.BoolValue = &boolVal return nil } // Try string var stringVal string if err := json.Unmarshal(data, &stringVal); err == nil { e.StringValue = &stringVal return nil } - return nil + return fmt.Errorf("early_stopping must be a boolean or string, got: %s", string(data)) }This would require adding
"fmt"to the imports.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (28)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)Makefile(8 hunks)core/bifrost.go(2 hunks)core/internal/testutil/account.go(3 hunks)core/internal/testutil/chat_completion_stream.go(1 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types copy.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(2 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/features/unified-interface.mdx(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(1 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
transports/config.schema.jsoncore/internal/testutil/responses_stream.gocore/schemas/bifrost.gocore/providers/huggingface/embedding.godocs/apis/openapi.jsoncore/schemas/mux.godocs/features/unified-interface.mdxcore/internal/testutil/chat_completion_stream.goui/lib/constants/config.tscore/providers/huggingface/types copy.gocore/providers/huggingface/models.goui/README.mdcore/schemas/account.gocore/bifrost.gocore/providers/huggingface/transcription.goui/lib/constants/icons.tsxui/lib/constants/logs.tsMakefilecore/providers/huggingface/huggingface_test.gocore/providers/huggingface/speech.gocore/internal/testutil/account.godocs/contributing/adding-a-provider.mdxcore/providers/huggingface/utils.gocore/providers/huggingface/chat.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
🧬 Code graph analysis (10)
core/schemas/bifrost.go (1)
ui/lib/types/config.ts (1)
ModelProvider(171-174)
core/providers/huggingface/embedding.go (2)
core/schemas/embedding.go (4)
BifrostEmbeddingRequest(9-16)BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)EmbeddingStruct(124-128)core/providers/huggingface/types.go (2)
HuggingFaceEmbeddingRequest(278-288)HuggingFaceEmbeddingResponse(299-299)
core/schemas/mux.go (2)
core/schemas/responses.go (2)
BifrostResponsesStreamResponse(1422-1460)ResponsesStreamResponseTypeOutputTextDelta(1370-1370)core/schemas/utils.go (1)
Ptr(16-18)
core/internal/testutil/chat_completion_stream.go (1)
core/internal/testutil/utils.go (1)
CreateBasicChatMessage(247-254)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
HuggingFaceListModelsResponse(21-23)core/schemas/bifrost.go (13)
ModelProvider(32-32)RequestType(86-86)ChatCompletionRequest(92-92)ChatCompletionStreamRequest(93-93)TextCompletionRequest(90-90)TextCompletionStreamRequest(91-91)ResponsesRequest(94-94)ResponsesStreamRequest(95-95)EmbeddingRequest(96-96)SpeechRequest(97-97)SpeechStreamRequest(98-98)TranscriptionRequest(99-99)TranscriptionStreamRequest(100-100)core/schemas/models.go (1)
Model(109-129)
core/schemas/account.go (1)
ui/lib/types/config.ts (3)
AzureKeyConfig(23-27)VertexKeyConfig(36-42)BedrockKeyConfig(53-60)
core/bifrost.go (2)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/providers/huggingface/huggingface.go (1)
NewHuggingFaceProvider(88-120)
core/providers/huggingface/speech.go (3)
core/schemas/speech.go (2)
BifrostSpeechRequest(9-16)BifrostSpeechResponse(22-29)core/providers/huggingface/types.go (5)
HuggingFaceSpeechRequest(304-310)HuggingFaceSpeechParameters(313-316)HuggingFaceTranscriptionGenerationParameters(342-359)HuggingFaceTranscriptionEarlyStopping(363-366)HuggingFaceSpeechResponse(319-323)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/providers/huggingface/utils.go (5)
core/providers/huggingface/huggingface.go (1)
HuggingFaceProvider(25-31)core/schemas/models.go (1)
BifrostListModelsRequest(23-34)core/providers/utils/utils.go (5)
GetRequestPath(219-239)MakeRequestWithContext(39-93)HandleProviderAPIError(317-337)CheckAndDecodeBody(423-431)NewBifrostOperationError(449-460)core/providers/huggingface/types.go (3)
HuggingFaceInferenceProviderMappingResponse(25-30)HuggingFaceInferenceProviderMapping(39-42)HuggingFaceHubError(263-266)core/schemas/provider.go (1)
ErrProviderResponseDecode(29-29)
core/providers/huggingface/chat.go (2)
core/schemas/chatcompletions.go (17)
BifrostChatRequest(12-19)ChatContentBlockTypeText(497-497)ChatContentBlockTypeImage(498-498)ChatAssistantMessage(541-545)ChatToolMessage(536-538)ChatToolChoiceStruct(339-344)BifrostResponseChoice(582-590)ChatMessageRole(415-415)ChatAssistantMessageToolCall(564-570)ChatAssistantMessageToolCallFunction(573-576)ChatNonStreamResponseChoice(605-608)BifrostLogProbs(593-598)ContentLogProb(632-637)LogProb(625-629)BifrostLLMUsage(640-647)ChatStreamResponseChoice(611-613)ChatStreamResponseChoiceDelta(616-622)core/providers/huggingface/types.go (12)
HuggingFaceChatRequest(47-67)HuggingFaceChatMessage(69-77)HuggingFaceContentItem(80-84)HuggingFaceImageRef(86-88)HuggingFaceToolCall(90-94)HuggingFaceFunction(96-100)HuggingFaceResponseFormat(102-105)HuggingFaceStreamOptions(114-116)HuggingFaceTool(118-121)HuggingFaceToolFunction(123-127)HuggingFaceChatResponse(129-136)HuggingFaceChatStreamResponse(190-199)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...
(QB_NEW_EN_HYPHEN)
[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...
(QB_NEW_EN_HYPHEN)
[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...
(QB_NEW_EN_HYPHEN)
[grammar] ~1469-~1469: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...
(QB_NEW_EN_HYPHEN)
[uncategorized] ~1779-~1779: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
🔇 Additional comments (29)
.github/workflows/release-pipeline.yml (1)
118-118: LGTM! Consistent HuggingFace API key propagation across release jobs.The HUGGING_FACE_API_KEY has been properly added to all four release jobs (core, framework, plugins, bifrost-http) that require provider API keys for testing. The implementation is consistent with existing API key patterns and correctly sources from GitHub secrets.
Also applies to: 191-191, 268-268, 357-357
core/schemas/account.go (1)
54-56: LGTM! HuggingFaceKeyConfig follows established patterns.The new
HuggingFaceKeyConfigtype and its integration into theKeystruct follow the same pattern as existing provider configurations (Azure, Vertex, Bedrock). The implementation is consistent with:
- Similar
Deploymentsmap structure used by other providers- Proper JSON tags with
omitempty- Go naming conventions
- Optional field design
Also applies to: 17-17
core/internal/testutil/account.go (1)
96-96: LGTM! HuggingFace test configuration added with conservative settings.The HuggingFace provider has been properly integrated into the test account setup. Notable configuration choices:
- MaxRetries: 1 - Much lower than other providers (8-10), suggesting less reliability or desire to fail fast
- Timeout: 300 seconds - Higher than most providers (120s), indicating potentially longer response times
- Retry backoff: 2s-30s - Conservative settings for retry attempts
These settings appear intentional for the HuggingFace Inference API characteristics. Ensure these align with production use cases and adjust if needed based on actual performance data.
Also applies to: 259-266, 512-524
.github/workflows/pr-tests.yml (1)
118-118: LGTM! HuggingFace API key added to test environment.The HUGGING_FACE_API_KEY has been properly added to the PR test workflow, consistent with the release pipeline changes and other provider API key patterns.
ui/README.md (1)
84-84: LGTM! Documentation updated with HuggingFace provider.The README has been updated to include HuggingFace in the list of supported providers, keeping the documentation in sync with the code changes.
core/internal/testutil/responses_stream.go (1)
693-693: Verify the lifecycle streaming safety threshold of 300 is appropriate.The response count safety check at line 693 uses a threshold of 300 chunks. However, the file shows different thresholds for different streaming scenarios: tool streaming (100), reasoning streaming (150), lifecycle streaming (300), and basic streaming (500). Before merging, clarify:
- Why lifecycle streaming allows 3x more chunks than tool streaming—is this intentional differentiation or oversight?
- Is 300 chunks a reasonable upper bound based on actual HuggingFace lifecycle streaming behavior?
- Consider adding logging when approaching thresholds to help diagnose unexpected verbosity.
transports/config.schema.json (1)
135-140: HuggingFace correctly wired into config schema
providers.huggingfacereuses the genericproviderschema and the semanticcacheproviderenum now includes"huggingface", both consistent with other providers and with the newModelProviderconstant. No issues spotted.Also applies to: 764-784
core/schemas/bifrost.go (1)
35-83: ModelProvider and provider lists updated consistently for HuggingFace
HuggingFaceis added as aModelProviderand included in bothSupportedBaseProvidersandStandardProviderswith the correct"huggingface"identifier. This aligns with the rest of the provider plumbing.ui/lib/constants/config.ts (1)
5-24: UI config: HuggingFace placeholder and key requirement look correctThe
huggingfacemodel placeholder andisKeyRequiredByProviderentry are consistent with other providers and with the expected HF auth model. No issues from a UI/config standpoint.Also applies to: 26-44
ui/lib/constants/logs.ts (1)
2-20: Logging constants updated to recognize HuggingFaceAdding
"huggingface"toKnownProvidersNamesandProviderLabelskeeps the provider type and display labels consistent across the UI logging layer. Looks good.Also applies to: 43-61
docs/apis/openapi.json (1)
3239-3259: OpenAPI ModelProvider enum now includes HuggingFace (and Cerebras)Extending the
ModelProviderenum to include"huggingface"(and"cerebras") brings the public API spec in line with the backend/provider constants and config schema. The change is additive and backward‑compatible.core/bifrost.go (1)
1327-1328: LGTM!The HuggingFace provider is correctly integrated into the provider factory, following the established pattern used by other providers that return only a pointer (like OpenAI, Anthropic, Mistral, Gemini).
core/providers/huggingface/embedding.go (2)
22-31: LGTM with a note on unsupported input types.The conversion correctly handles
TextandTextsinput types. The comment on lines 29-30 appropriately documents that embedding/embeddings (int arrays) are not supported by HuggingFace feature extraction.
57-92: LGTM!The response conversion correctly:
- Validates nil input
- Pre-allocates the slice with proper capacity
- Maps each embedding to the appropriate Bifrost structure
- Documents that HuggingFace doesn't return usage information
core/providers/huggingface/models.go (2)
46-104: LGTM!The
deriveSupportedMethodsfunction correctly:
- Normalizes the pipeline string for case-insensitive matching
- Uses a map to deduplicate methods
- Handles both pipeline tags and model tags for flexibility
- Returns a sorted, deduplicated list of supported methods
34-39: Code correctly distinguishes betweenmodel.IDandmodel.ModelIDfields.The distinction is valid:
model.ModelID(maps to"modelId"in the HuggingFace API response) represents the model's user-facing identifier, whilemodel.ID(maps to"_id"in the API response) represents the internal HuggingFace identifier. The code appropriately usesModelIDfor the composite ID and Name, andIDfor the HuggingFaceID field.core/providers/huggingface/speech.go (1)
94-116: LGTM!The response conversion correctly maps the HuggingFace response to Bifrost format, with appropriate nil checks and documentation about missing usage/alignment data.
Makefile (2)
14-21: LGTM! Improved tool binary discovery for portable environments.The logic for detecting Go binary paths (
GOBIN,GOPATH,DEFAULT_GOBIN) and constructing tool paths (AIR_BIN,GOTESTSUM_BIN) is well-structured. This enables robust tool invocation without relying on globalPATHconfiguration, which benefits environments like Nix.
65-69: Good safety check for preventing root execution in local development.The guard against running as root on developer machines while allowing CI environments is appropriate. This prevents global npm install failures on systems like Nix.
core/providers/huggingface/transcription.go (1)
100-139: LGTM! Response conversion is well-implemented.The
ToBifrostTranscriptionResponsemethod properly validates inputs, handles optional timestamp chunks, and correctly maps them toTranscriptionSegmentstructures.core/providers/huggingface/chat.go (2)
201-315: LGTM! Non-streaming response conversion is comprehensive.The
ToBifrostChatResponsemethod properly handles:
- Nil checks and model validation
- Choice conversion with message, role, content, and tool calls
- Logprobs conversion including nested top logprobs
- Usage information mapping
317-413: LGTM! Streaming response conversion is well-implemented.The
ToBifrostChatStreamResponsemethod correctly converts streaming delta responses, handling roles, content, reasoning (as thought), tool calls, and logprobs appropriately.core/providers/huggingface/utils.go (1)
83-128: LGTM! URL building with proper encoding and ExtraParams handling.The
buildModelHubURLfunction correctly:
- Applies default and maximum limits
- URL-encodes all query parameters
- Handles various types in
ExtraParamswith appropriate conversions- Constructs a well-formed URL with the inference provider filter
core/providers/huggingface/huggingface.go (3)
477-650: LGTM: Stream processing goroutineThe goroutine properly handles:
- Context cancellation checks before processing each chunk
- Deferred cleanup of resources (channel close, response release)
- Scanner buffer sizing for large responses
- Error handling with proper error propagation to the response channel
- Both regular chat streaming and ResponsesStream fallback modes
132-205: LGTM: Request handling with proper resource managementThe
completeRequestfunction correctly:
- Uses defer for cleanup of fasthttp resources
- Makes a copy of the response body before releasing to avoid use-after-free
- Handles error responses with proper error type extraction
- Supports debug logging controlled by environment variable
87-120: LGTM: Provider initializationProvider initialization correctly handles configuration defaults, client setup with timeouts, pool pre-warming, proxy configuration, and base URL normalization.
core/providers/huggingface/types.go (3)
1-43: LGTM: Model and inference provider typesThe model metadata types and inference provider mapping structures are well-defined with appropriate JSON tags for API compatibility.
44-260: LGTM: Chat completion typesComprehensive type definitions for chat requests/responses with:
- Support for both streaming and non-streaming responses
- Flexible content handling via
json.RawMessage- Tool/function calling support
- Logprobs support
- Time info for streaming diagnostics
274-411: LGTM: Embedding, Speech, and Transcription typesWell-structured types for:
- Embedding with flexible input types and encoding format options
- Speech synthesis with generation parameters
- Transcription with timestamp support and generation configuration
- Type aliases for backward compatibility
5baaee2 to
5a72875
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (12)
docs/contributing/adding-a-provider.mdx (2)
500-527: Variable name inconsistency in documentation example (duplicate comment).The example declares
hfReqon line 500 but then referencesproviderReqin the parameter mapping section. This inconsistency was flagged in a past review and remains unresolved. Replace allproviderReqreferences withhfReqfor consistency.Apply this diff to fix the variable names:
// Build the request hfReq := &HuggingFaceChatRequest{ Model: bifrostReq.Model, Messages: hfMessages, } // Map parameters if bifrostReq.Params != nil { params := bifrostReq.Params // Map standard parameters if params.Temperature != nil { - providerReq.Temperature = params.Temperature + hfReq.Temperature = params.Temperature } if params.MaxTokens != nil { - providerReq.MaxTokens = params.MaxTokens + hfReq.MaxTokens = params.MaxTokens } // ... other standard parameters // Handle provider-specific ExtraParams if params.ExtraParams != nil { if customParam, ok := params.ExtraParams["custom_param"].(string); ok { - providerReq.CustomParam = &customParam + hfReq.CustomParam = &customParam } } } - return providerReq + return hfReq
1405-1427: Incomplete/truncated code example (duplicate comment).The code example is cut off mid-function at the tool/function calling section, which may confuse contributors trying to implement converters. This was flagged in a past review. Either complete the example with the tool calling logic and final return, or explicitly mark it as abbreviated with a comment like:
// Tool/Function calling if params.Tools != nil && len(params.Tools) > 0 { // Convert tools... } } return hfReq }Consider adding clarity about why the example is truncated and pointing contributors to full working examples in
core/providers/huggingface/orcore/providers/anthropic/.core/providers/huggingface/huggingface_test.go (1)
29-33: TranscriptionModel and SpeechSynthesisModel are swapped.Based on model capabilities:
Kokoro-82Mis a text-to-speech (TTS) model → should beSpeechSynthesisModelwhisper-large-v3is a speech-to-text (transcription) model → should beTranscriptionModelApply this diff to fix the model assignments:
- TranscriptionModel: "fal-ai/hexgrad/Kokoro-82M", - SpeechSynthesisModel: "fal-ai/openai/whisper-large-v3", + TranscriptionModel: "fal-ai/openai/whisper-large-v3", + SpeechSynthesisModel: "fal-ai/hexgrad/Kokoro-82M", SpeechSynthesisFallbacks: []schemas.Fallback{ - {Provider: schemas.HuggingFace, Model: "fal-ai/openai/whisper-large-v3"}, + {Provider: schemas.HuggingFace, Model: "fal-ai/hexgrad/Kokoro-82M"}, },core/providers/huggingface/speech.go (1)
37-63: Type assertions forintwill fail when values come from JSON unmarshaling.When
ExtraParamsis populated from JSON (e.g., from HTTP request bodies), numeric values are unmarshaled asfloat64, notint. These type assertions will silently fail, and the parameters won't be set.Handle both
intandfloat64types for all integer parameters. Example formax_new_tokens:- if val, ok := request.Params.ExtraParams["max_new_tokens"].(int); ok { - genParams.MaxNewTokens = &val + if val, ok := request.Params.ExtraParams["max_new_tokens"].(float64); ok { + intVal := int(val) + genParams.MaxNewTokens = &intVal + } else if val, ok := request.Params.ExtraParams["max_new_tokens"].(int); ok { + genParams.MaxNewTokens = &val }Apply the same pattern to:
max_length,min_length,min_new_tokens,num_beams,num_beam_groups, andtop_k.Consider extracting a helper function to reduce code duplication:
func getIntParam(params map[string]any, key string) *int { if val, ok := params[key].(float64); ok { intVal := int(val) return &intVal } if val, ok := params[key].(int); ok { return &val } return nil }core/providers/huggingface/transcription.go (2)
14-16: Incorrect error message references "speech" instead of "transcription".The error message says "speech request input cannot be nil" but this is a transcription request converter.
if request.Input == nil { - return nil, fmt.Errorf("speech request input cannot be nil") + return nil, fmt.Errorf("transcription request input cannot be nil") }
38-63: Type assertions forintwill fail when values come from JSON unmarshaling.Same issue as in
speech.go- whenExtraParamscomes from JSON, numeric values arefloat64, notint. These assertions will silently fail.Handle both
intandfloat64types. Consider creating a shared helper function in the package to avoid code duplication acrossspeech.goandtranscription.go:// In a shared utils file func getIntFromExtra(params map[string]any, key string) *int { if val, ok := params[key].(float64); ok { intVal := int(val) return &intVal } if val, ok := params[key].(int); ok { return &val } return nil }Then use it as:
genParams.MaxNewTokens = getIntFromExtra(request.Params.ExtraParams, "max_new_tokens")core/providers/huggingface/chat.go (1)
69-81: Critical: Nil pointer dereference remains unaddressed.Line 74 still dereferences
tc.Function.Namewithout checking for nil, which can cause a panic.Apply the previously suggested fix:
for _, tc := range msg.ChatAssistantMessage.ToolCalls { + if tc.Function.Name == nil { + continue // Skip tool calls without a function name + } hfToolCall := HuggingFaceToolCall{ ID: tc.ID, Type: tc.Type, Function: HuggingFaceFunction{ Name: *tc.Function.Name, Arguments: tc.Function.Arguments, }, } hfToolCalls = append(hfToolCalls, hfToolCall) }core/providers/huggingface/huggingface.go (4)
277-293: Critical: Wrong provider constant remains unfixed.Line 279 still uses
schemas.Geminiinstead ofschemas.HuggingFacein the operation check, causing incorrect permission validation.Apply the previously suggested fix:
-if err := providerUtils.CheckOperationAllowed(schemas.Gemini, provider.customProviderConfig, schemas.ListModelsRequest); err != nil { +if err := providerUtils.CheckOperationAllowed(schemas.HuggingFace, provider.customProviderConfig, schemas.ListModelsRequest); err != nil { return nil, err }
700-707: Minor: Wrong request type in Embedding error messages remains unfixed.Lines 701 and 706 use
schemas.SpeechRequestinstead ofschemas.EmbeddingRequestin error messages, which will confuse debugging.Apply the previously suggested fix:
if providerMapping == nil { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) } mapping, ok := providerMapping[inferenceProvider] if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "feature-extraction" { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) }
773-776: Major: Wrong task type check in Speech remains unfixed.Line 774 checks for
"automatic-speech-recognition"(transcription task) instead of"text-to-speech"(speech generation task). This will cause Speech operations to incorrectly validate against the wrong task type.Apply the previously suggested fix:
mapping, ok := providerMapping[inferenceProvider] -if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "automatic-speech-recognition" { +if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "text-to-speech" { return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) }
831-838: Major: Wrong task type and request type in Transcription remain unfixed.Two issues here:
- Lines 832 and 837 use
schemas.SpeechRequestinstead ofschemas.TranscriptionRequestin error messages- Line 836 checks for
"text-to-speech"task instead of"automatic-speech-recognition"(the tasks are swapped with the Speech function)Apply the previously suggested fix:
if providerMapping == nil { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey()) } mapping, ok := providerMapping[inferenceProvider] -if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "text-to-speech" { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) +if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "automatic-speech-recognition" { + return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey()) }core/providers/huggingface/types.go (1)
21-23: Critical: Struct definition may not match API response format.The HuggingFace
/api/modelsendpoint returns a JSON array directly, butHuggingFaceListModelsResponseexpects an object with amodelsfield. This mismatch will cause unmarshaling to fail.Run this script to verify the API response format:
#!/bin/bash # Check the actual response format from HuggingFace API curl -s "https://huggingface.co/api/models?limit=1" | jq -c 'if type == "array" then "ARRAY" else "OBJECT with keys: " + (keys | join(", ")) end'If the API returns
"ARRAY", change the response type to a slice or add customUnmarshalJSON:-type HuggingFaceListModelsResponse struct { - Models []HuggingFaceModel `json:"models"` -} +type HuggingFaceListModelsResponse []HuggingFaceModelAnd update usage in
listModelsByKey(line ~256 in huggingface.go):-response := huggingfaceAPIResponse.ToBifrostListModelsResponse(providerName) +response := ToBifrostListModelsResponse(huggingfaceAPIResponse, providerName)
🧹 Nitpick comments (8)
docs/contributing/adding-a-provider.mdx (1)
43-43: Hyphenate compound adjectives: "OpenAI-compatible" throughout documentation.For consistency and grammatical correctness, compound adjectives should be hyphenated when they precede a noun. Update all instances of "OpenAI Compatible" to "OpenAI-compatible":
- #### Non-OpenAI Compatible Providers + #### Non-OpenAI-compatible Providers - #### OpenAI Compatible Providers + #### OpenAI-compatible Providers - ### OpenAI Compatible Providers + ### OpenAI-compatible Providers - ### For OpenAI Compatible Providers + ### For OpenAI-compatible ProvidersApply these changes throughout the document to improve consistency and readability.
Also applies to: 71-71, 629-629, 1469-1469
Makefile (3)
14-21: Binary path variables introduce deterministic tool discovery—good for portability.The approach of chaining GOBIN → GOPATH/bin → default Go locations is sound and addresses real pain points on systems like Nix. However, Line 21's fallback to
whichmay not handle absolute paths reliably:GOTESTSUM_BIN := $(if $(strip $(DEFAULT_GOBIN)),$(DEFAULT_GOBIN)/gotestsum,$(shell which gotestsum 2>/dev/null || echo gotestsum))
whichtypically expects command names (searched in PATH), not absolute paths. Ifwhichis called with a full path, it may fail unexpectedly on some systems. Consider usingcommand -vor simplifying to just fallback to the bare name:- GOTESTSUM_BIN := $(if $(strip $(DEFAULT_GOBIN)),$(DEFAULT_GOBIN)/gotestsum,$(shell which gotestsum 2>/dev/null || echo gotestsum)) + GOTESTSUM_BIN := $(if $(strip $(DEFAULT_GOBIN)),$(DEFAULT_GOBIN)/gotestsum,gotestsum)The
|| echo gotestsumensures the variable is never empty, but the bare name should be sufficient; downstream checks like Line 103 already verify existence.
89-100: install-air logic is sound but the binary availability check is convoluted (Line 97).The condition:
if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then…will work (warns only if the full path doesn't exist), but it's unnecessarily complex. Since
INSTALLEDis a full path,whichon an absolute path may fail unexpectedly on some shells. Simplify to:- if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \ + if [ ! -x "$$INSTALLED" ]; then \
102-113: install-gotestsum has the samewhichissue and convoluted logic (Lines 103, 110).Line 103:
if [ -x "$(GOTESTSUM_BIN)" ] || which $(GOTESTSUM_BIN) > /dev/null 2>&1; thenIf GOTESTSUM_BIN is a full path like
/home/user/go/bin/gotestsum,whichon it will fail. Simplify to just check the file:- if [ -x "$(GOTESTSUM_BIN)" ] || which $(GOTESTSUM_BIN) > /dev/null 2>&1; then \ + if [ -x "$(GOTESTSUM_BIN)" ]; then \Line 110 has the same convoluted conditional as Line 97—apply the same simplification:
- if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \ + if [ ! -x "$$INSTALLED" ]; then \core/schemas/mux.go (1)
1146-1221: Confirm intent of mixingdelta.Thoughtintooutput_text.deltaand potential duplicationThe new logic now (a) enters the text path when either
hasContentorhasThoughtis true and (b) buildscontentDeltaby concatenatingdelta.Contentanddelta.Thought, then uses that forResponsesStreamResponseTypeOutputTextDelta.Delta, while still emitting a separateResponsesStreamResponseTypeReasoningSummaryTextDeltabased ondelta.Thought.This is a behavioral change from “text delta only reflects
Content” to “text delta reflectsContent + Thought”, and also means the sameThoughttokens appear in both the output-text and reasoning-summary streams. IfThoughtis intended to remain non-user-visible reasoning, this could leak it into the primary text channel and may surprise existing consumers that only look atoutput_text.delta.Please double‑check that:
- Existing UIs/clients that consume
output_text.deltaare meant to seeThoughttext, and- They can tolerate
Thoughtappearing both inoutput_text.deltaandreasoning_summary_text.delta.If the goal was only to advance lifecycle (item creation / closing) when thought‑only chunks arrive, a narrower change that gates on
hasThoughtbut keepsDeltabased solely onContentmight be safer. Happy to sketch that refactor if you confirm the desired semantics.core/providers/huggingface/huggingface_test.go (1)
59-63: Consider using defer for client.Shutdown() to ensure cleanup on panic.If
RunAllComprehensiveTestspanics,client.Shutdown()won't be called, potentially leaving resources unreleased.+ defer client.Shutdown() + t.Run("HuggingFaceTests", func(t *testing.T) { testutil.RunAllComprehensiveTests(t, client, ctx, testConfig) }) - client.Shutdown()core/providers/huggingface/chat.go (2)
34-64: Consider handling marshaling errors for content conversion.Lines 37 and 61 discard marshaling errors when converting content. While the API may reject invalid payloads, explicitly handling these errors would make debugging easier.
Consider logging or returning errors:
if msg.Content.ContentStr != nil { - contentJSON, _ := sonic.Marshal(*msg.Content.ContentStr) + contentJSON, err := sonic.Marshal(*msg.Content.ContentStr) + if err != nil { + // Log warning or return error + continue + } hfMsg.Content = json.RawMessage(contentJSON) }Apply similar handling at line 61 for ContentBlocks.
136-195: Consider handling errors in ResponseFormat and ToolChoice conversions.Lines 138-144 and 189-193 silently discard marshaling errors when converting
ResponseFormatandToolChoice. Since these are optional but important parameters, logging failures would aid debugging.Example for ResponseFormat:
if params.ResponseFormat != nil { responseFormatJSON, err := sonic.Marshal(params.ResponseFormat) - if err == nil { + if err != nil { + // Log warning: failed to marshal ResponseFormat + } else { var hfResponseFormat HuggingFaceResponseFormat if err := sonic.Unmarshal(responseFormatJSON, &hfResponseFormat); err == nil { hfReq.ResponseFormat = &hfResponseFormat + } else { + // Log warning: failed to unmarshal ResponseFormat } } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (28)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)Makefile(8 hunks)core/bifrost.go(2 hunks)core/internal/testutil/account.go(3 hunks)core/internal/testutil/chat_completion_stream.go(1 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types copy.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(2 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/features/unified-interface.mdx(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(1 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- core/providers/huggingface/utils.go
🚧 Files skipped from review as they are similar to previous changes (14)
- core/providers/huggingface/types copy.go
- ui/README.md
- docs/features/unified-interface.mdx
- .github/workflows/pr-tests.yml
- docs/apis/openapi.json
- ui/lib/constants/logs.ts
- core/providers/huggingface/embedding.go
- transports/config.schema.json
- ui/lib/constants/config.ts
- .github/workflows/release-pipeline.yml
- core/schemas/bifrost.go
- core/internal/testutil/chat_completion_stream.go
- core/internal/testutil/account.go
- core/internal/testutil/responses_stream.go
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/huggingface/huggingface_test.gocore/providers/huggingface/models.gocore/providers/huggingface/chat.gocore/providers/huggingface/speech.gocore/schemas/mux.goMakefilecore/providers/huggingface/transcription.gocore/bifrost.goui/lib/constants/icons.tsxcore/schemas/account.godocs/contributing/adding-a-provider.mdxcore/providers/huggingface/types.gocore/providers/huggingface/huggingface.go
🧬 Code graph analysis (6)
core/providers/huggingface/huggingface_test.go (5)
core/internal/testutil/setup.go (1)
SetupTest(51-60)core/internal/testutil/account.go (2)
ComprehensiveTestConfig(47-64)TestScenarios(22-44)core/schemas/bifrost.go (2)
HuggingFace(51-51)Fallback(131-134)core/schemas/models.go (1)
Model(109-129)core/internal/testutil/tests.go (1)
RunAllComprehensiveTests(15-62)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
HuggingFaceListModelsResponse(21-23)core/schemas/bifrost.go (13)
ModelProvider(32-32)RequestType(86-86)ChatCompletionRequest(92-92)ChatCompletionStreamRequest(93-93)TextCompletionRequest(90-90)TextCompletionStreamRequest(91-91)ResponsesRequest(94-94)ResponsesStreamRequest(95-95)EmbeddingRequest(96-96)SpeechRequest(97-97)SpeechStreamRequest(98-98)TranscriptionRequest(99-99)TranscriptionStreamRequest(100-100)core/schemas/models.go (2)
BifrostListModelsResponse(36-45)Model(109-129)
core/providers/huggingface/speech.go (3)
core/schemas/speech.go (2)
BifrostSpeechRequest(9-16)BifrostSpeechResponse(22-29)core/providers/huggingface/types.go (5)
HuggingFaceSpeechRequest(304-310)HuggingFaceSpeechParameters(313-316)HuggingFaceTranscriptionGenerationParameters(342-359)HuggingFaceTranscriptionEarlyStopping(363-366)HuggingFaceSpeechResponse(319-323)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/schemas/mux.go (2)
core/schemas/responses.go (2)
BifrostResponsesStreamResponse(1422-1460)ResponsesStreamResponseTypeOutputTextDelta(1370-1370)core/schemas/utils.go (1)
Ptr(16-18)
core/bifrost.go (2)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/providers/huggingface/huggingface.go (1)
NewHuggingFaceProvider(88-120)
core/schemas/account.go (1)
ui/lib/types/config.ts (3)
AzureKeyConfig(23-27)VertexKeyConfig(36-42)BedrockKeyConfig(53-60)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...
(QB_NEW_EN_HYPHEN)
[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...
(QB_NEW_EN_HYPHEN)
[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...
(QB_NEW_EN_HYPHEN)
[grammar] ~1469-~1469: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...
(QB_NEW_EN_HYPHEN)
[uncategorized] ~1779-~1779: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
🔇 Additional comments (9)
Makefile (2)
155-155: Variable substitutions are correct and enable deterministic test execution.The use of
$(AIR_BIN)and$(GOTESTSUM_BIN)throughout the dev and test targets is consistent and properly formatted. This decoupling from PATH-based tool discovery is essential for reliable test execution, especially for the new HuggingFace provider tests.Also applies to: 163-163, 349-349, 401-401, 425-425, 454-454, 548-548
24-29: Color vars, root check, and next installation fallback are sound.
- Color refactor (lines 24–29): Using
printfcentralizes ANSI code definitions—good for maintainability.- Root user guard (lines 67–70): Defensive check prevents npm permission issues on dev machines—appropriate.
- Next installation fallback (lines 75–85): Thoughtful multi-level strategy (local, then npx, then global) handles Nix-like environments well.
Also applies to: 66-70, 75-85
core/schemas/account.go (1)
54-56: LGTM! Consistent with existing provider key config patterns.The
HuggingFaceKeyConfigtype follows the established pattern used by Azure, Vertex, and Bedrock configurations, with aDeploymentsmap for model-to-deployment mapping.core/bifrost.go (1)
1327-1328: LGTM! Provider integration follows established patterns.The HuggingFace provider case follows the same pattern as other providers like OpenAI, Anthropic, and Gemini that return a provider instance without an error.
core/providers/huggingface/speech.go (1)
94-116: LGTM! Response conversion is straightforward and correct.The
ToBifrostSpeechResponseproperly validates inputs and maps the response fields. The comment about missing usage/alignment data is helpful for future maintainers.core/providers/huggingface/models.go (3)
16-44: LGTM! Model list conversion handles edge cases properly.The method correctly:
- Skips models with empty
ModelID- Skips models without derivable supported methods
- Pre-allocates the result slice with appropriate capacity
- Creates a composite ID with the provider prefix
46-104: LGTM! Comprehensive method derivation from pipeline and tags.The function properly:
- Normalizes the pipeline tag
- Uses a set to deduplicate methods
- Handles both primary pipeline tags and secondary model tags
- Returns a sorted slice for consistent output
11-14: These constants are used incore/providers/huggingface/utils.go(lines 90, 92–93), so no action is needed.core/providers/huggingface/transcription.go (1)
100-139: LGTM! Response conversion with proper segment mapping.The method correctly:
- Validates non-nil receiver and non-empty model name
- Maps chunks to segments with proper ID assignment
- Safely handles timestamp arrays with bounds checking before access
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/schemas/mux.go (1)
1146-1220: Remove duplicate delta handling for Thought content.The code processes
delta.Thoughttwice with conflicting approaches:
- Lines 1207-1213: Concatenates
ThoughtwithContentinto a singlecontentDeltastring emitted asResponsesStreamResponseTypeOutputTextDelta- Lines 1369-1377: Emits the same
delta.Thoughtseparately asResponsesStreamResponseTypeReasoningSummaryTextDeltaThis means consumers receive the thought content both mixed into regular text deltas and as a separate reasoning delta event. Keep the separate reasoning delta emission (lines 1369-1377) and remove the concatenation of
ThoughtintocontentDelta(lines 1211-1212).
♻️ Duplicate comments (15)
core/providers/huggingface/types.go (1)
21-23: Struct definition doesn't match HuggingFace API response format.This issue was previously flagged: The HuggingFace
/api/modelsendpoint returns a JSON array directly, butHuggingFaceListModelsResponseexpects an object with amodelsfield. This will cause unmarshaling failures. Either change the response type to[]HuggingFaceModelor add customUnmarshalJSONlogic.docs/contributing/adding-a-provider.mdx (2)
510-527: Variable name inconsistency in documentation example.The example code references
providerReq(e.g.,providerReq.Temperature), but the variable was namedhfReqon line 500. This inconsistency could confuse contributors.
1406-1427: Code example is truncated mid-function.The code block is incomplete, cutting off inside the parameter mapping section. This could confuse contributors trying to follow the pattern.
core/providers/huggingface/transcription.go (2)
14-16: Incorrect error message references "speech" instead of "transcription".The error message says "speech request input cannot be nil" but this is a transcription request converter.
38-79: Type assertions forintwill fail when values come from JSON unmarshalling.When
ExtraParamsis populated from JSON (e.g., from request bodies), numeric values are unmarshalled asfloat64, notint. The type assertions like.(int)on lines 38, 41, 44, 47, 50, 53, and 62 will silently fail, causing these parameters to be ignored.Consider handling both
intandfloat64types for all integer parameters (max_new_tokens,max_length,min_length,min_new_tokens,num_beams,num_beam_groups,top_k).core/providers/huggingface/speech.go (1)
37-77: Type assertions forintwill fail when values come from JSON unmarshalling.When
ExtraParamsis populated from JSON, numeric values are unmarshalled asfloat64, notint. The type assertions like.(int)on lines 37, 40, 43, 46, 49, 52, and 61 will silently fail, causing these parameters to be ignored.Consider handling both
intandfloat64types for all integer parameters.core/providers/huggingface/chat.go (1)
69-78: Potential nil pointer dereference when accessingtc.Function.Name.On line 74,
tc.Function.Nameis dereferenced without checking if it's nil. If the tool call has a nilNamefield, this will cause a panic.Add a nil check before dereferencing or skip tool calls without function names.
core/providers/huggingface/utils.go (3)
136-162: Empty provider and model names when input has no slashes.When
splitIntoModelProviderreceives a model name with no slashes (t == 0), bothprovandmodelremain empty strings. Downstream code at lines 313 and 398 inhuggingface.goformats this as"" : "", resulting in malformed model identifiers.Handle the t == 0 case by setting a default provider (e.g., "hf-inference") and using the input as the model name.
164-197: Incomplete provider routing ingetInferenceProviderRouteURL— 13 of 19 defined providers will error.The function only handles 6 providers (
fal-ai,hf-inference,nebius,replicate,sambanova,scaleway) whileINFERENCE_PROVIDERSdefines 19 total. Providers likecerebras,cohere,groq,featherlessAI,fireworksAI,hyperbolic,novita,nscale,ovhcloud,publicai,together,wavespeed, andzaiOrgwill hit the default error case.This affects embedding, speech, and transcription operations. Either expand routing logic for all providers or remove unsupported ones from
INFERENCE_PROVIDERS.
180-191: Copy-paste error: Wrong provider names in error messages.Lines 184 and 190 incorrectly reference "nebius provider" when the actual providers are "sambanova" (line 184) and "scaleway" (line 190).
core/providers/huggingface/huggingface.go (5)
277-281: Bug: Wrong provider constant used in ListModels.
CheckOperationAllowedis called withschemas.Geminiinstead ofschemas.HuggingFace. This is a copy-paste error and will cause incorrect operation permission checks.
295-301: Bug: Wrong request types in unsupported operation errors.Both
TextCompletionandTextCompletionStreamreturn errors withschemas.EmbeddingRequestinstead of the correct request types (schemas.TextCompletionRequestandschemas.TextCompletionStreamRequest).
700-707: Bug: Wrong request type in Embedding error messages.The error messages reference
schemas.SpeechRequestinstead ofschemas.EmbeddingRequestwhen the embedding operation is unsupported (lines 701 and 706).
773-776: Bug: Wrong task type check in Speech.The Speech function checks for
"automatic-speech-recognition"task, but Speech (text-to-speech) should check for"text-to-speech". The task checks appear to be swapped between Speech and Transcription functions.
831-838: Bug: Wrong task type check and error messages in Transcription.Two issues:
- The error messages use
schemas.SpeechRequestinstead ofschemas.TranscriptionRequest(lines 832, 837)- Line 836 checks for
"text-to-speech"task, but Transcription should check for"automatic-speech-recognition"
🧹 Nitpick comments (4)
Makefile (2)
88-100: Extract repeated DEFAULT_GOBIN logic into a shared variable for DRY.Lines 95 and 108 both replicate the DEFAULT_GOBIN fallback logic:
$(if $(strip $(GOBIN)),$(GOBIN)/gotestsum,$(if $(strip $(GOPATH)),$(GOPATH)/bin/gotestsum,...))This can be defined once and reused, improving maintainability and consistency across targets.
Consider extracting into a helper pattern (or simply reusing DEFAULT_GOBIN):
INSTALLED_AIR_PATH := $(DEFAULT_GOBIN)/air INSTALLED_GOTESTSUM_PATH := $(DEFAULT_GOBIN)/gotestsumThen reference these in the informational messages instead of computing the logic twice.
Also applies to: 102-113
103-103: Minor: Quote GOTESTSUM_BIN in which invocation for robustness.Line 103 uses
which $(GOTESTSUM_BIN)without quotes. While binary names rarely contain spaces, shell best practice is to quote variable expansions:which "$(GOTESTSUM_BIN)" > /dev/null 2>&1core/providers/huggingface/types.go (2)
379-396: Consider returning an error for invalid early_stopping values.The
UnmarshalJSONmethod silently ignores invalid values by returningnilat line 395. If the JSON value is neither a boolean nor a string, this will leave the field in an indeterminate state. Consider returning an error to surface invalid API responses:func (e *HuggingFaceTranscriptionEarlyStopping) UnmarshalJSON(data []byte) error { // Try boolean first var boolVal bool if err := json.Unmarshal(data, &boolVal); err == nil { e.BoolValue = &boolVal return nil } // Try string var stringVal string if err := json.Unmarshal(data, &stringVal); err == nil { e.StringValue = &stringVal return nil } - return nil + return fmt.Errorf("early_stopping must be a boolean or string, got: %s", string(data)) }
66-66: Extra field won't capture unknown fields withjson:"-"tag.The
Extrafield at line 66 has thejson:"-"tag, which excludes it from JSON marshaling/unmarshaling. This means it won't capture unknown additional fields as the comment suggests. If you want to capture unknown fields, remove thejson:"-"tag or use a different approach. If this field is for internal use only (not from/to JSON), the current implementation is correct but the comment should be clarified.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (28)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)Makefile(8 hunks)core/bifrost.go(2 hunks)core/internal/testutil/account.go(3 hunks)core/internal/testutil/chat_completion_stream.go(1 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types copy.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(2 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/features/unified-interface.mdx(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(1 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- core/providers/huggingface/types copy.go
🚧 Files skipped from review as they are similar to previous changes (12)
- core/internal/testutil/chat_completion_stream.go
- core/providers/huggingface/huggingface_test.go
- ui/lib/constants/config.ts
- core/providers/huggingface/models.go
- ui/lib/constants/logs.ts
- core/schemas/account.go
- core/schemas/bifrost.go
- transports/config.schema.json
- docs/apis/openapi.json
- ui/README.md
- .github/workflows/pr-tests.yml
- core/internal/testutil/responses_stream.go
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/schemas/mux.godocs/features/unified-interface.mdxcore/internal/testutil/account.gocore/providers/huggingface/embedding.godocs/contributing/adding-a-provider.mdxcore/providers/huggingface/speech.goMakefilecore/providers/huggingface/transcription.gocore/providers/huggingface/utils.goui/lib/constants/icons.tsxcore/providers/huggingface/chat.gocore/bifrost.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
🧬 Code graph analysis (6)
core/schemas/mux.go (2)
core/schemas/responses.go (2)
BifrostResponsesStreamResponse(1422-1460)ResponsesStreamResponseTypeOutputTextDelta(1370-1370)core/schemas/utils.go (1)
Ptr(16-18)
core/internal/testutil/account.go (4)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/schemas/account.go (1)
Key(8-18)core/schemas/provider.go (4)
ProviderConfig(234-242)NetworkConfig(45-53)DefaultRequestTimeoutInSeconds(15-15)ConcurrencyAndBufferSize(128-131)core/internal/testutil/cross_provider_scenarios.go (1)
ProviderConfig(45-53)
core/providers/huggingface/embedding.go (3)
core/schemas/embedding.go (4)
BifrostEmbeddingRequest(9-16)BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)EmbeddingStruct(124-128)core/providers/huggingface/types.go (2)
HuggingFaceEmbeddingRequest(278-288)HuggingFaceEmbeddingResponse(299-299)core/schemas/chatcompletions.go (1)
BifrostLLMUsage(640-647)
core/providers/huggingface/transcription.go (3)
core/schemas/transcriptions.go (2)
BifrostTranscriptionRequest(3-10)BifrostTranscriptionResponse(16-26)core/providers/huggingface/types.go (5)
HuggingFaceTranscriptionRequest(328-333)HuggingFaceTranscriptionRequestParameters(336-339)HuggingFaceTranscriptionGenerationParameters(342-359)HuggingFaceTranscriptionEarlyStopping(363-366)HuggingFaceTranscriptionResponse(399-402)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/providers/huggingface/chat.go (3)
core/schemas/chatcompletions.go (4)
BifrostChatRequest(12-19)BifrostChatResponse(26-41)BifrostResponseChoice(582-590)BifrostLLMUsage(640-647)core/providers/huggingface/types.go (11)
HuggingFaceChatRequest(47-67)HuggingFaceChatMessage(69-77)HuggingFaceContentItem(80-84)HuggingFaceImageRef(86-88)HuggingFaceToolCall(90-94)HuggingFaceFunction(96-100)HuggingFaceResponseFormat(102-105)HuggingFaceStreamOptions(114-116)HuggingFaceTool(118-121)HuggingFaceToolFunction(123-127)HuggingFaceChatResponse(129-136)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/bifrost.go (2)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/providers/huggingface/huggingface.go (1)
NewHuggingFaceProvider(88-120)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...
(QB_NEW_EN_HYPHEN)
[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...
(QB_NEW_EN_HYPHEN)
[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...
(QB_NEW_EN_HYPHEN)
[grammar] ~1469-~1469: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...
(QB_NEW_EN_HYPHEN)
[uncategorized] ~1779-~1779: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
🔇 Additional comments (21)
Makefile (3)
15-21: Tool-path discovery logic is sound; minor consistency opportunity.The conditional expansion for DEFAULT_GOBIN and fallback chains for AIR_BIN and GOTESTSUM_BIN are well-structured. However, the logic could be slightly clearer:
- Line 17–18:
GOBIN ?=andGOPATH ?=query environment each time; consider memoizing to$(shell go env GOBIN)and$(shell go env GOPATH)without the conditional assignment if these are truly static per invocation.- Line 21: The fallback to
which gotestsumis good, but ensure it's tested on systems wherewhichmay not be available (rare, but worth verifying).Overall approach is solid for improving portability and reducing hard-coded tool assumptions.
155-155: Binary invocation refactoring is correct.The substitution of direct
airwith$(AIR_BIN)is clean and allows the Makefile to find air without relying on PATH. Well done.Also applies to: 163-163
349-349: Test target refactoring to use $(GOTESTSUM_BIN) looks good.All test invocations now consistently use the computed GOTESTSUM_BIN variable, ensuring tests can run even if gotestsum is not on PATH. Proper fallback logic in variable definition (line 21) makes this robust.
Verify that tests pass when GOTESTSUM_BIN resolves to a full path outside PATH (e.g., via GOBIN). Run a quick smoke test locally to confirm:
make test-core PROVIDER=<any_provider>.Also applies to: 401-401, 425-425, 454-454, 548-548
core/internal/testutil/account.go (2)
96-96: LGTM! HuggingFace provider added to configured providers.The addition follows the established pattern and correctly references the HuggingFace constant defined in core/schemas/bifrost.go.
259-266: LGTM! Key retrieval properly configured.The HuggingFace key configuration follows the same pattern as other providers, retrieving the API key from the
HUGGING_FACE_API_KEYenvironment variable.core/providers/huggingface/types.go (1)
129-408: Comprehensive type definitions for HuggingFace integration.The type definitions provide good coverage for chat, embeddings, speech, and transcription APIs. The use of pointer fields,
json.RawMessage, andmap[string]anyprovides appropriate flexibility for varying API responses while maintaining type safety where possible.docs/features/unified-interface.mdx (1)
98-98: Documentation correctly reflects HuggingFace provider capabilities.The provider support matrix entry accurately documents HuggingFace's supported operations and follows the consistent format of other provider entries.
core/bifrost.go (2)
26-26: Import correctly added for HuggingFace provider.The import follows the consistent pattern used for other provider packages.
1327-1328: HuggingFace provider properly integrated into factory.The provider instantiation follows the same pattern as other providers, correctly calling
NewHuggingFaceProviderwith config and logger parameters. The integration is consistent with the existing provider architecture.docs/contributing/adding-a-provider.mdx (1)
7-2065: LGTM! Comprehensive provider implementation guide.The documentation provides excellent coverage of both OpenAI-compatible and custom provider patterns, with clear examples, conventions, and checklists. The phase-based workflow and file organization guidelines will help contributors implement providers consistently.
core/providers/huggingface/embedding.go (2)
10-55: LGTM! Solid embedding request converter.The function properly handles nil inputs, correctly splits model/provider, and safely extracts ExtraParams with type assertions. The mapping logic is clear and follows the converter pattern.
58-93: LGTM! Well-implemented response converter.The method correctly handles nil responses with error returns, pre-allocates slices for performance, and properly documents that HuggingFace doesn't return usage information (setting it to zero). The conversion logic is clean and efficient.
core/providers/huggingface/transcription.go (1)
100-139: LGTM! Solid transcription response converter.The method properly validates inputs, safely handles timestamp array access, and cleanly converts HuggingFace chunks to Bifrost segments. The implementation is defensive and correct.
core/providers/huggingface/speech.go (1)
94-116: LGTM! Clean speech response converter.The method properly validates inputs and clearly documents that HuggingFace TTS API doesn't return usage or alignment data. The implementation is straightforward and correct.
core/providers/huggingface/chat.go (2)
318-413: LGTM! Well-structured streaming response converter.The method properly handles nil responses and cleanly converts all streaming delta fields (role, content, reasoning, tool calls, logprobs) to Bifrost format. The implementation is thorough and correct.
86-89: No action needed. Thedebugvariable is properly accessible.The
debugvariable is defined at package level inhuggingface.go(line 22) and is correctly accessible inchat.go. In Go, all files within the same package share a common namespace—no import or additional qualification is required for package-level variables. The code will compile without errors.Likely an incorrect or invalid review comment.
core/providers/huggingface/utils.go (2)
83-128: LGTM! Well-implemented URL builder.The function properly constructs the model hub URL with pagination, sorting, and provider filtering. The type switch for handling ExtraParams is robust and covers common types.
130-134: LGTM! Solid utility functions.The
buildModelInferenceProviderURL,convertToInferenceProviderMappings, andgetModelInferenceProviderMappingfunctions are well-implemented with proper error handling and clean logic.Also applies to: 199-251
core/providers/huggingface/huggingface.go (3)
20-126: LGTM! Well-structured provider implementation.The provider struct follows standard patterns with proper field ordering. The use of sync.Pool for response objects is a good performance optimization, and the acquire/release helpers are correctly implemented. The constructor properly initializes all fields and pre-warms the pools.
303-653: LGTM! Robust chat completion implementation.Both sync and streaming chat completion methods are well-implemented. The streaming logic properly handles SSE parsing, error cases, and the complex ResponsesStream fallback scenario. The workaround for combined usage/choices chunks (lines 597-626) is clever and well-documented.
655-677: LGTM! Clean delegation to chat completion methods.The Responses and ResponsesStream methods correctly delegate to the corresponding chat completion methods with appropriate context flags and request type overrides.
5a72875 to
00ecc06
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/schemas/mux.go (1)
1146-1221:Thoughtis now streamed as both user-visible text and as a separate reasoning delta, creating double-streaming and changed semantics for all providersThe code confirms the concerns raised:
- Lines 1146–1148 compute
hasContentandhasThought; line 1149 branches if either is true.- Lines 1207–1214 concatenate both
*delta.Contentand*delta.ThoughtintocontentDelta, which becomes theOutputTextDelta.Delta.- Line 1228 sets
state.TextItemHasContent = truefor either content or thought.- Lines 1369–1380 emit
delta.Thoughtseparately asResponsesStreamResponseTypeReasoningSummaryTextDelta.This creates two key problems:
Double-streaming: Chunks with only
Thoughtnow emit both anoutput_text.deltaand areasoning_summary_text.delta, causing reasoning content to appear in the visible text channel.Semantic shift: Chunks with both
ContentandThoughtnow concatenate them into a single visible delta, mixing chain-of-thought into user-facing text instead of keeping reasoning separate.Unguarded change: This logic applies uniformly to all providers (no provider-specific conditionals), making this a behavioral change across the entire stack, not isolated to HuggingFace.
If
Thoughtis intended as a distinct reasoning stream (per OpenAI'sreasoningsemantics), this mixes concerns. If it's a fallback for providers that emit primary text asThought, this needs explicit provider-level detection to avoid leaking reasoning into normal output for other providers.Recommend verifying this is intentional across all providers and that clients consuming only
output_text.deltaare prepared to receive reasoning content.
♻️ Duplicate comments (17)
docs/contributing/adding-a-provider.mdx (1)
510-527: Variable name inconsistency persists from previous review.The code example declares
hfReqat line 500 but referencesproviderReqthroughout lines 511–527. This identical issue was flagged in a previous review. While you clarified that examples are for reference, this particular snippet is labeled "Real Example fromcore/providers/huggingface/chat.go" and should be corrected for accuracy.Apply this diff to align variable names:
// Map parameters if bifrostReq.Params != nil { params := bifrostReq.Params // Map standard parameters if params.Temperature != nil { - providerReq.Temperature = params.Temperature + hfReq.Temperature = params.Temperature } if params.MaxTokens != nil { - providerReq.MaxTokens = params.MaxTokens + hfReq.MaxTokens = params.MaxTokens } // ... other standard parameters // Handle provider-specific ExtraParams if params.ExtraParams != nil { if customParam, ok := params.ExtraParams["custom_param"].(string); ok { - providerReq.CustomParam = &customParam + hfReq.CustomParam = &customParam } } } - return providerReq + return hfReqcore/internal/testutil/account.go (1)
512-524: Align HuggingFace retry policy with other providers
MaxRetriesis set to1for HuggingFace while essentially all other remote providers use8–10retries, and the comment “HuggingFace can be variable” argues for more retries, not fewer. This will make tests unnecessarily flaky on transient errors.Recommend matching the standard 10‑retry policy and updating/removing the comment:
case schemas.HuggingFace: return &schemas.ProviderConfig{ NetworkConfig: schemas.NetworkConfig{ DefaultRequestTimeoutInSeconds: 300, - MaxRetries: 1, // HuggingFace can be variable + MaxRetries: 10, // Align with other variable cloud providers RetryBackoffInitial: 2 * time.Second, RetryBackoffMax: 30 * time.Second, },core/providers/huggingface/huggingface_test.go (1)
29-33: SpeechSynthesisModel and TranscriptionModel appear to be swapped.
Kokoro-82Mis a text-to-speech model and should be assigned toSpeechSynthesisModel, whilewhisper-large-v3is a speech-to-text model and should be assigned toTranscriptionModel. TheSpeechSynthesisFallbacksshould also reference the TTS model.- TranscriptionModel: "fal-ai/hexgrad/Kokoro-82M", - SpeechSynthesisModel: "fal-ai/openai/whisper-large-v3", + TranscriptionModel: "fal-ai/openai/whisper-large-v3", + SpeechSynthesisModel: "fal-ai/hexgrad/Kokoro-82M", SpeechSynthesisFallbacks: []schemas.Fallback{ - {Provider: schemas.HuggingFace, Model: "fal-ai/openai/whisper-large-v3"}, + {Provider: schemas.HuggingFace, Model: "fal-ai/hexgrad/Kokoro-82M"}, },core/providers/huggingface/transcription.go (2)
14-16: Incorrect error message references "speech" instead of "transcription".The error message says "speech request input cannot be nil" but this is a transcription request converter.
if request.Input == nil { - return nil, fmt.Errorf("speech request input cannot be nil") + return nil, fmt.Errorf("transcription request input cannot be nil") }
38-63: Type assertions forintwill fail when values come from JSON unmarshalling.When
ExtraParamsis populated from JSON, numeric values are unmarshalled asfloat64, notint. These type assertions will silently fail, causing parameters to be ignored.Consider handling both types. Example fix for one parameter:
- if val, ok := request.Params.ExtraParams["max_new_tokens"].(int); ok { - genParams.MaxNewTokens = &val + if val, ok := request.Params.ExtraParams["max_new_tokens"].(int); ok { + genParams.MaxNewTokens = &val + } else if val, ok := request.Params.ExtraParams["max_new_tokens"].(float64); ok { + intVal := int(val) + genParams.MaxNewTokens = &intVal }Apply the same pattern to
max_length,min_length,min_new_tokens,num_beams,num_beam_groups, andtop_k.core/providers/huggingface/speech.go (1)
37-62: Type assertions forintmay fail when values come from JSON.Same issue as in transcription.go - when
ExtraParamsis populated from JSON unmarshaling, numeric values are typicallyfloat64, notint. These type assertions will silently fail.Apply the same fix pattern as suggested for transcription.go - handle both
intandfloat64types for all integer parameters.core/providers/huggingface/chat.go (2)
69-81: Potential nil pointer dereference when accessingtc.Function.Name.On line 74,
tc.Function.Nameis dereferenced without checking if it's nil. If a tool call has a nilNamefield, this will cause a panic.for _, tc := range msg.ChatAssistantMessage.ToolCalls { + if tc.Function.Name == nil { + continue // Skip tool calls without a function name + } hfToolCall := HuggingFaceToolCall{ ID: tc.ID, Type: tc.Type,
86-89: Undefined variabledebugwill cause compilation error.The
debugvariable is referenced at lines 86 and 250 but does not appear to be defined in this file. This will cause a compilation failure.#!/bin/bash # Check if debug variable is defined in the huggingface package rg -n "^var debug\b|^const debug\b|debug\s*:=\s*(true|false)" core/providers/huggingface/Also applies to: 250-256
core/providers/huggingface/utils.go (3)
136-162: Empty provider and model names when input has no slashes.When
splitIntoModelProviderreceives a model name with no slashes (t == 0), bothprovandmodelremain empty strings. This results in malformed model identifiers downstream. A user passing"llama-7b"(without organization prefix) would trigger this issue.Consider defaulting to
hf-inferenceand using the original name as the model:+ } else { + // No slashes - default to hf-inference with the full name as model + prov = hfInference + model = bifrostModelName + if debug { + fmt.Printf("[huggingface debug] splitIntoModelProvider (t==0): prov=%s, model=%s\n", prov, model) + } }
180-191: Copy-paste error: Wrong provider names in error messages.Lines 184 and 190 incorrectly reference "nebius provider" when the actual providers are "sambanova" and "scaleway" respectively.
case "sambanova": if requestType == schemas.EmbeddingRequest { defaultPath = "/sambanova/v1/embeddings" } else { - return "", fmt.Errorf("nebius provider only supports embedding requests") + return "", fmt.Errorf("sambanova provider only supports embedding requests") } case "scaleway": if requestType == schemas.EmbeddingRequest { defaultPath = "/scaleway/v1/embeddings" } else { - return "", fmt.Errorf("nebius provider only supports embedding requests") + return "", fmt.Errorf("scaleway provider only supports embedding requests") }
164-197: Incomplete provider routing — 13 of 19 defined providers will error.The
getInferenceProviderRouteURLfunction only handles 6 providers (fal-ai,hf-inference,nebius,replicate,sambanova,scaleway) whileINFERENCE_PROVIDERSdefines 19 total. Providers likecerebras,cohere,groq,fireworks-ai,together, etc. will hit the default error case.This affects embedding, speech, and transcription operations for the majority of defined providers.
Either expand the routing logic to handle all providers or document which providers are actually supported for which request types.
core/providers/huggingface/huggingface.go (6)
51-85: Typo: "aquire" should be "acquire".The pool helper functions are misspelled as
aquireHuggingFaceChatResponse,aquireHuggingFaceTranscriptionResponse, andaquireHuggingFaceSpeechResponse. The correct spelling is "acquire". Update the function names and their call sites at lines 348, 790, and 861.
277-281: Bug: Wrong provider constant in ListModels.Line 279 uses
schemas.Geminiinstead ofschemas.HuggingFacein theCheckOperationAllowedcall. This causes incorrect operation permission checks for the HuggingFace provider.Apply this diff:
- if err := providerUtils.CheckOperationAllowed(schemas.Gemini, provider.customProviderConfig, schemas.ListModelsRequest); err != nil { + if err := providerUtils.CheckOperationAllowed(schemas.HuggingFace, provider.customProviderConfig, schemas.ListModelsRequest); err != nil { return nil, err }
295-301: Bug: Wrong request types in unsupported operation errors.Both
TextCompletion(line 296) andTextCompletionStream(line 300) return errors usingschemas.EmbeddingRequestinstead of the correct request types. This was previously flagged and marked as addressed, but the issue persists in the current code.Apply this diff:
func (provider *HuggingFaceProvider) TextCompletion(ctx context.Context, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (*schemas.BifrostTextCompletionResponse, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionRequest, provider.GetProviderKey()) } func (provider *HuggingFaceProvider) TextCompletionStream(ctx context.Context, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionStreamRequest, provider.GetProviderKey()) }
700-707: Bug: Wrong request type in Embedding error messages.Lines 701 and 706 use
schemas.SpeechRequestinstead ofschemas.EmbeddingRequestwhen returning unsupported operation errors in the Embedding method.Apply this diff:
if providerMapping == nil { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) } mapping, ok := providerMapping[inferenceProvider] if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "feature-extraction" { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) }
773-776: Bug: Wrong task type check in Speech.Line 774 checks for
"automatic-speech-recognition"task, but Speech (text-to-speech) should check for"text-to-speech". The task checks appear to be swapped between Speech and Transcription functions.Apply this diff:
mapping, ok := providerMapping[inferenceProvider] - if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "automatic-speech-recognition" { + if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "text-to-speech" { return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) }
831-838: Bug: Wrong task type check and error messages in Transcription.Two issues in the Transcription function:
- Lines 832 and 837 use
schemas.SpeechRequestinstead ofschemas.TranscriptionRequestin error messages- Line 836 checks for
"text-to-speech"task, but Transcription should check for"automatic-speech-recognition"(which is what Speech incorrectly checks at line 774, suggesting these are swapped)Apply this diff:
if providerMapping == nil { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey()) } mapping, ok := providerMapping[inferenceProvider] - if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "text-to-speech" { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "automatic-speech-recognition" { + return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey()) }
🧹 Nitpick comments (1)
core/internal/testutil/account.go (1)
530-876: Consider adding a ComprehensiveTestConfig entry for HuggingFace
GetConfiguredProvidersandGetConfigForProvidernow supportschemas.HuggingFace, butAllProviderConfigsdoesn’t define aComprehensiveTestConfigfor it. That means HuggingFace won’t participate in these cross‑provider end‑to‑end scenarios.Recommend adding a HuggingFace entry here (with a representative chat/embedding model) so the new provider is exercised alongside the others.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (28)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)Makefile(8 hunks)core/bifrost.go(2 hunks)core/internal/testutil/account.go(3 hunks)core/internal/testutil/chat_completion_stream.go(1 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types copy.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(2 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/features/unified-interface.mdx(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(1 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
- ui/README.md
- core/schemas/bifrost.go
- core/internal/testutil/chat_completion_stream.go
- core/bifrost.go
- core/schemas/account.go
- docs/features/unified-interface.mdx
- ui/lib/constants/config.ts
- .github/workflows/pr-tests.yml
- .github/workflows/release-pipeline.yml
- core/providers/huggingface/types copy.go
- core/providers/huggingface/embedding.go
- core/providers/huggingface/types.go
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
transports/config.schema.jsoncore/providers/huggingface/huggingface_test.gocore/schemas/mux.gocore/internal/testutil/responses_stream.gocore/providers/huggingface/transcription.goui/lib/constants/logs.tscore/providers/huggingface/models.gocore/internal/testutil/account.godocs/apis/openapi.jsonMakefilecore/providers/huggingface/speech.goui/lib/constants/icons.tsxdocs/contributing/adding-a-provider.mdxcore/providers/huggingface/chat.gocore/providers/huggingface/utils.gocore/providers/huggingface/huggingface.go
🧬 Code graph analysis (6)
core/providers/huggingface/huggingface_test.go (4)
core/internal/testutil/setup.go (1)
SetupTest(51-60)core/internal/testutil/account.go (2)
ComprehensiveTestConfig(47-64)TestScenarios(22-44)core/schemas/bifrost.go (2)
HuggingFace(51-51)Fallback(131-134)core/internal/testutil/tests.go (1)
RunAllComprehensiveTests(15-62)
core/providers/huggingface/transcription.go (4)
core/schemas/transcriptions.go (2)
BifrostTranscriptionRequest(3-10)BifrostTranscriptionResponse(16-26)core/providers/huggingface/types.go (5)
HuggingFaceTranscriptionRequest(328-333)HuggingFaceTranscriptionRequestParameters(336-339)HuggingFaceTranscriptionGenerationParameters(342-359)HuggingFaceTranscriptionEarlyStopping(363-366)HuggingFaceTranscriptionResponse(399-402)core/schemas/models.go (1)
Model(109-129)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
HuggingFaceListModelsResponse(21-23)core/schemas/bifrost.go (13)
ModelProvider(32-32)RequestType(86-86)ChatCompletionRequest(92-92)ChatCompletionStreamRequest(93-93)TextCompletionRequest(90-90)TextCompletionStreamRequest(91-91)ResponsesRequest(94-94)ResponsesStreamRequest(95-95)EmbeddingRequest(96-96)SpeechRequest(97-97)SpeechStreamRequest(98-98)TranscriptionRequest(99-99)TranscriptionStreamRequest(100-100)core/schemas/models.go (1)
Model(109-129)
core/internal/testutil/account.go (4)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/schemas/account.go (1)
Key(8-18)core/schemas/provider.go (4)
ProviderConfig(234-242)NetworkConfig(45-53)DefaultRequestTimeoutInSeconds(15-15)ConcurrencyAndBufferSize(128-131)core/internal/testutil/cross_provider_scenarios.go (1)
ProviderConfig(45-53)
core/providers/huggingface/speech.go (3)
core/schemas/speech.go (2)
BifrostSpeechRequest(9-16)BifrostSpeechResponse(22-29)core/providers/huggingface/types.go (5)
HuggingFaceSpeechRequest(304-310)HuggingFaceSpeechParameters(313-316)HuggingFaceTranscriptionGenerationParameters(342-359)HuggingFaceTranscriptionEarlyStopping(363-366)HuggingFaceSpeechResponse(319-323)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/providers/huggingface/huggingface.go (7)
core/providers/huggingface/types.go (5)
HuggingFaceChatResponse(129-136)HuggingFaceTranscriptionResponse(399-402)HuggingFaceSpeechResponse(319-323)HuggingFaceResponseError(268-272)HuggingFaceHubError(263-266)core/providers/utils/utils.go (7)
GetRequestPath(219-239)SetExtraHeaders(178-208)HandleProviderAPIError(317-337)CheckAndDecodeBody(423-431)NewBifrostOperationError(449-460)HandleProviderResponse(343-387)ShouldSendBackRawResponse(482-487)core/schemas/bifrost.go (4)
HuggingFace(51-51)EmbeddingRequest(96-96)BifrostStream(321-328)ChatCompletionRequest(92-92)core/providers/huggingface/chat.go (1)
ToHuggingFaceChatCompletionRequest(12-199)core/schemas/mux.go (1)
ChatToResponsesStreamState(966-982)core/schemas/responses.go (1)
BifrostResponsesResponse(45-83)core/providers/huggingface/embedding.go (1)
ToHuggingFaceEmbeddingRequest(10-55)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...
(QB_NEW_EN_HYPHEN)
[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...
(QB_NEW_EN_HYPHEN)
[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...
(QB_NEW_EN_HYPHEN)
[grammar] ~1469-~1469: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...
(QB_NEW_EN_HYPHEN)
[uncategorized] ~1779-~1779: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
🔇 Additional comments (18)
core/internal/testutil/responses_stream.go (1)
692-695: Lifecycle stream safety cap increase looks appropriateBumping the lifecycle loop guard to
responseCount > 300keeps the infinite‑loop protection but avoids prematurely truncating providers that emit many small lifecycle events (e.g., more granular Hugging Face streaming), especially given the existing 200s context timeout. This looks like a safe, non‑breaking adjustment in the broader streaming test harness.Makefile (3)
24-29: Color initialization refactor is cleaner and more portable.Replacing hard-coded ANSI escape sequences with
$(shell printf ...)improves readability and portability across different shells. This is a good improvement.
15-21: Binary path resolution logic is sound but relies on Go toolchain consistency.The GOBIN/GOPATH detection and fallback logic correctly identifies where
go installplaces binaries. The pattern of preferring explicitDEFAULT_GOBINthen falling back towhichis robust for most environments. However, ensure downstream targets (e.g.,dev,test) tolerate scenarios where binaries are missing or not inPATH—these targets already callinstall-airandinstall-gotestsum, which is good.Verify that all targets invoking
$(AIR_BIN)or$(GOTESTSUM_BIN)are guarded by correspondinginstall-*targets or handle missing binaries gracefully.
65-86: install-ui improvements handle root check and Next.js discovery well.The root check (lines 67-70) prevents permission issues on developer machines. The multi-stage Next.js discovery (lines 76-85) gracefully handles systems without global npm directories (e.g., Nix). The fallback chain (local → npx → global) is well-reasoned for environment diversity.
transports/config.schema.json (2)
135-140: HuggingFace provider wired into transport config schema correctlyUsing the generic
#/$defs/providerref for"huggingface"is consistent with other HTTP providers and looks good.
764-784: Semantic cache embedding provider enum correctly extended for HuggingFaceIncluding
"huggingface"in thesemanticcachepluginproviderenum keeps the plugin in sync with the newly supported embedding provider.docs/apis/openapi.json (1)
3239-3258: ModelProvider enum now exposes HuggingFace in OpenAPIAdding
"huggingface"to theModelProviderenum keeps the public OpenAPI schema aligned with the backend provider set; change looks correct.core/internal/testutil/account.go (1)
77-99: HuggingFace correctly added to configured providersIncluding
schemas.HuggingFaceinGetConfiguredProviderskeeps the comprehensive test account aligned with the new provider set.ui/lib/constants/logs.ts (2)
2-20: KnownProvidersNames correctly extended with HuggingFaceAdding
"huggingface"here ensures log views and filters recognize the new provider at the type level.
43-61: ProviderLabels updated for HuggingFace display nameThe
huggingface: "HuggingFace"label keeps the logs UI consistent with the rest of the product’s provider naming.core/providers/huggingface/huggingface_test.go (1)
12-63: Test structure looks good overall.The test follows the established pattern: parallel execution, environment variable check for API key, proper setup/teardown with
defer cancel(), and client shutdown. The comprehensive test config enables a reasonable subset of scenarios for the HuggingFace provider.core/providers/huggingface/models.go (2)
16-44: Model ID conversion logic looks correct.The conversion properly handles nil input, pre-allocates the slice, skips models without IDs or supported methods, and constructs composite IDs with the provider prefix. The use of
model.ModelIDfor the display name andmodel.IDfor the HuggingFace reference appears intentional based on the type definitions.
46-104: Method derivation logic is well-structured.The function correctly uses a set for deduplication, handles both pipeline tags and model tags with appropriate fallbacks, and returns a sorted slice for deterministic output. The broad tag matching (e.g.,
strings.Contains(tagLower, "embedding")) is reasonable for discovery purposes.core/providers/huggingface/transcription.go (1)
100-139: Response conversion implementation is correct.The
ToBifrostTranscriptionResponsemethod properly validates inputs, maps the text response, and safely converts chunks to segments with appropriate bounds checking for timestamps.core/providers/huggingface/speech.go (1)
9-24: Request and response conversion logic is well-implemented.The
ToHuggingFaceSpeechRequestproperly validates input, extracts provider/model, and maps the text input. TheToBifrostSpeechResponsecorrectly handles nil cases and maps the audio data with appropriate extra fields.Also applies to: 94-116
core/providers/huggingface/chat.go (1)
201-315: Response conversion implementations are comprehensive.Both
ToBifrostChatResponseandToBifrostChatStreamResponseproperly handle nil checks, convert choices with messages/deltas, map tool calls, logprobs, and usage information. The streaming response correctly maps theReasoningfield toThought.Also applies to: 317-413
core/providers/huggingface/utils.go (2)
83-128: URL construction for model hub API is well-implemented.The
buildModelHubURLfunction properly handles pagination limits with bounds checking, sets appropriate query parameters, and handles various ExtraParams types with type switching.
217-251: Model inference provider mapping retrieval is correctly implemented.Proper use of fasthttp with deferred resource release, appropriate error handling for API responses, and clean conversion to internal mapping structure.
f279893 to
4663a80
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (10)
core/providers/huggingface/utils.go (1)
168-201: Only a subset of INFERENCE_PROVIDERS are routable in getInferenceProviderRouteURL.The switch handles
fal-ai,hf-inference,nebius,replicate,sambanova, andscaleway, but other entries inINFERENCE_PROVIDERS(e.g.,cerebras,cohere,groq,featherless-ai,fireworks-ai,hyperbolic,novita,nscale,ovhcloud,publicai,together,wavespeed,zai-org) fall through to the default error. That means models mapped to those providers (and surfaced via list-models) will currently be unusable.Either (a) add routing cases for the remaining providers with correct default paths or (b) remove any providers you don’t intend to support yet from
INFERENCE_PROVIDERSso they don’t appear as “supported” in filters and mappings.core/internal/testutil/account.go (1)
259-266: Fix HuggingFace test API key env var name and optionally support HF_TOKEN.This branch reads
HUGGING_FACE_API_KEY, but the PR/docs useHUGGINGFACE_API_KEYand the provider is supposed to supportHF_TOKENas well. As written, CI that exportsHUGGINGFACE_API_KEY/HF_TOKENwill see empty keys in tests.Consider standardizing on
HUGGINGFACE_API_KEYwith anHF_TOKENfallback:- case schemas.HuggingFace: - return []schemas.Key{ - { - Value: os.Getenv("HUGGING_FACE_API_KEY"), - Models: []string{}, - Weight: 1.0, - }, - }, nil + case schemas.HuggingFace: + key := os.Getenv("HUGGINGFACE_API_KEY") + if key == "" { + key = os.Getenv("HF_TOKEN") + } + return []schemas.Key{ + { + Value: key, + Models: []string{}, + Weight: 1.0, + }, + }, nilMakefile (3)
21-21: SimplifyGOTESTSUM_BINfallback for consistency withAIR_BIN.The
shell which gotestsum 2>/dev/null || echo gotestsumfallback adds complexity without benefit. For consistency withAIR_BIN(line 20), use a simple fallback.
103-104: Replacewhichwithcommand -vfor proper full-path handling.The
which $(GOTESTSUM_BIN)fallback doesn't work correctly with full paths. Usecommand -vwhich handles both cases, or align with theAIR_BINpattern using only[ -x ].
97-98: Incorrect use ofwhichwith full path variables.At lines 97-98 (and 110-111),
which $$INSTALLEDis used where$$INSTALLEDis an absolute path. Thewhichcommand searchesPATHby name, not by checking if a full path exists. Replace with a direct existence check:[ ! -x "$$INSTALLED" ].core/providers/huggingface/chat.go (2)
93-96: Undefined variabledebugwill cause compilation error.The variable
debugis referenced but never defined in this file. This will prevent the code from compiling.Either define the debug variable or remove the debug logging:
+// Package-level debug flag (set via build tags or environment) +var debug = false + func ToHuggingFaceChatCompletionRequest(bifrostReq *schemas.BifrostChatRequest) *HuggingFaceChatRequest {Or remove the debug blocks entirely if not needed:
- if debug { - fmt.Printf("[huggingface debug] Added tool_call_id=%s to tool message\n", *msg.ChatToolMessage.ToolCallID) - }
257-263: Same undefineddebugvariable issue.This is the same compilation error as in the request converter -
debugis undefined.core/providers/huggingface/huggingface.go (3)
295-301: Bug: Wrong request type in unsupported operation errors.Both
TextCompletionandTextCompletionStreamuseschemas.EmbeddingRequestinstead of their respective request types. This causes incorrect error messages.Apply this diff:
func (provider *HuggingFaceProvider) TextCompletion(ctx context.Context, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (*schemas.BifrostTextCompletionResponse, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionRequest, provider.GetProviderKey()) } func (provider *HuggingFaceProvider) TextCompletionStream(ctx context.Context, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionStreamRequest, provider.GetProviderKey()) }
713-716: Bug: Wrong request type in Embedding error message.Line 715 uses
schemas.SpeechRequestinstead ofschemas.EmbeddingRequestwhen the mapping check fails.Apply this diff:
mapping, ok := providerMapping[inferenceProvider] if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "feature-extraction" { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) }
847-854: Bug: Wrong request type in Transcription error messages.Lines 848 and 853 use
schemas.SpeechRequestinstead ofschemas.TranscriptionRequest.Apply this diff:
if providerMapping == nil { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey()) } mapping, ok := providerMapping[inferenceProvider] if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "automatic-speech-recognition" { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey()) }
🧹 Nitpick comments (9)
docs/contributing/adding-a-provider.mdx (1)
41-75: Fix hyphenation for compound adjectives.According to English grammar rules, compound adjectives should be hyphenated when they precede a noun. Update all instances of "OpenAI Compatible" to "OpenAI-compatible" when used as adjectives:
- Line 43: "not OpenAI compatible" → "not OpenAI-compatible"
- Line 71: "#### OpenAI Compatible Providers" → "#### OpenAI-Compatible Providers"
- Line 629: "### OpenAI Compatible Providers" → "### OpenAI-Compatible Providers"
- Line 1475: "### For OpenAI Compatible Providers" → "### For OpenAI-Compatible Providers"
Also update any related section titles and cross-references to maintain consistency throughout the document.
Also applies to: 629-632, 1475-1477
core/providers/huggingface/utils.go (2)
27-82: inferenceProvider constants and provider lists look consistent with HF docs.
INFERENCE_PROVIDERSandPROVIDERS_OR_POLICIESare well-structured; if you want slightly tighter typing, you could append theautoconstant instead of the raw"auto"literal inPROVIDERS_OR_POLICIES, but the current form is functionally fine.
257-296: Int extraction helper is flexible; be aware of truncation semantics.
extractIntFromInterfacehandles most numeric variants (includingjson.Number) and falls back cleanly; just note that all float cases (andjson.NumberviaFloat64) are truncated viaint(...), which is fine if all upstream numeric fields are expected to be integral.core/providers/huggingface/huggingface_test.go (1)
24-57: HuggingFace scenarios omit embedding/audio tests despite models being configured.You’ve set
EmbeddingModel,TranscriptionModel, andSpeechSynthesisModel, but all corresponding scenario flags (Embedding,Transcription,TranscriptionStream,SpeechSynthesis,SpeechSynthesisStream) arefalse, soRunAllComprehensiveTestswon’t actually exercise those paths. Once you’re confident in those flows, consider flipping these booleans totrueso chat, embeddings, and audio are all covered end-to-end.core/providers/huggingface/models.go (1)
46-104: deriveSupportedMethods’ pipeline/tag heuristics look solid and conservative.Normalizing the pipeline tag, aggregating methods via a set, and falling back to tags for embeddings, chat/text, TTS, and ASR is a good balance between coverage and precision, and sorting the final method list keeps responses deterministic.
core/providers/huggingface/embedding.go (1)
10-13: Inconsistent nil handling - returning(nil, nil)may cause silent failures.Returning
(nil, nil)whenbifrostReqis nil differs from other converters in this PR (e.g.,ToBifrostEmbeddingResponsereturns an error for nil input). This could lead to silent failures where the caller doesn't know if the request was successfully converted or if the input was nil.Consider returning an explicit error for consistency:
func ToHuggingFaceEmbeddingRequest(bifrostReq *schemas.BifrostEmbeddingRequest) (*HuggingFaceEmbeddingRequest, error) { if bifrostReq == nil { - return nil, nil + return nil, fmt.Errorf("bifrost embedding request is nil") }Alternatively, if nil-in-nil-out is intentional, document this behavior with a comment.
core/providers/huggingface/speech.go (1)
9-12: Inconsistent nil handling pattern.Similar to
ToHuggingFaceEmbeddingRequest, returning(nil, nil)for nil input may cause silent failures. Consider returning an error or documenting the nil-in-nil-out behavior for consistency across the provider.core/providers/huggingface/transcription.go (1)
124-129: Potential silent data loss whenTimestamphas exactly one element.The condition
len(chunk.Timestamp) >= 2handles empty and full timestamps, but if a chunk has exactly one timestamp element, bothstartandendwill be zero. Consider logging or handling this edge case explicitly if it indicates malformed data.core/providers/huggingface/chat.go (1)
36-38: Silently discarding marshalling errors could hide bugs.Multiple locations discard
sonic.Marshalerrors with_. While this may be acceptable for optional fields, consider logging these errors at debug level to aid troubleshooting.- contentJSON, _ := sonic.Marshal(*msg.Content.ContentStr) + contentJSON, err := sonic.Marshal(*msg.Content.ContentStr) + if err != nil { + // Log error but continue - content will be empty + continue + } hfMsg.Content = json.RawMessage(contentJSON)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (26)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)Makefile(8 hunks)core/bifrost.go(2 hunks)core/internal/testutil/account.go(3 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(2 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/features/unified-interface.mdx(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(1 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
- core/schemas/bifrost.go
- core/bifrost.go
- ui/lib/constants/logs.ts
- transports/config.schema.json
- .github/workflows/pr-tests.yml
- core/schemas/account.go
- core/schemas/mux.go
- .github/workflows/release-pipeline.yml
- ui/lib/constants/config.ts
- core/internal/testutil/responses_stream.go
- docs/features/unified-interface.mdx
- ui/README.md
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
docs/apis/openapi.jsoncore/providers/huggingface/embedding.gocore/internal/testutil/account.gocore/providers/huggingface/models.goMakefilecore/providers/huggingface/speech.godocs/contributing/adding-a-provider.mdxcore/providers/huggingface/transcription.goui/lib/constants/icons.tsxcore/providers/huggingface/chat.gocore/providers/huggingface/utils.gocore/providers/huggingface/huggingface_test.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
🧬 Code graph analysis (5)
core/providers/huggingface/embedding.go (3)
core/schemas/embedding.go (4)
BifrostEmbeddingRequest(9-16)BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)EmbeddingStruct(124-128)core/providers/huggingface/types.go (2)
HuggingFaceEmbeddingRequest(303-313)HuggingFaceEmbeddingResponse(324-324)core/schemas/chatcompletions.go (1)
BifrostLLMUsage(640-647)
core/internal/testutil/account.go (4)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/schemas/account.go (1)
Key(8-18)core/schemas/provider.go (4)
ProviderConfig(234-242)NetworkConfig(45-53)DefaultRequestTimeoutInSeconds(15-15)ConcurrencyAndBufferSize(128-131)core/internal/testutil/cross_provider_scenarios.go (1)
ProviderConfig(45-53)
core/providers/huggingface/speech.go (4)
core/schemas/speech.go (2)
BifrostSpeechRequest(9-16)BifrostSpeechResponse(22-29)core/providers/huggingface/types.go (5)
HuggingFaceSpeechRequest(329-335)HuggingFaceSpeechParameters(338-341)HuggingFaceTranscriptionGenerationParameters(367-384)HuggingFaceTranscriptionEarlyStopping(388-391)HuggingFaceSpeechResponse(344-348)core/schemas/models.go (1)
Model(109-129)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/providers/huggingface/huggingface.go (4)
core/providers/utils/utils.go (11)
GetProviderName(734-741)GetRequestPath(219-239)SetExtraHeaders(178-208)MakeRequestWithContext(39-93)HandleProviderAPIError(317-337)NewBifrostOperationError(449-460)HandleProviderResponse(343-387)ShouldSendBackRawResponse(482-487)NewUnsupportedOperationError(391-402)ProcessAndSendBifrostError(571-601)GetBifrostResponseForStreamResponse(777-805)core/providers/huggingface/chat.go (1)
ToHuggingFaceChatCompletionRequest(12-206)core/providers/huggingface/embedding.go (1)
ToHuggingFaceEmbeddingRequest(10-58)core/providers/huggingface/transcription.go (1)
ToHuggingFaceTranscriptionRequest(9-101)
core/providers/huggingface/types.go (1)
core/schemas/models.go (1)
Model(109-129)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...
(QB_NEW_EN_HYPHEN)
[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...
(QB_NEW_EN_HYPHEN)
[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...
(QB_NEW_EN_HYPHEN)
[grammar] ~1475-~1475: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...
(QB_NEW_EN_HYPHEN)
[uncategorized] ~1785-~1785: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
🔇 Additional comments (31)
docs/apis/openapi.json (1)
3238-3258: ModelProvider enum now correctly exposes HuggingFace (and Cerebras) in the public API.The added
"cerebras"and"huggingface"enum values align with the newModelProviderconstants and make these providers addressable via the OpenAPI schema; no issues from a schema/compat perspective.core/providers/huggingface/utils.go (3)
83-128: Model hub URL builder is robust and respects pagination and overrides.The URL construction correctly clamps
limit, sets sensible defaults (full,sort,direction,inference_provider), and then letsExtraParamsoverride as needed; this should work well for listing inference-provider-backed models.
203-255: Provider-mapping fetch and conversion are straightforward and aligned with fasthttp/providerUtils patterns.The mapping conversion guards against nil, preserves tasks and model mappings, and the HTTP helper correctly acquires/releases fasthttp objects, handles non-200 responses via
HandleProviderAPIError, and decodes JSON into the mapping type; this looks solid.
137-166: All call sites ofsplitIntoModelProviderproperly handle the error return value. Every instance checksnameErr != niland either propagates the error or wraps it in an appropriate error type (UnsupportedOperationError). The stricter validation of model name format is safely enforced across the codebase with no gaps in error handling.core/internal/testutil/account.go (2)
77-99: Including HuggingFace in the configured providers set is correct.Adding
schemas.HuggingFacehere keeps the test harness aligned with the new provider and ensures it participates in cross-provider setups.
512-524: HuggingFace ProviderConfig defaults look reasonable.A 300s timeout, 10 retries, and moderate backoff (2s–30s) with standard concurrency/buffer mirror how other “variable” cloud providers are configured; this should be fine as a starting point.
core/providers/huggingface/models.go (1)
16-44: Model listing transformation correctly scopes IDs and filters unsupported models.
ToBifrostListModelsResponsesensibly skips models without IDs or derived methods, prefixes IDs with the Bifrost provider key, and stores the raw Hugging Face ID separately; this gives a clean, provider-scoped surface for/v1/models.core/providers/huggingface/embedding.go (1)
60-96: LGTM!The
ToBifrostEmbeddingResponsemethod correctly converts HuggingFace embeddings to Bifrost format, properly handles nil input with an error, and documents that usage information is unavailable from the HuggingFace API.core/providers/huggingface/speech.go (1)
98-119: LGTM!The response converter properly validates the model name and correctly notes that HuggingFace TTS doesn't return usage or alignment data.
Makefile (1)
66-70: Good addition of root-user guard for local development.Preventing
make install-uifrom running as root on developer machines avoids common permission issues with npm global installs.core/providers/huggingface/transcription.go (1)
38-82: LGTM!The integer parameter extraction correctly uses
extractIntFromInterfaceto handle bothintandfloat64types from JSON unmarshalling, addressing the concern from previous reviews.core/providers/huggingface/chat.go (2)
69-76: LGTM - nil pointer dereference fix is correctly implemented.The code now safely handles a nil
tc.Function.Nameby using a default empty string, preventing potential panics.
324-420: LGTM!The streaming response converter correctly handles delta fields, tool calls, logprobs, and usage conversion. The nil handling returning plain
nil(without error) is appropriate for streaming contexts.core/providers/huggingface/huggingface.go (10)
1-31: LGTM: Package setup and provider struct are well-structured.The debug toggle via environment variable and the provider struct with proper configuration fields follow established patterns from other providers.
33-85: LGTM: Object pooling implementation is correct.The acquire/release pattern with struct reset ensures clean state reuse. The nil checks in release functions prevent panics.
87-120: LGTM: Provider constructor follows established patterns.Pre-warming response pools and proper configuration handling align with other provider implementations.
122-130: LGTM: Helper methods correctly delegate to utility functions.
132-205: LGTM: HTTP request handling is robust.The response body copy at line 192 correctly prevents use-after-free when fasthttp releases its internal buffer. Error response parsing properly extracts HuggingFace-specific error details.
207-293: LGTM: Model listing implementation handles both keyed and keyless modes correctly.The operation check at line 279 correctly uses
schemas.HuggingFace(the copy-paste issue from Gemini was fixed).
303-375: LGTM: ChatCompletion implementation is well-structured.The model name splitting and reconstruction pattern correctly handles HuggingFace's
modelName:inferenceProviderformat. Response conversion and extra fields population follow established patterns.
377-659: LGTM: Streaming implementation is comprehensive.The SSE parsing, context cancellation handling, and the workaround for combined usage+content chunks (lines 604-645) are well-documented and correctly implemented. Resource cleanup via defers ensures proper release of fasthttp resources.
661-683: LGTM: Responses API correctly adapts ChatCompletion endpoints.The fallback pattern via context flag enables code reuse while maintaining proper response type conversion.
756-830: LGTM: Speech implementation correctly validates task type.The task check at line 787 correctly validates
"text-to-speech"for Speech operations (the swapped task check from past review was fixed).core/providers/huggingface/types.go (8)
1-48: LGTM: Model response unmarshaling correctly handles both API formats.The custom
UnmarshalJSONat lines 30-48 properly handles both the top-level array format (current API) and the object format with amodelsfield (backward compatibility). This addresses the past review concern.
50-68: LGTM: Inference provider mapping types are well-structured.The separation between API response types (
HuggingFaceInferenceProviderInfo) and internal types (HuggingFaceInferenceProviderMapping) provides clean abstraction.
69-213: LGTM: Chat types comprehensively cover OpenAI-compatible format.The flexible use of
json.RawMessagefor content and tool_choice allows handling various input formats. Response types include full logprobs and usage support.
214-285: LGTM: Streaming types correctly model delta structures.The streaming response types properly handle incremental content delivery with optional usage and timing information.
286-298: LGTM: Error types distinguish Hub and inference API responses.
299-324: LGTM: Embedding types support flexible input formats.The
interface{}type forInputscorrectly handles both single string and string array inputs required by the feature extraction API.
326-349: LGTM: Speech types appropriately model text-to-speech API.
350-436: LGTM: Transcription types handle complex union types correctly.The custom
MarshalJSON/UnmarshalJSONforHuggingFaceTranscriptionEarlyStoppingproperly handles the boolean or string ("never") union type. The type aliases at lines 435-436 provide convenient access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
Makefile (3)
97-98: Simplify binary availability check—whichdoes not work with full paths.The condition
if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; thencontains a redundant and confusingwhichcall. Since$$INSTALLEDis an absolute path (e.g.,/home/user/go/bin/air),whichwill always fail—making the&&logic unintuitive. Simply check if the file is executable.- if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \ - echo "$(YELLOW)Note: the installed air binary may not be on your PATH...$(NC)"; \ + if [ ! -x "$$INSTALLED" ]; then \ + echo "$(YELLOW)Note: the installed air binary may not be on your PATH...$(NC)"; \
110-111: Simplify binary availability check—same issue as install-air (line 97).Apply the same simplification to
install-gotestsum: remove the redundantwhichcall and rely solely on the[ -x ]check.- if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \ - echo "$(YELLOW)Note: the installed gotestsum binary may not be on your PATH...$(NC)"; \ + if [ ! -x "$$INSTALLED" ]; then \ + echo "$(YELLOW)Note: the installed gotestsum binary may not be on your PATH...$(NC)"; \
103-103: Replacewhichwith a more portable check or simplify to match AIR_BIN pattern.Line 103 uses
which $(GOTESTSUM_BIN)as a fallback after the[ -x ]check. SinceGOTESTSUM_BINmay be a full path,whichis unreliable. Either replace withcommand -v(which works for both names and full paths) or simplify to just[ -x ]for consistency with theAIR_BINpattern on line 90.Option 1 (recommended): Simplify to match the
AIR_BINpattern:- @if [ -x "$(GOTESTSUM_BIN)" ] || which $(GOTESTSUM_BIN) > /dev/null 2>&1; then \ + @if [ -x "$(GOTESTSUM_BIN)" ]; then \Option 2: Use
command -vfor portability:- @if [ -x "$(GOTESTSUM_BIN)" ] || which $(GOTESTSUM_BIN) > /dev/null 2>&1; then \ + @if [ -x "$(GOTESTSUM_BIN)" ] || command -v $(GOTESTSUM_BIN) > /dev/null 2>&1; then \core/providers/huggingface/huggingface.go (1)
295-301: Fix request type in unsupported TextCompletion operations.Both
TextCompletionandTextCompletionStreamreturn an unsupported-operation error but incorrectly tag it asschemas.EmbeddingRequestinstead of the appropriate text-completion request types. That makes error classification and telemetry misleading.Consider updating to:
func (provider *HuggingFaceProvider) TextCompletion(ctx context.Context, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (*schemas.BifrostTextCompletionResponse, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionRequest, provider.GetProviderKey()) } func (provider *HuggingFaceProvider) TextCompletionStream(ctx context.Context, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionStreamRequest, provider.GetProviderKey()) }
🧹 Nitpick comments (2)
core/providers/huggingface/types.go (1)
405-421: Consider returning an error for invalid EarlyStopping values.The
UnmarshalJSONmethod returnsnil(line 420) when the value is neither a boolean nor a string. This silently ignores invalid values. Consider returning an error for non-null values that don't match expected types to catch malformed API responses.Apply this diff to add error handling:
func (e *HuggingFaceTranscriptionEarlyStopping) UnmarshalJSON(data []byte) error { + // Handle null explicitly + if string(data) == "null" { + return nil + } + // Try boolean first var boolVal bool if err := json.Unmarshal(data, &boolVal); err == nil { e.BoolValue = &boolVal return nil } // Try string var stringVal string if err := json.Unmarshal(data, &stringVal); err == nil { e.StringValue = &stringVal return nil } - return nil + return fmt.Errorf("early_stopping must be a boolean or string, got: %s", string(data)) }core/providers/huggingface/huggingface.go (1)
685-754: Embedding implementation matches the expected routing and extra-field semantics.The Embedding path:
- Checks operation permissions for HuggingFace/Embedding.
- Builds the HF request via
ToHuggingFaceEmbeddingRequest.- Splits the model into
inferenceProviderandmodelName, then resolves agetModelInferenceProviderMappingentry and validatesProviderTask == "feature-extraction".- Uses the provider-specific model id and
getInferenceProviderRouteURLto derive the target URL, then executescompleteRequest.- Converts
HuggingFaceEmbeddingResponseinto aBifrostEmbeddingResponseand fillsExtraFields(provider, model requested, request type, latency, raw response).This is correct and consistent with how embeddings are handled for other providers.
As a non-blocking improvement, the repeated pattern of splitting the model, looking up
getModelInferenceProviderMapping, validatingProviderTask, and deriving the route URL (here and inSpeech/Transcription) could be factored into a small helper to avoid drift if the mapping rules ever change.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (26)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)Makefile(8 hunks)core/bifrost.go(2 hunks)core/internal/testutil/account.go(3 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(2 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/features/unified-interface.mdx(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(1 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
- ui/README.md
- core/schemas/mux.go
- .github/workflows/release-pipeline.yml
- .github/workflows/pr-tests.yml
- docs/apis/openapi.json
- core/schemas/account.go
- transports/config.schema.json
- ui/lib/constants/logs.ts
- docs/features/unified-interface.mdx
- core/internal/testutil/responses_stream.go
- core/schemas/bifrost.go
- core/internal/testutil/account.go
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/bifrost.goui/lib/constants/config.tscore/providers/huggingface/embedding.godocs/contributing/adding-a-provider.mdxMakefilecore/providers/huggingface/chat.gocore/providers/huggingface/models.gocore/providers/huggingface/utils.gocore/providers/huggingface/transcription.goui/lib/constants/icons.tsxcore/providers/huggingface/huggingface.gocore/providers/huggingface/huggingface_test.gocore/providers/huggingface/speech.gocore/providers/huggingface/types.go
🧬 Code graph analysis (9)
core/bifrost.go (2)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/providers/huggingface/huggingface.go (1)
NewHuggingFaceProvider(88-120)
core/providers/huggingface/embedding.go (2)
core/schemas/embedding.go (4)
BifrostEmbeddingRequest(9-16)BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)EmbeddingStruct(124-128)core/providers/huggingface/types.go (2)
HuggingFaceEmbeddingRequest(303-313)HuggingFaceEmbeddingResponse(324-324)
core/providers/huggingface/chat.go (2)
core/schemas/chatcompletions.go (2)
BifrostChatRequest(12-19)LogProb(625-629)core/providers/huggingface/types.go (12)
HuggingFaceChatRequest(72-92)HuggingFaceChatMessage(94-102)HuggingFaceContentItem(105-109)HuggingFaceImageRef(111-113)HuggingFaceToolCall(115-119)HuggingFaceFunction(121-125)HuggingFaceResponseFormat(127-130)HuggingFaceStreamOptions(139-141)HuggingFaceTool(143-146)HuggingFaceToolFunction(148-152)HuggingFaceChatResponse(154-161)HuggingFaceChatStreamResponse(215-224)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
HuggingFaceListModelsResponse(24-26)core/schemas/bifrost.go (13)
ModelProvider(32-32)RequestType(86-86)ChatCompletionRequest(92-92)ChatCompletionStreamRequest(93-93)TextCompletionRequest(90-90)TextCompletionStreamRequest(91-91)ResponsesRequest(94-94)ResponsesStreamRequest(95-95)EmbeddingRequest(96-96)SpeechRequest(97-97)SpeechStreamRequest(98-98)TranscriptionRequest(99-99)TranscriptionStreamRequest(100-100)core/schemas/models.go (1)
Model(109-129)
core/providers/huggingface/utils.go (2)
core/providers/huggingface/huggingface.go (1)
HuggingFaceProvider(25-31)core/providers/utils/utils.go (5)
GetRequestPath(219-239)MakeRequestWithContext(39-93)HandleProviderAPIError(317-337)CheckAndDecodeBody(423-431)NewBifrostOperationError(449-460)
core/providers/huggingface/transcription.go (3)
core/schemas/transcriptions.go (2)
BifrostTranscriptionRequest(3-10)BifrostTranscriptionResponse(16-26)core/providers/huggingface/types.go (4)
HuggingFaceTranscriptionRequest(353-358)HuggingFaceTranscriptionRequestParameters(361-364)HuggingFaceTranscriptionGenerationParameters(367-384)HuggingFaceTranscriptionEarlyStopping(388-391)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/providers/huggingface/huggingface.go (5)
core/providers/huggingface/types.go (4)
HuggingFaceChatResponse(154-161)HuggingFaceResponseError(293-297)HuggingFaceHubError(288-291)HuggingFaceEmbeddingResponse(324-324)core/providers/utils/utils.go (5)
SetExtraHeaders(178-208)MakeRequestWithContext(39-93)HandleProviderAPIError(317-337)CheckAndDecodeBody(423-431)NewBifrostOperationError(449-460)core/schemas/bifrost.go (15)
ModelProvider(32-32)HuggingFace(51-51)RequestType(86-86)BifrostError(356-365)ErrorField(374-381)ListModelsRequest(89-89)EmbeddingRequest(96-96)ChatCompletionRequest(92-92)ChatCompletionStreamRequest(93-93)ResponsesStreamRequest(95-95)ResponsesRequest(94-94)SpeechRequest(97-97)SpeechStreamRequest(98-98)TranscriptionRequest(99-99)TranscriptionStreamRequest(100-100)core/providers/huggingface/chat.go (1)
ToHuggingFaceChatCompletionRequest(12-207)core/providers/huggingface/embedding.go (1)
ToHuggingFaceEmbeddingRequest(10-58)
core/providers/huggingface/huggingface_test.go (5)
core/internal/testutil/setup.go (1)
SetupTest(51-60)core/internal/testutil/account.go (2)
ComprehensiveTestConfig(47-64)TestScenarios(22-44)core/schemas/bifrost.go (2)
HuggingFace(51-51)Fallback(131-134)core/schemas/models.go (1)
Model(109-129)core/internal/testutil/tests.go (1)
RunAllComprehensiveTests(15-62)
core/providers/huggingface/speech.go (3)
core/schemas/speech.go (2)
BifrostSpeechRequest(9-16)BifrostSpeechResponse(22-29)core/providers/huggingface/types.go (4)
HuggingFaceSpeechRequest(329-335)HuggingFaceSpeechParameters(338-341)HuggingFaceTranscriptionGenerationParameters(367-384)HuggingFaceTranscriptionEarlyStopping(388-391)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...
(QB_NEW_EN_HYPHEN)
[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...
(QB_NEW_EN_HYPHEN)
[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...
(QB_NEW_EN_HYPHEN)
[grammar] ~1475-~1475: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...
(QB_NEW_EN_HYPHEN)
[uncategorized] ~1785-~1785: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
🔇 Additional comments (34)
Makefile (5)
15-21: ✓ Well-structured tool path resolution with sensible fallbacks.The new
DEFAULT_GOBINvariable elegantly prioritizesGOBIN→GOPATH/bin→ empty, andAIR_BINandGOTESTSUM_BINconsistently prefer local paths. This avoids hardcoding tool expectations and gracefully handles systems with custom Go installations (e.g., Nix).
24-29: ✓ Dynamic ANSI color codes improve portability and maintainability.Using
shell printfinstead of literal escape sequences is more portable and easier to read. This pattern is cleaner than hardcoded escape codes.
65-70: ✓ Root check and multi-fallback Next.js detection enhance developer experience.Lines 67-70 prevent root-user npm installs on local dev machines, which avoids permission issues on NixOS and similar systems. Lines 76-85 provide three tiers of fallback (local node_modules, npx, global), making the install flow robust across different environments.
Also applies to: 75-85
155-155: ✓ Consistent use of $(AIR_BIN) variable throughout dev target.Both debug and non-debug paths now use the centralized
$(AIR_BIN)variable, ensuring the tool path is resolved consistently.Also applies to: 163-163
349-349: ✓ $(GOTESTSUM_BIN) consistently used across all test targets.All test invocations now reference
$(GOTESTSUM_BIN)instead of the baregotestsumcommand, ensuring they use the resolved tool path. This improves reliability on systems where gotestsum is installed in non-standard locations.Also applies to: 401-401, 425-425, 454-454, 548-548
core/bifrost.go (2)
26-26: LGTM!The import statement is correctly placed in alphabetical order and follows Go conventions.
1327-1328: LGTM!The HuggingFace provider is correctly wired into the factory switch with the proper constructor call and return pattern, consistent with other providers.
core/providers/huggingface/types.go (6)
11-48: LGTM!The custom
UnmarshalJSONimplementation properly handles both the array and object response formats from the HuggingFace API, addressing the earlier review concern.
90-92: Clarify the purpose of the Extra field.The
Extrafield is tagged withjson:"-"which means it won't be marshaled or unmarshaled. If the intent is to capture unknown additional fields from the API response, you'd need customUnmarshalJSONlogic. If it's for application-level metadata only (not from API), the current approach is fine.
288-297: LGTM!The error types properly represent different HuggingFace API error response formats.
303-324: LGTM!The embedding types properly handle flexible input formats and the response structure matches HuggingFace's feature extraction API output.
329-348: LGTM!The speech types follow a consistent pattern with the chat types. Note that the
Extrafields withjson:"-"tags won't capture unknown API fields (same consideration as the chat request Extra field).
435-436: LGTM!The type aliases provide convenient alternative names for shared generation parameters, improving code readability.
ui/lib/constants/config.ts (2)
14-14: LGTM!The HuggingFace model placeholder examples demonstrate both chat and embedding models with proper HuggingFace Hub naming conventions.
34-34: LGTM!Correctly marks HuggingFace as requiring an API key, consistent with other cloud-based inference providers.
core/providers/huggingface/huggingface_test.go (1)
12-63: LGTM! Test configuration is well-structured.The test setup correctly:
- Checks for API key before running
- Uses parallel execution
- Configures comprehensive scenarios
- Enables appropriate features (chat, streaming, tool calls, vision)
- Disables unsupported features (text completion, embedding, speech, transcription)
The model assignments appear correct and the disabled scenarios align with the provider's current capabilities.
core/providers/huggingface/models.go (1)
16-104: LGTM! Model listing conversion is well-implemented.The code demonstrates good practices:
- Proper nil handling and input validation
- Pre-allocated slices for performance
- Skip logic for invalid models (empty ID or no supported methods)
- Comprehensive method derivation from both pipeline tags and model tags
- Deduplication using a map-based approach
- Sorted output for consistency
The logic correctly maps HuggingFace model metadata to Bifrost's schema.
core/providers/huggingface/embedding.go (1)
10-96: LGTM! Embedding conversion logic is solid.The implementation correctly handles:
- Nil input validation
- Model/provider extraction via
splitIntoModelProvider- Mapping single text vs. array of texts
- Provider-specific parameters from
ExtraParams- Embedding array construction with proper indexing
- Missing usage information (documented that HF doesn't provide it)
The converters follow the established pattern and handle both request and response transformations cleanly.
core/providers/huggingface/transcription.go (1)
9-142: LGTM! Transcription converters handle complex parameter mapping well.The implementation demonstrates:
- Proper input validation with clear error messages
- Correct use of
extractIntFromInterfaceto handle JSON numeric types (addressing past review concerns)- Comprehensive generation parameter mapping (do_sample, max_new_tokens, temperature, etc.)
- Proper handling of the polymorphic
early_stoppingfield (bool or string)- Segment conversion with timestamp preservation
- Clean separation of concerns
Previous issues with error messages and integer type assertions have been addressed.
docs/contributing/adding-a-provider.mdx (1)
1-2070: Documentation is comprehensive and well-structured.The guide provides excellent coverage of:
- Clear distinction between OpenAI-compatible and custom API providers
- Step-by-step implementation phases with proper ordering
- File structure and naming conventions
- Type definitions and validation requirements
- Converter patterns with real examples
- Test setup and CI/CD integration
- UI and configuration updates
The reference code examples serve their purpose of illustrating patterns without needing to be complete implementations (as clarified by the author in past reviews).
core/providers/huggingface/speech.go (1)
9-120: LGTM! Speech synthesis converters are well-implemented.The code follows the same robust patterns as transcription:
- Input validation with clear error messages
- Proper use of
extractIntFromInterfacefor JSON numeric handling- Comprehensive generation parameter mapping
- Handling of polymorphic
early_stoppingfield- Clear documentation that HF TTS doesn't provide usage/alignment data
- Clean response construction with proper metadata
Previous type assertion issues have been resolved.
core/providers/huggingface/chat.go (1)
12-421: LGTM! Chat converters handle complex transformations correctly.The implementation demonstrates excellent handling of:
- Nil pointer protection for tool calls (fixed from past review)
- Message content conversion (string vs. structured blocks)
- Image URL handling for vision models
- Tool call conversion with proper nil checks
- Response format and stream options
- Debug logging (variable properly defined in package scope)
- Logprobs and top_logprobs conversion
- Streaming delta transformation
The code properly handles both non-streaming and streaming responses with appropriate type conversions.
core/providers/huggingface/utils.go (3)
137-166: Good fix for model name validation.The
splitIntoModelProviderfunction now properly handles invalid input (no slashes) by returning an error instead of producing empty strings. The debug logging helps trace the parsing logic, and the distinction between single-slash (org/model) and multi-slash (provider/org/model) formats is clear.
168-201: Provider routing correctly limited to supported operations.As clarified in past reviews, only 6 providers (fal-ai, hf-inference, nebius, replicate, sambanova, scaleway) support embedding/speech/transcription operations. The other 13 providers in
INFERENCE_PROVIDERSare used for chat/text-generation, which follows a different routing pattern. The error messages have been corrected to reference the appropriate provider names.
257-296: Excellent utility function for JSON numeric handling.The
extractIntFromInterfacehelper comprehensively handles all numeric types that can result from JSON unmarshaling:
- All signed and unsigned integer types
- Float types (with conversion)
json.Numberwith fallback parsingThis is used throughout the provider to safely extract integer parameters from
ExtraParams, addressing the type assertion issues flagged in earlier reviews.core/providers/huggingface/huggingface.go (9)
33-85: Response pooling helpers look correct and safe to reuse.The sync.Pool setup and acquire/release helpers for
HuggingFaceChatResponse,HuggingFaceTranscriptionResponse, andHuggingFaceSpeechResponsecorrectly reset structs on acquire and only reuse them after the call site is done, which avoids stale state and minimizes allocations. No changes needed here.
87-131: Provider initialization and URL handling are consistent with other providers.
NewHuggingFaceProvider,GetProviderKey, andbuildRequestURLfollow the existing provider patterns: they apply defaults, configure the fasthttp client (including proxy), normalizeBaseURL, honorCustomProviderConfig, and respect per-request path overrides. This looks good and matches the broader design.
132-205: Core HTTP request path is robust and correctly decouples the response body.
completeRequestcleanly centralizes request construction, extra-header injection, auth, context-aware execution, non-200 handling viaHandleProviderAPIError, gzip-aware decoding viaCheckAndDecodeBody, and copies the body before releasing the fasthttp response to avoid use-after-free. The debug logging is also appropriately gated. No functional issues from this implementation.
207-293: Model listing logic is aligned with Bifrost patterns.
listModelsByKeyandListModelscorrectly apply operation-allowed checks, build the model hub URL, attach auth and extra headers, handle HTTP and provider-level errors, decode viaHuggingFaceListModelsResponse, and then delegate fan-out across keys viaHandleMultipleListModelsRequests, including latency and optional raw-response propagation. This looks consistent and complete.
303-375: ChatCompletion request/response wiring looks solid.The ChatCompletion path correctly:
- Checks operation permissions.
- Normalizes the model via
splitIntoModelProviderbefore building the HF request.- Uses
CheckContextAndGetRequestBodywithToHuggingFaceChatCompletionRequestand explicitly disables streaming.- Builds the URL with
buildRequestURL, callscompleteRequest, and uses the pooledHuggingFaceChatResponsefor JSON decoding.- Converts to
BifrostChatResponseand fillsExtraFields(provider, requested model, request type, latency, raw response when enabled).This end-to-end flow is cohesive and matches the intended Bifrost provider contract.
377-659: Streaming chat implementation handles SSE and Responses fallback correctly.The
ChatCompletionStreamimplementation is careful and comprehensive: it validates operation permissions, handles the Responses→Chat fallback via a stream state object, builds a streaming POST withAccept: text/event-stream, and usesbufio.ScanneronBodyStream()with an increased buffer. The loop:
- Filters comments/empty lines and
[DONE].- Parses SSE
data:lines, detects error payloads viaHuggingFaceResponseErrorand sends a structuredBifrostError.- Decodes normal chunks into
HuggingFaceChatStreamResponse, converts to Bifrost stream responses, populates per-chunk metadata (including latency and chunk index), and optionally attaches raw JSON.- Handles the combined usage+content case for Responses fallback by splitting content/usage into separate events.
- Propagates scanner errors via
ProcessAndSendErrorand ensures resources/channels are released.This is a solid streaming implementation; no blocking issues identified.
661-683: Responses and ResponsesStream wrappers are straightforward and correct.These methods reuse the ChatCompletion/ChatCompletionStream implementations and only adapt to the Responses API shape, setting
ExtraFields.RequestType,Provider, andModelRequestedappropriately. This is a good reuse of existing logic and behaves as expected.
756-830: Speech and SpeechStream behaviors are coherent with provider capabilities.
Speechcorrectly reuses the model-mapping pattern (includingProviderTask == "text-to-speech"), builds the HF-specific speech request viaToHuggingFaceSpeechRequest, executescompleteRequest, and converts the pooledHuggingFaceSpeechResponseinto aBifrostSpeechResponsewith fully populatedExtraFields(including latency and optional raw response).SpeechStreamexplicitly returns an unsupported-operation error tagged withschemas.SpeechStreamRequest, which is appropriate if streaming speech isn’t supported yet.
832-906: Transcription and TranscriptionStream correctly target ASR and error types.
Transcription:
- Checks operation permissions for
schemas.TranscriptionRequest.- Splits model/provider and resolves a mapping where
ProviderTask == "automatic-speech-recognition".- Builds the body via
ToHuggingFaceTranscriptionRequest, derives the inference route URL withgetInferenceProviderRouteURL, and callscompleteRequest.- Uses the pooled
HuggingFaceTranscriptionResponseto decode and convert toBifrostTranscriptionResponse, filling provider/model/request-type/latency/RawResponse metadata.
TranscriptionStreamcleanly returns an unsupported-operation error with the correctschemas.TranscriptionStreamRequesttag. This aligns with the intended ASR behavior and prior fixes to the task and error types.
4663a80 to
53cb91d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (4)
Makefile (3)
97-98: DUPLICATE COMMENT:whichwith full path variable—still present despite prior review.This pattern was flagged in a previous review (commit f279893) but remains in the code. While the AND logic with
[ ! -x "$$INSTALLED" ]does provide fallback protection, the condition! which $$INSTALLEDwill always be true for full paths (sincewhichcannot resolve absolute paths). Simplify by removing the redundantwhichcheck:- if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \ + if [ ! -x "$$INSTALLED" ]; then \
103-104: DUPLICATE COMMENT:whichfallback for full-path variable—unclear intent and inconsistent with AIR_BIN pattern.This pattern was flagged in a previous review (commit f279893). Although the
[ -x ]check comes first and will handle full paths correctly (short-circuiting thewhichfallback), the fallback itself is ineffective for full paths and inconsistent with the simpler[ -x "$(AIR_BIN)" ]check used for AIR_BIN on line 90. Consider aligning both patterns:- @if [ -x "$(GOTESTSUM_BIN)" ] || which $(GOTESTSUM_BIN) > /dev/null 2>&1; then \ + @if [ -x "$(GOTESTSUM_BIN)" ]; then \
110-111: DUPLICATE COMMENT:whichwith full path variable—same issue as lines 97–98.This pattern was flagged in a previous review (commit f279893) but persists here. The AND logic provides practical protection, but the
whichcheck is redundant for full paths. Apply the same simplification as line 97:- if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \ + if [ ! -x "$$INSTALLED" ]; then \core/providers/huggingface/huggingface.go (1)
381-387: Bug: Incorrect request type in unsupported operation errors.Both
TextCompletionandTextCompletionStreamreturn errors withschemas.EmbeddingRequestinstead of the correct request types.Apply this diff to fix the error types:
func (provider *HuggingFaceProvider) TextCompletion(ctx context.Context, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (*schemas.BifrostTextCompletionResponse, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionRequest, provider.GetProviderKey()) } func (provider *HuggingFaceProvider) TextCompletionStream(ctx context.Context, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionStreamRequest, provider.GetProviderKey()) }
🧹 Nitpick comments (3)
Makefile (1)
75-85: Next.js fallback chain is reasonable but could be simplified.The logic prefers local
./node_modules/.bin/next, thennpx, then global install. However, the condition structure is a bit convoluted:
- Line 77: checks if local next exists
- Line 79: only runs npm install if npx is found (but the intent seems to be to install next if not found)
- Line 83: fallback comment mentions "may fail on Nix"
The logic works, but consider clarifying the intent—should the npm install on line 81 run unconditionally if local next is missing, or should it depend on npx availability? Currently, it only runs
npm installifnpxexists, which may not match the intent.core/schemas/account.go (1)
54-57: Consider adding anEndpointfield for self-hosted deployments.The current implementation only includes
Deployments, which covers model-to-deployment mappings. Per the linked issue, self-hosted endpoints are an optional enhancement. For future extensibility, consider whether anEndpointfield (similar toAzureKeyConfig) would be beneficial for users deploying models on custom infrastructure or using dedicated inference endpoints.This is non-blocking for the initial implementation.
type HuggingFaceKeyConfig struct { + Endpoint string `json:"endpoint,omitempty"` // Custom HuggingFace inference endpoint URL Deployments map[string]string `json:"deployments,omitempty"` // Mapping of model identifiers to deployment names }core/providers/huggingface/embedding.go (1)
60-104: Consider simplifying zero-usage handling.The conversion logic is correct. However, lines 94-100 explicitly create a zero-valued
BifrostLLMUsagewhen the HuggingFace response doesn't include usage. You could simplify by leavingUsageasnilwhen not provided, since the Bifrost schema allows it to be omitted.If you prefer explicit zero values, the current implementation is fine. Otherwise, you can remove lines 94-100:
// Map usage information if available if response.Usage != nil { bifrostResponse.Usage = &schemas.BifrostLLMUsage{ PromptTokens: response.Usage.PromptTokens, CompletionTokens: response.Usage.CompletionTokens, TotalTokens: response.Usage.TotalTokens, } - } else { - // Set empty usage if not provided - bifrostResponse.Usage = &schemas.BifrostLLMUsage{ - PromptTokens: 0, - CompletionTokens: 0, - TotalTokens: 0, - } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (26)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)Makefile(8 hunks)core/bifrost.go(2 hunks)core/internal/testutil/account.go(3 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(2 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/features/unified-interface.mdx(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(1 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
- ui/README.md
- core/schemas/mux.go
- core/internal/testutil/account.go
- transports/config.schema.json
- core/providers/huggingface/utils.go
- .github/workflows/pr-tests.yml
- docs/apis/openapi.json
- core/providers/huggingface/speech.go
- docs/features/unified-interface.mdx
- ui/lib/constants/config.ts
- core/bifrost.go
- core/internal/testutil/responses_stream.go
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/schemas/bifrost.gocore/schemas/account.gocore/providers/huggingface/transcription.godocs/contributing/adding-a-provider.mdxMakefilecore/providers/huggingface/huggingface_test.gocore/providers/huggingface/embedding.gocore/providers/huggingface/chat.goui/lib/constants/logs.tscore/providers/huggingface/models.goui/lib/constants/icons.tsxcore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
🧬 Code graph analysis (6)
core/schemas/bifrost.go (1)
ui/lib/types/config.ts (1)
ModelProvider(171-174)
core/providers/huggingface/transcription.go (6)
core/schemas/transcriptions.go (2)
BifrostTranscriptionRequest(3-10)BifrostTranscriptionResponse(16-26)core/providers/huggingface/types.go (5)
HuggingFaceTranscriptionRequest(376-381)HuggingFaceTranscriptionRequestParameters(384-387)HuggingFaceTranscriptionGenerationParameters(390-407)HuggingFaceTranscriptionEarlyStopping(411-414)HuggingFaceTranscriptionResponse(447-450)ui/components/ui/input.tsx (1)
Input(15-69)core/schemas/models.go (1)
Model(109-129)core/providers/elevenlabs/transcription.go (1)
ToBifrostTranscriptionResponse(100-150)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/providers/huggingface/embedding.go (3)
core/schemas/embedding.go (4)
BifrostEmbeddingRequest(9-16)BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)EmbeddingStruct(124-128)core/providers/huggingface/types.go (2)
HuggingFaceEmbeddingRequest(303-313)HuggingFaceEmbeddingResponse(324-328)core/schemas/chatcompletions.go (1)
BifrostLLMUsage(640-647)
core/providers/huggingface/chat.go (2)
core/schemas/chatcompletions.go (12)
BifrostChatRequest(12-19)ChatContentBlockTypeText(497-497)ChatContentBlockTypeImage(498-498)ChatAssistantMessage(541-545)ChatToolMessage(536-538)BifrostResponseChoice(582-590)ChatAssistantMessageToolCall(564-570)ChatNonStreamResponseChoice(605-608)BifrostLogProbs(593-598)LogProb(625-629)ChatStreamResponseChoice(611-613)ChatStreamResponseChoiceDelta(616-622)core/providers/huggingface/types.go (12)
HuggingFaceChatRequest(72-92)HuggingFaceChatMessage(94-102)HuggingFaceContentItem(105-109)HuggingFaceImageRef(111-113)HuggingFaceToolCall(115-119)HuggingFaceFunction(121-125)HuggingFaceResponseFormat(127-130)HuggingFaceStreamOptions(139-141)HuggingFaceTool(143-146)HuggingFaceToolFunction(148-152)HuggingFaceChatResponse(154-161)HuggingFaceChatStreamResponse(215-224)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
HuggingFaceListModelsResponse(24-26)core/schemas/bifrost.go (13)
ModelProvider(32-32)RequestType(86-86)ChatCompletionRequest(92-92)ChatCompletionStreamRequest(93-93)TextCompletionRequest(90-90)TextCompletionStreamRequest(91-91)ResponsesRequest(94-94)ResponsesStreamRequest(95-95)EmbeddingRequest(96-96)SpeechRequest(97-97)SpeechStreamRequest(98-98)TranscriptionRequest(99-99)TranscriptionStreamRequest(100-100)core/schemas/models.go (2)
BifrostListModelsResponse(36-45)Model(109-129)
core/providers/huggingface/types.go (1)
core/schemas/models.go (1)
Model(109-129)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...
(QB_NEW_EN_HYPHEN)
[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...
(QB_NEW_EN_HYPHEN)
[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...
(QB_NEW_EN_HYPHEN)
[grammar] ~1475-~1475: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...
(QB_NEW_EN_HYPHEN)
[uncategorized] ~1785-~1785: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
🔇 Additional comments (25)
.github/workflows/release-pipeline.yml (1)
118-118: Environment variable additions are correct and necessary.The
HUGGING_FACE_API_KEYenvironment variable is properly added to all four release jobs with correct GitHub Secrets reference syntax. The environment variables are inherited by the release scripts' child processes (specificallygo test), where HuggingFace tests access the key viaos.Getenv("HUGGING_FACE_API_KEY"). This is the correct pattern for passing secrets to test suites during the release workflow.The documentation already uses consistent naming (
HUGGING_FACE_API_KEYwith underscore) as shown indocs/contributing/adding-a-provider.mdx, so no documentation updates are needed.Likely an incorrect or invalid review comment.
Makefile (5)
15-21: Binary path resolution and variable initialization look sound.The new variables (GOBIN, GOPATH, DEFAULT_GOBIN, AIR_BIN, GOTESTSUM_BIN) properly detect and prioritize local Go binary paths, with sensible fallbacks. Line 21's GOTESTSUM_BIN simplification (removal of the
shell whichfallback) is an improvement. This approach supports the HuggingFace provider tests by ensuring test tooling is available in predictable locations.
24-29: Color variables using printf are a good improvement.Replacing hardcoded ANSI escape sequences with shell
printfcalls is more portable and avoids raw escape codes in the Makefile. This is a solid enhancement.
65-70: Root check in install-ui is appropriate for local development.The guard against running
makeas root on developer machines (except CI environments) prevents npm permission issues on systems like Nix. This is a pragmatic improvement.
155-155: Using AIR_BIN variable in dev target is correct.The change from direct
airinvocation to$(AIR_BIN)ensures the tool is invoked from the detected/installed path, improving reliability. Good alignment with the binary path resolution strategy.Also applies to: 163-163
349-349: Test targets correctly use GOTESTSUM_BIN variable.All test targets (test, test-core, test-plugins) now use
$(GOTESTSUM_BIN)instead of directgotestsuminvocations. This ensures consistency with the binary path resolution and supports the HuggingFace provider testing workflow. The change is well-applied across all affected targets.Also applies to: 401-401, 425-425, 454-454, 548-548
core/schemas/bifrost.go (3)
35-52: LGTM!The HuggingFace provider constant is correctly added with the lowercase value
"huggingface", consistent with the naming convention used by other providers.
55-62: LGTM!Adding HuggingFace to
SupportedBaseProvidersis appropriate, allowing it to serve as a base provider for custom provider configurations.
65-83: LGTM!HuggingFace is correctly added to
StandardProviders, completing the registration as a built-in provider.ui/lib/constants/logs.ts (2)
2-20: LGTM!The
"huggingface"entry is correctly placed in alphabetical order within theKnownProvidersNamesarray, and theProviderNametype will automatically include it through type derivation.
43-61: LGTM!The
ProviderLabelsentry correctly uses the brand-appropriate capitalization"HuggingFace"for user-facing display.core/schemas/account.go (1)
8-18: LGTM!The
HuggingFaceKeyConfigfield is correctly added to theKeystruct following the established pattern for provider-specific configurations.core/providers/huggingface/models.go (1)
16-104: LGTM! Well-structured model listing and method derivation.The implementation correctly:
- Handles nil inputs and skips invalid models
- Derives supported methods from pipeline tags and model tags with comprehensive coverage
- Pre-allocates slices for performance
- Returns sorted method lists for consistency
core/providers/huggingface/types.go (2)
28-48: LGTM! Flexible JSON unmarshaling.The custom
UnmarshalJSONcorrectly handles both the array form[...]and object form{"models": [...]}returned by different HuggingFace API versions, with proper error handling when neither format matches.
416-444: LGTM! Correct handling of union types.The custom JSON marshaling/unmarshaling for
HuggingFaceTranscriptionEarlyStoppingproperly handles both boolean and string ("never") forms, which aligns with HuggingFace API's flexible parameter schema.core/providers/huggingface/huggingface_test.go (1)
12-63: LGTM! Comprehensive test configuration.The test correctly:
- Gates on
HUGGING_FACE_API_KEYenvironment variable- Uses appropriate models for each feature (transcription, speech synthesis, embeddings, chat, vision)
- Configures comprehensive test scenarios covering chat, streaming, tool calls, images, embeddings, and more
- Follows testutil patterns with proper setup and cleanup
core/providers/huggingface/transcription.go (2)
9-101: LGTM! Robust parameter handling.The conversion correctly:
- Validates inputs with appropriate error messages
- Uses
extractIntFromInterfaceto handle numeric parameters from JSON (which may befloat64orint)- Handles the
early_stoppingunion type (bool or string)- Maps all generation parameters comprehensively
103-142: LGTM! Clean response conversion.The conversion properly:
- Validates non-nil response and model name
- Maps transcription chunks to Bifrost segments with timestamps
- Sets appropriate provider metadata in ExtraFields
core/providers/huggingface/chat.go (2)
12-207: LGTM! Comprehensive chat request conversion.The conversion correctly:
- Handles messages, roles, names, and content (string and structured blocks)
- Safely processes tool calls with nil checks for
Function.Name- Maps all chat parameters including response format, stream options, and tools
- Uses
json.RawMessagefor flexible fields likeToolChoice
209-323: LGTM! Thorough response conversion.The conversion properly:
- Validates inputs and constructs base response fields
- Converts choices, messages, tool calls, and logprobs to Bifrost format
- Maps usage information when available
- Sets appropriate ExtraFields metadata
core/providers/huggingface/huggingface.go (5)
389-461: LGTM! Solid chat completion implementation.The implementation correctly:
- Checks operation permissions
- Splits model identifiers into provider and model components
- Converts requests using the chat converter
- Handles errors and raw responses appropriately
- Sets all required ExtraFields metadata
463-745: LGTM! Robust streaming implementation.The streaming logic properly:
- Handles SSE parsing with proper line-by-line processing
- Checks for context cancellation between chunks
- Parses error responses in the stream
- Converts HuggingFace stream responses to Bifrost format
- Supports fallback to
ResponsesStreamwhen needed with proper state management- Handles combined usage and content chunks by splitting them into separate events
771-840: LGTM! Clean embedding implementation.The implementation correctly:
- Validates operation permissions
- Retrieves provider mapping to get the correct provider-specific model ID
- Validates the task type matches "feature-extraction"
- Converts responses with proper error handling
842-918: LGTM! Complete speech synthesis flow.The implementation correctly:
- Validates permissions and request body
- Retrieves and validates provider mapping for "text-to-speech" task
- Downloads audio from the returned URL
- Converts response with proper metadata
924-993: LGTM! Correct transcription implementation.The implementation correctly:
- Validates permissions and model mapping
- Checks for "automatic-speech-recognition" task
- Converts requests and responses with proper error handling
- Sets appropriate ExtraFields metadata

Summary
Adding Hugging Face inference provider.
Changes
Type of change
Affected areas
How to test
Added new env variable for hugging face
HUGGING_FACE_API_KEY=""
Screenshots/Recordings
Breaking changes
Related issues
Closes #430
Checklist
docs/contributing/README.mdand followed the guidelines