Skip to content

Conversation

@qwerty-dvorak
Copy link

@qwerty-dvorak qwerty-dvorak commented Dec 5, 2025

Summary

Adding Hugging Face inference provider.

Changes

  • Added hugginface folder in core/providers with huggingface_test as needed
  • Added changes to makefile which implement search through of local versions of binaries instead of search in global path namespace
  • Updated documentation for adding provider

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs

How to test

# Providers
export HUGGING_FACE_API_KEY="" && make test-core PROVIDER=huggingface

# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build

Added new env variable for hugging face
HUGGING_FACE_API_KEY=""

Screenshots/Recordings

image.png

Breaking changes

  • Yes
  • No

Related issues

Closes #430

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 5, 2025

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Added HuggingFace as a supported AI provider with chat completions, embeddings, speech synthesis, and speech-to-text capabilities
    • Implemented streaming support for chat and text completion features
    • Extended unified interface to include HuggingFace across all supported request types
  • Chores

    • Updated CI/CD workflows to support HuggingFace API authentication
    • Enhanced build system for improved binary path resolution and tooling management

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds HuggingFace as a first-class provider (models, chat, streaming, embeddings, TTS/STT, model listing), implements a full HuggingFace provider with converters, types, network/routing utilities, tests, UI entries, and wires HUGGING_FACE_API_KEY into CI/release workflows.

Changes

Cohort / File(s) Summary
CI / Release Workflows
.github/workflows/pr-tests.yml, .github/workflows/release-pipeline.yml
Added HUGGING_FACE_API_KEY environment variable to PR test and multiple release jobs so the key is available to tests and release scripts.
Make / Tooling
Makefile
Introduced GOBIN/GOPATH/DEFAULT_GOBIN and computed AIR_BIN/GOTESTSUM_BIN; replaced hard-coded ANSI escapes with printf-based colors; improved install and detection logic for UI/air/gotestsum; updated dev/test targets to use new binary paths.
Provider Registration & Schema
core/bifrost.go, core/schemas/bifrost.go, core/schemas/account.go, docs/apis/openapi.json, transports/config.schema.json
Registered HuggingFace model provider constant and added it to supported lists; added per-key HuggingFaceKeyConfig; updated OpenAPI and transport schema enums to include huggingface.
HuggingFace Provider Core
core/providers/huggingface/huggingface.go, core/providers/huggingface/utils.go
New HuggingFaceProvider implementation: HTTP client, request URL construction, completeRequest, model listing, pooling, routing via inference providers, and feature entrypoints (chat, text, embeddings, speech, transcription, streaming, responses and stream fallbacks).
Types & Payloads
core/providers/huggingface/types.go
Large set of request/response/streaming/error/type definitions and JSON handling for chat, embeddings, speech, transcription, model listing, and provider mappings (including custom marshal/unmarshal for mixed-type fields).
Converters / Feature Mappings
core/providers/huggingface/chat.go, core/providers/huggingface/embedding.go, core/providers/huggingface/speech.go, core/providers/huggingface/transcription.go, core/providers/huggingface/models.go
Bidirectional converters between Bifrost and HuggingFace payloads for chat (including streaming/tool-calls), embeddings, speech/TTS, transcription/STT, and model-listing capability derivation.
Tests & Test Utilities
core/providers/huggingface/huggingface_test.go, core/internal/testutil/account.go, core/internal/testutil/responses_stream.go
Added comprehensive provider tests (skippable when key absent), added HuggingFace defaults to test util configs (timeouts, retries, backoff, concurrency), and increased streaming loop iteration safety thresholds.
UI Integration
ui/lib/constants/config.ts, ui/lib/constants/icons.tsx, ui/lib/constants/logs.ts, ui/README.md
Added huggingface model placeholder and required-key flag, new HuggingFace SVG icon component, provider label entry and name list update, and README mention.
Docs / Contributor Guide / Features
docs/contributing/adding-a-provider.mdx, docs/features/unified-interface.mdx
Expanded contributor guide into a multi-step provider blueprint and added HuggingFace row to the provider support matrix with capability flags.
Mux change
core/schemas/mux.go
Adjusted delta handling to combine content and thought into a single contentDelta and added dual content/ thought presence checks.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant BifrostCore as Bifrost Core
    participant HFProvider as HuggingFaceProvider
    participant HFAPI as Hugging Face API

    Client->>BifrostCore: Request (chat / embedding / speech / list)
    BifrostCore->>HFProvider: Route request + key/config
    HFProvider->>HFProvider: splitIntoModelProvider / build URL / prepare body
    HFProvider->>HFAPI: HTTP request (sync or SSE stream)
    alt streaming
        HFAPI-->>HFProvider: SSE chunks
        HFProvider->>BifrostCore: Streamed BifrostStream events
        BifrostCore->>Client: Streamed events to client
    else non-stream
        HFAPI-->>HFProvider: JSON response
        HFProvider->>HFProvider: convert -> Bifrost response
        HFProvider->>BifrostCore: BifrostResponse (with latency/raw)
        BifrostCore->>Client: Final response
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Files needing close attention:
    • core/providers/huggingface/huggingface.go — streaming/SSE parsing, pooling, error mapping, request/response lifecycle.
    • core/providers/huggingface/types.go — many public types and custom JSON (Marshal/Unmarshal) edge cases.
    • core/providers/huggingface/chat.go — message/tool-call and streaming delta conversion correctness.
    • core/providers/huggingface/utils.go — model provider routing, URL construction, inference provider mapping and numeric extraction.
    • CI changes — ensure secrets are properly scoped and tests skip correctly when key absent.

Possibly related PRs

  • fix: integration test cases #671 — modifies the same .github/workflows/pr-tests.yml; related to CI test gating and environment setup for run-tests (may conflict or complement HUGGING_FACE_API_KEY wiring).

Poem

🐇 I hopped through the hub with a basket of code,
Brought HuggingFace friends down our unified road.
Chats and embeddings, speech in a hurry,
Now Bifrost and HF share a bright, feathery flurry! 🎶

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning The Makefile changes introducing GOBIN, GOPATH, and binary path resolution logic appear unrelated to HuggingFace provider implementation and should be separated into a distinct PR. Move Makefile binary path resolution changes to a separate, dedicated PR focused on build tooling improvements to maintain focus on the HuggingFace provider feature.
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: huggingface provider added' clearly summarizes the main change and is directly related to the comprehensive HuggingFace provider implementation in the changeset.
Description check ✅ Passed The pull request description covers all major template sections including Summary, Changes, Type of change, Affected areas, How to test, Breaking changes, Related issues, and a complete Checklist.
Linked Issues check ✅ Passed The changeset comprehensively implements all core requirements from issue #430: Provider interface with chat/text/embeddings support, API key authentication, streaming support, error handling, type definitions, test coverage, and documentation updates.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 12-05-feat_huggingface_provider_added

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 5, 2025

🧪 Test Suite Available

This PR can be tested by a repository admin.

Run tests for PR #1006

@qwerty-dvorak qwerty-dvorak force-pushed the 12-05-feat_huggingface_provider_added branch 2 times, most recently from 135b9b5 to 5baaee2 Compare December 5, 2025 13:52
@qwerty-dvorak qwerty-dvorak marked this pull request as ready for review December 5, 2025 14:05
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/schemas/mux.go (1)

1146-1229: Thought content is emitted twice in the stream when delta contains both Content and Thought.

At lines 1207-1213, the code aggregates both delta.Content and delta.Thought into a single contentDelta string and emits it as OutputTextDelta at line 1218. Then at lines 1343-1377, if hasThought is true, the code separately emits ReasoningSummaryTextDelta containing the thought content again.

Since these are sequential conditions (not mutually exclusive), when a delta contains both fields, thought appears in:

  1. The aggregated OutputTextDelta event (lines 1207-1213, 1218)
  2. The separate ReasoningSummaryTextDelta event (lines 1365-1377)

Clarify the intended behavior: should thought only appear in the reasoning delta event, or is it correct for it to appear in both? If thought should be separate, remove delta.Thought from the aggregation at lines 1207-1213.

🧹 Nitpick comments (6)
docs/features/unified-interface.mdx (1)

88-106: Verify Hugging Face capability matrix matches actual implementation

The Hugging Face row advertises ✅ for Text/Text (stream)/Chat/Chat (stream)/Embeddings/TTS/STT and ❌ for both Responses modes. Please double‑check this against the actual HuggingFace provider implementation (especially Responses and any streaming audio paths) so the matrix doesn’t drift from reality.

core/providers/huggingface/huggingface_test.go (1)

59-62: Move client.Shutdown() inside the subtest or use t.Cleanup.

client.Shutdown() is called unconditionally after t.Run, but if the subtest is run in parallel or skipped, this could cause issues. Consider using t.Cleanup for proper resource cleanup.

+	t.Cleanup(func() {
+		client.Shutdown()
+	})
+
 	t.Run("HuggingFaceTests", func(t *testing.T) {
 		testutil.RunAllComprehensiveTests(t, client, ctx, testConfig)
 	})
-	client.Shutdown()
core/providers/huggingface/speech.go (1)

27-28: Consider avoiding empty Parameters struct allocation.

hfRequest.Parameters is allocated on line 28 even if no parameters are actually mapped. Consider only allocating when there are parameters to set.

 	// Map parameters if present
 	if request.Params != nil {
-		hfRequest.Parameters = &HuggingFaceSpeechParameters{}
-
 		// Map generation parameters from ExtraParams if available
 		if request.Params.ExtraParams != nil {
 			genParams := &HuggingFaceTranscriptionGenerationParameters{}
 
 			// ... parameter mapping ...
 
-			hfRequest.Parameters.GenerationParameters = genParams
+			hfRequest.Parameters = &HuggingFaceSpeechParameters{
+				GenerationParameters: genParams,
+			}
 		}
 	}
core/providers/huggingface/chat.go (1)

36-38: Errors from sonic.Marshal are silently ignored.

Multiple calls to sonic.Marshal discard the error using _. While marshalling simple structs rarely fails, silently ignoring errors could mask issues in edge cases (e.g., cyclic references, unusual types).

Consider logging or returning an error when marshalling fails for better debuggability:

contentJSON, err := sonic.Marshal(*msg.Content.ContentStr)
if err != nil {
    // At minimum, log the error for debugging
    if debug {
        fmt.Printf("[huggingface debug] Failed to marshal content: %v\n", err)
    }
    continue
}

Also applies to: 61-62, 189-194

core/providers/huggingface/huggingface.go (1)

51-85: Typo: "aquire" should be "acquire"

The function names use "aquire" instead of the correct spelling "acquire". While this doesn't affect functionality, it's a code quality issue that should be fixed for consistency and readability.

-func aquireHuggingFaceChatResponse() *HuggingFaceChatResponse {
+func acquireHuggingFaceChatResponse() *HuggingFaceChatResponse {
-func aquireHuggingFaceTranscriptionResponse() *HuggingFaceTranscriptionResponse {
+func acquireHuggingFaceTranscriptionResponse() *HuggingFaceTranscriptionResponse {
-func aquireHuggingFaceSpeechResponse() *HuggingFaceSpeechResponse {
+func acquireHuggingFaceSpeechResponse() *HuggingFaceSpeechResponse {

Don't forget to update the call sites at lines 348, 790, and 861.

core/providers/huggingface/types.go (1)

379-396: Silent error swallowing in UnmarshalJSON

The UnmarshalJSON method silently returns nil when both boolean and string unmarshaling fail, leaving the struct in an uninitialized state. This could mask malformed JSON input.

Consider returning an error when the input doesn't match expected types:

 func (e *HuggingFaceTranscriptionEarlyStopping) UnmarshalJSON(data []byte) error {
+	// Handle null explicitly
+	if string(data) == "null" {
+		return nil
+	}
+
 	// Try boolean first
 	var boolVal bool
 	if err := json.Unmarshal(data, &boolVal); err == nil {
 		e.BoolValue = &boolVal
 		return nil
 	}

 	// Try string
 	var stringVal string
 	if err := json.Unmarshal(data, &stringVal); err == nil {
 		e.StringValue = &stringVal
 		return nil
 	}

-	return nil
+	return fmt.Errorf("early_stopping must be a boolean or string, got: %s", string(data))
 }

This would require adding "fmt" to the imports.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 633a01d and 5baaee2.

📒 Files selected for processing (28)
  • .github/workflows/pr-tests.yml (1 hunks)
  • .github/workflows/release-pipeline.yml (4 hunks)
  • Makefile (8 hunks)
  • core/bifrost.go (2 hunks)
  • core/internal/testutil/account.go (3 hunks)
  • core/internal/testutil/chat_completion_stream.go (1 hunks)
  • core/internal/testutil/responses_stream.go (1 hunks)
  • core/providers/huggingface/chat.go (1 hunks)
  • core/providers/huggingface/embedding.go (1 hunks)
  • core/providers/huggingface/huggingface.go (1 hunks)
  • core/providers/huggingface/huggingface_test.go (1 hunks)
  • core/providers/huggingface/models.go (1 hunks)
  • core/providers/huggingface/speech.go (1 hunks)
  • core/providers/huggingface/transcription.go (1 hunks)
  • core/providers/huggingface/types copy.go (1 hunks)
  • core/providers/huggingface/types.go (1 hunks)
  • core/providers/huggingface/utils.go (1 hunks)
  • core/schemas/account.go (2 hunks)
  • core/schemas/bifrost.go (3 hunks)
  • core/schemas/mux.go (2 hunks)
  • docs/apis/openapi.json (1 hunks)
  • docs/contributing/adding-a-provider.mdx (1 hunks)
  • docs/features/unified-interface.mdx (1 hunks)
  • transports/config.schema.json (2 hunks)
  • ui/README.md (1 hunks)
  • ui/lib/constants/config.ts (2 hunks)
  • ui/lib/constants/icons.tsx (1 hunks)
  • ui/lib/constants/logs.ts (2 hunks)
👮 Files not reviewed due to content moderation or server errors (1)
  • ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • transports/config.schema.json
  • core/internal/testutil/responses_stream.go
  • core/schemas/bifrost.go
  • core/providers/huggingface/embedding.go
  • docs/apis/openapi.json
  • core/schemas/mux.go
  • docs/features/unified-interface.mdx
  • core/internal/testutil/chat_completion_stream.go
  • ui/lib/constants/config.ts
  • core/providers/huggingface/types copy.go
  • core/providers/huggingface/models.go
  • ui/README.md
  • core/schemas/account.go
  • core/bifrost.go
  • core/providers/huggingface/transcription.go
  • ui/lib/constants/icons.tsx
  • ui/lib/constants/logs.ts
  • Makefile
  • core/providers/huggingface/huggingface_test.go
  • core/providers/huggingface/speech.go
  • core/internal/testutil/account.go
  • docs/contributing/adding-a-provider.mdx
  • core/providers/huggingface/utils.go
  • core/providers/huggingface/chat.go
  • core/providers/huggingface/huggingface.go
  • core/providers/huggingface/types.go
🧬 Code graph analysis (10)
core/schemas/bifrost.go (1)
ui/lib/types/config.ts (1)
  • ModelProvider (171-174)
core/providers/huggingface/embedding.go (2)
core/schemas/embedding.go (4)
  • BifrostEmbeddingRequest (9-16)
  • BifrostEmbeddingResponse (22-28)
  • EmbeddingData (118-122)
  • EmbeddingStruct (124-128)
core/providers/huggingface/types.go (2)
  • HuggingFaceEmbeddingRequest (278-288)
  • HuggingFaceEmbeddingResponse (299-299)
core/schemas/mux.go (2)
core/schemas/responses.go (2)
  • BifrostResponsesStreamResponse (1422-1460)
  • ResponsesStreamResponseTypeOutputTextDelta (1370-1370)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/internal/testutil/chat_completion_stream.go (1)
core/internal/testutil/utils.go (1)
  • CreateBasicChatMessage (247-254)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
  • HuggingFaceListModelsResponse (21-23)
core/schemas/bifrost.go (13)
  • ModelProvider (32-32)
  • RequestType (86-86)
  • ChatCompletionRequest (92-92)
  • ChatCompletionStreamRequest (93-93)
  • TextCompletionRequest (90-90)
  • TextCompletionStreamRequest (91-91)
  • ResponsesRequest (94-94)
  • ResponsesStreamRequest (95-95)
  • EmbeddingRequest (96-96)
  • SpeechRequest (97-97)
  • SpeechStreamRequest (98-98)
  • TranscriptionRequest (99-99)
  • TranscriptionStreamRequest (100-100)
core/schemas/models.go (1)
  • Model (109-129)
core/schemas/account.go (1)
ui/lib/types/config.ts (3)
  • AzureKeyConfig (23-27)
  • VertexKeyConfig (36-42)
  • BedrockKeyConfig (53-60)
core/bifrost.go (2)
core/schemas/bifrost.go (1)
  • HuggingFace (51-51)
core/providers/huggingface/huggingface.go (1)
  • NewHuggingFaceProvider (88-120)
core/providers/huggingface/speech.go (3)
core/schemas/speech.go (2)
  • BifrostSpeechRequest (9-16)
  • BifrostSpeechResponse (22-29)
core/providers/huggingface/types.go (5)
  • HuggingFaceSpeechRequest (304-310)
  • HuggingFaceSpeechParameters (313-316)
  • HuggingFaceTranscriptionGenerationParameters (342-359)
  • HuggingFaceTranscriptionEarlyStopping (363-366)
  • HuggingFaceSpeechResponse (319-323)
core/schemas/bifrost.go (2)
  • BifrostResponseExtraFields (287-296)
  • HuggingFace (51-51)
core/providers/huggingface/utils.go (5)
core/providers/huggingface/huggingface.go (1)
  • HuggingFaceProvider (25-31)
core/schemas/models.go (1)
  • BifrostListModelsRequest (23-34)
core/providers/utils/utils.go (5)
  • GetRequestPath (219-239)
  • MakeRequestWithContext (39-93)
  • HandleProviderAPIError (317-337)
  • CheckAndDecodeBody (423-431)
  • NewBifrostOperationError (449-460)
core/providers/huggingface/types.go (3)
  • HuggingFaceInferenceProviderMappingResponse (25-30)
  • HuggingFaceInferenceProviderMapping (39-42)
  • HuggingFaceHubError (263-266)
core/schemas/provider.go (1)
  • ErrProviderResponseDecode (29-29)
core/providers/huggingface/chat.go (2)
core/schemas/chatcompletions.go (17)
  • BifrostChatRequest (12-19)
  • ChatContentBlockTypeText (497-497)
  • ChatContentBlockTypeImage (498-498)
  • ChatAssistantMessage (541-545)
  • ChatToolMessage (536-538)
  • ChatToolChoiceStruct (339-344)
  • BifrostResponseChoice (582-590)
  • ChatMessageRole (415-415)
  • ChatAssistantMessageToolCall (564-570)
  • ChatAssistantMessageToolCallFunction (573-576)
  • ChatNonStreamResponseChoice (605-608)
  • BifrostLogProbs (593-598)
  • ContentLogProb (632-637)
  • LogProb (625-629)
  • BifrostLLMUsage (640-647)
  • ChatStreamResponseChoice (611-613)
  • ChatStreamResponseChoiceDelta (616-622)
core/providers/huggingface/types.go (12)
  • HuggingFaceChatRequest (47-67)
  • HuggingFaceChatMessage (69-77)
  • HuggingFaceContentItem (80-84)
  • HuggingFaceImageRef (86-88)
  • HuggingFaceToolCall (90-94)
  • HuggingFaceFunction (96-100)
  • HuggingFaceResponseFormat (102-105)
  • HuggingFaceStreamOptions (114-116)
  • HuggingFaceTool (118-121)
  • HuggingFaceToolFunction (123-127)
  • HuggingFaceChatResponse (129-136)
  • HuggingFaceChatStreamResponse (190-199)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx

[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...

(QB_NEW_EN_HYPHEN)


[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...

(QB_NEW_EN_HYPHEN)


[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...

(QB_NEW_EN_HYPHEN)


[grammar] ~1469-~1469: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...

(QB_NEW_EN_HYPHEN)


[uncategorized] ~1779-~1779: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...

(GITHUB)

🔇 Additional comments (29)
.github/workflows/release-pipeline.yml (1)

118-118: LGTM! Consistent HuggingFace API key propagation across release jobs.

The HUGGING_FACE_API_KEY has been properly added to all four release jobs (core, framework, plugins, bifrost-http) that require provider API keys for testing. The implementation is consistent with existing API key patterns and correctly sources from GitHub secrets.

Also applies to: 191-191, 268-268, 357-357

core/schemas/account.go (1)

54-56: LGTM! HuggingFaceKeyConfig follows established patterns.

The new HuggingFaceKeyConfig type and its integration into the Key struct follow the same pattern as existing provider configurations (Azure, Vertex, Bedrock). The implementation is consistent with:

  • Similar Deployments map structure used by other providers
  • Proper JSON tags with omitempty
  • Go naming conventions
  • Optional field design

Also applies to: 17-17

core/internal/testutil/account.go (1)

96-96: LGTM! HuggingFace test configuration added with conservative settings.

The HuggingFace provider has been properly integrated into the test account setup. Notable configuration choices:

  • MaxRetries: 1 - Much lower than other providers (8-10), suggesting less reliability or desire to fail fast
  • Timeout: 300 seconds - Higher than most providers (120s), indicating potentially longer response times
  • Retry backoff: 2s-30s - Conservative settings for retry attempts

These settings appear intentional for the HuggingFace Inference API characteristics. Ensure these align with production use cases and adjust if needed based on actual performance data.

Also applies to: 259-266, 512-524

.github/workflows/pr-tests.yml (1)

118-118: LGTM! HuggingFace API key added to test environment.

The HUGGING_FACE_API_KEY has been properly added to the PR test workflow, consistent with the release pipeline changes and other provider API key patterns.

ui/README.md (1)

84-84: LGTM! Documentation updated with HuggingFace provider.

The README has been updated to include HuggingFace in the list of supported providers, keeping the documentation in sync with the code changes.

core/internal/testutil/responses_stream.go (1)

693-693: Verify the lifecycle streaming safety threshold of 300 is appropriate.

The response count safety check at line 693 uses a threshold of 300 chunks. However, the file shows different thresholds for different streaming scenarios: tool streaming (100), reasoning streaming (150), lifecycle streaming (300), and basic streaming (500). Before merging, clarify:

  1. Why lifecycle streaming allows 3x more chunks than tool streaming—is this intentional differentiation or oversight?
  2. Is 300 chunks a reasonable upper bound based on actual HuggingFace lifecycle streaming behavior?
  3. Consider adding logging when approaching thresholds to help diagnose unexpected verbosity.
transports/config.schema.json (1)

135-140: HuggingFace correctly wired into config schema

providers.huggingface reuses the generic provider schema and the semanticcache provider enum now includes "huggingface", both consistent with other providers and with the new ModelProvider constant. No issues spotted.

Also applies to: 764-784

core/schemas/bifrost.go (1)

35-83: ModelProvider and provider lists updated consistently for HuggingFace

HuggingFace is added as a ModelProvider and included in both SupportedBaseProviders and StandardProviders with the correct "huggingface" identifier. This aligns with the rest of the provider plumbing.

ui/lib/constants/config.ts (1)

5-24: UI config: HuggingFace placeholder and key requirement look correct

The huggingface model placeholder and isKeyRequiredByProvider entry are consistent with other providers and with the expected HF auth model. No issues from a UI/config standpoint.

Also applies to: 26-44

ui/lib/constants/logs.ts (1)

2-20: Logging constants updated to recognize HuggingFace

Adding "huggingface" to KnownProvidersNames and ProviderLabels keeps the provider type and display labels consistent across the UI logging layer. Looks good.

Also applies to: 43-61

docs/apis/openapi.json (1)

3239-3259: OpenAPI ModelProvider enum now includes HuggingFace (and Cerebras)

Extending the ModelProvider enum to include "huggingface" (and "cerebras") brings the public API spec in line with the backend/provider constants and config schema. The change is additive and backward‑compatible.

core/bifrost.go (1)

1327-1328: LGTM!

The HuggingFace provider is correctly integrated into the provider factory, following the established pattern used by other providers that return only a pointer (like OpenAI, Anthropic, Mistral, Gemini).

core/providers/huggingface/embedding.go (2)

22-31: LGTM with a note on unsupported input types.

The conversion correctly handles Text and Texts input types. The comment on lines 29-30 appropriately documents that embedding/embeddings (int arrays) are not supported by HuggingFace feature extraction.


57-92: LGTM!

The response conversion correctly:

  • Validates nil input
  • Pre-allocates the slice with proper capacity
  • Maps each embedding to the appropriate Bifrost structure
  • Documents that HuggingFace doesn't return usage information
core/providers/huggingface/models.go (2)

46-104: LGTM!

The deriveSupportedMethods function correctly:

  • Normalizes the pipeline string for case-insensitive matching
  • Uses a map to deduplicate methods
  • Handles both pipeline tags and model tags for flexibility
  • Returns a sorted, deduplicated list of supported methods

34-39: Code correctly distinguishes between model.ID and model.ModelID fields.

The distinction is valid: model.ModelID (maps to "modelId" in the HuggingFace API response) represents the model's user-facing identifier, while model.ID (maps to "_id" in the API response) represents the internal HuggingFace identifier. The code appropriately uses ModelID for the composite ID and Name, and ID for the HuggingFaceID field.

core/providers/huggingface/speech.go (1)

94-116: LGTM!

The response conversion correctly maps the HuggingFace response to Bifrost format, with appropriate nil checks and documentation about missing usage/alignment data.

Makefile (2)

14-21: LGTM! Improved tool binary discovery for portable environments.

The logic for detecting Go binary paths (GOBIN, GOPATH, DEFAULT_GOBIN) and constructing tool paths (AIR_BIN, GOTESTSUM_BIN) is well-structured. This enables robust tool invocation without relying on global PATH configuration, which benefits environments like Nix.


65-69: Good safety check for preventing root execution in local development.

The guard against running as root on developer machines while allowing CI environments is appropriate. This prevents global npm install failures on systems like Nix.

core/providers/huggingface/transcription.go (1)

100-139: LGTM! Response conversion is well-implemented.

The ToBifrostTranscriptionResponse method properly validates inputs, handles optional timestamp chunks, and correctly maps them to TranscriptionSegment structures.

core/providers/huggingface/chat.go (2)

201-315: LGTM! Non-streaming response conversion is comprehensive.

The ToBifrostChatResponse method properly handles:

  • Nil checks and model validation
  • Choice conversion with message, role, content, and tool calls
  • Logprobs conversion including nested top logprobs
  • Usage information mapping

317-413: LGTM! Streaming response conversion is well-implemented.

The ToBifrostChatStreamResponse method correctly converts streaming delta responses, handling roles, content, reasoning (as thought), tool calls, and logprobs appropriately.

core/providers/huggingface/utils.go (1)

83-128: LGTM! URL building with proper encoding and ExtraParams handling.

The buildModelHubURL function correctly:

  • Applies default and maximum limits
  • URL-encodes all query parameters
  • Handles various types in ExtraParams with appropriate conversions
  • Constructs a well-formed URL with the inference provider filter
core/providers/huggingface/huggingface.go (3)

477-650: LGTM: Stream processing goroutine

The goroutine properly handles:

  • Context cancellation checks before processing each chunk
  • Deferred cleanup of resources (channel close, response release)
  • Scanner buffer sizing for large responses
  • Error handling with proper error propagation to the response channel
  • Both regular chat streaming and ResponsesStream fallback modes

132-205: LGTM: Request handling with proper resource management

The completeRequest function correctly:

  • Uses defer for cleanup of fasthttp resources
  • Makes a copy of the response body before releasing to avoid use-after-free
  • Handles error responses with proper error type extraction
  • Supports debug logging controlled by environment variable

87-120: LGTM: Provider initialization

Provider initialization correctly handles configuration defaults, client setup with timeouts, pool pre-warming, proxy configuration, and base URL normalization.

core/providers/huggingface/types.go (3)

1-43: LGTM: Model and inference provider types

The model metadata types and inference provider mapping structures are well-defined with appropriate JSON tags for API compatibility.


44-260: LGTM: Chat completion types

Comprehensive type definitions for chat requests/responses with:

  • Support for both streaming and non-streaming responses
  • Flexible content handling via json.RawMessage
  • Tool/function calling support
  • Logprobs support
  • Time info for streaming diagnostics

274-411: LGTM: Embedding, Speech, and Transcription types

Well-structured types for:

  • Embedding with flexible input types and encoding format options
  • Speech synthesis with generation parameters
  • Transcription with timestamp support and generation configuration
  • Type aliases for backward compatibility

@qwerty-dvorak qwerty-dvorak force-pushed the 12-05-feat_huggingface_provider_added branch from 5baaee2 to 5a72875 Compare December 5, 2025 19:17
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (12)
docs/contributing/adding-a-provider.mdx (2)

500-527: Variable name inconsistency in documentation example (duplicate comment).

The example declares hfReq on line 500 but then references providerReq in the parameter mapping section. This inconsistency was flagged in a past review and remains unresolved. Replace all providerReq references with hfReq for consistency.

Apply this diff to fix the variable names:

     // Build the request
     hfReq := &HuggingFaceChatRequest{
         Model:    bifrostReq.Model,
         Messages: hfMessages,
     }
 
     // Map parameters
     if bifrostReq.Params != nil {
         params := bifrostReq.Params
         
         // Map standard parameters
         if params.Temperature != nil {
-            providerReq.Temperature = params.Temperature
+            hfReq.Temperature = params.Temperature
         }
         if params.MaxTokens != nil {
-            providerReq.MaxTokens = params.MaxTokens
+            hfReq.MaxTokens = params.MaxTokens
         }
         // ... other standard parameters
         
         // Handle provider-specific ExtraParams
         if params.ExtraParams != nil {
             if customParam, ok := params.ExtraParams["custom_param"].(string); ok {
-                providerReq.CustomParam = &customParam
+                hfReq.CustomParam = &customParam
             }
         }
     }
 
-    return providerReq
+    return hfReq

1405-1427: Incomplete/truncated code example (duplicate comment).

The code example is cut off mid-function at the tool/function calling section, which may confuse contributors trying to implement converters. This was flagged in a past review. Either complete the example with the tool calling logic and final return, or explicitly mark it as abbreviated with a comment like:

           // Tool/Function calling
           if params.Tools != nil && len(params.Tools) > 0 {
               // Convert tools...
           }
       }

       return hfReq
   }

Consider adding clarity about why the example is truncated and pointing contributors to full working examples in core/providers/huggingface/ or core/providers/anthropic/.

core/providers/huggingface/huggingface_test.go (1)

29-33: TranscriptionModel and SpeechSynthesisModel are swapped.

Based on model capabilities:

  • Kokoro-82M is a text-to-speech (TTS) model → should be SpeechSynthesisModel
  • whisper-large-v3 is a speech-to-text (transcription) model → should be TranscriptionModel

Apply this diff to fix the model assignments:

-		TranscriptionModel:   "fal-ai/hexgrad/Kokoro-82M",
-		SpeechSynthesisModel: "fal-ai/openai/whisper-large-v3",
+		TranscriptionModel:   "fal-ai/openai/whisper-large-v3",
+		SpeechSynthesisModel: "fal-ai/hexgrad/Kokoro-82M",
 		SpeechSynthesisFallbacks: []schemas.Fallback{
-			{Provider: schemas.HuggingFace, Model: "fal-ai/openai/whisper-large-v3"},
+			{Provider: schemas.HuggingFace, Model: "fal-ai/hexgrad/Kokoro-82M"},
 		},
core/providers/huggingface/speech.go (1)

37-63: Type assertions for int will fail when values come from JSON unmarshaling.

When ExtraParams is populated from JSON (e.g., from HTTP request bodies), numeric values are unmarshaled as float64, not int. These type assertions will silently fail, and the parameters won't be set.

Handle both int and float64 types for all integer parameters. Example for max_new_tokens:

-			if val, ok := request.Params.ExtraParams["max_new_tokens"].(int); ok {
-				genParams.MaxNewTokens = &val
+			if val, ok := request.Params.ExtraParams["max_new_tokens"].(float64); ok {
+				intVal := int(val)
+				genParams.MaxNewTokens = &intVal
+			} else if val, ok := request.Params.ExtraParams["max_new_tokens"].(int); ok {
+				genParams.MaxNewTokens = &val
 			}

Apply the same pattern to: max_length, min_length, min_new_tokens, num_beams, num_beam_groups, and top_k.

Consider extracting a helper function to reduce code duplication:

func getIntParam(params map[string]any, key string) *int {
    if val, ok := params[key].(float64); ok {
        intVal := int(val)
        return &intVal
    }
    if val, ok := params[key].(int); ok {
        return &val
    }
    return nil
}
core/providers/huggingface/transcription.go (2)

14-16: Incorrect error message references "speech" instead of "transcription".

The error message says "speech request input cannot be nil" but this is a transcription request converter.

 	if request.Input == nil {
-		return nil, fmt.Errorf("speech request input cannot be nil")
+		return nil, fmt.Errorf("transcription request input cannot be nil")
 	}

38-63: Type assertions for int will fail when values come from JSON unmarshaling.

Same issue as in speech.go - when ExtraParams comes from JSON, numeric values are float64, not int. These assertions will silently fail.

Handle both int and float64 types. Consider creating a shared helper function in the package to avoid code duplication across speech.go and transcription.go:

// In a shared utils file
func getIntFromExtra(params map[string]any, key string) *int {
    if val, ok := params[key].(float64); ok {
        intVal := int(val)
        return &intVal
    }
    if val, ok := params[key].(int); ok {
        return &val
    }
    return nil
}

Then use it as:

genParams.MaxNewTokens = getIntFromExtra(request.Params.ExtraParams, "max_new_tokens")
core/providers/huggingface/chat.go (1)

69-81: Critical: Nil pointer dereference remains unaddressed.

Line 74 still dereferences tc.Function.Name without checking for nil, which can cause a panic.

Apply the previously suggested fix:

 for _, tc := range msg.ChatAssistantMessage.ToolCalls {
+	if tc.Function.Name == nil {
+		continue // Skip tool calls without a function name
+	}
 	hfToolCall := HuggingFaceToolCall{
 		ID:   tc.ID,
 		Type: tc.Type,
 		Function: HuggingFaceFunction{
 			Name:      *tc.Function.Name,
 			Arguments: tc.Function.Arguments,
 		},
 	}
 	hfToolCalls = append(hfToolCalls, hfToolCall)
 }
core/providers/huggingface/huggingface.go (4)

277-293: Critical: Wrong provider constant remains unfixed.

Line 279 still uses schemas.Gemini instead of schemas.HuggingFace in the operation check, causing incorrect permission validation.

Apply the previously suggested fix:

-if err := providerUtils.CheckOperationAllowed(schemas.Gemini, provider.customProviderConfig, schemas.ListModelsRequest); err != nil {
+if err := providerUtils.CheckOperationAllowed(schemas.HuggingFace, provider.customProviderConfig, schemas.ListModelsRequest); err != nil {
 	return nil, err
 }

700-707: Minor: Wrong request type in Embedding error messages remains unfixed.

Lines 701 and 706 use schemas.SpeechRequest instead of schemas.EmbeddingRequest in error messages, which will confuse debugging.

Apply the previously suggested fix:

 if providerMapping == nil {
-	return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey())
+	return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey())
 }

 mapping, ok := providerMapping[inferenceProvider]
 if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "feature-extraction" {
-	return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey())
+	return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey())
 }

773-776: Major: Wrong task type check in Speech remains unfixed.

Line 774 checks for "automatic-speech-recognition" (transcription task) instead of "text-to-speech" (speech generation task). This will cause Speech operations to incorrectly validate against the wrong task type.

Apply the previously suggested fix:

 mapping, ok := providerMapping[inferenceProvider]
-if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "automatic-speech-recognition" {
+if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "text-to-speech" {
 	return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey())
 }

831-838: Major: Wrong task type and request type in Transcription remain unfixed.

Two issues here:

  1. Lines 832 and 837 use schemas.SpeechRequest instead of schemas.TranscriptionRequest in error messages
  2. Line 836 checks for "text-to-speech" task instead of "automatic-speech-recognition" (the tasks are swapped with the Speech function)

Apply the previously suggested fix:

 if providerMapping == nil {
-	return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey())
+	return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey())
 }

 mapping, ok := providerMapping[inferenceProvider]
-if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "text-to-speech" {
-	return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey())
+if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "automatic-speech-recognition" {
+	return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey())
 }
core/providers/huggingface/types.go (1)

21-23: Critical: Struct definition may not match API response format.

The HuggingFace /api/models endpoint returns a JSON array directly, but HuggingFaceListModelsResponse expects an object with a models field. This mismatch will cause unmarshaling to fail.

Run this script to verify the API response format:

#!/bin/bash
# Check the actual response format from HuggingFace API
curl -s "https://huggingface.co/api/models?limit=1" | jq -c 'if type == "array" then "ARRAY" else "OBJECT with keys: " + (keys | join(", ")) end'

If the API returns "ARRAY", change the response type to a slice or add custom UnmarshalJSON:

-type HuggingFaceListModelsResponse struct {
-	Models []HuggingFaceModel `json:"models"`
-}
+type HuggingFaceListModelsResponse []HuggingFaceModel

And update usage in listModelsByKey (line ~256 in huggingface.go):

-response := huggingfaceAPIResponse.ToBifrostListModelsResponse(providerName)
+response := ToBifrostListModelsResponse(huggingfaceAPIResponse, providerName)
🧹 Nitpick comments (8)
docs/contributing/adding-a-provider.mdx (1)

43-43: Hyphenate compound adjectives: "OpenAI-compatible" throughout documentation.

For consistency and grammatical correctness, compound adjectives should be hyphenated when they precede a noun. Update all instances of "OpenAI Compatible" to "OpenAI-compatible":

- #### Non-OpenAI Compatible Providers
+ #### Non-OpenAI-compatible Providers

- #### OpenAI Compatible Providers
+ #### OpenAI-compatible Providers

- ### OpenAI Compatible Providers
+ ### OpenAI-compatible Providers

- ### For OpenAI Compatible Providers
+ ### For OpenAI-compatible Providers

Apply these changes throughout the document to improve consistency and readability.

Also applies to: 71-71, 629-629, 1469-1469

Makefile (3)

14-21: Binary path variables introduce deterministic tool discovery—good for portability.

The approach of chaining GOBIN → GOPATH/bin → default Go locations is sound and addresses real pain points on systems like Nix. However, Line 21's fallback to which may not handle absolute paths reliably:

GOTESTSUM_BIN := $(if $(strip $(DEFAULT_GOBIN)),$(DEFAULT_GOBIN)/gotestsum,$(shell which gotestsum 2>/dev/null || echo gotestsum))

which typically expects command names (searched in PATH), not absolute paths. If which is called with a full path, it may fail unexpectedly on some systems. Consider using command -v or simplifying to just fallback to the bare name:

- GOTESTSUM_BIN := $(if $(strip $(DEFAULT_GOBIN)),$(DEFAULT_GOBIN)/gotestsum,$(shell which gotestsum 2>/dev/null || echo gotestsum))
+ GOTESTSUM_BIN := $(if $(strip $(DEFAULT_GOBIN)),$(DEFAULT_GOBIN)/gotestsum,gotestsum)

The || echo gotestsum ensures the variable is never empty, but the bare name should be sufficient; downstream checks like Line 103 already verify existence.


89-100: install-air logic is sound but the binary availability check is convoluted (Line 97).

The condition:

if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then

…will work (warns only if the full path doesn't exist), but it's unnecessarily complex. Since INSTALLED is a full path, which on an absolute path may fail unexpectedly on some shells. Simplify to:

- if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \
+ if [ ! -x "$$INSTALLED" ]; then \

102-113: install-gotestsum has the same which issue and convoluted logic (Lines 103, 110).

Line 103:

if [ -x "$(GOTESTSUM_BIN)" ] || which $(GOTESTSUM_BIN) > /dev/null 2>&1; then

If GOTESTSUM_BIN is a full path like /home/user/go/bin/gotestsum, which on it will fail. Simplify to just check the file:

- if [ -x "$(GOTESTSUM_BIN)" ] || which $(GOTESTSUM_BIN) > /dev/null 2>&1; then \
+ if [ -x "$(GOTESTSUM_BIN)" ]; then \

Line 110 has the same convoluted conditional as Line 97—apply the same simplification:

- if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \
+ if [ ! -x "$$INSTALLED" ]; then \
core/schemas/mux.go (1)

1146-1221: Confirm intent of mixing delta.Thought into output_text.delta and potential duplication

The new logic now (a) enters the text path when either hasContent or hasThought is true and (b) builds contentDelta by concatenating delta.Content and delta.Thought, then uses that for ResponsesStreamResponseTypeOutputTextDelta.Delta, while still emitting a separate ResponsesStreamResponseTypeReasoningSummaryTextDelta based on delta.Thought.

This is a behavioral change from “text delta only reflects Content” to “text delta reflects Content + Thought”, and also means the same Thought tokens appear in both the output-text and reasoning-summary streams. If Thought is intended to remain non-user-visible reasoning, this could leak it into the primary text channel and may surprise existing consumers that only look at output_text.delta.

Please double‑check that:

  • Existing UIs/clients that consume output_text.delta are meant to see Thought text, and
  • They can tolerate Thought appearing both in output_text.delta and reasoning_summary_text.delta.

If the goal was only to advance lifecycle (item creation / closing) when thought‑only chunks arrive, a narrower change that gates on hasThought but keeps Delta based solely on Content might be safer. Happy to sketch that refactor if you confirm the desired semantics.

core/providers/huggingface/huggingface_test.go (1)

59-63: Consider using defer for client.Shutdown() to ensure cleanup on panic.

If RunAllComprehensiveTests panics, client.Shutdown() won't be called, potentially leaving resources unreleased.

+	defer client.Shutdown()
+
 	t.Run("HuggingFaceTests", func(t *testing.T) {
 		testutil.RunAllComprehensiveTests(t, client, ctx, testConfig)
 	})
-	client.Shutdown()
core/providers/huggingface/chat.go (2)

34-64: Consider handling marshaling errors for content conversion.

Lines 37 and 61 discard marshaling errors when converting content. While the API may reject invalid payloads, explicitly handling these errors would make debugging easier.

Consider logging or returning errors:

 if msg.Content.ContentStr != nil {
-	contentJSON, _ := sonic.Marshal(*msg.Content.ContentStr)
+	contentJSON, err := sonic.Marshal(*msg.Content.ContentStr)
+	if err != nil {
+		// Log warning or return error
+		continue
+	}
 	hfMsg.Content = json.RawMessage(contentJSON)
 }

Apply similar handling at line 61 for ContentBlocks.


136-195: Consider handling errors in ResponseFormat and ToolChoice conversions.

Lines 138-144 and 189-193 silently discard marshaling errors when converting ResponseFormat and ToolChoice. Since these are optional but important parameters, logging failures would aid debugging.

Example for ResponseFormat:

 if params.ResponseFormat != nil {
 	responseFormatJSON, err := sonic.Marshal(params.ResponseFormat)
-	if err == nil {
+	if err != nil {
+		// Log warning: failed to marshal ResponseFormat
+	} else {
 		var hfResponseFormat HuggingFaceResponseFormat
 		if err := sonic.Unmarshal(responseFormatJSON, &hfResponseFormat); err == nil {
 			hfReq.ResponseFormat = &hfResponseFormat
+		} else {
+			// Log warning: failed to unmarshal ResponseFormat
 		}
 	}
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5baaee2 and 5a72875.

📒 Files selected for processing (28)
  • .github/workflows/pr-tests.yml (1 hunks)
  • .github/workflows/release-pipeline.yml (4 hunks)
  • Makefile (8 hunks)
  • core/bifrost.go (2 hunks)
  • core/internal/testutil/account.go (3 hunks)
  • core/internal/testutil/chat_completion_stream.go (1 hunks)
  • core/internal/testutil/responses_stream.go (1 hunks)
  • core/providers/huggingface/chat.go (1 hunks)
  • core/providers/huggingface/embedding.go (1 hunks)
  • core/providers/huggingface/huggingface.go (1 hunks)
  • core/providers/huggingface/huggingface_test.go (1 hunks)
  • core/providers/huggingface/models.go (1 hunks)
  • core/providers/huggingface/speech.go (1 hunks)
  • core/providers/huggingface/transcription.go (1 hunks)
  • core/providers/huggingface/types copy.go (1 hunks)
  • core/providers/huggingface/types.go (1 hunks)
  • core/providers/huggingface/utils.go (1 hunks)
  • core/schemas/account.go (2 hunks)
  • core/schemas/bifrost.go (3 hunks)
  • core/schemas/mux.go (2 hunks)
  • docs/apis/openapi.json (1 hunks)
  • docs/contributing/adding-a-provider.mdx (1 hunks)
  • docs/features/unified-interface.mdx (1 hunks)
  • transports/config.schema.json (2 hunks)
  • ui/README.md (1 hunks)
  • ui/lib/constants/config.ts (2 hunks)
  • ui/lib/constants/icons.tsx (1 hunks)
  • ui/lib/constants/logs.ts (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • core/providers/huggingface/utils.go
🚧 Files skipped from review as they are similar to previous changes (14)
  • core/providers/huggingface/types copy.go
  • ui/README.md
  • docs/features/unified-interface.mdx
  • .github/workflows/pr-tests.yml
  • docs/apis/openapi.json
  • ui/lib/constants/logs.ts
  • core/providers/huggingface/embedding.go
  • transports/config.schema.json
  • ui/lib/constants/config.ts
  • .github/workflows/release-pipeline.yml
  • core/schemas/bifrost.go
  • core/internal/testutil/chat_completion_stream.go
  • core/internal/testutil/account.go
  • core/internal/testutil/responses_stream.go
👮 Files not reviewed due to content moderation or server errors (1)
  • ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/huggingface/huggingface_test.go
  • core/providers/huggingface/models.go
  • core/providers/huggingface/chat.go
  • core/providers/huggingface/speech.go
  • core/schemas/mux.go
  • Makefile
  • core/providers/huggingface/transcription.go
  • core/bifrost.go
  • ui/lib/constants/icons.tsx
  • core/schemas/account.go
  • docs/contributing/adding-a-provider.mdx
  • core/providers/huggingface/types.go
  • core/providers/huggingface/huggingface.go
🧬 Code graph analysis (6)
core/providers/huggingface/huggingface_test.go (5)
core/internal/testutil/setup.go (1)
  • SetupTest (51-60)
core/internal/testutil/account.go (2)
  • ComprehensiveTestConfig (47-64)
  • TestScenarios (22-44)
core/schemas/bifrost.go (2)
  • HuggingFace (51-51)
  • Fallback (131-134)
core/schemas/models.go (1)
  • Model (109-129)
core/internal/testutil/tests.go (1)
  • RunAllComprehensiveTests (15-62)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
  • HuggingFaceListModelsResponse (21-23)
core/schemas/bifrost.go (13)
  • ModelProvider (32-32)
  • RequestType (86-86)
  • ChatCompletionRequest (92-92)
  • ChatCompletionStreamRequest (93-93)
  • TextCompletionRequest (90-90)
  • TextCompletionStreamRequest (91-91)
  • ResponsesRequest (94-94)
  • ResponsesStreamRequest (95-95)
  • EmbeddingRequest (96-96)
  • SpeechRequest (97-97)
  • SpeechStreamRequest (98-98)
  • TranscriptionRequest (99-99)
  • TranscriptionStreamRequest (100-100)
core/schemas/models.go (2)
  • BifrostListModelsResponse (36-45)
  • Model (109-129)
core/providers/huggingface/speech.go (3)
core/schemas/speech.go (2)
  • BifrostSpeechRequest (9-16)
  • BifrostSpeechResponse (22-29)
core/providers/huggingface/types.go (5)
  • HuggingFaceSpeechRequest (304-310)
  • HuggingFaceSpeechParameters (313-316)
  • HuggingFaceTranscriptionGenerationParameters (342-359)
  • HuggingFaceTranscriptionEarlyStopping (363-366)
  • HuggingFaceSpeechResponse (319-323)
core/schemas/bifrost.go (2)
  • BifrostResponseExtraFields (287-296)
  • HuggingFace (51-51)
core/schemas/mux.go (2)
core/schemas/responses.go (2)
  • BifrostResponsesStreamResponse (1422-1460)
  • ResponsesStreamResponseTypeOutputTextDelta (1370-1370)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/bifrost.go (2)
core/schemas/bifrost.go (1)
  • HuggingFace (51-51)
core/providers/huggingface/huggingface.go (1)
  • NewHuggingFaceProvider (88-120)
core/schemas/account.go (1)
ui/lib/types/config.ts (3)
  • AzureKeyConfig (23-27)
  • VertexKeyConfig (36-42)
  • BedrockKeyConfig (53-60)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx

[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...

(QB_NEW_EN_HYPHEN)


[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...

(QB_NEW_EN_HYPHEN)


[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...

(QB_NEW_EN_HYPHEN)


[grammar] ~1469-~1469: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...

(QB_NEW_EN_HYPHEN)


[uncategorized] ~1779-~1779: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...

(GITHUB)

🔇 Additional comments (9)
Makefile (2)

155-155: Variable substitutions are correct and enable deterministic test execution.

The use of $(AIR_BIN) and $(GOTESTSUM_BIN) throughout the dev and test targets is consistent and properly formatted. This decoupling from PATH-based tool discovery is essential for reliable test execution, especially for the new HuggingFace provider tests.

Also applies to: 163-163, 349-349, 401-401, 425-425, 454-454, 548-548


24-29: Color vars, root check, and next installation fallback are sound.

  • Color refactor (lines 24–29): Using printf centralizes ANSI code definitions—good for maintainability.
  • Root user guard (lines 67–70): Defensive check prevents npm permission issues on dev machines—appropriate.
  • Next installation fallback (lines 75–85): Thoughtful multi-level strategy (local, then npx, then global) handles Nix-like environments well.

Also applies to: 66-70, 75-85

core/schemas/account.go (1)

54-56: LGTM! Consistent with existing provider key config patterns.

The HuggingFaceKeyConfig type follows the established pattern used by Azure, Vertex, and Bedrock configurations, with a Deployments map for model-to-deployment mapping.

core/bifrost.go (1)

1327-1328: LGTM! Provider integration follows established patterns.

The HuggingFace provider case follows the same pattern as other providers like OpenAI, Anthropic, and Gemini that return a provider instance without an error.

core/providers/huggingface/speech.go (1)

94-116: LGTM! Response conversion is straightforward and correct.

The ToBifrostSpeechResponse properly validates inputs and maps the response fields. The comment about missing usage/alignment data is helpful for future maintainers.

core/providers/huggingface/models.go (3)

16-44: LGTM! Model list conversion handles edge cases properly.

The method correctly:

  • Skips models with empty ModelID
  • Skips models without derivable supported methods
  • Pre-allocates the result slice with appropriate capacity
  • Creates a composite ID with the provider prefix

46-104: LGTM! Comprehensive method derivation from pipeline and tags.

The function properly:

  • Normalizes the pipeline tag
  • Uses a set to deduplicate methods
  • Handles both primary pipeline tags and secondary model tags
  • Returns a sorted slice for consistent output

11-14: These constants are used in core/providers/huggingface/utils.go (lines 90, 92–93), so no action is needed.

core/providers/huggingface/transcription.go (1)

100-139: LGTM! Response conversion with proper segment mapping.

The method correctly:

  • Validates non-nil receiver and non-empty model name
  • Maps chunks to segments with proper ID assignment
  • Safely handles timestamp arrays with bounds checking before access

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/schemas/mux.go (1)

1146-1220: Remove duplicate delta handling for Thought content.

The code processes delta.Thought twice with conflicting approaches:

  1. Lines 1207-1213: Concatenates Thought with Content into a single contentDelta string emitted as ResponsesStreamResponseTypeOutputTextDelta
  2. Lines 1369-1377: Emits the same delta.Thought separately as ResponsesStreamResponseTypeReasoningSummaryTextDelta

This means consumers receive the thought content both mixed into regular text deltas and as a separate reasoning delta event. Keep the separate reasoning delta emission (lines 1369-1377) and remove the concatenation of Thought into contentDelta (lines 1211-1212).

♻️ Duplicate comments (15)
core/providers/huggingface/types.go (1)

21-23: Struct definition doesn't match HuggingFace API response format.

This issue was previously flagged: The HuggingFace /api/models endpoint returns a JSON array directly, but HuggingFaceListModelsResponse expects an object with a models field. This will cause unmarshaling failures. Either change the response type to []HuggingFaceModel or add custom UnmarshalJSON logic.

docs/contributing/adding-a-provider.mdx (2)

510-527: Variable name inconsistency in documentation example.

The example code references providerReq (e.g., providerReq.Temperature), but the variable was named hfReq on line 500. This inconsistency could confuse contributors.


1406-1427: Code example is truncated mid-function.

The code block is incomplete, cutting off inside the parameter mapping section. This could confuse contributors trying to follow the pattern.

core/providers/huggingface/transcription.go (2)

14-16: Incorrect error message references "speech" instead of "transcription".

The error message says "speech request input cannot be nil" but this is a transcription request converter.


38-79: Type assertions for int will fail when values come from JSON unmarshalling.

When ExtraParams is populated from JSON (e.g., from request bodies), numeric values are unmarshalled as float64, not int. The type assertions like .(int) on lines 38, 41, 44, 47, 50, 53, and 62 will silently fail, causing these parameters to be ignored.

Consider handling both int and float64 types for all integer parameters (max_new_tokens, max_length, min_length, min_new_tokens, num_beams, num_beam_groups, top_k).

core/providers/huggingface/speech.go (1)

37-77: Type assertions for int will fail when values come from JSON unmarshalling.

When ExtraParams is populated from JSON, numeric values are unmarshalled as float64, not int. The type assertions like .(int) on lines 37, 40, 43, 46, 49, 52, and 61 will silently fail, causing these parameters to be ignored.

Consider handling both int and float64 types for all integer parameters.

core/providers/huggingface/chat.go (1)

69-78: Potential nil pointer dereference when accessing tc.Function.Name.

On line 74, tc.Function.Name is dereferenced without checking if it's nil. If the tool call has a nil Name field, this will cause a panic.

Add a nil check before dereferencing or skip tool calls without function names.

core/providers/huggingface/utils.go (3)

136-162: Empty provider and model names when input has no slashes.

When splitIntoModelProvider receives a model name with no slashes (t == 0), both prov and model remain empty strings. Downstream code at lines 313 and 398 in huggingface.go formats this as "" : "", resulting in malformed model identifiers.

Handle the t == 0 case by setting a default provider (e.g., "hf-inference") and using the input as the model name.


164-197: Incomplete provider routing in getInferenceProviderRouteURL — 13 of 19 defined providers will error.

The function only handles 6 providers (fal-ai, hf-inference, nebius, replicate, sambanova, scaleway) while INFERENCE_PROVIDERS defines 19 total. Providers like cerebras, cohere, groq, featherlessAI, fireworksAI, hyperbolic, novita, nscale, ovhcloud, publicai, together, wavespeed, and zaiOrg will hit the default error case.

This affects embedding, speech, and transcription operations. Either expand routing logic for all providers or remove unsupported ones from INFERENCE_PROVIDERS.


180-191: Copy-paste error: Wrong provider names in error messages.

Lines 184 and 190 incorrectly reference "nebius provider" when the actual providers are "sambanova" (line 184) and "scaleway" (line 190).

core/providers/huggingface/huggingface.go (5)

277-281: Bug: Wrong provider constant used in ListModels.

CheckOperationAllowed is called with schemas.Gemini instead of schemas.HuggingFace. This is a copy-paste error and will cause incorrect operation permission checks.


295-301: Bug: Wrong request types in unsupported operation errors.

Both TextCompletion and TextCompletionStream return errors with schemas.EmbeddingRequest instead of the correct request types (schemas.TextCompletionRequest and schemas.TextCompletionStreamRequest).


700-707: Bug: Wrong request type in Embedding error messages.

The error messages reference schemas.SpeechRequest instead of schemas.EmbeddingRequest when the embedding operation is unsupported (lines 701 and 706).


773-776: Bug: Wrong task type check in Speech.

The Speech function checks for "automatic-speech-recognition" task, but Speech (text-to-speech) should check for "text-to-speech". The task checks appear to be swapped between Speech and Transcription functions.


831-838: Bug: Wrong task type check and error messages in Transcription.

Two issues:

  1. The error messages use schemas.SpeechRequest instead of schemas.TranscriptionRequest (lines 832, 837)
  2. Line 836 checks for "text-to-speech" task, but Transcription should check for "automatic-speech-recognition"
🧹 Nitpick comments (4)
Makefile (2)

88-100: Extract repeated DEFAULT_GOBIN logic into a shared variable for DRY.

Lines 95 and 108 both replicate the DEFAULT_GOBIN fallback logic:

$(if $(strip $(GOBIN)),$(GOBIN)/gotestsum,$(if $(strip $(GOPATH)),$(GOPATH)/bin/gotestsum,...))

This can be defined once and reused, improving maintainability and consistency across targets.

Consider extracting into a helper pattern (or simply reusing DEFAULT_GOBIN):

INSTALLED_AIR_PATH := $(DEFAULT_GOBIN)/air
INSTALLED_GOTESTSUM_PATH := $(DEFAULT_GOBIN)/gotestsum

Then reference these in the informational messages instead of computing the logic twice.

Also applies to: 102-113


103-103: Minor: Quote GOTESTSUM_BIN in which invocation for robustness.

Line 103 uses which $(GOTESTSUM_BIN) without quotes. While binary names rarely contain spaces, shell best practice is to quote variable expansions:

which "$(GOTESTSUM_BIN)" > /dev/null 2>&1
core/providers/huggingface/types.go (2)

379-396: Consider returning an error for invalid early_stopping values.

The UnmarshalJSON method silently ignores invalid values by returning nil at line 395. If the JSON value is neither a boolean nor a string, this will leave the field in an indeterminate state. Consider returning an error to surface invalid API responses:

 func (e *HuggingFaceTranscriptionEarlyStopping) UnmarshalJSON(data []byte) error {
 	// Try boolean first
 	var boolVal bool
 	if err := json.Unmarshal(data, &boolVal); err == nil {
 		e.BoolValue = &boolVal
 		return nil
 	}
 
 	// Try string
 	var stringVal string
 	if err := json.Unmarshal(data, &stringVal); err == nil {
 		e.StringValue = &stringVal
 		return nil
 	}
 
-	return nil
+	return fmt.Errorf("early_stopping must be a boolean or string, got: %s", string(data))
 }

66-66: Extra field won't capture unknown fields with json:"-" tag.

The Extra field at line 66 has the json:"-" tag, which excludes it from JSON marshaling/unmarshaling. This means it won't capture unknown additional fields as the comment suggests. If you want to capture unknown fields, remove the json:"-" tag or use a different approach. If this field is for internal use only (not from/to JSON), the current implementation is correct but the comment should be clarified.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5baaee2 and 5a72875.

📒 Files selected for processing (28)
  • .github/workflows/pr-tests.yml (1 hunks)
  • .github/workflows/release-pipeline.yml (4 hunks)
  • Makefile (8 hunks)
  • core/bifrost.go (2 hunks)
  • core/internal/testutil/account.go (3 hunks)
  • core/internal/testutil/chat_completion_stream.go (1 hunks)
  • core/internal/testutil/responses_stream.go (1 hunks)
  • core/providers/huggingface/chat.go (1 hunks)
  • core/providers/huggingface/embedding.go (1 hunks)
  • core/providers/huggingface/huggingface.go (1 hunks)
  • core/providers/huggingface/huggingface_test.go (1 hunks)
  • core/providers/huggingface/models.go (1 hunks)
  • core/providers/huggingface/speech.go (1 hunks)
  • core/providers/huggingface/transcription.go (1 hunks)
  • core/providers/huggingface/types copy.go (1 hunks)
  • core/providers/huggingface/types.go (1 hunks)
  • core/providers/huggingface/utils.go (1 hunks)
  • core/schemas/account.go (2 hunks)
  • core/schemas/bifrost.go (3 hunks)
  • core/schemas/mux.go (2 hunks)
  • docs/apis/openapi.json (1 hunks)
  • docs/contributing/adding-a-provider.mdx (1 hunks)
  • docs/features/unified-interface.mdx (1 hunks)
  • transports/config.schema.json (2 hunks)
  • ui/README.md (1 hunks)
  • ui/lib/constants/config.ts (2 hunks)
  • ui/lib/constants/icons.tsx (1 hunks)
  • ui/lib/constants/logs.ts (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • core/providers/huggingface/types copy.go
🚧 Files skipped from review as they are similar to previous changes (12)
  • core/internal/testutil/chat_completion_stream.go
  • core/providers/huggingface/huggingface_test.go
  • ui/lib/constants/config.ts
  • core/providers/huggingface/models.go
  • ui/lib/constants/logs.ts
  • core/schemas/account.go
  • core/schemas/bifrost.go
  • transports/config.schema.json
  • docs/apis/openapi.json
  • ui/README.md
  • .github/workflows/pr-tests.yml
  • core/internal/testutil/responses_stream.go
👮 Files not reviewed due to content moderation or server errors (1)
  • ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/schemas/mux.go
  • docs/features/unified-interface.mdx
  • core/internal/testutil/account.go
  • core/providers/huggingface/embedding.go
  • docs/contributing/adding-a-provider.mdx
  • core/providers/huggingface/speech.go
  • Makefile
  • core/providers/huggingface/transcription.go
  • core/providers/huggingface/utils.go
  • ui/lib/constants/icons.tsx
  • core/providers/huggingface/chat.go
  • core/bifrost.go
  • core/providers/huggingface/huggingface.go
  • core/providers/huggingface/types.go
🧬 Code graph analysis (6)
core/schemas/mux.go (2)
core/schemas/responses.go (2)
  • BifrostResponsesStreamResponse (1422-1460)
  • ResponsesStreamResponseTypeOutputTextDelta (1370-1370)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/internal/testutil/account.go (4)
core/schemas/bifrost.go (1)
  • HuggingFace (51-51)
core/schemas/account.go (1)
  • Key (8-18)
core/schemas/provider.go (4)
  • ProviderConfig (234-242)
  • NetworkConfig (45-53)
  • DefaultRequestTimeoutInSeconds (15-15)
  • ConcurrencyAndBufferSize (128-131)
core/internal/testutil/cross_provider_scenarios.go (1)
  • ProviderConfig (45-53)
core/providers/huggingface/embedding.go (3)
core/schemas/embedding.go (4)
  • BifrostEmbeddingRequest (9-16)
  • BifrostEmbeddingResponse (22-28)
  • EmbeddingData (118-122)
  • EmbeddingStruct (124-128)
core/providers/huggingface/types.go (2)
  • HuggingFaceEmbeddingRequest (278-288)
  • HuggingFaceEmbeddingResponse (299-299)
core/schemas/chatcompletions.go (1)
  • BifrostLLMUsage (640-647)
core/providers/huggingface/transcription.go (3)
core/schemas/transcriptions.go (2)
  • BifrostTranscriptionRequest (3-10)
  • BifrostTranscriptionResponse (16-26)
core/providers/huggingface/types.go (5)
  • HuggingFaceTranscriptionRequest (328-333)
  • HuggingFaceTranscriptionRequestParameters (336-339)
  • HuggingFaceTranscriptionGenerationParameters (342-359)
  • HuggingFaceTranscriptionEarlyStopping (363-366)
  • HuggingFaceTranscriptionResponse (399-402)
core/schemas/bifrost.go (2)
  • BifrostResponseExtraFields (287-296)
  • HuggingFace (51-51)
core/providers/huggingface/chat.go (3)
core/schemas/chatcompletions.go (4)
  • BifrostChatRequest (12-19)
  • BifrostChatResponse (26-41)
  • BifrostResponseChoice (582-590)
  • BifrostLLMUsage (640-647)
core/providers/huggingface/types.go (11)
  • HuggingFaceChatRequest (47-67)
  • HuggingFaceChatMessage (69-77)
  • HuggingFaceContentItem (80-84)
  • HuggingFaceImageRef (86-88)
  • HuggingFaceToolCall (90-94)
  • HuggingFaceFunction (96-100)
  • HuggingFaceResponseFormat (102-105)
  • HuggingFaceStreamOptions (114-116)
  • HuggingFaceTool (118-121)
  • HuggingFaceToolFunction (123-127)
  • HuggingFaceChatResponse (129-136)
core/schemas/bifrost.go (2)
  • BifrostResponseExtraFields (287-296)
  • HuggingFace (51-51)
core/bifrost.go (2)
core/schemas/bifrost.go (1)
  • HuggingFace (51-51)
core/providers/huggingface/huggingface.go (1)
  • NewHuggingFaceProvider (88-120)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx

[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...

(QB_NEW_EN_HYPHEN)


[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...

(QB_NEW_EN_HYPHEN)


[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...

(QB_NEW_EN_HYPHEN)


[grammar] ~1469-~1469: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...

(QB_NEW_EN_HYPHEN)


[uncategorized] ~1779-~1779: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...

(GITHUB)

🔇 Additional comments (21)
Makefile (3)

15-21: Tool-path discovery logic is sound; minor consistency opportunity.

The conditional expansion for DEFAULT_GOBIN and fallback chains for AIR_BIN and GOTESTSUM_BIN are well-structured. However, the logic could be slightly clearer:

  • Line 17–18: GOBIN ?= and GOPATH ?= query environment each time; consider memoizing to $(shell go env GOBIN) and $(shell go env GOPATH) without the conditional assignment if these are truly static per invocation.
  • Line 21: The fallback to which gotestsum is good, but ensure it's tested on systems where which may not be available (rare, but worth verifying).

Overall approach is solid for improving portability and reducing hard-coded tool assumptions.


155-155: Binary invocation refactoring is correct.

The substitution of direct air with $(AIR_BIN) is clean and allows the Makefile to find air without relying on PATH. Well done.

Also applies to: 163-163


349-349: Test target refactoring to use $(GOTESTSUM_BIN) looks good.

All test invocations now consistently use the computed GOTESTSUM_BIN variable, ensuring tests can run even if gotestsum is not on PATH. Proper fallback logic in variable definition (line 21) makes this robust.

Verify that tests pass when GOTESTSUM_BIN resolves to a full path outside PATH (e.g., via GOBIN). Run a quick smoke test locally to confirm: make test-core PROVIDER=<any_provider>.

Also applies to: 401-401, 425-425, 454-454, 548-548

core/internal/testutil/account.go (2)

96-96: LGTM! HuggingFace provider added to configured providers.

The addition follows the established pattern and correctly references the HuggingFace constant defined in core/schemas/bifrost.go.


259-266: LGTM! Key retrieval properly configured.

The HuggingFace key configuration follows the same pattern as other providers, retrieving the API key from the HUGGING_FACE_API_KEY environment variable.

core/providers/huggingface/types.go (1)

129-408: Comprehensive type definitions for HuggingFace integration.

The type definitions provide good coverage for chat, embeddings, speech, and transcription APIs. The use of pointer fields, json.RawMessage, and map[string]any provides appropriate flexibility for varying API responses while maintaining type safety where possible.

docs/features/unified-interface.mdx (1)

98-98: Documentation correctly reflects HuggingFace provider capabilities.

The provider support matrix entry accurately documents HuggingFace's supported operations and follows the consistent format of other provider entries.

core/bifrost.go (2)

26-26: Import correctly added for HuggingFace provider.

The import follows the consistent pattern used for other provider packages.


1327-1328: HuggingFace provider properly integrated into factory.

The provider instantiation follows the same pattern as other providers, correctly calling NewHuggingFaceProvider with config and logger parameters. The integration is consistent with the existing provider architecture.

docs/contributing/adding-a-provider.mdx (1)

7-2065: LGTM! Comprehensive provider implementation guide.

The documentation provides excellent coverage of both OpenAI-compatible and custom provider patterns, with clear examples, conventions, and checklists. The phase-based workflow and file organization guidelines will help contributors implement providers consistently.

core/providers/huggingface/embedding.go (2)

10-55: LGTM! Solid embedding request converter.

The function properly handles nil inputs, correctly splits model/provider, and safely extracts ExtraParams with type assertions. The mapping logic is clear and follows the converter pattern.


58-93: LGTM! Well-implemented response converter.

The method correctly handles nil responses with error returns, pre-allocates slices for performance, and properly documents that HuggingFace doesn't return usage information (setting it to zero). The conversion logic is clean and efficient.

core/providers/huggingface/transcription.go (1)

100-139: LGTM! Solid transcription response converter.

The method properly validates inputs, safely handles timestamp array access, and cleanly converts HuggingFace chunks to Bifrost segments. The implementation is defensive and correct.

core/providers/huggingface/speech.go (1)

94-116: LGTM! Clean speech response converter.

The method properly validates inputs and clearly documents that HuggingFace TTS API doesn't return usage or alignment data. The implementation is straightforward and correct.

core/providers/huggingface/chat.go (2)

318-413: LGTM! Well-structured streaming response converter.

The method properly handles nil responses and cleanly converts all streaming delta fields (role, content, reasoning, tool calls, logprobs) to Bifrost format. The implementation is thorough and correct.


86-89: No action needed. The debug variable is properly accessible.

The debug variable is defined at package level in huggingface.go (line 22) and is correctly accessible in chat.go. In Go, all files within the same package share a common namespace—no import or additional qualification is required for package-level variables. The code will compile without errors.

Likely an incorrect or invalid review comment.

core/providers/huggingface/utils.go (2)

83-128: LGTM! Well-implemented URL builder.

The function properly constructs the model hub URL with pagination, sorting, and provider filtering. The type switch for handling ExtraParams is robust and covers common types.


130-134: LGTM! Solid utility functions.

The buildModelInferenceProviderURL, convertToInferenceProviderMappings, and getModelInferenceProviderMapping functions are well-implemented with proper error handling and clean logic.

Also applies to: 199-251

core/providers/huggingface/huggingface.go (3)

20-126: LGTM! Well-structured provider implementation.

The provider struct follows standard patterns with proper field ordering. The use of sync.Pool for response objects is a good performance optimization, and the acquire/release helpers are correctly implemented. The constructor properly initializes all fields and pre-warms the pools.


303-653: LGTM! Robust chat completion implementation.

Both sync and streaming chat completion methods are well-implemented. The streaming logic properly handles SSE parsing, error cases, and the complex ResponsesStream fallback scenario. The workaround for combined usage/choices chunks (lines 597-626) is clever and well-documented.


655-677: LGTM! Clean delegation to chat completion methods.

The Responses and ResponsesStream methods correctly delegate to the corresponding chat completion methods with appropriate context flags and request type overrides.

@qwerty-dvorak qwerty-dvorak force-pushed the 12-05-feat_huggingface_provider_added branch from 5a72875 to 00ecc06 Compare December 5, 2025 20:15
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/schemas/mux.go (1)

1146-1221: Thought is now streamed as both user-visible text and as a separate reasoning delta, creating double-streaming and changed semantics for all providers

The code confirms the concerns raised:

  • Lines 1146–1148 compute hasContent and hasThought; line 1149 branches if either is true.
  • Lines 1207–1214 concatenate both *delta.Content and *delta.Thought into contentDelta, which becomes the OutputTextDelta.Delta.
  • Line 1228 sets state.TextItemHasContent = true for either content or thought.
  • Lines 1369–1380 emit delta.Thought separately as ResponsesStreamResponseTypeReasoningSummaryTextDelta.

This creates two key problems:

  1. Double-streaming: Chunks with only Thought now emit both an output_text.delta and a reasoning_summary_text.delta, causing reasoning content to appear in the visible text channel.

  2. Semantic shift: Chunks with both Content and Thought now concatenate them into a single visible delta, mixing chain-of-thought into user-facing text instead of keeping reasoning separate.

  3. Unguarded change: This logic applies uniformly to all providers (no provider-specific conditionals), making this a behavioral change across the entire stack, not isolated to HuggingFace.

If Thought is intended as a distinct reasoning stream (per OpenAI's reasoning semantics), this mixes concerns. If it's a fallback for providers that emit primary text as Thought, this needs explicit provider-level detection to avoid leaking reasoning into normal output for other providers.

Recommend verifying this is intentional across all providers and that clients consuming only output_text.delta are prepared to receive reasoning content.

♻️ Duplicate comments (17)
docs/contributing/adding-a-provider.mdx (1)

510-527: Variable name inconsistency persists from previous review.

The code example declares hfReq at line 500 but references providerReq throughout lines 511–527. This identical issue was flagged in a previous review. While you clarified that examples are for reference, this particular snippet is labeled "Real Example from core/providers/huggingface/chat.go" and should be corrected for accuracy.

Apply this diff to align variable names:

     // Map parameters
     if bifrostReq.Params != nil {
         params := bifrostReq.Params
         
         // Map standard parameters
         if params.Temperature != nil {
-            providerReq.Temperature = params.Temperature
+            hfReq.Temperature = params.Temperature
         }
         if params.MaxTokens != nil {
-            providerReq.MaxTokens = params.MaxTokens
+            hfReq.MaxTokens = params.MaxTokens
         }
         // ... other standard parameters
         
         // Handle provider-specific ExtraParams
         if params.ExtraParams != nil {
             if customParam, ok := params.ExtraParams["custom_param"].(string); ok {
-                providerReq.CustomParam = &customParam
+                hfReq.CustomParam = &customParam
             }
         }
     }
 
-    return providerReq
+    return hfReq
core/internal/testutil/account.go (1)

512-524: Align HuggingFace retry policy with other providers

MaxRetries is set to 1 for HuggingFace while essentially all other remote providers use 8–10 retries, and the comment “HuggingFace can be variable” argues for more retries, not fewer. This will make tests unnecessarily flaky on transient errors.

Recommend matching the standard 10‑retry policy and updating/removing the comment:

	case schemas.HuggingFace:
		return &schemas.ProviderConfig{
			NetworkConfig: schemas.NetworkConfig{
				DefaultRequestTimeoutInSeconds: 300,
-				MaxRetries:                     1, // HuggingFace can be variable
+				MaxRetries:                     10, // Align with other variable cloud providers
				RetryBackoffInitial:            2 * time.Second,
				RetryBackoffMax:                30 * time.Second,
			},
core/providers/huggingface/huggingface_test.go (1)

29-33: SpeechSynthesisModel and TranscriptionModel appear to be swapped.

Kokoro-82M is a text-to-speech model and should be assigned to SpeechSynthesisModel, while whisper-large-v3 is a speech-to-text model and should be assigned to TranscriptionModel. The SpeechSynthesisFallbacks should also reference the TTS model.

-		TranscriptionModel:   "fal-ai/hexgrad/Kokoro-82M",
-		SpeechSynthesisModel: "fal-ai/openai/whisper-large-v3",
+		TranscriptionModel:   "fal-ai/openai/whisper-large-v3",
+		SpeechSynthesisModel: "fal-ai/hexgrad/Kokoro-82M",
 		SpeechSynthesisFallbacks: []schemas.Fallback{
-			{Provider: schemas.HuggingFace, Model: "fal-ai/openai/whisper-large-v3"},
+			{Provider: schemas.HuggingFace, Model: "fal-ai/hexgrad/Kokoro-82M"},
 		},
core/providers/huggingface/transcription.go (2)

14-16: Incorrect error message references "speech" instead of "transcription".

The error message says "speech request input cannot be nil" but this is a transcription request converter.

 	if request.Input == nil {
-		return nil, fmt.Errorf("speech request input cannot be nil")
+		return nil, fmt.Errorf("transcription request input cannot be nil")
 	}

38-63: Type assertions for int will fail when values come from JSON unmarshalling.

When ExtraParams is populated from JSON, numeric values are unmarshalled as float64, not int. These type assertions will silently fail, causing parameters to be ignored.

Consider handling both types. Example fix for one parameter:

-			if val, ok := request.Params.ExtraParams["max_new_tokens"].(int); ok {
-				genParams.MaxNewTokens = &val
+			if val, ok := request.Params.ExtraParams["max_new_tokens"].(int); ok {
+				genParams.MaxNewTokens = &val
+			} else if val, ok := request.Params.ExtraParams["max_new_tokens"].(float64); ok {
+				intVal := int(val)
+				genParams.MaxNewTokens = &intVal
 			}

Apply the same pattern to max_length, min_length, min_new_tokens, num_beams, num_beam_groups, and top_k.

core/providers/huggingface/speech.go (1)

37-62: Type assertions for int may fail when values come from JSON.

Same issue as in transcription.go - when ExtraParams is populated from JSON unmarshaling, numeric values are typically float64, not int. These type assertions will silently fail.

Apply the same fix pattern as suggested for transcription.go - handle both int and float64 types for all integer parameters.

core/providers/huggingface/chat.go (2)

69-81: Potential nil pointer dereference when accessing tc.Function.Name.

On line 74, tc.Function.Name is dereferenced without checking if it's nil. If a tool call has a nil Name field, this will cause a panic.

 			for _, tc := range msg.ChatAssistantMessage.ToolCalls {
+				if tc.Function.Name == nil {
+					continue // Skip tool calls without a function name
+				}
 				hfToolCall := HuggingFaceToolCall{
 					ID:   tc.ID,
 					Type: tc.Type,

86-89: Undefined variable debug will cause compilation error.

The debug variable is referenced at lines 86 and 250 but does not appear to be defined in this file. This will cause a compilation failure.

#!/bin/bash
# Check if debug variable is defined in the huggingface package
rg -n "^var debug\b|^const debug\b|debug\s*:=\s*(true|false)" core/providers/huggingface/

Also applies to: 250-256

core/providers/huggingface/utils.go (3)

136-162: Empty provider and model names when input has no slashes.

When splitIntoModelProvider receives a model name with no slashes (t == 0), both prov and model remain empty strings. This results in malformed model identifiers downstream. A user passing "llama-7b" (without organization prefix) would trigger this issue.

Consider defaulting to hf-inference and using the original name as the model:

+	} else {
+		// No slashes - default to hf-inference with the full name as model
+		prov = hfInference
+		model = bifrostModelName
+		if debug {
+			fmt.Printf("[huggingface debug] splitIntoModelProvider (t==0): prov=%s, model=%s\n", prov, model)
+		}
 	}

180-191: Copy-paste error: Wrong provider names in error messages.

Lines 184 and 190 incorrectly reference "nebius provider" when the actual providers are "sambanova" and "scaleway" respectively.

 	case "sambanova":
 		if requestType == schemas.EmbeddingRequest {
 			defaultPath = "/sambanova/v1/embeddings"
 		} else {
-			return "", fmt.Errorf("nebius provider only supports embedding requests")
+			return "", fmt.Errorf("sambanova provider only supports embedding requests")
 		}
 	case "scaleway":
 		if requestType == schemas.EmbeddingRequest {
 			defaultPath = "/scaleway/v1/embeddings"
 		} else {
-			return "", fmt.Errorf("nebius provider only supports embedding requests")
+			return "", fmt.Errorf("scaleway provider only supports embedding requests")
 		}

164-197: Incomplete provider routing — 13 of 19 defined providers will error.

The getInferenceProviderRouteURL function only handles 6 providers (fal-ai, hf-inference, nebius, replicate, sambanova, scaleway) while INFERENCE_PROVIDERS defines 19 total. Providers like cerebras, cohere, groq, fireworks-ai, together, etc. will hit the default error case.

This affects embedding, speech, and transcription operations for the majority of defined providers.

Either expand the routing logic to handle all providers or document which providers are actually supported for which request types.

core/providers/huggingface/huggingface.go (6)

51-85: Typo: "aquire" should be "acquire".

The pool helper functions are misspelled as aquireHuggingFaceChatResponse, aquireHuggingFaceTranscriptionResponse, and aquireHuggingFaceSpeechResponse. The correct spelling is "acquire". Update the function names and their call sites at lines 348, 790, and 861.


277-281: Bug: Wrong provider constant in ListModels.

Line 279 uses schemas.Gemini instead of schemas.HuggingFace in the CheckOperationAllowed call. This causes incorrect operation permission checks for the HuggingFace provider.

Apply this diff:

-	if err := providerUtils.CheckOperationAllowed(schemas.Gemini, provider.customProviderConfig, schemas.ListModelsRequest); err != nil {
+	if err := providerUtils.CheckOperationAllowed(schemas.HuggingFace, provider.customProviderConfig, schemas.ListModelsRequest); err != nil {
 		return nil, err
 	}

295-301: Bug: Wrong request types in unsupported operation errors.

Both TextCompletion (line 296) and TextCompletionStream (line 300) return errors using schemas.EmbeddingRequest instead of the correct request types. This was previously flagged and marked as addressed, but the issue persists in the current code.

Apply this diff:

 func (provider *HuggingFaceProvider) TextCompletion(ctx context.Context, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (*schemas.BifrostTextCompletionResponse, *schemas.BifrostError) {
-	return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey())
+	return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionRequest, provider.GetProviderKey())
 }

 func (provider *HuggingFaceProvider) TextCompletionStream(ctx context.Context, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) {
-	return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey())
+	return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionStreamRequest, provider.GetProviderKey())
 }

700-707: Bug: Wrong request type in Embedding error messages.

Lines 701 and 706 use schemas.SpeechRequest instead of schemas.EmbeddingRequest when returning unsupported operation errors in the Embedding method.

Apply this diff:

 	if providerMapping == nil {
-		return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey())
+		return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey())
 	}

 	mapping, ok := providerMapping[inferenceProvider]
 	if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "feature-extraction" {
-		return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey())
+		return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey())
 	}

773-776: Bug: Wrong task type check in Speech.

Line 774 checks for "automatic-speech-recognition" task, but Speech (text-to-speech) should check for "text-to-speech". The task checks appear to be swapped between Speech and Transcription functions.

Apply this diff:

 	mapping, ok := providerMapping[inferenceProvider]
-	if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "automatic-speech-recognition" {
+	if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "text-to-speech" {
 		return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey())
 	}

831-838: Bug: Wrong task type check and error messages in Transcription.

Two issues in the Transcription function:

  1. Lines 832 and 837 use schemas.SpeechRequest instead of schemas.TranscriptionRequest in error messages
  2. Line 836 checks for "text-to-speech" task, but Transcription should check for "automatic-speech-recognition" (which is what Speech incorrectly checks at line 774, suggesting these are swapped)

Apply this diff:

 	if providerMapping == nil {
-		return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey())
+		return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey())
 	}

 	mapping, ok := providerMapping[inferenceProvider]
-	if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "text-to-speech" {
-		return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey())
+	if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "automatic-speech-recognition" {
+		return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey())
 	}
🧹 Nitpick comments (1)
core/internal/testutil/account.go (1)

530-876: Consider adding a ComprehensiveTestConfig entry for HuggingFace

GetConfiguredProviders and GetConfigForProvider now support schemas.HuggingFace, but AllProviderConfigs doesn’t define a ComprehensiveTestConfig for it. That means HuggingFace won’t participate in these cross‑provider end‑to‑end scenarios.

Recommend adding a HuggingFace entry here (with a representative chat/embedding model) so the new provider is exercised alongside the others.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a72875 and 00ecc06.

📒 Files selected for processing (28)
  • .github/workflows/pr-tests.yml (1 hunks)
  • .github/workflows/release-pipeline.yml (4 hunks)
  • Makefile (8 hunks)
  • core/bifrost.go (2 hunks)
  • core/internal/testutil/account.go (3 hunks)
  • core/internal/testutil/chat_completion_stream.go (1 hunks)
  • core/internal/testutil/responses_stream.go (1 hunks)
  • core/providers/huggingface/chat.go (1 hunks)
  • core/providers/huggingface/embedding.go (1 hunks)
  • core/providers/huggingface/huggingface.go (1 hunks)
  • core/providers/huggingface/huggingface_test.go (1 hunks)
  • core/providers/huggingface/models.go (1 hunks)
  • core/providers/huggingface/speech.go (1 hunks)
  • core/providers/huggingface/transcription.go (1 hunks)
  • core/providers/huggingface/types copy.go (1 hunks)
  • core/providers/huggingface/types.go (1 hunks)
  • core/providers/huggingface/utils.go (1 hunks)
  • core/schemas/account.go (2 hunks)
  • core/schemas/bifrost.go (3 hunks)
  • core/schemas/mux.go (2 hunks)
  • docs/apis/openapi.json (1 hunks)
  • docs/contributing/adding-a-provider.mdx (1 hunks)
  • docs/features/unified-interface.mdx (1 hunks)
  • transports/config.schema.json (2 hunks)
  • ui/README.md (1 hunks)
  • ui/lib/constants/config.ts (2 hunks)
  • ui/lib/constants/icons.tsx (1 hunks)
  • ui/lib/constants/logs.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
  • ui/README.md
  • core/schemas/bifrost.go
  • core/internal/testutil/chat_completion_stream.go
  • core/bifrost.go
  • core/schemas/account.go
  • docs/features/unified-interface.mdx
  • ui/lib/constants/config.ts
  • .github/workflows/pr-tests.yml
  • .github/workflows/release-pipeline.yml
  • core/providers/huggingface/types copy.go
  • core/providers/huggingface/embedding.go
  • core/providers/huggingface/types.go
👮 Files not reviewed due to content moderation or server errors (1)
  • ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • transports/config.schema.json
  • core/providers/huggingface/huggingface_test.go
  • core/schemas/mux.go
  • core/internal/testutil/responses_stream.go
  • core/providers/huggingface/transcription.go
  • ui/lib/constants/logs.ts
  • core/providers/huggingface/models.go
  • core/internal/testutil/account.go
  • docs/apis/openapi.json
  • Makefile
  • core/providers/huggingface/speech.go
  • ui/lib/constants/icons.tsx
  • docs/contributing/adding-a-provider.mdx
  • core/providers/huggingface/chat.go
  • core/providers/huggingface/utils.go
  • core/providers/huggingface/huggingface.go
🧬 Code graph analysis (6)
core/providers/huggingface/huggingface_test.go (4)
core/internal/testutil/setup.go (1)
  • SetupTest (51-60)
core/internal/testutil/account.go (2)
  • ComprehensiveTestConfig (47-64)
  • TestScenarios (22-44)
core/schemas/bifrost.go (2)
  • HuggingFace (51-51)
  • Fallback (131-134)
core/internal/testutil/tests.go (1)
  • RunAllComprehensiveTests (15-62)
core/providers/huggingface/transcription.go (4)
core/schemas/transcriptions.go (2)
  • BifrostTranscriptionRequest (3-10)
  • BifrostTranscriptionResponse (16-26)
core/providers/huggingface/types.go (5)
  • HuggingFaceTranscriptionRequest (328-333)
  • HuggingFaceTranscriptionRequestParameters (336-339)
  • HuggingFaceTranscriptionGenerationParameters (342-359)
  • HuggingFaceTranscriptionEarlyStopping (363-366)
  • HuggingFaceTranscriptionResponse (399-402)
core/schemas/models.go (1)
  • Model (109-129)
core/schemas/bifrost.go (2)
  • BifrostResponseExtraFields (287-296)
  • HuggingFace (51-51)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
  • HuggingFaceListModelsResponse (21-23)
core/schemas/bifrost.go (13)
  • ModelProvider (32-32)
  • RequestType (86-86)
  • ChatCompletionRequest (92-92)
  • ChatCompletionStreamRequest (93-93)
  • TextCompletionRequest (90-90)
  • TextCompletionStreamRequest (91-91)
  • ResponsesRequest (94-94)
  • ResponsesStreamRequest (95-95)
  • EmbeddingRequest (96-96)
  • SpeechRequest (97-97)
  • SpeechStreamRequest (98-98)
  • TranscriptionRequest (99-99)
  • TranscriptionStreamRequest (100-100)
core/schemas/models.go (1)
  • Model (109-129)
core/internal/testutil/account.go (4)
core/schemas/bifrost.go (1)
  • HuggingFace (51-51)
core/schemas/account.go (1)
  • Key (8-18)
core/schemas/provider.go (4)
  • ProviderConfig (234-242)
  • NetworkConfig (45-53)
  • DefaultRequestTimeoutInSeconds (15-15)
  • ConcurrencyAndBufferSize (128-131)
core/internal/testutil/cross_provider_scenarios.go (1)
  • ProviderConfig (45-53)
core/providers/huggingface/speech.go (3)
core/schemas/speech.go (2)
  • BifrostSpeechRequest (9-16)
  • BifrostSpeechResponse (22-29)
core/providers/huggingface/types.go (5)
  • HuggingFaceSpeechRequest (304-310)
  • HuggingFaceSpeechParameters (313-316)
  • HuggingFaceTranscriptionGenerationParameters (342-359)
  • HuggingFaceTranscriptionEarlyStopping (363-366)
  • HuggingFaceSpeechResponse (319-323)
core/schemas/bifrost.go (2)
  • BifrostResponseExtraFields (287-296)
  • HuggingFace (51-51)
core/providers/huggingface/huggingface.go (7)
core/providers/huggingface/types.go (5)
  • HuggingFaceChatResponse (129-136)
  • HuggingFaceTranscriptionResponse (399-402)
  • HuggingFaceSpeechResponse (319-323)
  • HuggingFaceResponseError (268-272)
  • HuggingFaceHubError (263-266)
core/providers/utils/utils.go (7)
  • GetRequestPath (219-239)
  • SetExtraHeaders (178-208)
  • HandleProviderAPIError (317-337)
  • CheckAndDecodeBody (423-431)
  • NewBifrostOperationError (449-460)
  • HandleProviderResponse (343-387)
  • ShouldSendBackRawResponse (482-487)
core/schemas/bifrost.go (4)
  • HuggingFace (51-51)
  • EmbeddingRequest (96-96)
  • BifrostStream (321-328)
  • ChatCompletionRequest (92-92)
core/providers/huggingface/chat.go (1)
  • ToHuggingFaceChatCompletionRequest (12-199)
core/schemas/mux.go (1)
  • ChatToResponsesStreamState (966-982)
core/schemas/responses.go (1)
  • BifrostResponsesResponse (45-83)
core/providers/huggingface/embedding.go (1)
  • ToHuggingFaceEmbeddingRequest (10-55)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx

[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...

(QB_NEW_EN_HYPHEN)


[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...

(QB_NEW_EN_HYPHEN)


[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...

(QB_NEW_EN_HYPHEN)


[grammar] ~1469-~1469: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...

(QB_NEW_EN_HYPHEN)


[uncategorized] ~1779-~1779: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...

(GITHUB)

🔇 Additional comments (18)
core/internal/testutil/responses_stream.go (1)

692-695: Lifecycle stream safety cap increase looks appropriate

Bumping the lifecycle loop guard to responseCount > 300 keeps the infinite‑loop protection but avoids prematurely truncating providers that emit many small lifecycle events (e.g., more granular Hugging Face streaming), especially given the existing 200s context timeout. This looks like a safe, non‑breaking adjustment in the broader streaming test harness.

Makefile (3)

24-29: Color initialization refactor is cleaner and more portable.

Replacing hard-coded ANSI escape sequences with $(shell printf ...) improves readability and portability across different shells. This is a good improvement.


15-21: Binary path resolution logic is sound but relies on Go toolchain consistency.

The GOBIN/GOPATH detection and fallback logic correctly identifies where go install places binaries. The pattern of preferring explicit DEFAULT_GOBIN then falling back to which is robust for most environments. However, ensure downstream targets (e.g., dev, test) tolerate scenarios where binaries are missing or not in PATH—these targets already call install-air and install-gotestsum, which is good.

Verify that all targets invoking $(AIR_BIN) or $(GOTESTSUM_BIN) are guarded by corresponding install-* targets or handle missing binaries gracefully.


65-86: install-ui improvements handle root check and Next.js discovery well.

The root check (lines 67-70) prevents permission issues on developer machines. The multi-stage Next.js discovery (lines 76-85) gracefully handles systems without global npm directories (e.g., Nix). The fallback chain (local → npx → global) is well-reasoned for environment diversity.

transports/config.schema.json (2)

135-140: HuggingFace provider wired into transport config schema correctly

Using the generic #/$defs/provider ref for "huggingface" is consistent with other HTTP providers and looks good.


764-784: Semantic cache embedding provider enum correctly extended for HuggingFace

Including "huggingface" in the semanticcache plugin provider enum keeps the plugin in sync with the newly supported embedding provider.

docs/apis/openapi.json (1)

3239-3258: ModelProvider enum now exposes HuggingFace in OpenAPI

Adding "huggingface" to the ModelProvider enum keeps the public OpenAPI schema aligned with the backend provider set; change looks correct.

core/internal/testutil/account.go (1)

77-99: HuggingFace correctly added to configured providers

Including schemas.HuggingFace in GetConfiguredProviders keeps the comprehensive test account aligned with the new provider set.

ui/lib/constants/logs.ts (2)

2-20: KnownProvidersNames correctly extended with HuggingFace

Adding "huggingface" here ensures log views and filters recognize the new provider at the type level.


43-61: ProviderLabels updated for HuggingFace display name

The huggingface: "HuggingFace" label keeps the logs UI consistent with the rest of the product’s provider naming.

core/providers/huggingface/huggingface_test.go (1)

12-63: Test structure looks good overall.

The test follows the established pattern: parallel execution, environment variable check for API key, proper setup/teardown with defer cancel(), and client shutdown. The comprehensive test config enables a reasonable subset of scenarios for the HuggingFace provider.

core/providers/huggingface/models.go (2)

16-44: Model ID conversion logic looks correct.

The conversion properly handles nil input, pre-allocates the slice, skips models without IDs or supported methods, and constructs composite IDs with the provider prefix. The use of model.ModelID for the display name and model.ID for the HuggingFace reference appears intentional based on the type definitions.


46-104: Method derivation logic is well-structured.

The function correctly uses a set for deduplication, handles both pipeline tags and model tags with appropriate fallbacks, and returns a sorted slice for deterministic output. The broad tag matching (e.g., strings.Contains(tagLower, "embedding")) is reasonable for discovery purposes.

core/providers/huggingface/transcription.go (1)

100-139: Response conversion implementation is correct.

The ToBifrostTranscriptionResponse method properly validates inputs, maps the text response, and safely converts chunks to segments with appropriate bounds checking for timestamps.

core/providers/huggingface/speech.go (1)

9-24: Request and response conversion logic is well-implemented.

The ToHuggingFaceSpeechRequest properly validates input, extracts provider/model, and maps the text input. The ToBifrostSpeechResponse correctly handles nil cases and maps the audio data with appropriate extra fields.

Also applies to: 94-116

core/providers/huggingface/chat.go (1)

201-315: Response conversion implementations are comprehensive.

Both ToBifrostChatResponse and ToBifrostChatStreamResponse properly handle nil checks, convert choices with messages/deltas, map tool calls, logprobs, and usage information. The streaming response correctly maps the Reasoning field to Thought.

Also applies to: 317-413

core/providers/huggingface/utils.go (2)

83-128: URL construction for model hub API is well-implemented.

The buildModelHubURL function properly handles pagination limits with bounds checking, sets appropriate query parameters, and handles various ExtraParams types with type switching.


217-251: Model inference provider mapping retrieval is correctly implemented.

Proper use of fasthttp with deferred resource release, appropriate error handling for API responses, and clean conversion to internal mapping structure.

@qwerty-dvorak qwerty-dvorak force-pushed the 12-05-feat_huggingface_provider_added branch 2 times, most recently from f279893 to 4663a80 Compare December 5, 2025 21:20
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (10)
core/providers/huggingface/utils.go (1)

168-201: Only a subset of INFERENCE_PROVIDERS are routable in getInferenceProviderRouteURL.

The switch handles fal-ai, hf-inference, nebius, replicate, sambanova, and scaleway, but other entries in INFERENCE_PROVIDERS (e.g., cerebras, cohere, groq, featherless-ai, fireworks-ai, hyperbolic, novita, nscale, ovhcloud, publicai, together, wavespeed, zai-org) fall through to the default error. That means models mapped to those providers (and surfaced via list-models) will currently be unusable.

Either (a) add routing cases for the remaining providers with correct default paths or (b) remove any providers you don’t intend to support yet from INFERENCE_PROVIDERS so they don’t appear as “supported” in filters and mappings.

core/internal/testutil/account.go (1)

259-266: Fix HuggingFace test API key env var name and optionally support HF_TOKEN.

This branch reads HUGGING_FACE_API_KEY, but the PR/docs use HUGGINGFACE_API_KEY and the provider is supposed to support HF_TOKEN as well. As written, CI that exports HUGGINGFACE_API_KEY/HF_TOKEN will see empty keys in tests.

Consider standardizing on HUGGINGFACE_API_KEY with an HF_TOKEN fallback:

-	case schemas.HuggingFace:
-		return []schemas.Key{
-			{
-				Value:  os.Getenv("HUGGING_FACE_API_KEY"),
-				Models: []string{},
-				Weight: 1.0,
-			},
-		}, nil
+	case schemas.HuggingFace:
+		key := os.Getenv("HUGGINGFACE_API_KEY")
+		if key == "" {
+			key = os.Getenv("HF_TOKEN")
+		}
+		return []schemas.Key{
+			{
+				Value:  key,
+				Models: []string{},
+				Weight: 1.0,
+			},
+		}, nil
Makefile (3)

21-21: Simplify GOTESTSUM_BIN fallback for consistency with AIR_BIN.

The shell which gotestsum 2>/dev/null || echo gotestsum fallback adds complexity without benefit. For consistency with AIR_BIN (line 20), use a simple fallback.


103-104: Replace which with command -v for proper full-path handling.

The which $(GOTESTSUM_BIN) fallback doesn't work correctly with full paths. Use command -v which handles both cases, or align with the AIR_BIN pattern using only [ -x ].


97-98: Incorrect use of which with full path variables.

At lines 97-98 (and 110-111), which $$INSTALLED is used where $$INSTALLED is an absolute path. The which command searches PATH by name, not by checking if a full path exists. Replace with a direct existence check: [ ! -x "$$INSTALLED" ].

core/providers/huggingface/chat.go (2)

93-96: Undefined variable debug will cause compilation error.

The variable debug is referenced but never defined in this file. This will prevent the code from compiling.

Either define the debug variable or remove the debug logging:

+// Package-level debug flag (set via build tags or environment)
+var debug = false
+
 func ToHuggingFaceChatCompletionRequest(bifrostReq *schemas.BifrostChatRequest) *HuggingFaceChatRequest {

Or remove the debug blocks entirely if not needed:

-		if debug {
-			fmt.Printf("[huggingface debug] Added tool_call_id=%s to tool message\n", *msg.ChatToolMessage.ToolCallID)
-		}

257-263: Same undefined debug variable issue.

This is the same compilation error as in the request converter - debug is undefined.

core/providers/huggingface/huggingface.go (3)

295-301: Bug: Wrong request type in unsupported operation errors.

Both TextCompletion and TextCompletionStream use schemas.EmbeddingRequest instead of their respective request types. This causes incorrect error messages.

Apply this diff:

 func (provider *HuggingFaceProvider) TextCompletion(ctx context.Context, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (*schemas.BifrostTextCompletionResponse, *schemas.BifrostError) {
-	return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey())
+	return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionRequest, provider.GetProviderKey())
 }

 func (provider *HuggingFaceProvider) TextCompletionStream(ctx context.Context, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) {
-	return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey())
+	return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionStreamRequest, provider.GetProviderKey())
 }

713-716: Bug: Wrong request type in Embedding error message.

Line 715 uses schemas.SpeechRequest instead of schemas.EmbeddingRequest when the mapping check fails.

Apply this diff:

 	mapping, ok := providerMapping[inferenceProvider]
 	if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "feature-extraction" {
-		return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey())
+		return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey())
 	}

847-854: Bug: Wrong request type in Transcription error messages.

Lines 848 and 853 use schemas.SpeechRequest instead of schemas.TranscriptionRequest.

Apply this diff:

 	if providerMapping == nil {
-		return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey())
+		return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey())
 	}

 	mapping, ok := providerMapping[inferenceProvider]
 	if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "automatic-speech-recognition" {
-		return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey())
+		return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey())
 	}
🧹 Nitpick comments (9)
docs/contributing/adding-a-provider.mdx (1)

41-75: Fix hyphenation for compound adjectives.

According to English grammar rules, compound adjectives should be hyphenated when they precede a noun. Update all instances of "OpenAI Compatible" to "OpenAI-compatible" when used as adjectives:

  • Line 43: "not OpenAI compatible" → "not OpenAI-compatible"
  • Line 71: "#### OpenAI Compatible Providers" → "#### OpenAI-Compatible Providers"
  • Line 629: "### OpenAI Compatible Providers" → "### OpenAI-Compatible Providers"
  • Line 1475: "### For OpenAI Compatible Providers" → "### For OpenAI-Compatible Providers"

Also update any related section titles and cross-references to maintain consistency throughout the document.

Also applies to: 629-632, 1475-1477

core/providers/huggingface/utils.go (2)

27-82: inferenceProvider constants and provider lists look consistent with HF docs.

INFERENCE_PROVIDERS and PROVIDERS_OR_POLICIES are well-structured; if you want slightly tighter typing, you could append the auto constant instead of the raw "auto" literal in PROVIDERS_OR_POLICIES, but the current form is functionally fine.


257-296: Int extraction helper is flexible; be aware of truncation semantics.

extractIntFromInterface handles most numeric variants (including json.Number) and falls back cleanly; just note that all float cases (and json.Number via Float64) are truncated via int(...), which is fine if all upstream numeric fields are expected to be integral.

core/providers/huggingface/huggingface_test.go (1)

24-57: HuggingFace scenarios omit embedding/audio tests despite models being configured.

You’ve set EmbeddingModel, TranscriptionModel, and SpeechSynthesisModel, but all corresponding scenario flags (Embedding, Transcription, TranscriptionStream, SpeechSynthesis, SpeechSynthesisStream) are false, so RunAllComprehensiveTests won’t actually exercise those paths. Once you’re confident in those flows, consider flipping these booleans to true so chat, embeddings, and audio are all covered end-to-end.

core/providers/huggingface/models.go (1)

46-104: deriveSupportedMethods’ pipeline/tag heuristics look solid and conservative.

Normalizing the pipeline tag, aggregating methods via a set, and falling back to tags for embeddings, chat/text, TTS, and ASR is a good balance between coverage and precision, and sorting the final method list keeps responses deterministic.

core/providers/huggingface/embedding.go (1)

10-13: Inconsistent nil handling - returning (nil, nil) may cause silent failures.

Returning (nil, nil) when bifrostReq is nil differs from other converters in this PR (e.g., ToBifrostEmbeddingResponse returns an error for nil input). This could lead to silent failures where the caller doesn't know if the request was successfully converted or if the input was nil.

Consider returning an explicit error for consistency:

 func ToHuggingFaceEmbeddingRequest(bifrostReq *schemas.BifrostEmbeddingRequest) (*HuggingFaceEmbeddingRequest, error) {
 	if bifrostReq == nil {
-		return nil, nil
+		return nil, fmt.Errorf("bifrost embedding request is nil")
 	}

Alternatively, if nil-in-nil-out is intentional, document this behavior with a comment.

core/providers/huggingface/speech.go (1)

9-12: Inconsistent nil handling pattern.

Similar to ToHuggingFaceEmbeddingRequest, returning (nil, nil) for nil input may cause silent failures. Consider returning an error or documenting the nil-in-nil-out behavior for consistency across the provider.

core/providers/huggingface/transcription.go (1)

124-129: Potential silent data loss when Timestamp has exactly one element.

The condition len(chunk.Timestamp) >= 2 handles empty and full timestamps, but if a chunk has exactly one timestamp element, both start and end will be zero. Consider logging or handling this edge case explicitly if it indicates malformed data.

core/providers/huggingface/chat.go (1)

36-38: Silently discarding marshalling errors could hide bugs.

Multiple locations discard sonic.Marshal errors with _. While this may be acceptable for optional fields, consider logging these errors at debug level to aid troubleshooting.

-			contentJSON, _ := sonic.Marshal(*msg.Content.ContentStr)
+			contentJSON, err := sonic.Marshal(*msg.Content.ContentStr)
+			if err != nil {
+				// Log error but continue - content will be empty
+				continue
+			}
 			hfMsg.Content = json.RawMessage(contentJSON)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00ecc06 and f279893.

📒 Files selected for processing (26)
  • .github/workflows/pr-tests.yml (1 hunks)
  • .github/workflows/release-pipeline.yml (4 hunks)
  • Makefile (8 hunks)
  • core/bifrost.go (2 hunks)
  • core/internal/testutil/account.go (3 hunks)
  • core/internal/testutil/responses_stream.go (1 hunks)
  • core/providers/huggingface/chat.go (1 hunks)
  • core/providers/huggingface/embedding.go (1 hunks)
  • core/providers/huggingface/huggingface.go (1 hunks)
  • core/providers/huggingface/huggingface_test.go (1 hunks)
  • core/providers/huggingface/models.go (1 hunks)
  • core/providers/huggingface/speech.go (1 hunks)
  • core/providers/huggingface/transcription.go (1 hunks)
  • core/providers/huggingface/types.go (1 hunks)
  • core/providers/huggingface/utils.go (1 hunks)
  • core/schemas/account.go (2 hunks)
  • core/schemas/bifrost.go (3 hunks)
  • core/schemas/mux.go (2 hunks)
  • docs/apis/openapi.json (1 hunks)
  • docs/contributing/adding-a-provider.mdx (1 hunks)
  • docs/features/unified-interface.mdx (1 hunks)
  • transports/config.schema.json (2 hunks)
  • ui/README.md (1 hunks)
  • ui/lib/constants/config.ts (2 hunks)
  • ui/lib/constants/icons.tsx (1 hunks)
  • ui/lib/constants/logs.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
  • core/schemas/bifrost.go
  • core/bifrost.go
  • ui/lib/constants/logs.ts
  • transports/config.schema.json
  • .github/workflows/pr-tests.yml
  • core/schemas/account.go
  • core/schemas/mux.go
  • .github/workflows/release-pipeline.yml
  • ui/lib/constants/config.ts
  • core/internal/testutil/responses_stream.go
  • docs/features/unified-interface.mdx
  • ui/README.md
👮 Files not reviewed due to content moderation or server errors (1)
  • ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • docs/apis/openapi.json
  • core/providers/huggingface/embedding.go
  • core/internal/testutil/account.go
  • core/providers/huggingface/models.go
  • Makefile
  • core/providers/huggingface/speech.go
  • docs/contributing/adding-a-provider.mdx
  • core/providers/huggingface/transcription.go
  • ui/lib/constants/icons.tsx
  • core/providers/huggingface/chat.go
  • core/providers/huggingface/utils.go
  • core/providers/huggingface/huggingface_test.go
  • core/providers/huggingface/huggingface.go
  • core/providers/huggingface/types.go
🧬 Code graph analysis (5)
core/providers/huggingface/embedding.go (3)
core/schemas/embedding.go (4)
  • BifrostEmbeddingRequest (9-16)
  • BifrostEmbeddingResponse (22-28)
  • EmbeddingData (118-122)
  • EmbeddingStruct (124-128)
core/providers/huggingface/types.go (2)
  • HuggingFaceEmbeddingRequest (303-313)
  • HuggingFaceEmbeddingResponse (324-324)
core/schemas/chatcompletions.go (1)
  • BifrostLLMUsage (640-647)
core/internal/testutil/account.go (4)
core/schemas/bifrost.go (1)
  • HuggingFace (51-51)
core/schemas/account.go (1)
  • Key (8-18)
core/schemas/provider.go (4)
  • ProviderConfig (234-242)
  • NetworkConfig (45-53)
  • DefaultRequestTimeoutInSeconds (15-15)
  • ConcurrencyAndBufferSize (128-131)
core/internal/testutil/cross_provider_scenarios.go (1)
  • ProviderConfig (45-53)
core/providers/huggingface/speech.go (4)
core/schemas/speech.go (2)
  • BifrostSpeechRequest (9-16)
  • BifrostSpeechResponse (22-29)
core/providers/huggingface/types.go (5)
  • HuggingFaceSpeechRequest (329-335)
  • HuggingFaceSpeechParameters (338-341)
  • HuggingFaceTranscriptionGenerationParameters (367-384)
  • HuggingFaceTranscriptionEarlyStopping (388-391)
  • HuggingFaceSpeechResponse (344-348)
core/schemas/models.go (1)
  • Model (109-129)
core/schemas/bifrost.go (2)
  • BifrostResponseExtraFields (287-296)
  • HuggingFace (51-51)
core/providers/huggingface/huggingface.go (4)
core/providers/utils/utils.go (11)
  • GetProviderName (734-741)
  • GetRequestPath (219-239)
  • SetExtraHeaders (178-208)
  • MakeRequestWithContext (39-93)
  • HandleProviderAPIError (317-337)
  • NewBifrostOperationError (449-460)
  • HandleProviderResponse (343-387)
  • ShouldSendBackRawResponse (482-487)
  • NewUnsupportedOperationError (391-402)
  • ProcessAndSendBifrostError (571-601)
  • GetBifrostResponseForStreamResponse (777-805)
core/providers/huggingface/chat.go (1)
  • ToHuggingFaceChatCompletionRequest (12-206)
core/providers/huggingface/embedding.go (1)
  • ToHuggingFaceEmbeddingRequest (10-58)
core/providers/huggingface/transcription.go (1)
  • ToHuggingFaceTranscriptionRequest (9-101)
core/providers/huggingface/types.go (1)
core/schemas/models.go (1)
  • Model (109-129)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx

[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...

(QB_NEW_EN_HYPHEN)


[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...

(QB_NEW_EN_HYPHEN)


[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...

(QB_NEW_EN_HYPHEN)


[grammar] ~1475-~1475: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...

(QB_NEW_EN_HYPHEN)


[uncategorized] ~1785-~1785: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...

(GITHUB)

🔇 Additional comments (31)
docs/apis/openapi.json (1)

3238-3258: ModelProvider enum now correctly exposes HuggingFace (and Cerebras) in the public API.

The added "cerebras" and "huggingface" enum values align with the new ModelProvider constants and make these providers addressable via the OpenAPI schema; no issues from a schema/compat perspective.

core/providers/huggingface/utils.go (3)

83-128: Model hub URL builder is robust and respects pagination and overrides.

The URL construction correctly clamps limit, sets sensible defaults (full, sort, direction, inference_provider), and then lets ExtraParams override as needed; this should work well for listing inference-provider-backed models.


203-255: Provider-mapping fetch and conversion are straightforward and aligned with fasthttp/providerUtils patterns.

The mapping conversion guards against nil, preserves tasks and model mappings, and the HTTP helper correctly acquires/releases fasthttp objects, handles non-200 responses via HandleProviderAPIError, and decodes JSON into the mapping type; this looks solid.


137-166: All call sites of splitIntoModelProvider properly handle the error return value. Every instance checks nameErr != nil and either propagates the error or wraps it in an appropriate error type (UnsupportedOperationError). The stricter validation of model name format is safely enforced across the codebase with no gaps in error handling.

core/internal/testutil/account.go (2)

77-99: Including HuggingFace in the configured providers set is correct.

Adding schemas.HuggingFace here keeps the test harness aligned with the new provider and ensures it participates in cross-provider setups.


512-524: HuggingFace ProviderConfig defaults look reasonable.

A 300s timeout, 10 retries, and moderate backoff (2s–30s) with standard concurrency/buffer mirror how other “variable” cloud providers are configured; this should be fine as a starting point.

core/providers/huggingface/models.go (1)

16-44: Model listing transformation correctly scopes IDs and filters unsupported models.

ToBifrostListModelsResponse sensibly skips models without IDs or derived methods, prefixes IDs with the Bifrost provider key, and stores the raw Hugging Face ID separately; this gives a clean, provider-scoped surface for /v1/models.

core/providers/huggingface/embedding.go (1)

60-96: LGTM!

The ToBifrostEmbeddingResponse method correctly converts HuggingFace embeddings to Bifrost format, properly handles nil input with an error, and documents that usage information is unavailable from the HuggingFace API.

core/providers/huggingface/speech.go (1)

98-119: LGTM!

The response converter properly validates the model name and correctly notes that HuggingFace TTS doesn't return usage or alignment data.

Makefile (1)

66-70: Good addition of root-user guard for local development.

Preventing make install-ui from running as root on developer machines avoids common permission issues with npm global installs.

core/providers/huggingface/transcription.go (1)

38-82: LGTM!

The integer parameter extraction correctly uses extractIntFromInterface to handle both int and float64 types from JSON unmarshalling, addressing the concern from previous reviews.

core/providers/huggingface/chat.go (2)

69-76: LGTM - nil pointer dereference fix is correctly implemented.

The code now safely handles a nil tc.Function.Name by using a default empty string, preventing potential panics.


324-420: LGTM!

The streaming response converter correctly handles delta fields, tool calls, logprobs, and usage conversion. The nil handling returning plain nil (without error) is appropriate for streaming contexts.

core/providers/huggingface/huggingface.go (10)

1-31: LGTM: Package setup and provider struct are well-structured.

The debug toggle via environment variable and the provider struct with proper configuration fields follow established patterns from other providers.


33-85: LGTM: Object pooling implementation is correct.

The acquire/release pattern with struct reset ensures clean state reuse. The nil checks in release functions prevent panics.


87-120: LGTM: Provider constructor follows established patterns.

Pre-warming response pools and proper configuration handling align with other provider implementations.


122-130: LGTM: Helper methods correctly delegate to utility functions.


132-205: LGTM: HTTP request handling is robust.

The response body copy at line 192 correctly prevents use-after-free when fasthttp releases its internal buffer. Error response parsing properly extracts HuggingFace-specific error details.


207-293: LGTM: Model listing implementation handles both keyed and keyless modes correctly.

The operation check at line 279 correctly uses schemas.HuggingFace (the copy-paste issue from Gemini was fixed).


303-375: LGTM: ChatCompletion implementation is well-structured.

The model name splitting and reconstruction pattern correctly handles HuggingFace's modelName:inferenceProvider format. Response conversion and extra fields population follow established patterns.


377-659: LGTM: Streaming implementation is comprehensive.

The SSE parsing, context cancellation handling, and the workaround for combined usage+content chunks (lines 604-645) are well-documented and correctly implemented. Resource cleanup via defers ensures proper release of fasthttp resources.


661-683: LGTM: Responses API correctly adapts ChatCompletion endpoints.

The fallback pattern via context flag enables code reuse while maintaining proper response type conversion.


756-830: LGTM: Speech implementation correctly validates task type.

The task check at line 787 correctly validates "text-to-speech" for Speech operations (the swapped task check from past review was fixed).

core/providers/huggingface/types.go (8)

1-48: LGTM: Model response unmarshaling correctly handles both API formats.

The custom UnmarshalJSON at lines 30-48 properly handles both the top-level array format (current API) and the object format with a models field (backward compatibility). This addresses the past review concern.


50-68: LGTM: Inference provider mapping types are well-structured.

The separation between API response types (HuggingFaceInferenceProviderInfo) and internal types (HuggingFaceInferenceProviderMapping) provides clean abstraction.


69-213: LGTM: Chat types comprehensively cover OpenAI-compatible format.

The flexible use of json.RawMessage for content and tool_choice allows handling various input formats. Response types include full logprobs and usage support.


214-285: LGTM: Streaming types correctly model delta structures.

The streaming response types properly handle incremental content delivery with optional usage and timing information.


286-298: LGTM: Error types distinguish Hub and inference API responses.


299-324: LGTM: Embedding types support flexible input formats.

The interface{} type for Inputs correctly handles both single string and string array inputs required by the feature extraction API.


326-349: LGTM: Speech types appropriately model text-to-speech API.


350-436: LGTM: Transcription types handle complex union types correctly.

The custom MarshalJSON/UnmarshalJSON for HuggingFaceTranscriptionEarlyStopping properly handles the boolean or string ("never") union type. The type aliases at lines 435-436 provide convenient access.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
Makefile (3)

97-98: Simplify binary availability check—which does not work with full paths.

The condition if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then contains a redundant and confusing which call. Since $$INSTALLED is an absolute path (e.g., /home/user/go/bin/air), which will always fail—making the && logic unintuitive. Simply check if the file is executable.

-		if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \
-			echo "$(YELLOW)Note: the installed air binary may not be on your PATH...$(NC)"; \
+		if [ ! -x "$$INSTALLED" ]; then \
+			echo "$(YELLOW)Note: the installed air binary may not be on your PATH...$(NC)"; \

110-111: Simplify binary availability check—same issue as install-air (line 97).

Apply the same simplification to install-gotestsum: remove the redundant which call and rely solely on the [ -x ] check.

-		if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \
-			echo "$(YELLOW)Note: the installed gotestsum binary may not be on your PATH...$(NC)"; \
+		if [ ! -x "$$INSTALLED" ]; then \
+			echo "$(YELLOW)Note: the installed gotestsum binary may not be on your PATH...$(NC)"; \

103-103: Replace which with a more portable check or simplify to match AIR_BIN pattern.

Line 103 uses which $(GOTESTSUM_BIN) as a fallback after the [ -x ] check. Since GOTESTSUM_BIN may be a full path, which is unreliable. Either replace with command -v (which works for both names and full paths) or simplify to just [ -x ] for consistency with the AIR_BIN pattern on line 90.

Option 1 (recommended): Simplify to match the AIR_BIN pattern:

- @if [ -x "$(GOTESTSUM_BIN)" ] || which $(GOTESTSUM_BIN) > /dev/null 2>&1; then \
+ @if [ -x "$(GOTESTSUM_BIN)" ]; then \

Option 2: Use command -v for portability:

- @if [ -x "$(GOTESTSUM_BIN)" ] || which $(GOTESTSUM_BIN) > /dev/null 2>&1; then \
+ @if [ -x "$(GOTESTSUM_BIN)" ] || command -v $(GOTESTSUM_BIN) > /dev/null 2>&1; then \
core/providers/huggingface/huggingface.go (1)

295-301: Fix request type in unsupported TextCompletion operations.

Both TextCompletion and TextCompletionStream return an unsupported-operation error but incorrectly tag it as schemas.EmbeddingRequest instead of the appropriate text-completion request types. That makes error classification and telemetry misleading.

Consider updating to:

 func (provider *HuggingFaceProvider) TextCompletion(ctx context.Context, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (*schemas.BifrostTextCompletionResponse, *schemas.BifrostError) {
-	return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey())
+	return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionRequest, provider.GetProviderKey())
 }

 func (provider *HuggingFaceProvider) TextCompletionStream(ctx context.Context, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) {
-	return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey())
+	return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionStreamRequest, provider.GetProviderKey())
 }
🧹 Nitpick comments (2)
core/providers/huggingface/types.go (1)

405-421: Consider returning an error for invalid EarlyStopping values.

The UnmarshalJSON method returns nil (line 420) when the value is neither a boolean nor a string. This silently ignores invalid values. Consider returning an error for non-null values that don't match expected types to catch malformed API responses.

Apply this diff to add error handling:

 func (e *HuggingFaceTranscriptionEarlyStopping) UnmarshalJSON(data []byte) error {
+	// Handle null explicitly
+	if string(data) == "null" {
+		return nil
+	}
+
 	// Try boolean first
 	var boolVal bool
 	if err := json.Unmarshal(data, &boolVal); err == nil {
 		e.BoolValue = &boolVal
 		return nil
 	}
 
 	// Try string
 	var stringVal string
 	if err := json.Unmarshal(data, &stringVal); err == nil {
 		e.StringValue = &stringVal
 		return nil
 	}
 
-	return nil
+	return fmt.Errorf("early_stopping must be a boolean or string, got: %s", string(data))
 }
core/providers/huggingface/huggingface.go (1)

685-754: Embedding implementation matches the expected routing and extra-field semantics.

The Embedding path:

  • Checks operation permissions for HuggingFace/Embedding.
  • Builds the HF request via ToHuggingFaceEmbeddingRequest.
  • Splits the model into inferenceProvider and modelName, then resolves a getModelInferenceProviderMapping entry and validates ProviderTask == "feature-extraction".
  • Uses the provider-specific model id and getInferenceProviderRouteURL to derive the target URL, then executes completeRequest.
  • Converts HuggingFaceEmbeddingResponse into a BifrostEmbeddingResponse and fills ExtraFields (provider, model requested, request type, latency, raw response).

This is correct and consistent with how embeddings are handled for other providers.

As a non-blocking improvement, the repeated pattern of splitting the model, looking up getModelInferenceProviderMapping, validating ProviderTask, and deriving the route URL (here and in Speech/Transcription) could be factored into a small helper to avoid drift if the mapping rules ever change.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00ecc06 and 4663a80.

📒 Files selected for processing (26)
  • .github/workflows/pr-tests.yml (1 hunks)
  • .github/workflows/release-pipeline.yml (4 hunks)
  • Makefile (8 hunks)
  • core/bifrost.go (2 hunks)
  • core/internal/testutil/account.go (3 hunks)
  • core/internal/testutil/responses_stream.go (1 hunks)
  • core/providers/huggingface/chat.go (1 hunks)
  • core/providers/huggingface/embedding.go (1 hunks)
  • core/providers/huggingface/huggingface.go (1 hunks)
  • core/providers/huggingface/huggingface_test.go (1 hunks)
  • core/providers/huggingface/models.go (1 hunks)
  • core/providers/huggingface/speech.go (1 hunks)
  • core/providers/huggingface/transcription.go (1 hunks)
  • core/providers/huggingface/types.go (1 hunks)
  • core/providers/huggingface/utils.go (1 hunks)
  • core/schemas/account.go (2 hunks)
  • core/schemas/bifrost.go (3 hunks)
  • core/schemas/mux.go (2 hunks)
  • docs/apis/openapi.json (1 hunks)
  • docs/contributing/adding-a-provider.mdx (1 hunks)
  • docs/features/unified-interface.mdx (1 hunks)
  • transports/config.schema.json (2 hunks)
  • ui/README.md (1 hunks)
  • ui/lib/constants/config.ts (2 hunks)
  • ui/lib/constants/icons.tsx (1 hunks)
  • ui/lib/constants/logs.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
  • ui/README.md
  • core/schemas/mux.go
  • .github/workflows/release-pipeline.yml
  • .github/workflows/pr-tests.yml
  • docs/apis/openapi.json
  • core/schemas/account.go
  • transports/config.schema.json
  • ui/lib/constants/logs.ts
  • docs/features/unified-interface.mdx
  • core/internal/testutil/responses_stream.go
  • core/schemas/bifrost.go
  • core/internal/testutil/account.go
👮 Files not reviewed due to content moderation or server errors (1)
  • ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/bifrost.go
  • ui/lib/constants/config.ts
  • core/providers/huggingface/embedding.go
  • docs/contributing/adding-a-provider.mdx
  • Makefile
  • core/providers/huggingface/chat.go
  • core/providers/huggingface/models.go
  • core/providers/huggingface/utils.go
  • core/providers/huggingface/transcription.go
  • ui/lib/constants/icons.tsx
  • core/providers/huggingface/huggingface.go
  • core/providers/huggingface/huggingface_test.go
  • core/providers/huggingface/speech.go
  • core/providers/huggingface/types.go
🧬 Code graph analysis (9)
core/bifrost.go (2)
core/schemas/bifrost.go (1)
  • HuggingFace (51-51)
core/providers/huggingface/huggingface.go (1)
  • NewHuggingFaceProvider (88-120)
core/providers/huggingface/embedding.go (2)
core/schemas/embedding.go (4)
  • BifrostEmbeddingRequest (9-16)
  • BifrostEmbeddingResponse (22-28)
  • EmbeddingData (118-122)
  • EmbeddingStruct (124-128)
core/providers/huggingface/types.go (2)
  • HuggingFaceEmbeddingRequest (303-313)
  • HuggingFaceEmbeddingResponse (324-324)
core/providers/huggingface/chat.go (2)
core/schemas/chatcompletions.go (2)
  • BifrostChatRequest (12-19)
  • LogProb (625-629)
core/providers/huggingface/types.go (12)
  • HuggingFaceChatRequest (72-92)
  • HuggingFaceChatMessage (94-102)
  • HuggingFaceContentItem (105-109)
  • HuggingFaceImageRef (111-113)
  • HuggingFaceToolCall (115-119)
  • HuggingFaceFunction (121-125)
  • HuggingFaceResponseFormat (127-130)
  • HuggingFaceStreamOptions (139-141)
  • HuggingFaceTool (143-146)
  • HuggingFaceToolFunction (148-152)
  • HuggingFaceChatResponse (154-161)
  • HuggingFaceChatStreamResponse (215-224)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
  • HuggingFaceListModelsResponse (24-26)
core/schemas/bifrost.go (13)
  • ModelProvider (32-32)
  • RequestType (86-86)
  • ChatCompletionRequest (92-92)
  • ChatCompletionStreamRequest (93-93)
  • TextCompletionRequest (90-90)
  • TextCompletionStreamRequest (91-91)
  • ResponsesRequest (94-94)
  • ResponsesStreamRequest (95-95)
  • EmbeddingRequest (96-96)
  • SpeechRequest (97-97)
  • SpeechStreamRequest (98-98)
  • TranscriptionRequest (99-99)
  • TranscriptionStreamRequest (100-100)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/huggingface/utils.go (2)
core/providers/huggingface/huggingface.go (1)
  • HuggingFaceProvider (25-31)
core/providers/utils/utils.go (5)
  • GetRequestPath (219-239)
  • MakeRequestWithContext (39-93)
  • HandleProviderAPIError (317-337)
  • CheckAndDecodeBody (423-431)
  • NewBifrostOperationError (449-460)
core/providers/huggingface/transcription.go (3)
core/schemas/transcriptions.go (2)
  • BifrostTranscriptionRequest (3-10)
  • BifrostTranscriptionResponse (16-26)
core/providers/huggingface/types.go (4)
  • HuggingFaceTranscriptionRequest (353-358)
  • HuggingFaceTranscriptionRequestParameters (361-364)
  • HuggingFaceTranscriptionGenerationParameters (367-384)
  • HuggingFaceTranscriptionEarlyStopping (388-391)
core/schemas/bifrost.go (2)
  • BifrostResponseExtraFields (287-296)
  • HuggingFace (51-51)
core/providers/huggingface/huggingface.go (5)
core/providers/huggingface/types.go (4)
  • HuggingFaceChatResponse (154-161)
  • HuggingFaceResponseError (293-297)
  • HuggingFaceHubError (288-291)
  • HuggingFaceEmbeddingResponse (324-324)
core/providers/utils/utils.go (5)
  • SetExtraHeaders (178-208)
  • MakeRequestWithContext (39-93)
  • HandleProviderAPIError (317-337)
  • CheckAndDecodeBody (423-431)
  • NewBifrostOperationError (449-460)
core/schemas/bifrost.go (15)
  • ModelProvider (32-32)
  • HuggingFace (51-51)
  • RequestType (86-86)
  • BifrostError (356-365)
  • ErrorField (374-381)
  • ListModelsRequest (89-89)
  • EmbeddingRequest (96-96)
  • ChatCompletionRequest (92-92)
  • ChatCompletionStreamRequest (93-93)
  • ResponsesStreamRequest (95-95)
  • ResponsesRequest (94-94)
  • SpeechRequest (97-97)
  • SpeechStreamRequest (98-98)
  • TranscriptionRequest (99-99)
  • TranscriptionStreamRequest (100-100)
core/providers/huggingface/chat.go (1)
  • ToHuggingFaceChatCompletionRequest (12-207)
core/providers/huggingface/embedding.go (1)
  • ToHuggingFaceEmbeddingRequest (10-58)
core/providers/huggingface/huggingface_test.go (5)
core/internal/testutil/setup.go (1)
  • SetupTest (51-60)
core/internal/testutil/account.go (2)
  • ComprehensiveTestConfig (47-64)
  • TestScenarios (22-44)
core/schemas/bifrost.go (2)
  • HuggingFace (51-51)
  • Fallback (131-134)
core/schemas/models.go (1)
  • Model (109-129)
core/internal/testutil/tests.go (1)
  • RunAllComprehensiveTests (15-62)
core/providers/huggingface/speech.go (3)
core/schemas/speech.go (2)
  • BifrostSpeechRequest (9-16)
  • BifrostSpeechResponse (22-29)
core/providers/huggingface/types.go (4)
  • HuggingFaceSpeechRequest (329-335)
  • HuggingFaceSpeechParameters (338-341)
  • HuggingFaceTranscriptionGenerationParameters (367-384)
  • HuggingFaceTranscriptionEarlyStopping (388-391)
core/schemas/bifrost.go (2)
  • BifrostResponseExtraFields (287-296)
  • HuggingFace (51-51)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx

[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...

(QB_NEW_EN_HYPHEN)


[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...

(QB_NEW_EN_HYPHEN)


[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...

(QB_NEW_EN_HYPHEN)


[grammar] ~1475-~1475: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...

(QB_NEW_EN_HYPHEN)


[uncategorized] ~1785-~1785: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...

(GITHUB)

🔇 Additional comments (34)
Makefile (5)

15-21: ✓ Well-structured tool path resolution with sensible fallbacks.

The new DEFAULT_GOBIN variable elegantly prioritizes GOBINGOPATH/bin → empty, and AIR_BIN and GOTESTSUM_BIN consistently prefer local paths. This avoids hardcoding tool expectations and gracefully handles systems with custom Go installations (e.g., Nix).


24-29: ✓ Dynamic ANSI color codes improve portability and maintainability.

Using shell printf instead of literal escape sequences is more portable and easier to read. This pattern is cleaner than hardcoded escape codes.


65-70: ✓ Root check and multi-fallback Next.js detection enhance developer experience.

Lines 67-70 prevent root-user npm installs on local dev machines, which avoids permission issues on NixOS and similar systems. Lines 76-85 provide three tiers of fallback (local node_modules, npx, global), making the install flow robust across different environments.

Also applies to: 75-85


155-155: ✓ Consistent use of $(AIR_BIN) variable throughout dev target.

Both debug and non-debug paths now use the centralized $(AIR_BIN) variable, ensuring the tool path is resolved consistently.

Also applies to: 163-163


349-349: ✓ $(GOTESTSUM_BIN) consistently used across all test targets.

All test invocations now reference $(GOTESTSUM_BIN) instead of the bare gotestsum command, ensuring they use the resolved tool path. This improves reliability on systems where gotestsum is installed in non-standard locations.

Also applies to: 401-401, 425-425, 454-454, 548-548

core/bifrost.go (2)

26-26: LGTM!

The import statement is correctly placed in alphabetical order and follows Go conventions.


1327-1328: LGTM!

The HuggingFace provider is correctly wired into the factory switch with the proper constructor call and return pattern, consistent with other providers.

core/providers/huggingface/types.go (6)

11-48: LGTM!

The custom UnmarshalJSON implementation properly handles both the array and object response formats from the HuggingFace API, addressing the earlier review concern.


90-92: Clarify the purpose of the Extra field.

The Extra field is tagged with json:"-" which means it won't be marshaled or unmarshaled. If the intent is to capture unknown additional fields from the API response, you'd need custom UnmarshalJSON logic. If it's for application-level metadata only (not from API), the current approach is fine.


288-297: LGTM!

The error types properly represent different HuggingFace API error response formats.


303-324: LGTM!

The embedding types properly handle flexible input formats and the response structure matches HuggingFace's feature extraction API output.


329-348: LGTM!

The speech types follow a consistent pattern with the chat types. Note that the Extra fields with json:"-" tags won't capture unknown API fields (same consideration as the chat request Extra field).


435-436: LGTM!

The type aliases provide convenient alternative names for shared generation parameters, improving code readability.

ui/lib/constants/config.ts (2)

14-14: LGTM!

The HuggingFace model placeholder examples demonstrate both chat and embedding models with proper HuggingFace Hub naming conventions.


34-34: LGTM!

Correctly marks HuggingFace as requiring an API key, consistent with other cloud-based inference providers.

core/providers/huggingface/huggingface_test.go (1)

12-63: LGTM! Test configuration is well-structured.

The test setup correctly:

  • Checks for API key before running
  • Uses parallel execution
  • Configures comprehensive scenarios
  • Enables appropriate features (chat, streaming, tool calls, vision)
  • Disables unsupported features (text completion, embedding, speech, transcription)

The model assignments appear correct and the disabled scenarios align with the provider's current capabilities.

core/providers/huggingface/models.go (1)

16-104: LGTM! Model listing conversion is well-implemented.

The code demonstrates good practices:

  • Proper nil handling and input validation
  • Pre-allocated slices for performance
  • Skip logic for invalid models (empty ID or no supported methods)
  • Comprehensive method derivation from both pipeline tags and model tags
  • Deduplication using a map-based approach
  • Sorted output for consistency

The logic correctly maps HuggingFace model metadata to Bifrost's schema.

core/providers/huggingface/embedding.go (1)

10-96: LGTM! Embedding conversion logic is solid.

The implementation correctly handles:

  • Nil input validation
  • Model/provider extraction via splitIntoModelProvider
  • Mapping single text vs. array of texts
  • Provider-specific parameters from ExtraParams
  • Embedding array construction with proper indexing
  • Missing usage information (documented that HF doesn't provide it)

The converters follow the established pattern and handle both request and response transformations cleanly.

core/providers/huggingface/transcription.go (1)

9-142: LGTM! Transcription converters handle complex parameter mapping well.

The implementation demonstrates:

  • Proper input validation with clear error messages
  • Correct use of extractIntFromInterface to handle JSON numeric types (addressing past review concerns)
  • Comprehensive generation parameter mapping (do_sample, max_new_tokens, temperature, etc.)
  • Proper handling of the polymorphic early_stopping field (bool or string)
  • Segment conversion with timestamp preservation
  • Clean separation of concerns

Previous issues with error messages and integer type assertions have been addressed.

docs/contributing/adding-a-provider.mdx (1)

1-2070: Documentation is comprehensive and well-structured.

The guide provides excellent coverage of:

  • Clear distinction between OpenAI-compatible and custom API providers
  • Step-by-step implementation phases with proper ordering
  • File structure and naming conventions
  • Type definitions and validation requirements
  • Converter patterns with real examples
  • Test setup and CI/CD integration
  • UI and configuration updates

The reference code examples serve their purpose of illustrating patterns without needing to be complete implementations (as clarified by the author in past reviews).

core/providers/huggingface/speech.go (1)

9-120: LGTM! Speech synthesis converters are well-implemented.

The code follows the same robust patterns as transcription:

  • Input validation with clear error messages
  • Proper use of extractIntFromInterface for JSON numeric handling
  • Comprehensive generation parameter mapping
  • Handling of polymorphic early_stopping field
  • Clear documentation that HF TTS doesn't provide usage/alignment data
  • Clean response construction with proper metadata

Previous type assertion issues have been resolved.

core/providers/huggingface/chat.go (1)

12-421: LGTM! Chat converters handle complex transformations correctly.

The implementation demonstrates excellent handling of:

  • Nil pointer protection for tool calls (fixed from past review)
  • Message content conversion (string vs. structured blocks)
  • Image URL handling for vision models
  • Tool call conversion with proper nil checks
  • Response format and stream options
  • Debug logging (variable properly defined in package scope)
  • Logprobs and top_logprobs conversion
  • Streaming delta transformation

The code properly handles both non-streaming and streaming responses with appropriate type conversions.

core/providers/huggingface/utils.go (3)

137-166: Good fix for model name validation.

The splitIntoModelProvider function now properly handles invalid input (no slashes) by returning an error instead of producing empty strings. The debug logging helps trace the parsing logic, and the distinction between single-slash (org/model) and multi-slash (provider/org/model) formats is clear.


168-201: Provider routing correctly limited to supported operations.

As clarified in past reviews, only 6 providers (fal-ai, hf-inference, nebius, replicate, sambanova, scaleway) support embedding/speech/transcription operations. The other 13 providers in INFERENCE_PROVIDERS are used for chat/text-generation, which follows a different routing pattern. The error messages have been corrected to reference the appropriate provider names.


257-296: Excellent utility function for JSON numeric handling.

The extractIntFromInterface helper comprehensively handles all numeric types that can result from JSON unmarshaling:

  • All signed and unsigned integer types
  • Float types (with conversion)
  • json.Number with fallback parsing

This is used throughout the provider to safely extract integer parameters from ExtraParams, addressing the type assertion issues flagged in earlier reviews.

core/providers/huggingface/huggingface.go (9)

33-85: Response pooling helpers look correct and safe to reuse.

The sync.Pool setup and acquire/release helpers for HuggingFaceChatResponse, HuggingFaceTranscriptionResponse, and HuggingFaceSpeechResponse correctly reset structs on acquire and only reuse them after the call site is done, which avoids stale state and minimizes allocations. No changes needed here.


87-131: Provider initialization and URL handling are consistent with other providers.

NewHuggingFaceProvider, GetProviderKey, and buildRequestURL follow the existing provider patterns: they apply defaults, configure the fasthttp client (including proxy), normalize BaseURL, honor CustomProviderConfig, and respect per-request path overrides. This looks good and matches the broader design.


132-205: Core HTTP request path is robust and correctly decouples the response body.

completeRequest cleanly centralizes request construction, extra-header injection, auth, context-aware execution, non-200 handling via HandleProviderAPIError, gzip-aware decoding via CheckAndDecodeBody, and copies the body before releasing the fasthttp response to avoid use-after-free. The debug logging is also appropriately gated. No functional issues from this implementation.


207-293: Model listing logic is aligned with Bifrost patterns.

listModelsByKey and ListModels correctly apply operation-allowed checks, build the model hub URL, attach auth and extra headers, handle HTTP and provider-level errors, decode via HuggingFaceListModelsResponse, and then delegate fan-out across keys via HandleMultipleListModelsRequests, including latency and optional raw-response propagation. This looks consistent and complete.


303-375: ChatCompletion request/response wiring looks solid.

The ChatCompletion path correctly:

  • Checks operation permissions.
  • Normalizes the model via splitIntoModelProvider before building the HF request.
  • Uses CheckContextAndGetRequestBody with ToHuggingFaceChatCompletionRequest and explicitly disables streaming.
  • Builds the URL with buildRequestURL, calls completeRequest, and uses the pooled HuggingFaceChatResponse for JSON decoding.
  • Converts to BifrostChatResponse and fills ExtraFields (provider, requested model, request type, latency, raw response when enabled).

This end-to-end flow is cohesive and matches the intended Bifrost provider contract.


377-659: Streaming chat implementation handles SSE and Responses fallback correctly.

The ChatCompletionStream implementation is careful and comprehensive: it validates operation permissions, handles the Responses→Chat fallback via a stream state object, builds a streaming POST with Accept: text/event-stream, and uses bufio.Scanner on BodyStream() with an increased buffer. The loop:

  • Filters comments/empty lines and [DONE].
  • Parses SSE data: lines, detects error payloads via HuggingFaceResponseError and sends a structured BifrostError.
  • Decodes normal chunks into HuggingFaceChatStreamResponse, converts to Bifrost stream responses, populates per-chunk metadata (including latency and chunk index), and optionally attaches raw JSON.
  • Handles the combined usage+content case for Responses fallback by splitting content/usage into separate events.
  • Propagates scanner errors via ProcessAndSendError and ensures resources/channels are released.

This is a solid streaming implementation; no blocking issues identified.


661-683: Responses and ResponsesStream wrappers are straightforward and correct.

These methods reuse the ChatCompletion/ChatCompletionStream implementations and only adapt to the Responses API shape, setting ExtraFields.RequestType, Provider, and ModelRequested appropriately. This is a good reuse of existing logic and behaves as expected.


756-830: Speech and SpeechStream behaviors are coherent with provider capabilities.

Speech correctly reuses the model-mapping pattern (including ProviderTask == "text-to-speech"), builds the HF-specific speech request via ToHuggingFaceSpeechRequest, executes completeRequest, and converts the pooled HuggingFaceSpeechResponse into a BifrostSpeechResponse with fully populated ExtraFields (including latency and optional raw response). SpeechStream explicitly returns an unsupported-operation error tagged with schemas.SpeechStreamRequest, which is appropriate if streaming speech isn’t supported yet.


832-906: Transcription and TranscriptionStream correctly target ASR and error types.

Transcription:

  • Checks operation permissions for schemas.TranscriptionRequest.
  • Splits model/provider and resolves a mapping where ProviderTask == "automatic-speech-recognition".
  • Builds the body via ToHuggingFaceTranscriptionRequest, derives the inference route URL with getInferenceProviderRouteURL, and calls completeRequest.
  • Uses the pooled HuggingFaceTranscriptionResponse to decode and convert to BifrostTranscriptionResponse, filling provider/model/request-type/latency/RawResponse metadata.

TranscriptionStream cleanly returns an unsupported-operation error with the correct schemas.TranscriptionStreamRequest tag. This aligns with the intended ASR behavior and prior fixes to the task and error types.

@qwerty-dvorak qwerty-dvorak force-pushed the 12-05-feat_huggingface_provider_added branch from 4663a80 to 53cb91d Compare December 5, 2025 23:10
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
Makefile (3)

97-98: DUPLICATE COMMENT: which with full path variable—still present despite prior review.

This pattern was flagged in a previous review (commit f279893) but remains in the code. While the AND logic with [ ! -x "$$INSTALLED" ] does provide fallback protection, the condition ! which $$INSTALLED will always be true for full paths (since which cannot resolve absolute paths). Simplify by removing the redundant which check:

- if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \
+ if [ ! -x "$$INSTALLED" ]; then \

103-104: DUPLICATE COMMENT: which fallback for full-path variable—unclear intent and inconsistent with AIR_BIN pattern.

This pattern was flagged in a previous review (commit f279893). Although the [ -x ] check comes first and will handle full paths correctly (short-circuiting the which fallback), the fallback itself is ineffective for full paths and inconsistent with the simpler [ -x "$(AIR_BIN)" ] check used for AIR_BIN on line 90. Consider aligning both patterns:

- @if [ -x "$(GOTESTSUM_BIN)" ] || which $(GOTESTSUM_BIN) > /dev/null 2>&1; then \
+ @if [ -x "$(GOTESTSUM_BIN)" ]; then \

110-111: DUPLICATE COMMENT: which with full path variable—same issue as lines 97–98.

This pattern was flagged in a previous review (commit f279893) but persists here. The AND logic provides practical protection, but the which check is redundant for full paths. Apply the same simplification as line 97:

- if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \
+ if [ ! -x "$$INSTALLED" ]; then \
core/providers/huggingface/huggingface.go (1)

381-387: Bug: Incorrect request type in unsupported operation errors.

Both TextCompletion and TextCompletionStream return errors with schemas.EmbeddingRequest instead of the correct request types.

Apply this diff to fix the error types:

 func (provider *HuggingFaceProvider) TextCompletion(ctx context.Context, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (*schemas.BifrostTextCompletionResponse, *schemas.BifrostError) {
-	return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey())
+	return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionRequest, provider.GetProviderKey())
 }

 func (provider *HuggingFaceProvider) TextCompletionStream(ctx context.Context, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) {
-	return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey())
+	return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionStreamRequest, provider.GetProviderKey())
 }
🧹 Nitpick comments (3)
Makefile (1)

75-85: Next.js fallback chain is reasonable but could be simplified.

The logic prefers local ./node_modules/.bin/next, then npx, then global install. However, the condition structure is a bit convoluted:

  • Line 77: checks if local next exists
  • Line 79: only runs npm install if npx is found (but the intent seems to be to install next if not found)
  • Line 83: fallback comment mentions "may fail on Nix"

The logic works, but consider clarifying the intent—should the npm install on line 81 run unconditionally if local next is missing, or should it depend on npx availability? Currently, it only runs npm install if npx exists, which may not match the intent.

core/schemas/account.go (1)

54-57: Consider adding an Endpoint field for self-hosted deployments.

The current implementation only includes Deployments, which covers model-to-deployment mappings. Per the linked issue, self-hosted endpoints are an optional enhancement. For future extensibility, consider whether an Endpoint field (similar to AzureKeyConfig) would be beneficial for users deploying models on custom infrastructure or using dedicated inference endpoints.

This is non-blocking for the initial implementation.

 type HuggingFaceKeyConfig struct {
+	Endpoint    string            `json:"endpoint,omitempty"`    // Custom HuggingFace inference endpoint URL
 	Deployments map[string]string `json:"deployments,omitempty"` // Mapping of model identifiers to deployment names
 }
core/providers/huggingface/embedding.go (1)

60-104: Consider simplifying zero-usage handling.

The conversion logic is correct. However, lines 94-100 explicitly create a zero-valued BifrostLLMUsage when the HuggingFace response doesn't include usage. You could simplify by leaving Usage as nil when not provided, since the Bifrost schema allows it to be omitted.

If you prefer explicit zero values, the current implementation is fine. Otherwise, you can remove lines 94-100:

 	// Map usage information if available
 	if response.Usage != nil {
 		bifrostResponse.Usage = &schemas.BifrostLLMUsage{
 			PromptTokens:     response.Usage.PromptTokens,
 			CompletionTokens: response.Usage.CompletionTokens,
 			TotalTokens:      response.Usage.TotalTokens,
 		}
-	} else {
-		// Set empty usage if not provided
-		bifrostResponse.Usage = &schemas.BifrostLLMUsage{
-			PromptTokens:     0,
-			CompletionTokens: 0,
-			TotalTokens:      0,
-		}
 	}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4663a80 and 53cb91d.

📒 Files selected for processing (26)
  • .github/workflows/pr-tests.yml (1 hunks)
  • .github/workflows/release-pipeline.yml (4 hunks)
  • Makefile (8 hunks)
  • core/bifrost.go (2 hunks)
  • core/internal/testutil/account.go (3 hunks)
  • core/internal/testutil/responses_stream.go (1 hunks)
  • core/providers/huggingface/chat.go (1 hunks)
  • core/providers/huggingface/embedding.go (1 hunks)
  • core/providers/huggingface/huggingface.go (1 hunks)
  • core/providers/huggingface/huggingface_test.go (1 hunks)
  • core/providers/huggingface/models.go (1 hunks)
  • core/providers/huggingface/speech.go (1 hunks)
  • core/providers/huggingface/transcription.go (1 hunks)
  • core/providers/huggingface/types.go (1 hunks)
  • core/providers/huggingface/utils.go (1 hunks)
  • core/schemas/account.go (2 hunks)
  • core/schemas/bifrost.go (3 hunks)
  • core/schemas/mux.go (2 hunks)
  • docs/apis/openapi.json (1 hunks)
  • docs/contributing/adding-a-provider.mdx (1 hunks)
  • docs/features/unified-interface.mdx (1 hunks)
  • transports/config.schema.json (2 hunks)
  • ui/README.md (1 hunks)
  • ui/lib/constants/config.ts (2 hunks)
  • ui/lib/constants/icons.tsx (1 hunks)
  • ui/lib/constants/logs.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
  • ui/README.md
  • core/schemas/mux.go
  • core/internal/testutil/account.go
  • transports/config.schema.json
  • core/providers/huggingface/utils.go
  • .github/workflows/pr-tests.yml
  • docs/apis/openapi.json
  • core/providers/huggingface/speech.go
  • docs/features/unified-interface.mdx
  • ui/lib/constants/config.ts
  • core/bifrost.go
  • core/internal/testutil/responses_stream.go
👮 Files not reviewed due to content moderation or server errors (1)
  • ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/schemas/bifrost.go
  • core/schemas/account.go
  • core/providers/huggingface/transcription.go
  • docs/contributing/adding-a-provider.mdx
  • Makefile
  • core/providers/huggingface/huggingface_test.go
  • core/providers/huggingface/embedding.go
  • core/providers/huggingface/chat.go
  • ui/lib/constants/logs.ts
  • core/providers/huggingface/models.go
  • ui/lib/constants/icons.tsx
  • core/providers/huggingface/huggingface.go
  • core/providers/huggingface/types.go
🧬 Code graph analysis (6)
core/schemas/bifrost.go (1)
ui/lib/types/config.ts (1)
  • ModelProvider (171-174)
core/providers/huggingface/transcription.go (6)
core/schemas/transcriptions.go (2)
  • BifrostTranscriptionRequest (3-10)
  • BifrostTranscriptionResponse (16-26)
core/providers/huggingface/types.go (5)
  • HuggingFaceTranscriptionRequest (376-381)
  • HuggingFaceTranscriptionRequestParameters (384-387)
  • HuggingFaceTranscriptionGenerationParameters (390-407)
  • HuggingFaceTranscriptionEarlyStopping (411-414)
  • HuggingFaceTranscriptionResponse (447-450)
ui/components/ui/input.tsx (1)
  • Input (15-69)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/elevenlabs/transcription.go (1)
  • ToBifrostTranscriptionResponse (100-150)
core/schemas/bifrost.go (2)
  • BifrostResponseExtraFields (287-296)
  • HuggingFace (51-51)
core/providers/huggingface/embedding.go (3)
core/schemas/embedding.go (4)
  • BifrostEmbeddingRequest (9-16)
  • BifrostEmbeddingResponse (22-28)
  • EmbeddingData (118-122)
  • EmbeddingStruct (124-128)
core/providers/huggingface/types.go (2)
  • HuggingFaceEmbeddingRequest (303-313)
  • HuggingFaceEmbeddingResponse (324-328)
core/schemas/chatcompletions.go (1)
  • BifrostLLMUsage (640-647)
core/providers/huggingface/chat.go (2)
core/schemas/chatcompletions.go (12)
  • BifrostChatRequest (12-19)
  • ChatContentBlockTypeText (497-497)
  • ChatContentBlockTypeImage (498-498)
  • ChatAssistantMessage (541-545)
  • ChatToolMessage (536-538)
  • BifrostResponseChoice (582-590)
  • ChatAssistantMessageToolCall (564-570)
  • ChatNonStreamResponseChoice (605-608)
  • BifrostLogProbs (593-598)
  • LogProb (625-629)
  • ChatStreamResponseChoice (611-613)
  • ChatStreamResponseChoiceDelta (616-622)
core/providers/huggingface/types.go (12)
  • HuggingFaceChatRequest (72-92)
  • HuggingFaceChatMessage (94-102)
  • HuggingFaceContentItem (105-109)
  • HuggingFaceImageRef (111-113)
  • HuggingFaceToolCall (115-119)
  • HuggingFaceFunction (121-125)
  • HuggingFaceResponseFormat (127-130)
  • HuggingFaceStreamOptions (139-141)
  • HuggingFaceTool (143-146)
  • HuggingFaceToolFunction (148-152)
  • HuggingFaceChatResponse (154-161)
  • HuggingFaceChatStreamResponse (215-224)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
  • HuggingFaceListModelsResponse (24-26)
core/schemas/bifrost.go (13)
  • ModelProvider (32-32)
  • RequestType (86-86)
  • ChatCompletionRequest (92-92)
  • ChatCompletionStreamRequest (93-93)
  • TextCompletionRequest (90-90)
  • TextCompletionStreamRequest (91-91)
  • ResponsesRequest (94-94)
  • ResponsesStreamRequest (95-95)
  • EmbeddingRequest (96-96)
  • SpeechRequest (97-97)
  • SpeechStreamRequest (98-98)
  • TranscriptionRequest (99-99)
  • TranscriptionStreamRequest (100-100)
core/schemas/models.go (2)
  • BifrostListModelsResponse (36-45)
  • Model (109-129)
core/providers/huggingface/types.go (1)
core/schemas/models.go (1)
  • Model (109-129)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx

[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...

(QB_NEW_EN_HYPHEN)


[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...

(QB_NEW_EN_HYPHEN)


[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...

(QB_NEW_EN_HYPHEN)


[grammar] ~1475-~1475: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...

(QB_NEW_EN_HYPHEN)


[uncategorized] ~1785-~1785: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...

(GITHUB)

🔇 Additional comments (25)
.github/workflows/release-pipeline.yml (1)

118-118: Environment variable additions are correct and necessary.

The HUGGING_FACE_API_KEY environment variable is properly added to all four release jobs with correct GitHub Secrets reference syntax. The environment variables are inherited by the release scripts' child processes (specifically go test), where HuggingFace tests access the key via os.Getenv("HUGGING_FACE_API_KEY"). This is the correct pattern for passing secrets to test suites during the release workflow.

The documentation already uses consistent naming (HUGGING_FACE_API_KEY with underscore) as shown in docs/contributing/adding-a-provider.mdx, so no documentation updates are needed.

Likely an incorrect or invalid review comment.

Makefile (5)

15-21: Binary path resolution and variable initialization look sound.

The new variables (GOBIN, GOPATH, DEFAULT_GOBIN, AIR_BIN, GOTESTSUM_BIN) properly detect and prioritize local Go binary paths, with sensible fallbacks. Line 21's GOTESTSUM_BIN simplification (removal of the shell which fallback) is an improvement. This approach supports the HuggingFace provider tests by ensuring test tooling is available in predictable locations.


24-29: Color variables using printf are a good improvement.

Replacing hardcoded ANSI escape sequences with shell printf calls is more portable and avoids raw escape codes in the Makefile. This is a solid enhancement.


65-70: Root check in install-ui is appropriate for local development.

The guard against running make as root on developer machines (except CI environments) prevents npm permission issues on systems like Nix. This is a pragmatic improvement.


155-155: Using AIR_BIN variable in dev target is correct.

The change from direct air invocation to $(AIR_BIN) ensures the tool is invoked from the detected/installed path, improving reliability. Good alignment with the binary path resolution strategy.

Also applies to: 163-163


349-349: Test targets correctly use GOTESTSUM_BIN variable.

All test targets (test, test-core, test-plugins) now use $(GOTESTSUM_BIN) instead of direct gotestsum invocations. This ensures consistency with the binary path resolution and supports the HuggingFace provider testing workflow. The change is well-applied across all affected targets.

Also applies to: 401-401, 425-425, 454-454, 548-548

core/schemas/bifrost.go (3)

35-52: LGTM!

The HuggingFace provider constant is correctly added with the lowercase value "huggingface", consistent with the naming convention used by other providers.


55-62: LGTM!

Adding HuggingFace to SupportedBaseProviders is appropriate, allowing it to serve as a base provider for custom provider configurations.


65-83: LGTM!

HuggingFace is correctly added to StandardProviders, completing the registration as a built-in provider.

ui/lib/constants/logs.ts (2)

2-20: LGTM!

The "huggingface" entry is correctly placed in alphabetical order within the KnownProvidersNames array, and the ProviderName type will automatically include it through type derivation.


43-61: LGTM!

The ProviderLabels entry correctly uses the brand-appropriate capitalization "HuggingFace" for user-facing display.

core/schemas/account.go (1)

8-18: LGTM!

The HuggingFaceKeyConfig field is correctly added to the Key struct following the established pattern for provider-specific configurations.

core/providers/huggingface/models.go (1)

16-104: LGTM! Well-structured model listing and method derivation.

The implementation correctly:

  • Handles nil inputs and skips invalid models
  • Derives supported methods from pipeline tags and model tags with comprehensive coverage
  • Pre-allocates slices for performance
  • Returns sorted method lists for consistency
core/providers/huggingface/types.go (2)

28-48: LGTM! Flexible JSON unmarshaling.

The custom UnmarshalJSON correctly handles both the array form [...] and object form {"models": [...]} returned by different HuggingFace API versions, with proper error handling when neither format matches.


416-444: LGTM! Correct handling of union types.

The custom JSON marshaling/unmarshaling for HuggingFaceTranscriptionEarlyStopping properly handles both boolean and string ("never") forms, which aligns with HuggingFace API's flexible parameter schema.

core/providers/huggingface/huggingface_test.go (1)

12-63: LGTM! Comprehensive test configuration.

The test correctly:

  • Gates on HUGGING_FACE_API_KEY environment variable
  • Uses appropriate models for each feature (transcription, speech synthesis, embeddings, chat, vision)
  • Configures comprehensive test scenarios covering chat, streaming, tool calls, images, embeddings, and more
  • Follows testutil patterns with proper setup and cleanup
core/providers/huggingface/transcription.go (2)

9-101: LGTM! Robust parameter handling.

The conversion correctly:

  • Validates inputs with appropriate error messages
  • Uses extractIntFromInterface to handle numeric parameters from JSON (which may be float64 or int)
  • Handles the early_stopping union type (bool or string)
  • Maps all generation parameters comprehensively

103-142: LGTM! Clean response conversion.

The conversion properly:

  • Validates non-nil response and model name
  • Maps transcription chunks to Bifrost segments with timestamps
  • Sets appropriate provider metadata in ExtraFields
core/providers/huggingface/chat.go (2)

12-207: LGTM! Comprehensive chat request conversion.

The conversion correctly:

  • Handles messages, roles, names, and content (string and structured blocks)
  • Safely processes tool calls with nil checks for Function.Name
  • Maps all chat parameters including response format, stream options, and tools
  • Uses json.RawMessage for flexible fields like ToolChoice

209-323: LGTM! Thorough response conversion.

The conversion properly:

  • Validates inputs and constructs base response fields
  • Converts choices, messages, tool calls, and logprobs to Bifrost format
  • Maps usage information when available
  • Sets appropriate ExtraFields metadata
core/providers/huggingface/huggingface.go (5)

389-461: LGTM! Solid chat completion implementation.

The implementation correctly:

  • Checks operation permissions
  • Splits model identifiers into provider and model components
  • Converts requests using the chat converter
  • Handles errors and raw responses appropriately
  • Sets all required ExtraFields metadata

463-745: LGTM! Robust streaming implementation.

The streaming logic properly:

  • Handles SSE parsing with proper line-by-line processing
  • Checks for context cancellation between chunks
  • Parses error responses in the stream
  • Converts HuggingFace stream responses to Bifrost format
  • Supports fallback to ResponsesStream when needed with proper state management
  • Handles combined usage and content chunks by splitting them into separate events

771-840: LGTM! Clean embedding implementation.

The implementation correctly:

  • Validates operation permissions
  • Retrieves provider mapping to get the correct provider-specific model ID
  • Validates the task type matches "feature-extraction"
  • Converts responses with proper error handling

842-918: LGTM! Complete speech synthesis flow.

The implementation correctly:

  • Validates permissions and request body
  • Retrieves and validates provider mapping for "text-to-speech" task
  • Downloads audio from the returned URL
  • Converts response with proper metadata

924-993: LGTM! Correct transcription implementation.

The implementation correctly:

  • Validates permissions and model mapping
  • Checks for "automatic-speech-recognition" task
  • Converts requests and responses with proper error handling
  • Sets appropriate ExtraFields metadata

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Add huggingface as a provider

2 participants