Skip to content

Conversation

@xiaochen-zhou
Copy link
Contributor

@xiaochen-zhou xiaochen-zhou commented Jul 28, 2025

Purpose of this pull request

Reduce embedding precision from double to float,close #9611

Does this PR introduce any user-facing change?

no

How was this patch tested?

Exists tests

Check list

@xiaochen-zhou
Copy link
Contributor Author

I think we can start by reducing the embedding precision from double to float. The precision loss isn’t just happening with Zhipu—it’s actually an issue with almost all models where the embedding type is returned as double, like

Qianfan

image

openai model:

image

So, as a quick fix, we can switch to float for now and add a note in the docs to let users know. @Hisoka-X

@Hisoka-X
Copy link
Member

Hisoka-X commented Jul 29, 2025

So, as a quick fix, we can switch to float for now and add a note in the docs to let users know. @Hisoka-X

+1. Next step, we should support double vector type.

@Hisoka-X
Copy link
Member

Thanks @xiaochen-zhou . Could you add a test case to cover it?

@xiaochen-zhou
Copy link
Contributor Author

Thanks @xiaochen-zhou . Could you add a test case to cover it?

OK.

@loupipalien
Copy link
Contributor

So, as a quick fix, we can switch to float for now and add a note in the docs to let users know. @Hisoka-X

+1. Next step, we should support double vector type.

@Hisoka-X @xiaochen-zhou Another question, is there a plan to support multimodal embeddings?https://www.volcengine.com/docs/82379/1523520

@xiaochen-zhou
Copy link
Contributor Author

So, as a quick fix, we can switch to float for now and add a note in the docs to let users know. @Hisoka-X

+1. Next step, we should support double vector type.

@Hisoka-X @xiaochen-zhou Another question, is there a plan to support multimodal embeddings?https://www.volcengine.com/docs/82379/1523520

I think this suggestion is great, and I would be happy to try implementing it. @Hisoka-X

@Hisoka-X
Copy link
Member

So, as a quick fix, we can switch to float for now and add a note in the docs to let users know. @Hisoka-X

+1. Next step, we should support double vector type.

@Hisoka-X @xiaochen-zhou Another question, is there a plan to support multimodal embeddings?https://www.volcengine.com/docs/82379/1523520

+1

Copy link
Member

@Hisoka-X Hisoka-X left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@corgy-w corgy-w merged commit c1d2172 into apache:dev Jul 31, 2025
5 checks passed
@xiaochen-zhou xiaochen-zhou deleted the embedding-float branch August 3, 2025 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] [connector-elasticsearch] vector field type convert issue cause dimensions double

4 participants