Restore OpenAI embedding encoding format option#5902
Restore OpenAI embedding encoding format option#5902KoreaNirsa wants to merge 3 commits intospring-projects:mainfrom
Conversation
Signed-off-by: KoreaNirsa <islandtim@naver.com>
Signed-off-by: KoreaNirsa <islandtim@naver.com>
|
Thanks for this PR, I think we are going to keep the default to the OpenAI SDK one (Base64) and document that in the upgrade notes. Also I am tempted to expose this option with the strongly typed See my refined implementation at https://github.com/sdeleuze/spring-ai/tree/openai-embedding-encoding-format. |
Signed-off-by: KoreaNirsa <islandtim@naver.com>
|
@sdeleuze Thanks for the feedback and for sharing the refined implementation. I hadn’t considered that approach, but I agree it is more appropriate. Aligning with the OpenAI SDK default and using I’ve applied the suggested changes. Could you please take another look when you have a chance? Thanks! |
Summary
This restores the OpenAI embedding
encodingFormatoption and passes it through toEmbeddingCreateParamsIn Spring AI 2.0.0-M5, the OpenAI model implementation migrated to the official OpenAI Java SDK. During that migration, the previous
encodingFormatoption fromOpenAiEmbeddingOptionswas dropped. The OpenAI Java SDK 4.28.0 defaults embedding requests tobase64As a result, even when users configure
spring.ai.openai.embedding.options.encoding-format=float, the actual request is sent withencoding_format: base64. This breaks some OpenAI-compatible embedding APIs, such as IONOS bge-m3, where float array embeddings are expectedThis change
encodingFormatonOpenAiEmbeddingOptionsfloatEmbeddingCreateParamsfrom(...),merge(...), andfrom(EmbeddingCreateParams)base64, and copy/merge behaviorFixes gh-5901
Regression Cause
In
v2.0.0-M4,OpenAiEmbeddingOptionsexposedencodingFormatand mapped it to theencoding_formatrequest fieldEvidence
OpenAiEmbeddingOptionsinv2.0.0-M4containsSource
https://raw.githubusercontent.com/spring-projects/spring-ai/v2.0.0-M4/models/spring-ai-openai/src/main/java/org/springframework/ai/openai/OpenAiEmbeddingOptions.java
In
v2.0.0-M5, after the migration to the official OpenAI Java SDK, that option is no longer present inOpenAiEmbeddingOptions, andtoOpenAiCreateParams(...)does not callbuilder.encodingFormat(...)Evidence
OpenAiEmbeddingOptionsinv2.0.0-M5containsuseranddimensions, but noencodingFormat,getEncodingFormat,setEncodingFormat, or builderencodingFormat(...)Source
https://raw.githubusercontent.com/spring-projects/spring-ai/v2.0.0-M5/models/spring-ai-openai/src/main/java/org/springframework/ai/openai/OpenAiEmbeddingOptions.java
Therefore, when Spring AI does not explicitly set the value, the SDK default
base64is used. That is the regression introduced in M5Verification
Internal Unit Test
Spring Boot Reproducer - Before Fix, 2.0.0-M5
Spring Boot Verification - After Fix, 2.0.0-SNAPSHOT