Skip to content

Restore OpenAI embedding encoding format option#5902

Open
KoreaNirsa wants to merge 3 commits intospring-projects:mainfrom
KoreaNirsa:fix/GH-5901-openai-embedding-encoding-format
Open

Restore OpenAI embedding encoding format option#5902
KoreaNirsa wants to merge 3 commits intospring-projects:mainfrom
KoreaNirsa:fix/GH-5901-openai-embedding-encoding-format

Conversation

@KoreaNirsa
Copy link
Copy Markdown
Contributor

@KoreaNirsa KoreaNirsa commented Apr 29, 2026

Summary

This restores the OpenAI embedding encodingFormat option and passes it through to EmbeddingCreateParams

In Spring AI 2.0.0-M5, the OpenAI model implementation migrated to the official OpenAI Java SDK. During that migration, the previous encodingFormat option from OpenAiEmbeddingOptions was dropped. The OpenAI Java SDK 4.28.0 defaults embedding requests to base64

As a result, even when users configure spring.ai.openai.embedding.options.encoding-format=float, the actual request is sent with encoding_format: base64. This breaks some OpenAI-compatible embedding APIs, such as IONOS bge-m3, where float array embeddings are expected

This change

  • Restores encodingFormat on OpenAiEmbeddingOptions
  • Defaults it to float
  • Passes it to EmbeddingCreateParams
  • Preserves it across builder from(...), merge(...), and from(EmbeddingCreateParams)
  • Adds unit coverage for the default value, explicit base64, and copy/merge behavior

Fixes gh-5901

Regression Cause

In v2.0.0-M4, OpenAiEmbeddingOptions exposed encodingFormat and mapped it to the encoding_format request field

Evidence
OpenAiEmbeddingOptions in v2.0.0-M4 contains

private @JsonProperty("encoding_format") String encodingFormat;

public String getEncodingFormat()
public void setEncodingFormat(String encodingFormat)

public Builder encodingFormat(String encodingFormat)

Source
https://raw.githubusercontent.com/spring-projects/spring-ai/v2.0.0-M4/models/spring-ai-openai/src/main/java/org/springframework/ai/openai/OpenAiEmbeddingOptions.java

In v2.0.0-M5, after the migration to the official OpenAI Java SDK, that option is no longer present in OpenAiEmbeddingOptions, and toOpenAiCreateParams(...) does not call builder.encodingFormat(...)

Evidence
OpenAiEmbeddingOptions in v2.0.0-M5 contains user and dimensions, but no encodingFormat, getEncodingFormat, setEncodingFormat, or builder encodingFormat(...)

Source
https://raw.githubusercontent.com/spring-projects/spring-ai/v2.0.0-M5/models/spring-ai-openai/src/main/java/org/springframework/ai/openai/OpenAiEmbeddingOptions.java

Therefore, when Spring AI does not explicitly set the value, the SDK default base64 is used. That is the regression introduced in M5

Verification

Internal Unit Test

Command
.\mvnw.cmd -pl models/spring-ai-openai -Dtest=OpenAiEmbeddingOptionsTests test

Result
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0
BUILD SUCCESS

Spring Boot Reproducer - Before Fix, 2.0.0-M5

Dependency
org.springframework.ai:spring-ai-starter-model-openai:2.0.0-M5

Application property
spring.ai.openai.api-key=test-key
spring.ai.openai.embedding.options.model=bge-m3
spring.ai.openai.embedding.options.encoding-format=float

Result
AssertionError because the request body did not contain "encoding_format":"float".
image

Spring Boot Verification - After Fix, 2.0.0-SNAPSHOT

Dependency
org.springframework.ai:spring-ai-starter-model-openai:2.0.0-SNAPSHOT

Application property
spring.ai.openai.api-key=test-key
spring.ai.openai.embedding.options.model=bge-m3
spring.ai.openai.embedding.options.encoding-format=float
image

Signed-off-by: KoreaNirsa <islandtim@naver.com>
Signed-off-by: KoreaNirsa <islandtim@naver.com>
@sdeleuze sdeleuze removed the request for review from tzolov April 29, 2026 15:51
@sdeleuze sdeleuze self-assigned this Apr 29, 2026
@sdeleuze
Copy link
Copy Markdown
Contributor

Thanks for this PR, I think we are going to keep the default to the OpenAI SDK one (Base64) and document that in the upgrade notes.

Also I am tempted to expose this option with the strongly typed EmbeddingCreateParams.EncodingFormat enum provided by OpenAI SDK rather than one we artificially introduced at Spring AI level.

See my refined implementation at https://github.com/sdeleuze/spring-ai/tree/openai-embedding-encoding-format.

Signed-off-by: KoreaNirsa <islandtim@naver.com>
@KoreaNirsa
Copy link
Copy Markdown
Contributor Author

@sdeleuze Thanks for the feedback and for sharing the refined implementation.

I hadn’t considered that approach, but I agree it is more appropriate. Aligning with the OpenAI SDK default and using EmbeddingCreateParams.EncodingFormat directly makes the implementation cleaner.

I’ve applied the suggested changes. Could you please take another look when you have a chance?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Spring AI 2.0.0-M5: Embedding fails with OpenAI-compatible APIs (Base64 string vs float[])

3 participants