Skip to content

Disable transformation of non-ascii characters on storing to CosmosDB #41144

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

zzulanas
Copy link

@zzulanas zzulanas commented May 16, 2025

Description

This PR is to update the json.dumps() function to accept the ensure_ascii=False parameter so we can properly send unicode characters to CosmosDB, which it does support storing. If this parameter is disabled, trying to send strings with a unicode character results in the string being converted to an escaped unicode sequence to comply with being an ASCII character, and that results in transformations like \\ud83d\\udf1f for 🌟, which results in an error in Cosmos when trying to store these values:

Code: InternalServerError
Message: unsupported Unicode escape sequence

I'm assuming it's because of the escape sequence and Python interpreter do not play nice. I stumbled upon this issue when looking for answers, and decided to try implementing the solution proposed by @lidingshan in my local run, and it seems to work. To avoid an indefinite monkeypatch for my team, I am creating this PR to get this investigated/fixed.

Not sure if this is the best/long term solution for properly ensuring the unicode characters are stored in Cosmos, but it seems that Cosmos supports that storage, so idk why we would try and force the ASCII conversion here.

If an SDK is being regenerated based on a new swagger spec, a link to the pull request containing these swagger spec changes has been included above.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

@zzulanas zzulanas requested a review from a team as a code owner May 16, 2025 19:15
@github-actions github-actions bot added Community Contribution Community members are working on the issue Cosmos customer-reported Issues that are reported by GitHub users external to the Azure organization. labels May 16, 2025
Copy link

Thank you for your contribution @zzulanas! We will review the pull request and get back to you soon.

@zzulanas
Copy link
Author

@lmazuel @msyyc @kristapratico hi there, trying to get some attention on this. it's a quick fix and my team needs a fix on this soon.

@kristapratico
Copy link
Member

@simorenoh can you take a look at this PR? Seems to be related to this open issue: #40373

@zzulanas
Copy link
Author

Bumping this again

@simorenoh
Copy link
Member

hi @zzulanas, thanks for opening this PR and for using our SDK. I see what you're trying to do here and we appreciate the effort, but this PR is still missing the same changes to be made to the asynchronous client and testing to be added. We might be able to get to this next week, but if you'd like to get this done this week please ensure that you do both of those so we can properly review it.

A similar change was done to our Java SDK for the user agent, and is the same sort of algorithm that we would base ourselves off of for this type of change - will leave this here for reference in case you are interested in following up on this: Azure/azure-sdk-for-java#40293

If not, we will follow up next week - thanks!

@simorenoh simorenoh added needs-author-feedback Workflow: More information is needed from author to address the issue. and removed needs-author-feedback Workflow: More information is needed from author to address the issue. labels May 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community Contribution Community members are working on the issue Cosmos customer-reported Issues that are reported by GitHub users external to the Azure organization.
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

3 participants