Open
Description
Description
Java driver doesn't process strings with emojis (e.g. 😎) correctly. Instead, it fails on the Rust side's bytes parsing.
Environment
- TypeDB distribution: Core
- TypeDB version: 3.0.0-alpha-6 and earlier
- Environment: Mac
Use a bdd test (available in our BDD repo as cannot create database with an emoji
in connection/database
):
Background:
Given typedb starts
Given connection opens with default authentication
Given connection is open: true
Given connection has 0 databases
Scenario: cannot create database with an incorrect name
Then connection create database: 😎; fails
The database name's parsing will fail with an error:
thread '<unnamed>' panicked at c/src/memory.rs:109:13:
called `Result::unwrap()` on an `Err` value: Utf8Error { valid_up_to: 0, error_len: Some(1) }
If we print the received bytes on the Rust side, it will show eda0bdedb88e
. In the meantime, we'd expect f09f988e
as the UTF-8 representation for this emoji.
Characters inside the Basic Multilingual Plane (like Chinese chars) are processed correctly. The issue seems to be exclusive to chars outside of BMP.
We'll need to modify strings processing in SWIG for Java (and probably other languages... at least Python works correctly, others will be tested later).
Metadata
Metadata
Assignees
Labels
No labels