Skip to content

Unicode-safe String manipulation utilities for Rust, operating at the level of extended grapheme clusters (user-perceived characters) as defined in [Unicode Standard Annex #29](https://www.unicode.org/reports/tr29/).

License

benracine/grapheme-cluster-utils

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

grapheme-cluster-utils

Unicode-safe String manipulation utilities for Rust, operating at the level of extended grapheme clusters (user-perceived characters) as defined in Unicode Standard Annex #29.

Features

  • Remove, insert, or replace grapheme clusters in a String without breaking multi-codepoint characters (e.g., emoji, flags, accented letters).
  • Built on top of the unicode-segmentation crate.

Why?

Rust's standard String and char APIs operate on Unicode scalar values, not user-perceived characters. This crate provides helpers to manipulate strings at the grapheme cluster level, so you don't accidentally split emoji, flags, or accented characters.

Examples

use grapheme_cluster_utils::GraphemeClusterUtils;

// Remove the astronaut emoji (grapheme cluster at index 3)
let s = String::from("hi 👩‍🚀!");
let result = s.remove_grapheme_at(3);
assert_eq!(result, "hi !");

// Insert a globe emoji after the space (index 3)
let s = String::from("hi !");
let result = s.insert_grapheme_at(3, "🌍");
assert_eq!(result, "hi 🌍!");

// Replace the astronaut emoji with a globe
let s = String::from("hi 👩‍🚀!");
let result = s.replace_grapheme_at(3, "🌍");
assert_eq!(result, "hi 🌍!");

API

Trait: GraphemeClusterUtils

  • fn remove_grapheme_at(&self, n: usize) -> String
  • fn insert_grapheme_at(&self, n: usize, insert: &str) -> String
  • fn replace_grapheme_at(&self, n: usize, replacement: &str) -> String

Installation

Add to your Cargo.toml:

[dependencies]
grapheme-cluster-utils = "0.1.0"

License

MIT OR Apache-2.0

About

Unicode-safe String manipulation utilities for Rust, operating at the level of extended grapheme clusters (user-perceived characters) as defined in [Unicode Standard Annex #29](https://www.unicode.org/reports/tr29/).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages