Skip to content

Add support for explicit string character code ranges #5996

@josephjunker

Description

@josephjunker

🚀 Feature Request

It would be nice to have a way to specify more fine-grained character code ranges for string generation.

Motivation

I want to generate alphanumeric strings, but just adding a generator for just that exact case feels overly specific. If I could pass a more fine-grained way of selecting subvalues then I could easily construct whatever specific arbitrary I needed.

Example

// generates a string consisting of single characters from the provided string
const hexGenerator = fc.string({ fromChars: "1234567890ABCDEF" });

// generates a string consisting of characters selected from the union of the below ranges
const alphanumericGenerator = fc.string({
    fromChars: fc.stringBuilders.charRange({
        // numerals
        encoding: "ascii",
        start: 48,
        end: 57
    }).concat(fc.stringBuilders.charRange({
        // uppercase letters
        encoding: "ascii",
        start: 65
        end: 90
    }).concat(fc.stringBuilders.charRange({
        // lowercase letters
        encoding: "ascii",
        start: 97,
        end: 122
    });

fromChars may have wrinkles with multi-codepoint characters. I'm not sure off the top of my head how to handle these, but I feel like this would still be useful even with edge cases around combining characters?

charRange is verbose but would have fewer edge cases. It feels like this sort of semigroup/builder-y interface could be extended to handle combining characters if one wanted. The interface is verbose so if this existed it might be nice to provide some common combinations out-of-the-box.

The two could be combined, I think. Maybe a fromAsciiChars function which returns these charRange objects? That would solve the edge cases with fromChars. It feels like it would be possible to generate multi-codepoint characters using this sort of combinator approach but I don't have an answer for exactly what that would look like.


Side note, I think fast-check is awesome, and if a feature like this was desired I would be happy to try to implement it myself. I didn't want to attempt a drive-by PR for something that may not fit the overall design goals of the library.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions