Skip to content

No Control Over Scalar Data Representation in Yams Encoder #459

@ajwitt212

Description

@ajwitt212

Executive Summary

The current default encoding behavior in Yams, while compliant with the YAML 1.1 specification, leads to unintentional data type conversion that can break interoperability with systems expecting strict data formats. Specifically, integers 1 and 0 are implicitly converted to booleans (true/false), and floating-point numbers may be written in scientific notation. This proposal requests explicit configuration options on the Yams.Encoder to provide developers with granular control over these conversions, ensuring data integrity and predictability.

The Problem: Implicit Type Conversion

The Yams encoder adheres to the YAML 1.1 standard, which defines implicit typing rules for scalars. This creates two primary challenges:

Integer-to-Boolean Coercion: Any integer value of 1 is serialized as the boolean true, and 0 is serialized as false. This makes it impossible to write the literal integers 1 or 0 to a YAML document using the default settings, which is a critical issue for configurations, database IDs, or enums where 1 and 0 are valid and distinct integer values.

Uncontrolled Scientific Notation: Large or small floating-point numbers are automatically converted to scientific notation (e.g., 1.2e-5). While this is a compact representation, many parsers or configuration systems do not support it or require a standard decimal format.

This lack of control forces developers to implement workarounds or accept a YAML output that may not match their application's requirements.

Proposed Solution: Encoder Configuration Options

To address this, we propose adding clear and explicit options to the Yams.Encoder to manage scalar representation.

A. Control Integer vs. Boolean Encoding

An option should be introduced to disable the implicit conversion of integers to booleans.

Proposed Option: encodeIntegersAsBooleans: Bool (Default: true for backward compatibility)
Behavior:
When true (default), 1 and 0 are encoded as true and false.
When false, all Int values, including 1 and 0, are encoded as their literal integer representation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions