Skip to content

Commit 264aff2

Browse files
authored
feat(typia): add LLMData derive and native lenient parser (#335)
## Summary - add a serde-based `LLMData` runtime trait to `typia` with `parse`, `validate`, and `stringify` - add internal native lenient JSON parsing (markdown code block extraction, junk-prefix skipping, comments, unquoted keys, trailing commas, partial keyword recovery, unclosed recovery, unicode surrogate decoding, depth guard) - replace `typia-macros` scaffold with real `#[derive(LLMData)]` supporting structs/enums and rejecting unions - add runtime + compile-fail tests for parser behavior, serde path failures, and derive constraints - update typia contracts/docs and align AGENTS wording for typia stability contracts ## Validation - `cargo fmt --all` - `TRYBUILD=overwrite cargo test -p typia derive_llm_data_rejects_union -- --exact` - `cargo test`
1 parent cc7144e commit 264aff2

16 files changed

Lines changed: 1270 additions & 65 deletions

AGENTS.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -55,10 +55,10 @@
5555
- `docs/project-thenv.md`: Thenv multi-component project index.
5656
- `docs/project-public-docs.md`: Public docs app project index.
5757
- `docs/project-serde-feather.md`: Serde Feather multi-crate project index.
58-
- `docs/project-typia.md`: Typia scaffold-stage multi-crate project index.
58+
- `docs/project-typia.md`: Typia multi-crate project index.
5959
- `docs/project-dexdex.md`: DexDex multi-runtime project index.
60-
- `docs/crates-typia-core-foundation.md`: Typia core runtime scaffold contract.
61-
- `docs/crates-typia-macros-foundation.md`: Typia macros scaffold contract.
60+
- `docs/crates-typia-core-foundation.md`: Typia core runtime LLM data contract.
61+
- `docs/crates-typia-macros-foundation.md`: Typia macros derive contract.
6262
- `docs/apps-dexdex-desktop-app-foundation.md`: DexDex app runtime and integration foundation contract.
6363
- `docs/apps-dexdex-ui-contract.md`: DexDex UI and interaction contract.
6464
- `docs/apps-dexdex-user-guide-contract.md`: DexDex end-user workflow contract.

Cargo.lock

Lines changed: 24 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

crates/AGENTS.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,8 @@
1111
- `crates/nodeup`: Rust-based Node.js version manager.
1212
- `crates/serde-feather`: Size-first serde runtime-facing core crate.
1313
- `crates/serde-feather-macros`: Proc-macro companion crate for serde-feather.
14-
- `crates/typia`: Type-safe JSON schema validation core scaffold crate.
15-
- `crates/typia-macros`: Proc-macro companion scaffold crate for typia.
14+
- `crates/typia`: Serde-based LLM JSON runtime crate.
15+
- `crates/typia-macros`: Proc-macro derive companion crate for typia.
1616

1717
### Rust Workspace Rules
1818

@@ -46,7 +46,8 @@
4646
### typia-Specific Rules
4747

4848
- Keep `typia` as the runtime-facing crate and `typia-macros` as the proc-macro companion crate.
49-
- Keep scaffold-stage API policy explicit: do not treat v0 public identifiers as stable until documented in `docs/project-typia.md` and `docs/crates-typia-core-foundation.md`.
49+
- Keep stable typia identifiers (`LLMData`, `LlmJsonParseResult`, `LlmJsonParseError`, and `#[derive(LLMData)]`) synchronized with `docs/project-typia.md`, `docs/crates-typia-core-foundation.md`, and `docs/crates-typia-macros-foundation.md`.
50+
- Keep non-contracted v0 identifiers explicitly documented as unstable until promoted in typia contract docs.
5051
- Keep future macro/runtime compatibility constraints synchronized with typia project and crate contracts.
5152

5253
### Multi-Component Contract Sync

crates/typia-macros/Cargo.toml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,14 @@ name = "typia-macros"
33
version = "0.1.0"
44
edition = "2024"
55
license = "MIT"
6-
description = "Proc-macro scaffold crate for typia"
6+
description = "Proc-macro derive crate for typia"
77
publish = false
88

99
[lib]
1010
proc-macro = true
1111

1212
[dependencies]
13+
proc-macro-crate = "3.3.0"
14+
proc-macro2 = "1.0.94"
15+
quote = "1.0.39"
16+
syn = { version = "2.0.100", features = ["full"] }

crates/typia-macros/src/lib.rs

Lines changed: 44 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,50 @@
11
#![forbid(unsafe_code)]
22

3-
//! Proc-macro scaffold crate for `typia`.
4-
//!
5-
//! Public derive identifiers are intentionally not stabilized yet.
3+
//! Proc-macro derive implementation for `typia`.
64
75
use proc_macro::TokenStream;
6+
use proc_macro_crate::{FoundCrate, crate_name};
7+
use proc_macro2::{Span, TokenStream as TokenStream2};
8+
use quote::quote;
9+
use syn::{Data, DeriveInput, Ident, parse_macro_input};
810

9-
/// Internal-only scaffold macro used to keep the proc-macro crate wired into
10-
/// the workspace.
11-
///
12-
/// This identifier is intentionally not part of the stable public contract and
13-
/// may change.
14-
#[doc(hidden)]
15-
#[proc_macro]
16-
pub fn __typia_scaffold(input: TokenStream) -> TokenStream {
17-
input
11+
#[proc_macro_derive(LLMData)]
12+
pub fn derive_llm_data(input: TokenStream) -> TokenStream {
13+
let input = parse_macro_input!(input as DeriveInput);
14+
match expand_llm_data(&input) {
15+
Ok(tokens) => tokens.into(),
16+
Err(error) => error.into_compile_error().into(),
17+
}
18+
}
19+
20+
fn expand_llm_data(input: &DeriveInput) -> syn::Result<TokenStream2> {
21+
match input.data {
22+
Data::Struct(_) | Data::Enum(_) => {}
23+
Data::Union(_) => {
24+
return Err(syn::Error::new_spanned(
25+
input,
26+
"`LLMData` can only be derived for structs and enums",
27+
));
28+
}
29+
}
30+
31+
let typia_path = typia_path();
32+
let ident = &input.ident;
33+
let generics = &input.generics;
34+
let (impl_generics, ty_generics, where_clause) = generics.split_for_impl();
35+
36+
Ok(quote! {
37+
impl #impl_generics #typia_path::LLMData for #ident #ty_generics #where_clause {}
38+
})
39+
}
40+
41+
fn typia_path() -> TokenStream2 {
42+
match crate_name("typia") {
43+
Ok(FoundCrate::Itself) => quote!(crate),
44+
Ok(FoundCrate::Name(name)) => {
45+
let ident = Ident::new(&name.replace('-', "_"), Span::call_site());
46+
quote!(::#ident)
47+
}
48+
Err(_) => quote!(::typia),
49+
}
1850
}

crates/typia/Cargo.toml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,5 +4,17 @@ version = "0.1.0"
44
edition = "2024"
55
license = "MIT"
66
description = "Type-safe JSON schema validation for Rust"
7+
publish = false
8+
9+
[features]
10+
default = ["derive"]
11+
derive = ["dep:typia-macros"]
712

813
[dependencies]
14+
serde = { version = "1.0.219", features = ["derive"] }
15+
serde_json = "1.0.145"
16+
serde_path_to_error = "0.1.17"
17+
typia-macros = { version = "0.1.0", path = "../typia-macros", optional = true }
18+
19+
[dev-dependencies]
20+
trybuild = "1.0.114"

crates/typia/README.md

Lines changed: 40 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,52 @@
11
# typia
22

3-
`typia` is a scaffold-stage Rust project for type-safe JSON schema validation.
3+
`typia` provides serde-based LLM JSON utilities for Rust.
44

5-
## Goal and Current Status
5+
## Stable APIs
66

7-
- Goal: provide a stable runtime foundation for type-safe validation and schema workflows.
8-
- Current status: scaffold only; the crate does not expose stabilized public APIs yet.
9-
- API policy: v0 public function/type and macro identifiers are intentionally not frozen at this stage.
7+
- `LLMData` trait
8+
- `parse(input: &str) -> LlmJsonParseResult<Self>`
9+
- `validate(value: serde_json::Value) -> Result<Self, serde_json::Error>`
10+
- `stringify(&self) -> Result<String, serde_json::Error>`
11+
- `LlmJsonParseResult<T>`
12+
- `LlmJsonParseError`
13+
- `#[derive(LLMData)]` (from `typia-macros`, re-exported by `typia`)
1014

11-
## Component Architecture
15+
## Example
1216

13-
- `core`: `crates/typia` (active scaffold)
14-
- `macros`: `crates/typia-macros` (active scaffold)
17+
```rust
18+
use typia::{
19+
LLMData,
20+
serde::{Deserialize, Serialize},
21+
};
1522

16-
Future macro-generated code must remain compatible with the core runtime contracts.
23+
#[derive(Debug, Serialize, Deserialize, LLMData)]
24+
struct User {
25+
id: u32,
26+
name: String,
27+
}
1728

18-
## Current Non-Goals
29+
fn main() {
30+
let parsed = User::parse("{id: 1, name: \"alice\",}");
31+
println!("{parsed:?}");
32+
}
33+
```
34+
35+
## Lenient Parser Behaviors
36+
37+
`LLMData::parse()` uses typia's internal lenient JSON parser before serde validation.
38+
39+
Supported recovery behaviors:
1940

20-
- Freezing public runtime API identifiers before scaffold-stage contracts are finalized.
21-
- Defining stable derive macro names or expansion schemas before macro interface contracts are stabilized.
22-
- Providing production-ready validation semantics before core and macro contracts are documented as active.
41+
- markdown ` ```json ... ``` ` extraction
42+
- junk prefix skipping before JSON payloads
43+
- JavaScript comments (`//`, `/* ... */`)
44+
- unquoted object keys
45+
- trailing commas
46+
- partial keywords (`tru`, `fal`, `nul`)
47+
- unclosed strings/brackets with partial recovery
48+
- unicode escapes (including surrogate-pair decoding)
49+
- depth guard (`MAX_DEPTH = 512`)
2350

2451
## Local Validation
2552

0 commit comments

Comments
 (0)