Skip to content

Commit 89b3a10

Browse files
seayang-nvbinaryaaronmemadi-nv
authored
chore: add field instructions and fixed cross references (#138)
<!-- SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. --> <!-- SPDX-License-Identifier: Apache-2.0 --> <!-- Thank you for contributing to Safe Synthesizer! --> # Summary <!-- Brief description of changes --> ## Pre-Review Checklist <!-- These checks should be completed before a PR is reviewed, --> <!-- but you can submit a draft early to indicate that the issue is being worked on. --> Ensure that the following pass: - [ ] `make format && make lint` or via prek validation. - [ ] `make test` passes locally - [ ] `make test-e2e` passes locally - [ ] `make test-ci-container` passes locally (recommended) ## Pre-Merge Checklist <!-- These checks need to be completed before a PR is merged, --> <!-- but as PRs often change significantly during review, --> <!-- it's OK for them to be incomplete when review is first requested. --> - [ ] New or updated tests for any fix or new behavior - [ ] Updated documentation for new features and behaviors, including docstrings for API docs. ## Other Notes <!-- Please add the issue number that should be closed when this PR is merged. --> - Closes #<issue> --------- Signed-off-by: Sean Yang <seayang@nvidia.com> Signed-off-by: memadi <memadi@nvidia.com> Signed-off-by: aagonzales <aagonzales@nvidia.com> Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com> Co-authored-by: Aaron Gonzales <aagonzales@nvidia.com> Co-authored-by: Marjan Emadi <memadi@nvidia.com>
1 parent e780c22 commit 89b3a10

1 file changed

Lines changed: 69 additions & 13 deletions

File tree

STYLE_GUIDE.md

Lines changed: 69 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -459,6 +459,49 @@ Put constructor `Args:` in the class docstring, not on `__init__`. IDEs (Cursor,
459459

460460
Include both when a class has nontrivial constructor parameters AND public attributes worth documenting. For simple classes where the args and attributes are the same fields, `Args:` alone is sufficient.
461461

462+
#### Field-level docstrings for Pydantic models and dataclasses
463+
464+
Pydantic models and dataclasses: document each field with an inline docstring immediately after the field definition. Omit the `Attributes:` section to avoid duplication. Regular classes that set attributes in `__init__` should still use `Attributes:` (as in the Tier 3 example above).
465+
466+
```python
467+
class RopeScaling(BaseModel):
468+
"""Rotary Position Embedding (RoPE) scaling configuration.
469+
470+
Encapsulates the parameters needed to extend a model's context
471+
window via RoPE scaling.
472+
"""
473+
474+
rope_type: Annotated[
475+
Literal["linear", "dynamic", "default", "yarn", "llama3"],
476+
Field(description="Type of rope scaling"),
477+
] = "default"
478+
"""Scaling algorithm (``"linear"``, ``"dynamic"``, ``"default"``, ``"yarn"``, or ``"llama3"``)."""
479+
480+
factor: Annotated[float, Field(description="Multiplier for rope scaling")] = 1.0
481+
"""Context-window multiplier, clamped to ``MAX_ROPE_SCALING_FACTOR``."""
482+
483+
theta: Annotated[float, Field(description="Theta for rope scaling")] = 10000.0
484+
"""Base frequency for rotary embeddings."""
485+
```
486+
487+
When a field uses `Field(description=...)`, duplicate the description in the inline docstring.
488+
489+
Fields without defaults, with defaults, and with `Field()` all follow the same pattern:
490+
491+
```python
492+
model_name: str
493+
"""HuggingFace model identifier or local path."""
494+
495+
is_adapter: bool = False
496+
"""Whether an adapter checkpoint is loaded."""
497+
498+
base_max_seq_length: Annotated[
499+
int | None,
500+
Field(description="Context window before rope scaling"),
501+
] = None
502+
"""Context window size before RoPE scaling."""
503+
```
504+
462505
#### Before and after
463506

464507
Vague module docstring:
@@ -508,36 +551,37 @@ def teardown(self) -> None:
508551
"""
509552
```
510553

511-
Class without Attributes section:
554+
Pydantic model without field documentation:
512555

513556
```python
514557
# Before
515558
class SafeSynthesizerParameters(Parameters):
516559
"""Main configuration class for the Safe Synthesizer pipeline."""
517560

518-
# After
561+
# After -- inline field docstrings instead of Attributes: section
519562
class SafeSynthesizerParameters(Parameters):
520563
"""Main configuration class for the Safe Synthesizer pipeline.
521564
522565
Orchestrates all aspects of synthetic data generation including training,
523566
generation, privacy, evaluation, and data handling. Provides cross-field
524567
validation to ensure parameter compatibility.
525568
526-
Attributes:
527-
data: Data parameters (holdout ratio, column config, etc.).
528-
replace_pii: PII replacement parameters.
529-
training: Training hyperparameters (learning rate, epochs, LoRA config).
530-
generation: Generation parameters (temperature, top_p, num_records).
531-
privacy: Differential privacy parameters (epsilon, delta).
532-
evaluation: Evaluation component toggles and settings.
533-
enable_synthesis: Enable synthesizing new data by training a model.
534-
enable_replace_pii: Enable replacing PII in the data.
535-
536569
Example:
537570
config = SafeSynthesizerParameters.from_yaml("config.yaml")
538571
synthesizer = SafeSynthesizer(config).with_data_source("data.csv")
539572
synthesizer.run()
540573
"""
574+
575+
data: DataParameters = Field(default_factory=DataParameters, description="...")
576+
"""Data parameters (holdout ratio, column config, etc.)."""
577+
578+
training: TrainingHyperparams = Field(default_factory=TrainingHyperparams, description="...")
579+
"""Training hyperparameters (learning rate, epochs, LoRA config)."""
580+
581+
generation: GenerateParameters = Field(default_factory=GenerateParameters, description="...")
582+
"""Generation parameters (temperature, top_p, num_records)."""
583+
584+
# ... remaining fields follow the same pattern ...
541585
```
542586

543587
Generator with `Yields:` instead of `Returns:`:
@@ -607,7 +651,19 @@ The before/after examples above demonstrate most rules. These additional points
607651
- Document side effects, thread safety, and idempotency guarantees where applicable
608652
- Use `Example:` sections with working code for public API methods
609653
- Complex code deserves proportionally detailed explanation -- err on the side of more context
610-
- Cross-references in docstrings: use double backticks (` `` `) for inline code, `:meth:`method_name` `, `:class:`ClassName` `, and `:func:`function_name` ` for API cross-links in `MkDocs`/Sphinx
654+
- Cross-references in docstrings: use double backticks for inline code. For clickable API cross-links, use the `mkdocstrings` autorefs syntax:
655+
656+
```
657+
[`display`][full.dotted.path]
658+
```
659+
660+
For example:
661+
662+
```
663+
[`from_config`][nemo_safe_synthesizer.llm.metadata.ModelMetadata.from_config]
664+
```
665+
666+
Do not use the Sphinx `:meth:` / `:class:` / `:func:` syntax -- it renders as literal text in MkDocs
611667

612668
### Patterns to avoid
613669

0 commit comments

Comments
 (0)