Skip to content

Grammar rejects reserved keywords (e.g. for) as attribute names — fixed upstream in amplify-education/python-hcl2 #152

@calas

Description

@calas

Summary

bc-python-hcl2 (0.4.3 and current master — identical SHAs) fails to parse Terraform files that use HCL reserved keywords as attribute names in block bodies. Minimal repro:

block {
  for = "2m"
}
$ python3 -c "import hcl2; print(hcl2.load(open('min.tf')))"
...
lark.exceptions.UnexpectedToken: Unexpected token Token('FOR', 'for') at line 2, column 3.

Real Terraform / OpenTofu accepts this (it's a valid Grafana Cloud grafana_rule_group.rule { for = "2m" } attribute — see provider docs). The canonical upstream parser amplify-education/python-hcl2 has fixed this specific class of bug — the fix is a 1-line grammar change already present upstream.

Environment

  • bc-python-hcl2: 0.4.3 (latest tag; master identical per gh api repos/bridgecrewio/python-hcl2/compare/0.4.3...masterahead_by: 0, behind_by: 0)
  • Python: 3.12 / 3.13
  • Downstream impact: bridgecrewio/checkov#7526 — Checkov silently skips files with this pattern, leaving them unscanned.

Root cause (grammar-level)

bc-python-hcl2's grammar (hcl2/hcl2.lark)

Block bodies use the plain identifier production for attribute names. When the lexer emits Token('FOR', 'for') for the reserved word, the LALR parser state in attribute position has no rule to accept it — hence UnexpectedToken.

Canonical python-hcl2's grammar (current release 8.1.2)

Line 101 of hcl2/hcl2.lark:

_attribute_name : identifier | keyword | literal_value

This explicitly allows reserved keywords (and even literal values) in attribute-name position. That's the minimal fix: introduce _attribute_name and use it in the place where attributes are bound (currently just identifier).

Reproduction that the canonical parser already handles

$ pip install python-hcl2==8.1.2
$ python3 -c "import hcl2; print(hcl2.load(open('min.tf')))"
{'block': [{'for': '"2m"', '__is_block__': True}]}

Suggested action

Two paths — either works:

  1. Port the grammar fix: cherry-pick the _attribute_name : identifier | keyword | literal_value rule (plus any adjacent tweaks) from amplify-education/python-hcl2 into bc-python-hcl2's hcl2.lark. Scope is small — one grammar rule and the two or three productions that reference identifier in attribute-name position.

  2. Re-parent onto canonical upstream: drop the fork entirely in favour of amplify-education/python-hcl2 (current: 8.1.2, actively maintained). This is a larger change for consumers but eliminates a class of lag bugs. The two have drifted significantly (bc is at 0.4.x, upstream at 8.x), so this would be a major-version bump downstream regardless.

Option 1 is the lower-risk targeted fix; option 2 resolves "will we need to re-fix the next grammar-level issue" once and for all.

Concrete repro files

min.tf (1 attribute, isolating the keyword):

block {
  for = "2m"
}

realistic.tf (mirrors the Grafana provider usage pattern):

resource "grafana_rule_group" "alerts" {
  rule {
    name      = "High Disk Usage"
    condition = "B"

    for            = "2m"
    no_data_state  = "NoData"
  }
}

Both fail with the same KeyError: 'FOR'UnexpectedToken trace in bc-python-hcl2, both succeed in canonical python-hcl2 ≥ 8.1.2.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions