Skip to content

Commit 2cad388

Browse files
authored
feat(cli): python (#1088)
add python registration support: `t('hello, {name}', name='john', _max_chars=10)`, also supports `declare_static()`/`declare_var()` syntax <!-- greptile_comment --> <h3>Greptile Summary</h3> This PR adds a new `@generaltranslation/python-extractor` package and wires it into the CLI, enabling translation string registration for Python projects using `gt-flask` and `gt-fastapi`. The extractor uses `tree-sitter-python` to parse `.py` files and extract `t()` / `msg()` calls (including `declare_static()` / `declare_var()` compound expressions) into the same `ExtractionResult` format used by the existing React pipeline. Key changes: - New `packages/python-extractor` package with a full AST-based extraction pipeline (`extractImports` → `extractCalls` → `parseStringExpression` → `resolveFunctionVariants`). - CLI framework detection extended to check `pyproject.toml`, `requirements.txt`, and `setup.py` for `gt-flask` / `gt-fastapi` dependencies. - Shared `postProcess.ts` utilities (`calculateHashes`, `dedupeUpdates`, `linkStaticUpdates`) extracted from the React pipeline so both JS and Python extraction paths share the same deduplication and static-ID linking logic. - `PythonLibrary` type and `isPythonLibrary` guard added; `translation/parse.ts` routes Python libraries to the new `createPythonInlineUpdates` pipeline. - **Issue:** In `createPythonInlineUpdates.ts`, `fs.promises.readFile` is called outside the per-file `try/catch`, so file-read errors (permissions, broken symlinks) will abort the entire extraction loop rather than being collected in the `errors` array. - **Issue:** The module-level `crossFileCache` in `resolveFunctionVariants.ts` keys by `filePath::functionName` but does not include calling-context import aliases; two callers with different `declare_static` aliases for the same helper will share a stale cache entry. - **Minor:** `linkStaticUpdates` in `postProcess.ts` calls `.sort()` on the mapped hash array without filtering out potential `undefined` values. <details open><summary><h3>Confidence Score: 3/5</h3></summary> - Safe to merge after fixing the uncaught file-read error in the Python extraction pipeline; the logic and cross-file cache issues are edge cases but worth addressing. - The core extraction logic is sound and well-tested, but the `fs.promises.readFile` call outside the try/catch in `createPythonInlineUpdates.ts` is a real bug that can crash the CLI on permission errors or broken symlinks. The cross-file cache issue in `resolveFunctionVariants.ts` is an edge case that would only surface when two calling files use different aliases for `declare_static`/`declare_var`. The sort-on-undefined in `postProcess.ts` is a minor type safety concern. Overall the feature is substantial and correct for the common case, but the file-read bug should be fixed before shipping. - packages/cli/src/python/parse/createPythonInlineUpdates.ts (uncaught file-read error), packages/python-extractor/src/resolveFunctionVariants.ts (cache key ignores import context) </details> <details><summary><h3>Important Files Changed</h3></summary> | Filename | Overview | |----------|----------| | packages/python-extractor/src/index.ts | New entry point for the python-extractor package; orchestrates parsing, import extraction, and call extraction cleanly; cache-clear exports are absent from this file (noted in prior threads). | | packages/python-extractor/src/extractCalls.ts | Extracts and validates t()/msg() calls from a Python AST; routes compound expressions (f-strings, binary concat, declare_static/declare_var) through parseStringExpression; straightforward and well-covered by tests. | | packages/python-extractor/src/parseStringExpression.ts | Core recursive parser that converts Python string expressions into StringNode trees; handles f-strings, binary concat, ternary, and declare_static/declare_var calls; uses extractImports for GT-package filtering in extractImportsFromRoot, addressing prior concerns. | | packages/python-extractor/src/resolveFunctionVariants.ts | Resolves helper function return variants across files; module-level crossFileCache key (filePath::functionName) does not encode calling-context import aliases, which could produce stale results when two callers use different declare_static/declare_var aliases for the same helper file. | | packages/cli/src/python/parse/createPythonInlineUpdates.ts | Python extraction pipeline for the CLI; fs.promises.readFile is called outside the per-file try/catch, so file-read errors (permissions, broken symlinks) will propagate as uncaught exceptions instead of being collected in the errors array. | | packages/cli/src/extraction/postProcess.ts | Shared post-processing utilities (hashing, dedup, static-ID linking) extracted from the React pipeline; linkStaticUpdates sorts a potentially undefined hash array without guarding against undefined values. | | packages/cli/src/fs/determineFramework/matchSetupPyDependency.ts | Parses setup.py install_requires/extras_require blocks for GT dependencies; correctly handles string boundaries and nested brackets; escaped-backslash fix from prior thread is present. | | packages/cli/src/translation/parse.ts | Routes Python libraries to createPythonInlineUpdates and React/JS libraries to createInlineUpdates using the new isPythonLibrary guard. | </details> </details> <details><summary><h3>Sequence Diagram</h3></summary> ```mermaid sequenceDiagram participant CLI as CLI (translation/parse.ts) participant PY as createPythonInlineUpdates participant EXT as extractFromPythonSource participant IMP as extractImports participant CALLS as extractCalls participant PSE as parseStringExpression participant RFV as resolveFunctionVariants participant RI as resolveImport participant POST as postProcess CLI->>PY: isPythonLibrary(pkg) → true PY->>PY: matchFiles(cwd, patterns) loop each .py file PY->>EXT: extractFromPythonSource(code, filePath) EXT->>IMP: extractImports(rootNode) IMP-->>EXT: ImportAlias[] EXT->>CALLS: extractCalls(rootNode, imports, filePath) CALLS->>PSE: parseStringExpression(firstArg, ctx) alt declare_static / declare_var PSE->>RFV: resolveFunctionInCurrentFile / resolveFunctionInFile RFV->>RI: resolveImportPath(moduleName, filePath) RI-->>RFV: resolved file path RFV-->>PSE: StringNode end PSE-->>CALLS: StringNode CALLS-->>EXT: RawTranslationCall[] EXT-->>PY: ExtractionResult[] end PY->>POST: calculateHashes(updates) PY->>POST: dedupeUpdates(updates) PY->>POST: linkStaticUpdates(updates) PY-->>CLI: { updates, errors, warnings } ``` </details> <!-- greptile_failed_comments --> <details><summary><h3>Comments Outside Diff (1)</h3></summary> 1. `python-extractor-detailed-plan.md`, line 1 ([link](https://github.com/generaltranslation/gt/blob/f98045879fbee352b039b80cbb96472e0e661862/python-extractor-detailed-plan.md#L1)) This detailed plan document is committed to the repo root and is now stale. Its "Phase 2 (Future)" section lists `declare_static`, `declare_var`, and `msg()` support as not yet implemented, but this PR implements all three features. Committed design documents tend to drift out of sync and can mislead future contributors. Consider removing this file or relocating it to a `docs/` or `.github/` directory if you want to preserve it as a living design reference. <details><summary>Prompt To Fix With AI</summary> `````markdown This is a comment left during a code review. Path: python-extractor-detailed-plan.md Line: 1 Comment: This detailed plan document is committed to the repo root and is now stale. Its "Phase 2 (Future)" section lists `declare_static`, `declare_var`, and `msg()` support as not yet implemented, but this PR implements all three features. Committed design documents tend to drift out of sync and can mislead future contributors. Consider removing this file or relocating it to a `docs/` or `.github/` directory if you want to preserve it as a living design reference. How can I resolve this? If you propose a fix, please make it concise. ````` </details> </details> <!-- /greptile_failed_comments --> <sub>Last reviewed commit: 57ce390</sub> > Greptile also left **3 inline comments** on this PR. <!-- /greptile_comment -->
1 parent fb656e2 commit 2cad388

File tree

74 files changed

+4867
-2059
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

74 files changed

+4867
-2059
lines changed

.changeset/wide-mirrors-invent.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
'@generaltranslation/python-extractor': minor
3+
'gt': minor
4+
---
5+
6+
feat: add python support for registration

packages/cli/package.json

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,7 @@
117117
"esbuild": "^0.27.2",
118118
"fast-glob": "^3.3.3",
119119
"fast-json-stable-stringify": "^2.1.0",
120+
"@generaltranslation/python-extractor": "workspace:*",
120121
"generaltranslation": "workspace:*",
121122
"gt-remark": "workspace:*",
122123
"html-entities": "^2.6.0",
@@ -132,6 +133,7 @@
132133
"remark-parse": "^11.0.0",
133134
"remark-stringify": "^11.0.0",
134135
"resolve": "^1.22.10",
136+
"smol-toml": "^1.3.1",
135137
"tsconfig-paths": "^4.2.0",
136138
"unified": "^11.0.5",
137139
"unist-util-visit": "^5.0.0",

packages/cli/src/cli/base.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ import {
6666
getAgentInstructions,
6767
appendAgentInstructions,
6868
} from '../setup/agentInstructions.js';
69-
import { determineLibrary } from '../fs/determineFramework.js';
69+
import { determineLibrary } from '../fs/determineFramework/index.js';
7070
import { INLINE_LIBRARIES } from '../types/libraries.js';
7171
import { handleEnqueue } from './commands/enqueue.js';
7272

packages/cli/src/cli/inline.ts

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -193,10 +193,14 @@ function fallbackToGtReact(library: SupportedLibraries): InlineLibrary {
193193
Libraries.GT_NEXT,
194194
Libraries.GT_NODE,
195195
Libraries.GT_REACT_NATIVE,
196+
Libraries.GT_FLASK,
197+
Libraries.GT_FASTAPI,
196198
].includes(library as Libraries)
197199
? (library as
198200
| typeof Libraries.GT_NEXT
199201
| typeof Libraries.GT_NODE
200-
| typeof Libraries.GT_REACT_NATIVE)
202+
| typeof Libraries.GT_REACT_NATIVE
203+
| typeof Libraries.GT_FLASK
204+
| typeof Libraries.GT_FASTAPI)
201205
: Libraries.GT_REACT;
202206
}

packages/cli/src/cli/python.ts

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
import { Command } from 'commander';
2+
import { SupportedLibraries } from '../types/index.js';
3+
import { InlineCLI } from './inline.js';
4+
import { PythonLibrary } from '../types/libraries.js';
5+
6+
/**
7+
* CLI tool for managing translations with gt-flask and gt-fastapi
8+
*/
9+
export class PythonCLI extends InlineCLI {
10+
constructor(
11+
command: Command,
12+
library: PythonLibrary,
13+
additionalModules?: SupportedLibraries[]
14+
) {
15+
super(command, library, additionalModules);
16+
}
17+
}

packages/cli/src/config/generateSettings.ts

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,17 @@ export const DEFAULT_SRC_PATTERNS = [
3232
'components/**/*.{js,jsx,ts,tsx}',
3333
];
3434

35+
export const DEFAULT_PYTHON_SRC_PATTERNS = ['**/*.py'];
36+
export const DEFAULT_PYTHON_SRC_EXCLUDES = [
37+
'venv/**',
38+
'.venv/**',
39+
'__pycache__/**',
40+
'**/migrations/**',
41+
'**/tests/**',
42+
'**/test_*.py',
43+
'**/*_test.py',
44+
];
45+
3546
/**
3647
* Generates settings from any
3748
* @param flags - The CLI flags to generate settings from
@@ -158,8 +169,8 @@ export async function generateSettings(
158169
// Add publish if not provided
159170
mergedOptions.publish = (gtConfig.publish || flags.publish) ?? false;
160171

161-
// Populate src if not provided
162-
mergedOptions.src = mergedOptions.src || DEFAULT_SRC_PATTERNS;
172+
// Don't default src here — each pipeline (JS/Python) has its own defaults.
173+
// Only set src if the user explicitly provided it via flags or config.
163174

164175
// Resolve all glob patterns in the files object
165176
const compositePatterns = [
Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
import { describe, it, expect } from 'vitest';
2+
import { mapExtractionResultsToUpdates } from '../mapToUpdates.js';
3+
import type { ExtractionResult } from '@generaltranslation/python-extractor';
4+
5+
describe('mapExtractionResultsToUpdates', () => {
6+
it('maps empty results to empty updates', () => {
7+
const updates = mapExtractionResultsToUpdates([]);
8+
expect(updates).toEqual([]);
9+
});
10+
11+
it('maps single result with all metadata fields', () => {
12+
const results: ExtractionResult[] = [
13+
{
14+
dataFormat: 'ICU',
15+
source: 'Hello, {name}!',
16+
metadata: {
17+
id: 'greeting',
18+
context: 'casual',
19+
maxChars: 100,
20+
filePaths: ['app.py'],
21+
staticId: 'static-1',
22+
},
23+
},
24+
];
25+
26+
const updates = mapExtractionResultsToUpdates(results);
27+
28+
expect(updates).toHaveLength(1);
29+
expect(updates[0]).toEqual({
30+
dataFormat: 'ICU',
31+
source: 'Hello, {name}!',
32+
metadata: {
33+
id: 'greeting',
34+
context: 'casual',
35+
maxChars: 100,
36+
filePaths: ['app.py'],
37+
staticId: 'static-1',
38+
},
39+
});
40+
});
41+
42+
it('passes through dataFormat correctly', () => {
43+
const results: ExtractionResult[] = [
44+
{
45+
dataFormat: 'JSX',
46+
source: '<p>Hello</p>',
47+
metadata: {},
48+
},
49+
];
50+
51+
const updates = mapExtractionResultsToUpdates(results);
52+
expect(updates[0].dataFormat).toBe('JSX');
53+
});
54+
55+
it('handles missing optional metadata', () => {
56+
const results: ExtractionResult[] = [
57+
{
58+
dataFormat: 'ICU',
59+
source: 'Simple string',
60+
metadata: {},
61+
},
62+
];
63+
64+
const updates = mapExtractionResultsToUpdates(results);
65+
66+
expect(updates).toHaveLength(1);
67+
expect(updates[0].metadata).toEqual({});
68+
expect(updates[0].metadata.id).toBeUndefined();
69+
expect(updates[0].metadata.context).toBeUndefined();
70+
expect(updates[0].metadata.maxChars).toBeUndefined();
71+
});
72+
73+
it('preserves filePaths array', () => {
74+
const results: ExtractionResult[] = [
75+
{
76+
dataFormat: 'ICU',
77+
source: 'Multi-file string',
78+
metadata: {
79+
filePaths: ['routes/index.py', 'routes/auth.py'],
80+
},
81+
},
82+
];
83+
84+
const updates = mapExtractionResultsToUpdates(results);
85+
expect(updates[0].metadata.filePaths).toEqual([
86+
'routes/index.py',
87+
'routes/auth.py',
88+
]);
89+
});
90+
91+
it('maps multiple results', () => {
92+
const results: ExtractionResult[] = [
93+
{
94+
dataFormat: 'ICU',
95+
source: 'Hello',
96+
metadata: { id: 'hello' },
97+
},
98+
{
99+
dataFormat: 'ICU',
100+
source: 'Goodbye',
101+
metadata: { id: 'goodbye', context: 'farewell' },
102+
},
103+
];
104+
105+
const updates = mapExtractionResultsToUpdates(results);
106+
expect(updates).toHaveLength(2);
107+
expect(updates[0].source).toBe('Hello');
108+
expect(updates[1].source).toBe('Goodbye');
109+
expect(updates[1].metadata.context).toBe('farewell');
110+
});
111+
});
Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
import { describe, it, expect } from 'vitest';
2+
import {
3+
calculateHashes,
4+
dedupeUpdates,
5+
linkStaticUpdates,
6+
} from '../postProcess.js';
7+
import type { Updates } from '../../types/index.js';
8+
9+
describe('calculateHashes', () => {
10+
it('generates consistent hashes for same input', async () => {
11+
const updates1: Updates = [
12+
{ dataFormat: 'ICU', source: 'hello', metadata: {} },
13+
];
14+
const updates2: Updates = [
15+
{ dataFormat: 'ICU', source: 'hello', metadata: {} },
16+
];
17+
18+
await calculateHashes(updates1);
19+
await calculateHashes(updates2);
20+
21+
expect(updates1[0].metadata.hash).toBeDefined();
22+
expect(updates1[0].metadata.hash).toBe(updates2[0].metadata.hash);
23+
});
24+
25+
it('generates different hashes for different sources', async () => {
26+
const updates: Updates = [
27+
{ dataFormat: 'ICU', source: 'hello', metadata: {} },
28+
{ dataFormat: 'ICU', source: 'world', metadata: {} },
29+
];
30+
31+
await calculateHashes(updates);
32+
33+
expect(updates[0].metadata.hash).not.toBe(updates[1].metadata.hash);
34+
});
35+
});
36+
37+
describe('dedupeUpdates', () => {
38+
it('removes duplicates with same hash, merges filePaths', () => {
39+
const updates: Updates = [
40+
{
41+
dataFormat: 'ICU',
42+
source: 'hello',
43+
metadata: { hash: 'h1', filePaths: ['pathA'] },
44+
},
45+
{
46+
dataFormat: 'ICU',
47+
source: 'hello',
48+
metadata: { hash: 'h1', filePaths: ['pathB'] },
49+
},
50+
];
51+
52+
dedupeUpdates(updates);
53+
54+
expect(updates).toHaveLength(1);
55+
expect(updates[0].metadata.filePaths).toEqual(['pathA', 'pathB']);
56+
});
57+
58+
it('keeps distinct entries with different hashes', () => {
59+
const updates: Updates = [
60+
{
61+
dataFormat: 'ICU',
62+
source: 'hello',
63+
metadata: { hash: 'h1', filePaths: ['pathA'] },
64+
},
65+
{
66+
dataFormat: 'ICU',
67+
source: 'world',
68+
metadata: { hash: 'h2', filePaths: ['pathB'] },
69+
},
70+
];
71+
72+
dedupeUpdates(updates);
73+
74+
expect(updates).toHaveLength(2);
75+
});
76+
77+
it('handles entries without hashes', () => {
78+
const updates: Updates = [
79+
{ dataFormat: 'ICU', source: 'no-hash', metadata: {} },
80+
{
81+
dataFormat: 'ICU',
82+
source: 'has-hash',
83+
metadata: { hash: 'h1', filePaths: ['pathA'] },
84+
},
85+
];
86+
87+
dedupeUpdates(updates);
88+
89+
expect(updates).toHaveLength(2);
90+
});
91+
});
92+
93+
describe('linkStaticUpdates', () => {
94+
it('groups entries by temporary staticId and assigns shared hash', () => {
95+
const updates: Updates = [
96+
{
97+
dataFormat: 'ICU',
98+
source: 'variant-a',
99+
metadata: { hash: 'ha', staticId: 'temp-static' },
100+
},
101+
{
102+
dataFormat: 'ICU',
103+
source: 'variant-b',
104+
metadata: { hash: 'hb', staticId: 'temp-static' },
105+
},
106+
];
107+
108+
linkStaticUpdates(updates);
109+
110+
// Both should now share the same staticId (derived from their hashes)
111+
expect(updates[0].metadata.staticId).toBe(updates[1].metadata.staticId);
112+
// The staticId should have been replaced (no longer the temporary value)
113+
expect(updates[0].metadata.staticId).not.toBe('temp-static');
114+
});
115+
116+
it('does not modify entries without staticId', () => {
117+
const updates: Updates = [
118+
{ dataFormat: 'ICU', source: 'no-static', metadata: { hash: 'h1' } },
119+
];
120+
121+
linkStaticUpdates(updates);
122+
123+
expect(updates[0].metadata.staticId).toBeUndefined();
124+
});
125+
});
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
export type { ExtractionResult, ExtractionMetadata } from './types.js';
2+
export { mapExtractionResultsToUpdates } from './mapToUpdates.js';
3+
export {
4+
calculateHashes,
5+
dedupeUpdates,
6+
linkStaticUpdates,
7+
} from './postProcess.js';
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
import type { ExtractionResult } from '@generaltranslation/python-extractor';
2+
import type { Updates } from '../types/index.js';
3+
4+
/**
5+
* Maps ExtractionResult[] to Updates[] format used by the CLI pipeline
6+
*/
7+
export function mapExtractionResultsToUpdates(
8+
results: ExtractionResult[]
9+
): Updates {
10+
return results.map((result) => ({
11+
dataFormat: result.dataFormat,
12+
source: result.source,
13+
metadata: {
14+
...(result.metadata.id && { id: result.metadata.id }),
15+
...(result.metadata.context && { context: result.metadata.context }),
16+
...(result.metadata.maxChars != null && {
17+
maxChars: result.metadata.maxChars,
18+
}),
19+
...(result.metadata.filePaths && {
20+
filePaths: result.metadata.filePaths,
21+
}),
22+
...(result.metadata.staticId && { staticId: result.metadata.staticId }),
23+
},
24+
}));
25+
}

0 commit comments

Comments
 (0)