Skip to content

Commit 79d756a

Browse files
authored
feat(fs-bq-import-collection): add transformFunction option (#2251)
1 parent f8f496e commit 79d756a

File tree

7 files changed

+67
-8
lines changed

7 files changed

+67
-8
lines changed

firestore-bigquery-export/guides/IMPORT_EXISTING_DOCUMENTS.md

+32
Original file line numberDiff line numberDiff line change
@@ -139,3 +139,35 @@ This helps you quickly identify problematic documents and take action accordingl
139139
To retry the failed imports, you can use the output file to manually inspect or reprocess the documents. For example, you could create a script that reads the failed paths and reattempts the import.
140140
141141
> **Note:** If the specified file already exists, it will be **cleared** before writing new failed batch paths.
142+
143+
### Using a Transform Function
144+
145+
You can optionally provide a transform function URL (`--transform-function-url` or `-f`) that will transform document data before it's written to BigQuery. The transform function should should recieve document data and return transformed data. The payload will contain the following:
146+
147+
```
148+
{
149+
data: [{
150+
insertId: int;
151+
json: {
152+
timestamp: int;
153+
event_id: int;
154+
document_name: string;
155+
document_id: int;
156+
operation: ChangeType;
157+
data: string;
158+
},
159+
}]
160+
}
161+
```
162+
163+
The response should be identical in structure.
164+
165+
Example usage of the script with transform function option:
166+
167+
```shell
168+
npx @firebaseextensions/fs-bq-import-collection --non-interactive \
169+
-P <PROJECT_ID> \
170+
-s <COLLECTION_PATH> \
171+
-d <DATASET_ID> \
172+
-f https://us-west1-my-project.cloudfunctions.net/transformFunction
173+
```

firestore-bigquery-export/scripts/import/package-lock.json

+4-8
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

firestore-bigquery-export/scripts/import/src/config.ts

+24
Original file line numberDiff line numberDiff line change
@@ -170,6 +170,21 @@ const questions = [
170170
type: "confirm",
171171
default: false,
172172
},
173+
{
174+
message: "What's the URL of your transform function? (Optional)",
175+
name: "transformFunctionUrl",
176+
type: "input",
177+
default: "",
178+
validate: (value) => {
179+
if (!value) return true;
180+
try {
181+
new URL(value);
182+
return true;
183+
} catch {
184+
return "Please enter a valid URL or leave empty";
185+
}
186+
},
187+
},
173188
{
174189
message: "Would you like to use a local firestore emulator?",
175190
name: "useEmulator",
@@ -213,6 +228,15 @@ export async function parseConfig(): Promise<CliConfig | CliConfigError> {
213228
if (program.datasetLocation === undefined) {
214229
errors.push("DatasetLocation is not specified.");
215230
}
231+
232+
if (program.transformFunctionUrl) {
233+
try {
234+
new URL(program.transformFunctionUrl);
235+
} catch {
236+
errors.push("Transform function URL is invalid");
237+
}
238+
}
239+
216240
if (!validateBatchSize(program.batchSize)) {
217241
errors.push("Invalid batch size.");
218242
}

firestore-bigquery-export/scripts/import/src/index.ts

+1
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ const run = async (): Promise<number> => {
8181
wildcardIds: queryCollectionGroup,
8282
useNewSnapshotQuerySyntax,
8383
bqProjectId: bigQueryProjectId,
84+
transformFunction: config.transformFunctionUrl,
8485
});
8586

8687
await initializeDataSink(dataSink, config);

firestore-bigquery-export/scripts/import/src/program.ts

+4
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,10 @@ export const getCLIOptions = () => {
5454
"-u, --use-new-snapshot-query-syntax [true|false]",
5555
"Whether to use updated latest snapshot query"
5656
)
57+
.option(
58+
"-f, --transform-function-url <transform-function-url>",
59+
"URL of function to transform data before export (e.g., https://us-west1-project.cloudfunctions.net/transform)"
60+
)
5761
.option(
5862
"-e, --use-emulator [true|false]",
5963
"Whether to use the firestore emulator"

firestore-bigquery-export/scripts/import/src/types.ts

+1
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ export interface CliConfig {
1616
rawChangeLogName: string;
1717
cursorPositionFile: string;
1818
failedBatchOutput?: string;
19+
transformFunctionUrl?: string;
1920
}
2021

2122
export interface CliConfigError {

firestore-bigquery-export/scripts/import/src/worker.ts

+1
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,7 @@ async function processDocuments(
7373
wildcardIds: true,
7474
skipInit: true,
7575
useNewSnapshotQuerySyntax: config.useNewSnapshotQuerySyntax,
76+
transformFunction: config.transformFunctionUrl,
7677
});
7778

7879
// Process documents in batches until we've covered the entire partition

0 commit comments

Comments
 (0)