Description
Describe the feature
I propose that the bundling process for NodejsFunctions be refactored such that the aws cdk toolkit can orchestrate bundling in parallel with synth, and that the toolkit performs this orchestration when the --asset-parallelism
option is truthy.
Use Case
Currently, NodejsFunction
s inefficiently bundle synchronously during synth regardless of the --asset-parallelism
option for deployments. For heavily serverless projects with dozens of Lambdas or more, this massively slows synth and deployment time. The proposed feature will reduce deployment times and, alongside options like hotswap and watch, increase developer's iteration velocity in heavily serverless NodeJs projects.
Proposed Solution
The following is intended for illustration. It is neither elegant nor, frankly, a "good" solution to this problem.
The Bundling
class of the module can expose the interface to allow the toolkit to command parallel bundling:
// @aws-cdk/aws-lambda-nodejs/lib/bundling.ts@Bundling
private static parallelBundling: boolean = false;
public static set parallelBundling(parallelBundling: boolean) {
Bundling.parallelBundling = parallelBundling;
}
private static staged: { [outputDir: string]: { tempFile: string, inputDir: string, tryBundling: () => boolean } } = {};
public static bundleInParallel = async (): Promise<void> => {
let bundlings: Array<Promise<void>> = []
for (const [outputDir, { tryBundling, inputDir, tempFile }] of Object.entries(Bundling.staged)) {
bundlings.push((async () => {
// Need to swap the directories back for staging
fs.renameSync(outputDir, inputDir);
// Need to remove the temporary file that prevents AssetStaging from throwing
fs.rmSync(tempFile);
const success = tryBundling();
if (!success) {
const assetDir = path.basename(outputDir);
throw new Error(`Failed to bundle NodejsFunction for ${assetDir}.`);
}
// Okay to return the directories to their final locations
fs.renameSync(inputDir, outputDir);
})());
}
const results = await Promise.allSettled(bundlings)
const errors: PromiseRejectedResult[] = []
for (const result of results) {
if (result.status === "rejected") {
errors.push(result);
}
}
if (errors.length) {
errors.map(({ reason }) => console.error(reason))
throw new Error("One or more assets failed to bundle")
}
}
CDK Toolkit must set Bundling.parallelBundling = true
to force this mode of operation. It can then call bundleInParallel
to create a promise that will resolve once all bundling operations return successfully. For example:
// aws-cdk/lib/[email protected]
public async deploy(options: DeployOptions) {
if (options.watch) {
return this.watch(options);
}
lambdaNodeJs.parallelBundling = options.assetParallelism;
if (options.notificationArns) {
options.notificationArns.map( arn => {
if (!validateSnsTopicArn(arn)) {
throw new Error(`Notification arn ${arn} is not a valid arn for an SNS topic`);
}
});
}
const startSynthTime = new Date().getTime();
const [stackCollection] = await Promise.all([
this.selectStacksForDeploy(options.selector, options.exclusively, options.cacheCloudAssembly),
options.assetParallelism ? lambdaNodeJs.Bundling.bundleInParallel() : Promise.resolve(),
]);
const elapsedSynthTime = new Date().getTime() - startSynthTime;
print('\n✨ Synthesis time: %ss\n', formatTime(elapsedSynthTime));
// ...
}
You may have noticed that bundleInParallel
relies on a hash map of cached bundling functions stored in the static prop staged
. This would be populated by the @aws-cdk/AssetStaging
class's invocation of the tryBundling
function returned by Bundling.getLocalBundlingProvider
. When invoked by AssetStaging
, tryBundling
will capture the arguments passed into it within a new closure and cache the resulting function to staged
. Example:
// @aws-cdk/aws-lambda-nodejs/lib/bundling.ts@Bundling
private getLocalBundlingProvider(): cdk.ILocalBundling {
const osPlatform = os.platform();
const createLocalCommand = (outputDir: string, esbuild: PackageInstallation, tsc?: PackageInstallation) => this.createBundlingCommand({
inputDir: this.projectRoot,
outputDir,
esbuildRunner: esbuild.isLocal ? this.packageManager.runBinCommand('esbuild') : 'esbuild',
tscRunner: tsc && (tsc.isLocal ? this.packageManager.runBinCommand('tsc') : 'tsc'),
osPlatform,
});
const environment = this.props.environment ?? {};
const cwd = this.projectRoot;
// NEW: we've pulled `tryBundle` instantiation out of the return and added an options argument
const tryBundle = (outputDir: string, options?: { staging: boolean }): boolean => {
// NEW: when invoked by AssetStaging, ie without `options.staging === false`, cache input to a new closure
const parallelize = Bundling.parallelBundling && options?.staging ?? true
if (parallelize) {
const tempFile = path.join(outputDir, '.staged');
// Required, as AssetStaging asserts `outputDir` is not empty after bundling
fs.appendFileSync(tempFile, 'staged for parallel build')
// Caching done here
Bundling.staged[outputDir] = {
tryBundling: () => tryBundle(outputDir, { staging: false }),
inputDir: this.projectRoot
tempFile,
};
// Allow synth to continue uninterrupted
return true;
}
// Bundle as normal if either `Bundling.parallelBundling === false` or `options.staging === false`
if (!Bundling.esbuildInstallation) {
process.stderr.write('esbuild cannot run locally. Switching to Docker bundling.\n');
return false;
}
if (!Bundling.esbuildInstallation.version.startsWith(`${ESBUILD_MAJOR_VERSION}.`)) {
throw new Error(`Expected esbuild version ${ESBUILD_MAJOR_VERSION}.x but got ${Bundling.esbuildInstallation.version}`);
}
const localCommand = createLocalCommand(outputDir, Bundling.esbuildInstallation, Bundling.tscInstallation);
exec(
osPlatform === 'win32' ? 'cmd' : 'bash',
[
osPlatform === 'win32' ? '/c' : '-c',
localCommand,
],
{
env: { ...process.env, ...environment },
stdio: [ // show output
'ignore', // ignore stdio
process.stderr, // redirect stdout to stderr
'inherit', // inherit stderr
],
cwd,
windowsVerbatimArguments: osPlatform === 'win32',
},
);
return true;
};
return {
tryBundle
};
}
Other Information
I may be able to implement this feature and open a PR, but would like feedback on alternatives to the above approach. Specifically is there a preferred way to integrate optional behavior like this with the toolkit? I assume such a direct coupling of a top level CLI command with such a specific construct is a non-starter. However, I'd be happy to be wrong about that as well.
Acknowledgements
- I may be able to implement this feature request
- This feature might incur a breaking change
CDK version used
2.67.0
Environment details (OS name and version, etc.)
N/A