Skip to content

📥 Fix embedding raw code with remove-input flag #1937

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 27, 2025

Conversation

agoose77
Copy link
Contributor

Fixes #1932

Copy link

changeset-bot bot commented Mar 31, 2025

🦋 Changeset detected

Latest commit: 819191e

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
myst-cli Patch
mystmd Patch
myst-migrate Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@agoose77 agoose77 requested a review from fwkoch March 31, 2025 08:54
@agoose77 agoose77 changed the title 📥 Support embedding raw code with remove-input flag 📥 Fix embedding raw code with remove-input flag Mar 31, 2025
Copy link
Collaborator

@choldgraf choldgraf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some brief documentation to clarify this for future reference.

This looks fine to merge to me, assuming the logic you're following to detect a notebook cell is correct. My one suggestion would be to add a test to make sure that this is behaving as expected, but I don't think it's strictly required if you are confident that this fixes the bug.

Comment on lines 52 to 57
return !(
n.type === 'code' &&
parent?.type === 'block' &&
parent?.kind === NotebookCell.code
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous logic had a special case for code-nodes that were tagged as outputs. Now, we want to be even stricter --- only code nodes that are children of code-cells and not tagged as outputs.

// should be removed
return !(
n.type === 'code' &&
n.data?.type !== 'output' &&
Copy link
Contributor Author

@agoose77 agoose77 Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fwkoch I don't think I need this test for n.data.type actually, because reduceOutputs is always called after this transform runs. Shouldn't this mean we'll never see reduced outputs here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are correct that reduceOutputs always comes after this. However, I think we should just leave this check in place. At worst, it's just a no-op, but, at best, it would allow tagging nodes as an output in contexts other than the standard Jupyter Notebook processing (e.g. a custom transform plugin).

@@ -35,13 +36,28 @@ function mutateEmbedNode(
const { url, dataUrl, targetFile, sourceFile } = opts ?? {};
if (targetNode && node['remove-output']) {
targetNode = filter(targetNode, (n: GenericNode) => {
// After reduction, 'output' nodes may be replaced by their children which are then tagged as outputs
return n.type !== 'output' && n.data?.type !== 'output';
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See lower comment about reduceOutputs

Comment on lines +22 to +28
"exports": [
{
"format": "md",
"filename": "index.md",
"url": "/index-52e2bc751b3bb2cb9ed25191a3951cf9.md"
}
]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auxiliary test changes

@agoose77 agoose77 force-pushed the agoose77/fix-embed-code branch from 69f24df to f8cb8ec Compare April 4, 2025 13:39
@agoose77
Copy link
Contributor Author

agoose77 commented Apr 9, 2025

@fwkoch I am not 100% clear on the data.input tagging -- where do we set this, is it something we care about in the embed transform? Or does it happen later? Same for data.output tag!

When you get a moment, could you take this over the line and add in-line comments to explain for future reference?

@choldgraf choldgraf added the bug Something isn't working label May 22, 2025
@agoose77 agoose77 moved this from In Progress to Blocked on review in Jupyter Book and MyST Team Priorities May 22, 2025
@fwkoch fwkoch force-pushed the agoose77/fix-embed-code branch from f8cb8ec to 819191e Compare June 27, 2025 18:51
Copy link
Collaborator

@fwkoch fwkoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, finally took the time to understand the purpose of this PR and what it is fixing. The scope is actually quite small, simple, and safe; it just makes remove-input a bit stricter so it's actually "remove input" rather than "remove code." Easy to reproduce and see the improvement.

@agoose77 - Regarding your question about data.input, I don't see this anywhere, either in this PR or in reduceOutputs or anything, so I don't think we need to worry about it? We could add it in as a way for users to tag specific, non-code nodes as input? But that's a separate feature. Regarding output tags - I made a comment, but I think it doesn't hurt to just leave as-is.

Also, I noticed an edge case - if you try to embed a notebook code cell using ![](...) that has not been executed (i.e. just input, no output) it will still be removed, leaving nothing. I think this is fine, though, especially with the sentence added to the docs explicitly stating notebook input will be removed.

Sorry this review took 3 months 😬

@@ -60,7 +60,9 @@ For example, the following references the admonitions list in [](admonitions.md)

### The `![](#embed)` short-hand

The embedding markdown shorthand lets you quickly embed content using the Markdown image syntax (see more about [images](./figures.md)).
The embedding shorthand lets you embed content using the Markdown image syntax (see more about [images](./figures.md)).
This **removes the input cell** if you are [embedding from a Jupyter notebook](./reuse-jupyter-outputs.md).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

// should be removed
return !(
n.type === 'code' &&
n.data?.type !== 'output' &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are correct that reduceOutputs always comes after this. However, I think we should just leave this check in place. At worst, it's just a no-op, but, at best, it would allow tagging nodes as an output in contexts other than the standard Jupyter Notebook processing (e.g. a custom transform plugin).

@fwkoch fwkoch merged commit 7b844a9 into main Jun 27, 2025
7 checks passed
@fwkoch fwkoch deleted the agoose77/fix-embed-code branch June 27, 2025 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Development

Successfully merging this pull request may close these issues.

Embed mechanism loses code-blocks from directives
3 participants