Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node.js image to safely embed user content #9

Merged
merged 3 commits into from
Sep 2, 2023
Merged

Node.js image to safely embed user content #9

merged 3 commits into from
Sep 2, 2023

Conversation

Murderlon
Copy link
Member

Initial checklist

  • I read the support docs
  • I read the contributing guide
  • I agree to follow the code of conduct
  • I searched issues and couldn’t find anything (or linked relevant results below)
  • If applicable, I’ve added docs and tests

📑 👉 Read the rendered RFC

@Murderlon Murderlon requested a review from a team July 6, 2023 19:22
@wooorm
Copy link
Member

wooorm commented Jul 17, 2023

Scope: rehypejs/rehype-github (or new repository)

Think this should be a new repo. I can ping the camomile folks again if this is going somewhere. They were mostly open to it, just AFK a lot. @unifiedjs/camomile could work (first) too.

compatible with unified plugins

and a unified plugin to rewrite image URLs to use this proxy if necessary.

Only creating a unified plugin compatible with go-camo.

Think https://github.com/rehypejs/rehype-github/tree/main/packages/image is all that is needed? Which exists (unpublished?).
While this exists, good to look through it, to see if that’s all still the secure mechanism.

to safely embed user content on the web.

Images particularly, right? Which formats?

However, there is no Node.js equivalent

Good to mention the deprecated https://github.com/zendesk/camo.
And that GH still has a maintained fork privately, internally.

It’s in the ecosystem’s and the web’s best interest to create awareness and solutions for security problems when authoring user content.

🔥

A bare bones Node.js server (no framework).

Good to mention it will work with fastify, express, etc.

funds are used to kick start substantial work which otherwise may not have happened.

Yes.
I do think it is important that exciting new work is more fun than maintaining things. And that funds should also go to maintenance. But that’s a bit off topic!


Can you estimate how much work (without plugin, as it exists?)
I’m definitely up for 500 for this, that sounds super reasonable.
1k likely too, but depends a bit on how much work it would be. What the supported files would be?

I’d also say other additions, like resizing images, is worth investigating?

@wooorm
Copy link
Member

wooorm commented Jul 17, 2023

@ChristianMurphy @remcohaszing what do you think about this?

@ChristianMurphy
Copy link
Member

Unified has put a lot of effort in providing similar markdown support as GitHub. The latest project, rehype-github, takes this even further in order to support processing user content for safe use on the web, just as GitHub does.

I appreciate and support initiatives that increase feature compatibility with CM and GFM/GH 👍

There is a missing piece in safely authoring user content: images.

An HTTPS page that includes content fetched using cleartext HTTP is called a mixed content page. Pages like this are only partially encrypted, leaving the unencrypted content accessible to sniffers and man-in-the-middle attackers. — MDN

Developers aware of this problem, who are using Go or are willing to run a Go server, can use the well maintained go-camo to solve this. However, there is no Node.js equivalent, nor is there a plug-and-play solution for processing markdown which takes this problem into account.

Case in point: a new Node.js HTTP proxy to route images through SSL, and a unified plugin to rewrite image URLs to use this proxy if necessary.

I also appreciate the aim of increasing security.
I'd add some nuance that this doesn't eliminate the risk of MITM attacks or sniffers, that can still happen between the source server and the proxy server.
But generally, it should ensure more hops between the source and the browser are encrypted.

A bare bones Node.js server (no framework). The reason for this is that we can create a handle function which can be integrated in any Node.js framework or even a front-end framework like Next.js. This is what I’ve worked on for @tus/server, as you can see in the examples.

Makes sense

A new plugin in [rehypejs/rehype-github][] to rewrite image URLs to use this proxy if necessary.

How would this work?
Would the server/proxy provide a stable hash/encoding that the rehype plugin could solve offline? (if so would the plugin and the server need a shared versioning scheme?)
Or would this work online, with the rehype plugin making network calls to the proxy, looking up what the URL should be?
Or something else entirely?

Good to mention it will work with fastify, express, etc.

Would integrations with common webservers be included in the documentation this would generate?

Out of scope (for now): metrics endpoint for usage data, ...

Metrics/analytics feels like something express, fastify, etc would provide? Would we want it to be in scope ever?

Out of scope (for now): ... filtering rules.

Agreed this is out of scope currently.
It could be interesting but filtering, especially if it goes into moderation filtering, is both important and a massive/complex problem.

I’d also say other additions, like resizing images, is worth investigating?

Resizing and thumbnail generation for images could also be worth exploring. 👍
Though that can be rather complex, and could risk running well past the estimates in this RFC.

Unified has put a lot of effort in providing similar markdown support as GitHub.

With the goal appearing to be matching GitHub, would this include a new Recipe/Guide on https://unifiedjs.com/learn/ showing how to get the most GH like unified + remark + rehype + camomile setup?

With implementation, tests, and docs, this could take around three full days (optimistically)

This estimate does feel optimistic/underestimated to me just with what is listed.
Is code review and discussion time also included in the estimate?

In short: between $500 and $1,000 from Open Collective
It’s more an attempt at a fair flat fee.

If you feel comfortable with the estimate and the flat rate, I'm fine with this.
There is a balance to be struck between not overextending the limited open collective budget, and not overextending you and your time.
It feels like there is some risk of the latter, but you would be a better judge of that.


Overall I support the idea.
My concerns/considerations mostly center on how we would balance budget and scope for this effort.

@Murderlon
Copy link
Member Author

Murderlon commented Aug 3, 2023

Think https://github.com/rehypejs/rehype-github/tree/main/packages/image is all that is needed? Which exists (unpublished?).
While this exists, good to look through it, to see if that’s all still the secure mechanism.

That's great! I forgot about this one again.

Images particularly, right? Which formats?

I think we can do most (all?) image mime types?

Good to mention it will work with fastify, express, etc.

I think it states so clearly already: "The reason for this is that we can create a handle function which can be integrated in any Node.js framework or even a front-end framework like Next.js."

I’d also say other additions, like resizing images, is worth investigating?

I think it's definitely interesting, but leaving it out for MVP

How would this work?
Would the server/proxy provide a stable hash/encoding that the rehype plugin could solve offline? (if so would the plugin and the server need a shared versioning scheme?)
Or would this work online, with the rehype plugin making network calls to the proxy, looking up what the URL should be?
Or something else entirely?
A client requests a page from the web app.

  1. The original URL in the content is parsed.
  2. An HMAC signature of the url is generated.
  3. The url and hmac are encoded.
  4. The encoded url and hmac are placed into the expected format, creating the signed url.
  5. The signed url replaces the original image URL.
  6. The web app returns the content to the client.
  7. The client requests the signed url from camo(mile).
  8. camo(mile) validates the HMAC, decodes the URL, then requests the content from the origin server and streams it to the client.

Would integrations with common webservers be included in the documentation this would generate?

I'm not sure what you mean with generate, but I would write docs for all popular server frameworks yes.

With the goal appearing to be matching GitHub, would this include a new Recipe/Guide on https://unifiedjs.com/learn/ showing how to get the most GH like unified + remark + rehype + camomile setup?

That's a good idea, yes!


About the time concerns, the 3 full days I wrote was just for the server, I didn't write that clearly. I think with inspiration on existing solutions that should be doable. Code review will be more async and perhaps over a longer period of time, but that is fine and I didn't take that into account for the estimate. Those small iterations I'll gladly do out of estimate/budget.

@wooorm
Copy link
Member

wooorm commented Aug 3, 2023

Perhaps important to note that these are completely async from each other:

rehype plugin:

  1. The original URL in the content is parsed.
  2. An HMAC signature of the url is generated.
  3. The url and hmac are encoded.
  4. The encoded url and hmac are placed into the expected format, creating the signed url.
  5. The signed url replaces the original image URL.
  6. The web app returns the content to the client.

server:

  1. The client requests the signed url from camo(mile).
  2. camo(mile) validates the HMAC, decodes the URL, then requests the content from the origin server and streams it to the client.

Looks like we all think this is a good thing to have, and are willing to fund it.
a) when could you work on this, @Murderlon?
b) I’d recommend guessing a bit longer, a day or two, for the estimate
c) how do y’all feel about money?
d) depending on a), I’ll ping the chamomile owners again
e) when all is ready, I’ll take some time in october to update rehype-github-com and release it

@remcohaszing
Copy link
Member

lgtm

Re resizing: In my experience resizing images is pretty simple using sharp. Although this does require a native Node binding. I understand this doesn’t have to be part of the MVP.

c) how do y’all feel about money?

Seems fair.

@wooorm
Copy link
Member

wooorm commented Aug 17, 2023

@Murderlon What do you think about some of my Qs above? Particularly timeframe I was just wondering about

@Murderlon
Copy link
Member Author

a) when could you work on this, @Murderlon?

I could do the week of 4-8 September.

b) I’d recommend guessing a bit longer, a day or two, for the estimate

I can make that week available for it.

c) how do y’all feel about money?

No strong opinions on this. I gave a range and I'd say it's up to the team to decide where it lands. We can also revisit that after it's done.

e) when all is ready, I’ll take some time in october to update rehype-github-com and release it

👍

@wooorm
Copy link
Member

wooorm commented Aug 25, 2023

OK, how about this: you suggested between 500 and 1000, for 3 days as a price/time you can and are interested in doing this.
I think it’s pretty impossible to put an exact number to how much this is worth in OSS.
So I’d say, meet in the middle of what you asked and that’s good: $750.
I think it’s a reasonable price per hour for OSS.

Christian and I also thought that it might take a bit longer, and you mentioned that you could do a whole week.
So how about, if you find that you’re taking 4 or 5 days, we extend it to $1k total?
How does that sound?

Even if it isn’t exactly finished I imagine it’s in a good enough place for me or others (and perhaps you in evening hours and such) to get it to a release. I’m not worried about that.

Aside: Also, it’s good to mention that we’re not getting much money in OC (incl GH sponsors) anymore. About 1k per month. At the current burn rate, it’s empty after December. I calculate that I need October to get the whole majors thing done. And it’s nice to maybe have some time in nov/dec for other improvements/new things. But I’ll be looking for other gigs for nov/later. If someone else has similar ideas to this RFC, do open a similar one!

TL;DR: 👍 750 for 3 days. 👍 1k for 4/5 days. 4-8 September. I’ll be available in DMs! I’d just say open source early and often and I’m sure folks will be happy to review.

If anyone has objections, voice them in the next 72 hours, or it’s accepted!

@ChristianMurphy ChristianMurphy requested a review from a team August 25, 2023 20:41
@Murderlon
Copy link
Member Author

Sounds good!

Small update on the week I'll work on it: I'd like to do 11-15 September instead of 4-8 so I can be on the safe side with wrapping something up for Transloadit which needs to be out sooner rather than later.

@wooorm
Copy link
Member

wooorm commented Aug 29, 2023

@Murderlon can you update the PR with what’s accepted, so that it can be merged?

@Murderlon
Copy link
Member Author

@Murderlon can you update the PR with what’s accepted, so that it can be merged?

I made some small changes. I didn't put in the dates as that's irrelevant, and the written amount of €500-€1000 also corresponds to the decision.

@wooorm wooorm changed the title Node.js image proxy & unified plugin to safely embed user content Node.js image to safely embed user content Sep 2, 2023
@wooorm wooorm merged commit 20c9310 into main Sep 2, 2023
@wooorm wooorm deleted the image-proxy branch September 2, 2023 09:41
@wooorm
Copy link
Member

wooorm commented Sep 2, 2023

Cool 😎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants