Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable use outside of node.js environments #186

Closed
wants to merge 2 commits into from

Conversation

ianthetechie
Copy link


Here's the reason for this change 🚀

I'm experimenting with running parts of the stack in different environments (ex: WebAssembly components) where filesystem access via the node.js APIs is either not possible or undesirable.

If this isn't in line with Pelias' goals, that's probably fine and I'll just maintain a fork, but it seems like a reasonable change that'd be useful to others, so here's a PR with the changes :)


Here's what actually got changed 👏

  • Removes all fs and path imports.
  • Introduces scripts which output JavaScript modules rather than JSON. The rationale for this is that it's already going to get loaded into JS objects anyways. This moves a few steps to pre-processing. Some transformation (into the inverted index) still happens at runtime.
  • Possibly controversial: I removed the regexes from the "file names" and just spell them out. I also just lazily chomp off the .txt for now to get the key. I can clean up / change this. It's just in a pragmatically working with minimal changes sort of state for now.
  • **Custom loaders ⚠️ **: I may be missing something, but it looks like these are never used? Was this something from the past that just never got removed?

Here's how others can test the changes 👀

  • Unit tests are unchanged and are passing locally.
  • I ran the format fixer, but pre-commit hooks just stall for me FWIW 🤔

NOTE: I have not tested for any other second order effects like memory usage due to node require caching. (And I'm not 100% sure what I'd look for here to be honest; open to any feedback as you guys are much more into the node ecosystem than I.)

Copy link
Member

@Joxit Joxit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thank you for using the project and your contribution.
I think we cannot merge this, it making the workflow less easy to maintain. We use txt to get a better view when the data is updated (see #132 for example, which was a attempt to update WOF data before the unification)

For each data sources (libpostal and WOF) we have 3 levels:

  1. Source data you can find in resources/{libpostal,whosonfirst}
  2. Overrided data by pelias you can find in resources/pelias/{libpostal,whosonfirst} (see Add pelias specific resources for libpostal & WOF #15)
  3. Overrided custom data by us and git ignored (that's why it's empty and you think it's not used) you can find in resources/custom/{libpostal,whosonfirst} (see Add custom resources directory which is git ignored #27)

I think if you want the same behaviour, it should not be in the git repository but can be generated before publishing on npm? 🤔

@missinglink
Copy link
Member

Hi @ianthetechie, thanks for opening the PR, unfortunately I agree with Jones.

Its not clear what benefit would be gained by this change, if you are able to demonstrate a significant improvement in performance or memory usage by running it in WASM then we might consider adopting some of those changes but running the code outside of nodejs isn't a goal of the project.

@missinglink
Copy link
Member

I just wanted to add that while 'everything is a file' resource method is incompatible with bundlers, it allows users to take a stock docker image and 'bind mount' files inside the container, this is a convenient method of allowing users to customize/extend Pelias without having to build & maintain their own docker images.

@ianthetechie
Copy link
Author

Understood; no worries.

@missinglink
Copy link
Member

Have a look at https://github.com/isp-nexus/mailwoman, Teffen reached out to me about modernizing this codebase & converting to ESM modules, I said it would be better as a fork, it might be easier for you to convert to WASM.

I'd actually love to have all the Pelias code be more modern (such as my newer modules like https://github.com/missinglink/s2js) but it just seems like a lot of busy work for not a lot of gain, instead I think we'll just slowly chip away at making it more modern with ES6 features and not worry about it being CJS.

@ianthetechie
Copy link
Author

Yeah makes sense.

While performance is a consideration, better interop and modern tooling was the main motivation for this. And a hard dependency on Node APIs was the thing preventing it.

For example:

  • Bundling to allow use in a browser. Or React Native or whatever that's still JS but not Node.
  • Embedding V8 in non JS native projects to bring in JS modules.
  • In the "extreme," WASM components. They have clearly defined imports and exports and really unlock cross language interoperability in some exciting new ways! I think it's possible to shim the node APIs with WASI preloads. It's just a lot easier with "data as code modules."

I'll probably write up a blog post on some of these ideas later 😂 it's still early days for a lot of the tooling in the WASM space.

Anyways, for now it's easy to maintain a minimal fork to make my integrations easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants