Skip to content

Introduces O(lg(n)) file retrieval and several API changes #49

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

sgg
Copy link
Contributor

@sgg sgg commented Feb 24, 2020

Summary

  • Bumps version from 0.5.1-alpha.0 to 0.6.0-alpha.0 due to backwards incompatible API changes.
  • File retrieval can now be done in log time rather than linear time.
  • Nodes of the filesystem tree are now represented with a DirEntry sum-type over File and Dir to simplify the logic

Implementing sub-linear retrieval while File and Dir types were separate added a lot of undue complexity to both the macro generation and runtime lookup code so I opted to introduce DirEntry across the board instead.

I did my best to minimize the backwards incompatible changes wherever possible and to provide helper functionality to aid ergonomics and ease the migration.

I am also happy to write up a Migration Guide if we feel that's prudent!

Additions

  • Adds a file_name attribute to include_dir::Dir and include_dir::File
  • std::convert::TryFrom implementations for DirEntry to Option<File>,
    Option<Dir>, as well as the corresponding reference types to make conversion to the inner types easy.
  • DirEntry::is_dir and DirEntry::is_file allow one to check if a DirEntry is a particular variant.
  • Logarithmic file-retrieval is accomplished by sorting the filesystem entries at compile time and then performing a
    binary search on retrieval.

API Changes

  • Dir::files now returns an iterator over files (impl Iterator<Item=&File<'_>>) instead of a slice of owned Files
  • Dir::dirs now returns an iterator over dirs (impl Iterator<Item=&Dir<'_>>) instead of a slice of owned Dir
  • include_dir::globs::Globs now takes and returns &DirEntry references to avoid copies.

Copying

There were a handful of places where we were performing copies where we now return references as copying large
amounts of data could potentially be expensive. DirEntry, File, and Diir still implement Copy so callers
are free to make copies where it makes sense for their use case, however copying will not be done within
include_dir's business logic.

Misc

  • Updates code formatting as needed
  • Updates tests and documentation accordingly
  • Updates some macro generation error messages to be more descriptive

**Summary**

* Bumps version from `0.5.1-alpha.0` to `0.6.0-alpha.0` due to backwards incompatible API changes.
* File retrieval can now be done in log time rather than linear time.
* Nodes of the filesystem tree are now represented with a `DirEntry` sum-type over `File` and `Dir` to simplify the logic

Implementing sub-linear retrieval while `File` and `Dir` types were separate added a lot of undue complexity to
both the macro generation and runtime lookup code so I opted to introduce `DirEntry` across the board instead.
I did my best to minimize the backwards incompatible changes wherever possible and to provide helper functionality
to aid ergonomics and ease the migration.

I am also happy to write up a Migration Guide if we feel that's prudent!

**Additions**

* Adds a `file_name` attribute to `include_dir::Dir` and `include_dir::File`
* `std::convert::From` implementations for `DirEntry` to `Option<File>`,
  `Option<Dir>`, as well as the corresponding reference types to make conversion to the inner types easy.
* `DirEntry::is_dir` and `DirEntry::is_file` allow one to check if a DirEntry is a particular variant.
* Logarithmic file-retrieval is accomplished by sorting the filesystem entries at compile time and then performing a
  binary search on retrieval.

**API Changes**

* `Dir::files` now returns an iterator over files (`impl Iterator<Item=&File<'_>>`) instead of a slice of owned `Files`
* `Dir::dirs` now returns an iterator over dirs (`impl Iterator<Item=&Dir<'_>>`) instead of a slice of owned `Dir`
* `include_dir::globs::Globs` now takes and returns `&DirEntry` references to avoid copies.

**Copying**

There were a handful of places where we were performing copies where we now return references as copying large
amounts of data could potentially be expensive. `DirEntry`, `File`, and `Diir` still _implement_ `Copy` so callers
are free to make copies where it makes sense for their use case, however copying will not be done _within_
`include_dir`'s business logic.

**Misc**

* Updates code formatting as needed
* Updates tests and documentation accordingly
* Updates some macro generation error messages to be more descriptive
@sgg sgg requested a review from Michael-F-Bryan February 24, 2020 21:11
Copy link
Owner

@Michael-F-Bryan Michael-F-Bryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still a couple questions and small changes we might want to make but, other than the next point, I'm okay with this PR.

The big point of contention I have is around changing the include_dir!() macro's return type to DirEntry. My thoughts are:

  • The DirEntry type is intended more as an implementation detail that gets exposed when traversing the directory tree. For example the std analogue, std::fs::DirEntry, barely gets used outside of std::fs::read_dir().
  • You use a DirEntry to represent something which is either a Dir or a File, but the only thing you can create with the include_dir!() macro is a directory tree. If I wanted to store just a file I'd use std::include_bytes!().
  • This crate is called include_dir, not include_dir_entry 😜


let dir = Dir::from_disk(&path, &path).expect("Couldn't load the directory");
let entry = DirEntry::from_disk(&path, &path)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hang on... why are we changing the return type of include_dir!() from a Dir to a DirEntry?

The main point of the include_dir!() macro is to embed a directory into your binary. I would also consider DirEntry as an implementation detail that gets exposed when traversing the directory tree. It's the include_dir analogue of std::fs::DirEntry.

@sgg
Copy link
Contributor Author

sgg commented Feb 25, 2020

Thanks for the prompt review @Michael-F-Bryan!

I'll go through the feedback and make some changes. I spend most of my time writing against nightly and I straight up forgot TryFrom and const fn were stabilized! 😆

This crate is called include_dir, not include_dir_entry 😜

Hahaha, fair point. So the one tricky thing here is that if I use Dir as the entrypoint the traversal algo gets a bit messier (this is why I introduced DirEntry).

  1. Would it be alright if I introduce new type corresponds to the overall in-mem filessytem tree?
  2. If yes, any preference for a name? Could call it DirTree, Filesystem, Fs, IncludedDir, etc?

@sgg
Copy link
Contributor Author

sgg commented Feb 28, 2020

@Michael-F-Bryan please take a look whenever you get a chance. I think the only open question is regarding the entrypoint.

@sgg sgg changed the title Changes version to 0.6.0 and adds O(lg(n)) file retrieval Introduces O(lg(n)) file retrieval and several API changes Feb 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants