Conversation
When the very first command to a REPL is "^D" (EOF), all subsequent REPL calls are ignored until control is given back to a REPL at a lower depth. This makes it possible to quit jaq when running something like `recurse(.) | repl`.
Owner
Author
|
Whoops, this breaks compilation with Rust 1.65: I now get lots of errors: I can address some of these errors by: -fn fold_run<'a, D: DataT, T: Clone + 'a>(
+fn fold_run<'a, V: 'a, D: DataT<V<'a> = V>, T: Clone + 'a>(
-fn fold_update<'a, D: DataT>(
+fn fold_update<'a, V: 'a, D: DataT<V<'a> = V>>(Now there are only a few remaining: Any ideas? EDIT: I found that the errors above persist until (including) Rust 1.68 and disappear starting from Rust 1.69. |
Owner
Author
|
A little note: We currently have many type signatures in pub fn funs<D: DataT>() -> impl Iterator<Item = Filter<Native<D>>>
where
for<'a> D::V<'a>: ValT,
{ ... }This is a bit clunky. With associated type bounds, we could write this more compactly as follows, but that would increase MSRV to 1.79: pub trait DataTx: for<'a> DataT<V<'a>: ValT> {}
impl<T: DataT> DataTx for T where for<'a> T::V<'a>: ValT {}
pub fn funs<D: DataTx>() -> impl Iterator<Item = Filter<Native<D>>> { ... }It's probably not a big deal for now, the whole added |
This branch could actually do with 1.69 everywhere, but because we will require 1.70 for CBOR support eventually (due to `ciborium_ll` depending on `half`), we do it right away.
path/1 and value types with lifetimes
path/1 and value types with lifetimespath/1 and data with lifetimes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR makes two large changes to jaq's core:
path(f)filter. This is a user-facing enhancement which should not impact the execution of previously available filters.This PR changes the API of
jaq-corein a backwards-incompatible way; therefore, it will be part of jaq 3.0.path/1The support for
path/1makes several things possible in jaq; in particular, path-based updates à lajq!However, the execution of
f |= gwill keep using non-path-based updates due to their greater performance and resistance to iterator invalidation problems.Still, it is possible to define
def update(f; g): reduce path(f) as $p (.; getpath($p) |= g);and haveupdate(f; g)in jaq do mostly the same thing asf |= gin jq. Be aware, however, that this does not attempt to work around iterator invalidation issues the same way asjqdoes; to avoid these issues, I advise you to use jaq'sf |= g, which is more robust.Design
Before implementing this, I made a small experiment where I tried to give a rough estimation of the performance overhead if I would change jaq such that
path(...)execution and normal execution would share the same code.For that, I modified the
rangefunctionjaq/jaq-std/src/lib.rs
Line 391 in 8c5131b
$ hyperfine "target/release/jaq -n '[range(10000000)] | length'"
Box::new(range(Ok(from), to, by))(original):Time (mean ± σ): 418.0 ms ± 1.4 ms [User: 381.0 ms, System: 34.3 ms]
Range (min … max): 415.7 ms … 420.4 ms 10 runs
Box::new(range(Ok(from), to, by).map(|v| (v, None as Option<String>)).map(|(v, s)| v)):Time (mean ± σ): 469.1 ms ± 4.7 ms [User: 432.0 ms, System: 34.5 ms]
Range (min … max): 460.3 ms … 475.9 ms 10 runs
Box::new(range(Ok(from), to, by).map(|v| (v, Some(String::new()))).map(|(v, s)| v)):Time (mean ± σ): 472.4 ms ± 3.0 ms [User: 434.6 ms, System: 35.0 ms]
Range (min … max): 469.2 ms … 476.5 ms 10 runs
In a nutshell, that means that calculating paths everywhere --- even when you do not need them --- costs about 12% (469ms / 418ms) of performance overhead. That's too much of performance overhead to accept for me.
Therefore, I opted for a design where executing
path(f)uses code different from the normal execution code (that would be called when just executingfon its own). That entails some code duplication, but it's not that bad. I tried to share as much code as possible between the path and normal execution code, even if that required giving some helper functions even more complex function signatures. As a side-effect, this makes these functions more difficult to misuse, even if it also makes them harder to understand.Compatibility
There is currently a small difference between the
path(f)semantics ofjaqandjq: In jaq, if a subexpression offis executed and it does return a non-path value, then an error is thrown, whereas in jq, an error is only thrown if such a non-path value is actually returned fromf.To see the difference:
jq -n 'path([] | empty)'returns no output, whereasjaq -n 'path([] | empty)'returns an error.This difference could be eliminated, but it would cost some performance and memory, because instead of returning a path
RcList<V>(paths are implemented as linked lists), we would need to return anOption<RcList<V>>everywhere.That would also make the implementation a bit more awkward and the API more complex.
Given that I suppose that few people use such paths, I do not think it is worth the effort. But if you are concerned, feel free to speak up.
Short-lived value types & arbitrary data for native filters
The new
DataTtrait makes it possible to use value types that have a lifetime unknown at the time of filter compilation. For example, if you wanted to treat a value typeVal<'a>where'ais the lifetime of data that was loaded after filter compilation, you were out of luck. Now, this is supported.Furthermore, the
DataTtrait also allows passing arbitrary data to native filters. Previously, theinputsfilter enjoyed some special treatment, because it was the only native filter that could obtain some kind of "global" data. At the same time, this also implied that even if one did not want to provide an implementation ofinputs, it was still necessary to pass the data necessary forinputswhen executing a filter. This was cumbersome and felt unclean.The new machinery generalises the mechanism previously available for
inputs. This makes the core of jaq completely unaware of side effects and makes it possible to realise variations of jaq that are completely pure! It is also possible to go the other direction, namely integration of more complex side effects than previously possible. For example, this paves the path towards resolving #144.The
DataTtrait uses GATs, which are available from Rust 1.65. However, early Rust versions supporting this feature were quite limited in their type inference, as I had to find out the hard way. Therefore, I increased MSRV to 1.69, which is the first version that can compile the code without serious adaptations.