-
Notifications
You must be signed in to change notification settings - Fork 378
Add short-circuiting left and right fold #224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This article about making function composition stack safe could be related (the stack issue might be "fixed"). What would variation using thunks looks like? |
#### `shortFoldl` | ||
|
||
```hs | ||
shortFoldl :: Foldable f => f a ~> ((b, a) -> {done :: Boolean, value :: b}) ~> b -> b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you consider using []
/[x]
rather than {done: false}
/{done: true, value: x}
? I find it a more elegant way to fake the Maybe a
type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I did indeed. I can see the appeal. But I choose an object for two reasons:
- Array are more expensive to create. I.e. it would lower performance.
- In Jabz I use an Either to signal when to stop. That is even nicer than a tuple IMO. But fortunately, with the current specification an Either implementation can include a
done
property which it istrue
forleft
andfalse
forright
. Then the specification will be directly compatible with that Either implementation. With thatfind
could be implemented like this:
export function find<A>(f: (a: A) => boolean, t: Foldable<A>): Maybe<A> {
return t.shortFoldl((acc, a) => f(a) ? left(just(a)) : right(acc), nothing);
}
I think this PR is awesome. On the details of how to represent the alternative to the thunk, I'd much prefer an explicit function representation. I love @safareli's implementation & article Disclaimer: I've not looked at the diff yet, only read the PR description |
Thank you for the link. I'll take a look at it 😄 The thunk based versions that I experimented with can be found here. There are two variants. One uses a plain function as a thunk ( In either case I think a version using thunks would perform worse because a new function would have to be created on each invocation to the folding function. But I'm not sure. When I tried to benchmark it I blew the stack 😅 |
How does this differ from |
### Foldable | ||
|
||
1. `u.reduce` is equivalent to `u.reduce((acc, x) => acc.concat([x]), []).reduce` | ||
1. `u` is equivalent to `u.foldl((acc, x) => acc.concat([x]), [])` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. This doesn't work. I'll have to think about how to fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be: u.foldl
is equivalent to u.foldl((acc, x) => acc.concat([x]), []).foldl
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
u.foldl((acc, x) => acc.concat([x]), [])
evaluates to an array so the last foldl
should be reduce
.
This should be equivalent to the old law:
u.foldl
is equivalent to u.foldl((acc, x) => acc.concat([x]), []).reduce
.
Not having to remain with in the HKT seems the most obvious difference to me. That is, the ability to fold to Integer instead of |
I don't fully understand ChainRec. But I think this is very different. My understand is that Examples of things that are foldable but cannot implement |
In JS we have well defined Iteration protocol, which could be used to write efficient My question is: The "gap" of iterating over some structure is already "filled" in JS and why should FL redefine it (and also merge it to The main difference I see between If we were taking to add short circuiting support to Foldable structures in some strict and statically typed environment like purescript, then I think something like this should be done instead of "upgrading" Foldable. Update: |
I'm sorry, I phrased my question really poorly. What I meant to ask is: why don't we make this a separate algebra rather than changing |
@safareli By that logic, we should just remove As I mentioned in the original description of the PR this is about bridging the gap and making our Foldable as powerful as the Iterator protocol.
Since
IMO we should ideally have many more. Haskell's Foldable typeclass has 16 methods. As it currently stands Fantasy Lands Foldable specification has very little utility. All functions that can be derived from it will perform as poorly as they would on a cons-list. For most data structures that is far below optimal and thus the derived functions are prohibitively expensive. I explain this in greater detail in my blog post here. |
I must not be understanding your proposal. I still don't see how this differs from Every As I understand the proposal, it would mean data types like The proposal sounds great for recursive data types (whether infinite or not). I definitely think we should add this stuff. But I don't think it's worth the burden to every data type. My hesitation comes from having defined the interface for |
I think you do 😄 But I think we're looking at this a bit differently. I will try to do a better job at explaining my point of view.
Then I think that
Because if we have both a In my opinion, there is a very big difference between "
Since @foldable
class MyFoldable {
constructor(...) { ... }
foldr(initial, f) { ... }
} This gets us the best of both worlds:
Again, a foldable that only includes
I think that problem is confined to PureScript. Does it not support default method implementations like Haskell? Default method implementations ensure that type classes can include additional methods with no downsides. The deriving mechanism in Jabz achieves the same thing and it can easily be adapted to Fantasy Land. |
@paldepind @joneshf why not change |
I'm sorry. I don't want to be a blocker here. I just have remorse over what happened with If others think it's a good addition, let's add it! |
what? |
|
I don't think that's the real issue. The real issue is that PureScript does not support default method implementations. If it had that feature the problem would be solved. Foldable has a lot of useful derived methods. But if implementations are not given the opportunity to implement these themselves in an efficient way they are pretty useless in practice. Haskell's Foldable typeclass has 16 methods and it's no problem since they support default methods implementations. In JavaScript we can get the same thing with mixins. |
In any case, like you said, this isn't PureScript. So, we don't have the same problems as they do. |
That's a good idea, but it is a superclass of Traversable so we need to have it.
Why do we need to make Foldable more powerful, when the power we want already exists in Iterators? you can just reuse bunch of iterator based functions, instead of redefining them. "This isn't PureScript", this is javascript and Iterators are pretty common and I don't think we we should reinvent what already is present in javascript. |
This PR primarily adds a short-circuiting left and right fold to the Foldable specification. The changes proposed are adapted from my experiments in Jabz.
The problem
Fantasy Land currently specifies only a strict left fold. This has the following disadvantages:
It finds the first element in a Foldable that satisfies the predicate. But, it will always traverse the entire Foldable. Ideally, we would like the function to stop as soon as a matching element is found.
The solution
This PR adds two additional methods to the Foldable specification. The first is named
shortFoldl
and the secondshortFoldr
. They are added along with two laws and a description of their behavior. The functions are similar to normal folds with the difference that the folding function must return an object of the type:The
value
property is what would typically be returned only and thedone
property indicates whether or not the fold should stop. Given this the abovefind
function could be implemented like this:This implementation of
find
short-circuits as soon as a satisfying value is found. Furthermore, it will also work for infinite data-structures.shortFoldr
is added so that functions likefindLast
can also be implemented with optimal performance.Other changes
This PR also adds a strict right fold to the Foldable specification. The new 2. law ensures that this right fold behaves symmetric to the left fold. The law is a bit tricky I think. But I've convinced myself that it is correct with the following concrete example
About the names
Since this is a breaking change anyway I have decided to rename
reduce
tofoldl
. If renamingreduce
is controversial I suggest this alternative naming:But I prefer the current naming for the following reasons:
foldl
is explicit about the direction of the fold whilereduce
is not.foldl
andfoldr
has a nice symmetry that reflects the actual symmetry of the functions.Why not use thunks
In Jabz I also experimented with versions of
foldr
andfoldl
made lazy by using thunks. However forcing a thunk must be in the form of a function invocation and this resulted in the lazyfoldr
andfoldl
not being stack safe (i.e. they would overflow the stack for large data-structures).Final remarks
I'd love to hear what you all think of these changes.