Open
Description
Hi,
we have a scenario that need to use the map/reduce function in spark.net,
For example, we want to call
public IEnumerable<object[]> MapCallback(IEnumerable<Row> input)
{
// do something with `IEnumerable<Row> input`
}
df.Rdd.MapPartitions(MapCallback, true)
The thing is, we need this IEnumerable input so that we can do some operation on the row level.
In Mobius, we can access all the Rdd-related APIs, but according to this issue seems all the Rdd-related APIs are no longer accessible.
So we have the following questions:
- Is there any API in current Spark.Net that can implement the function that exactly equivalent to Rdd.Map, Rdd.Reduce and other mapreduce related function? Note that we need to deal with a IEnumerable with arbitrary number of elements in one row, i.e., we may not know how many elements (columns) in a row until runtime.
- If the answer of 1 is false, can we just download the source code, change the visibility of Rdd-related APIs to public, and build a private bits to use?
- Any other related suggestions will be really appreciated.
Looking forward to your answer! Thanks a lot!
Metadata
Metadata
Assignees
Labels
No labels