Skip to content

Create DataFrames From Java Collections #88

Open
@dgunning

Description

@dgunning

Java Developers need a easy way to create Dataframes from in-memory Java Collections. This will Morpheus much more suitable for generic Java development.

I am proposing a class called ListSource or CollectionSource that would be able to create a DataFrame from a List of Lists. E.g. Lets say you read a table from a Word document

XWPFDocument document = new XWPFDocument(stream); XWPFTable table= document.getTables().get(0);

and you convert the table to a lists of Iterables (or lists)

 List<Iterable<XWPFTableCell>> tableData =
                    table.getRows().stream()
                    .map( XWPFTableRow::getTableCells).collect(Collectors.toList());

you could then create a dataframe as follows

  DataFrame<Integer,String> data = new ListSource<XWPFTableCell>()
           .read(options ->{
                options.setData( tableData );
                options.setConverter( XWPFTableCell::getText );
            });

Generally a lot of data in Java can be converted to Lists of Lists and this feature would make Morpheus much more applicable.

Note that the current Morpheus API allows the following

        final Array<String> columns = Array.ofIterable( rows.get(0).getTableCells().stream()
          .map( XWPFTableCell::getText ).collect(toList()));

        return DataFrame.ofObjects(
                Range.of(1, rows.size()).toArray(),
                columns,
                value -> rows.get( value.rowOrdinal()+1).getTableCells().get(value.colOrdinal()).getText());

but that was trickier to get right due to the long method chains and the +1 in the method calls

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions