Skip to content

DataVec: End-to-end examples of transforms #60

Open
@crockpotveggies

Description

@crockpotveggies

This issue has been migrated from deeplearning4j/deeplearning4j#5213

Original author @AlexDBlack


https://github.com/deeplearning4j/DataVec/issues/355

Issue Description

As a total newcomer to datavec transform processes, it was very difficult for me to infer how to build a full transform pipeline. With the exception of @tomthetrainer's easy to follow video, I was lost on how to properly work with CSV data.

This issue is to address the need for an example that does the following:

  1. Works with a complex CSV of different types (categorical, integers, doubles strings)
  2. Shows how to use custom conditions and transforms to replace null values
  3. Shows how to do advanced transformation, including applying advanced code on string values (for example, CSV has human input text and we want to classify it before passing further into pipeline)
  4. Shows how to save to Hadoop map file, then load it into an MLP into training.
  5. Shows how to do advanced schemas for complex CSVs (what happens when you have 5,000 columns!?).

Version Information

Please indicate relevant versions, including, if relevant:

Master, current, etc.

Contributing

I'm very happy to contribute here. First, I'd like to find out if there's more that should be on this example before continuing.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions