Closed
Description
I found it weird and undocomunted but with the following malformed CSV (first two lines have less commas than the header)
sepal_length,sepal_width,petal_length,petal_width,label,id,test_train
5.8,4,1.2,0.2,15
4.8,3,1.4,
4.3,3,1.1,0.1,Iris-setosa,14,train
Using the mappify
method will produce the following:
{:sepal_length "5.8", :sepal_width "4", :petal_length "1.2", :petal_width "0.2", :label "15"}
{:sepal_length "4.8", :sepal_width "3", :petal_length "1.4", :petal_width ""}
{:sepal_length "4.3", :sepal_width "3", :petal_length "1.1", :petal_width "0.1", :label "Iris-setosa", :id "14", :test_train "train"}
As you can see some rows are smaller than others, totally missing from the mappified results. I was expecting that all rows will have the same size, with nil
values when something is missing from the CSV.
Bare in mind that using {:structs true}
will produce the expected results:
{:sepal_length "5.8", :sepal_width "4", :petal_length "1.2", :petal_width "0.2", :label "15", :id nil, :test_train nil}
{:sepal_length "4.8", :sepal_width "3", :petal_length "1.4", :petal_width "", :label nil, :id nil, :test_train nil}
{:sepal_length "4.3", :sepal_width "3", :petal_length "1.1", :petal_width "0.1", :label "Iris-setosa", :id "14", :test_train "train"}
I have some other issues using structs but I will probably open another issue when I can get a reproducible environment.
I'll open up a pull request with a fix I have made in order to fix this.
Metadata
Metadata
Assignees
Labels
No labels