Skip to content

Commit d35cf2d

Browse files
committed
docs update and getting ready for release
1 parent c210d56 commit d35cf2d

File tree

6 files changed

+73
-4
lines changed

6 files changed

+73
-4
lines changed

README.md

-1
Original file line numberDiff line numberDiff line change
@@ -340,7 +340,6 @@ Head over to the tutorials and notebooks listed above for more examples.
340340
* Basic Data manipulation and analysis operations:
341341
- Different kinds of join operations
342342
- Dataframe/vector merge (left, right, inner, outer)
343-
- Creation of correlation, covariance matrices
344343
- Verification of data in a vector
345344
- DF concat
346345
* Option to express a DataFrame as an NMatrix or MDArray so as to use more efficient storage techniques.

lib/daru.rb

+1-1
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ def jruby?
33
end
44

55
require 'csv'
6+
require 'matrix'
67
require 'securerandom'
78

89
require 'daru/index.rb'
@@ -12,4 +13,3 @@ def jruby?
1213
require 'daru/monkeys.rb'
1314

1415
require 'daru/core/group_by.rb'
15-

lib/daru/accessors/nmatrix_wrapper.rb

+5-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,8 @@
1-
require 'nmatrix' unless jruby?
1+
begin
2+
require 'nmatrix' unless jruby?
3+
rescue LoadError => e
4+
puts "Please install the nmatrix gem for fast and efficient data storage."
5+
end
26

37
module Daru
48
module Accessors

lib/daru/core/group_by.rb

+8
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,14 @@ def size
3030
Daru::Vector.new(values, index: index, name: :size)
3131
end
3232

33+
def first
34+
head(1)
35+
end
36+
37+
def last
38+
tail(1)
39+
end
40+
3341
def head quantity=5
3442
select_groups_from :first, quantity
3543
end

lib/daru/dataframe.rb

+41-1
Original file line numberDiff line numberDiff line change
@@ -513,13 +513,53 @@ def sort vector_order, opts={}
513513

514514
# Pivots a data frame on specified vectors and applies an aggregate function
515515
# to quickly generate a summary.
516+
#
517+
# == Options
518+
#
519+
# +:index+ - Keys to group by on the pivot table row index. Pass vector names
520+
# contained in an Array.
521+
#
522+
# +:vectors+ - Keys to group by on the pivot table column index. Pass vector
523+
# names contained in an Array.
524+
#
525+
# +:agg+ - Function to aggregate the grouped values. Default to *:mean*. Can
526+
# use any of the statistics functions applicable on Vectors that can be found in
527+
# the Daru::Statistics::Vector module.
528+
#
529+
# +:values+ - Columns to aggregate. Will consider all numeric columns not
530+
# specified in *:index* or *:vectors*. Optional.
531+
#
532+
# == Usage
533+
#
534+
# df = Daru::DataFrame.new({
535+
# a: ['foo' , 'foo', 'foo', 'foo', 'foo', 'bar', 'bar', 'bar', 'bar'],
536+
# b: ['one' , 'one', 'one', 'two', 'two', 'one', 'one', 'two', 'two'],
537+
# c: ['small','large','large','small','small','large','small','large','small'],
538+
# d: [1,2,2,3,3,4,5,6,7],
539+
# e: [2,4,4,6,6,8,10,12,14]
540+
# })
541+
# df.pivot_table(index: [:a], vectors: [:b], agg: :sum, values: :e)
542+
#
543+
# #=>
544+
# # #<Daru::DataFrame:88342020 @name = 08cdaf4e-b154-4186-9084-e76dd191b2c9 @size = 2>
545+
# # [:e, :one] [:e, :two]
546+
# # [:bar] 18 26
547+
# # [:foo] 10 12
516548
def pivot_table opts={}
517549
raise ArgumentError, "Specify grouping index" if !opts[:index] or opts[:index].empty?
518550

519551
index = opts[:index]
520552
vectors = opts[:vectors] || []
521553
aggregate_function = opts[:agg] || :mean
522-
values = opts[:values] ? [opts[:values]] : ((@vectors.to_a - (index | vectors)) & numeric_vectors)
554+
values =
555+
if opts[:values].is_a?(Symbol)
556+
[opts[:values]]
557+
elsif opts[:values].is_a?(Array)
558+
opts[:values]
559+
else # nil
560+
(@vectors.to_a - (index | vectors)) & numeric_vectors
561+
end
562+
523563
raise IndexError, "No numeric vectors to aggregate" if values.empty?
524564

525565
grouped = group_by(index)

spec/fixtures/sales-funnel.csv

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
Account,Name,Rep,Manager,Product,Quantity,Price,Status
2+
714466,Trantow-Barrows,Craig Booker,Debra Henley,CPU,1,30000,presented
3+
714466,Trantow-Barrows,Craig Booker,Debra Henley,Software,1,10000,presented
4+
714466,Trantow-Barrows,Craig Booker,Debra Henley,Maintenance,2,5000,pending
5+
737550,"Fritsch, Russel and Anderson",Craig Booker,Debra Henley,CPU,1,35000,declined
6+
146832,Kiehn-Spinka,Daniel Hilton,Debra Henley,CPU,2,65000,won
7+
218895,Kulas Inc,Daniel Hilton,Debra Henley,CPU,2,40000,pending
8+
218895,Kulas Inc,Daniel Hilton,Debra Henley,Software,1,10000,presented
9+
412290,Jerde-Hilpert,John Smith,Debra Henley,Maintenance,2,5000,pending
10+
740150,Barton LLC,John Smith,Debra Henley,CPU,1,35000,declined
11+
141962,Herman LLC,Cedric Moss,Fred Anderson,CPU,2,65000,won
12+
163416,Purdy-Kunde,Cedric Moss,Fred Anderson,CPU,1,30000,presented
13+
239344,Stokes LLC,Cedric Moss,Fred Anderson,Maintenance,1,5000,pending
14+
239344,Stokes LLC,Cedric Moss,Fred Anderson,Software,1,10000,presented
15+
307599,"Kassulke, Ondricka and Metz",Wendy Yule,Fred Anderson,Maintenance,3,7000,won
16+
688981,Keeling LLC,Wendy Yule,Fred Anderson,CPU,5,100000,won
17+
729833,Koepp Ltd,Wendy Yule,Fred Anderson,CPU,2,65000,declined
18+
729833,Koepp Ltd,Wendy Yule,Fred Anderson,Monitor,2,5000,presented

0 commit comments

Comments
 (0)