Skip to content

Commit 8dbe724

Browse files
committed
finish removing outliers function
1 parent adf82a7 commit 8dbe724

File tree

1 file changed

+7
-1
lines changed

1 file changed

+7
-1
lines changed

utilities/__init__.py

+7-1
Original file line numberDiff line numberDiff line change
@@ -69,10 +69,16 @@ def interpolate_missing_data(data, real, discrete):
6969

7070
return data
7171

72-
def remove_outliers(data):
72+
def remove_outliers(data, real):
7373
"""Remove outliers from data and return as a pandas data frame."""
7474

75+
# get field mean and std for real-valued fields
76+
mean = data.describe().iloc[1, :]
77+
std = data.describe().iloc[2, :]
78+
7579
# remove outliers
80+
for (real, mean, std) in zip(real, mean, std):
81+
data = data[data[real] < 3*std + mean]
7682

7783
return data
7884

0 commit comments

Comments
 (0)