-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Expand file tree
/
Copy pathcomparison-of-ecdfs.py
More file actions
75 lines (59 loc) · 2.95 KB
/
comparison-of-ecdfs.py
File metadata and controls
75 lines (59 loc) · 2.95 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
'''
Comparison of ECDFs
100xp
ECDFs also allow you to compare two or more distributions (though plots get cluttered if you
have too many). Here, you will plot ECDFs for the petal lengths of all three iris species.
You already wrote a function to generate ECDFs so you can put it to good use!
To overlay all three ECDFs on the same plot, you can use plt.plot() three times, once for each
ECDF. Remember to include marker='.' and linestyle='none' as arguments inside plt.plot().
Instructions
-Compute ECDFs for each of the three species using your ecdf() function. The variables
setosa_petal_length, versicolor_petal_length, and virginica_petal_length are all in your namespace.
Unpack the ECDFs into x_set, y_set, x_vers, y_vers and x_virg, y_virg, respectively.
-Plot all three ECDFs on the same plot as dots. To do this, you will need three plt.plot() commands.
Assign the result of each to _.
-Specify 2% margins.
-A legend and axis labels have been added for you, so hit 'Submit Answer' to see all the ECDFs!
'''
import numpy as np
import matplotlib.pyplot as plt
setosa_petal_length = np.array([ 1.4, 1.4, 1.3, 1.5, 1.4, 1.7, 1.4, 1.5, 1.4, 1.5, 1.5,
1.6, 1.4, 1.1, 1.2, 1.5, 1.3, 1.4, 1.7, 1.5, 1.7, 1.5,
1. , 1.7, 1.9, 1.6, 1.6, 1.5, 1.4, 1.6, 1.6, 1.5, 1.5,
1.4, 1.5, 1.2, 1.3, 1.5, 1.3, 1.5, 1.3, 1.3, 1.3, 1.6,
1.9, 1.4, 1.6, 1.4, 1.5, 1.4])
versicolor_petal_length = np.array([ 4.7, 4.5, 4.9, 4. , 4.6, 4.5, 4.7, 3.3, 4.6, 3.9, 3.5,
4.2, 4. , 4.7, 3.6, 4.4, 4.5, 4.1, 4.5, 3.9, 4.8, 4. ,
4.9, 4.7, 4.3, 4.4, 4.8, 5. , 4.5, 3.5, 3.8, 3.7, 3.9,
5.1, 4.5, 4.5, 4.7, 4.4, 4.1, 4. , 4.4, 4.6, 4. , 3.3,
4.2, 4.2, 4.2, 4.3, 3. , 4.1])
virginica_petal_length = np.array([ 6. , 5.1, 5.9, 5.6, 5.8, 6.6, 4.5, 6.3, 5.8, 6.1, 5.1,
5.3, 5.5, 5. , 5.1, 5.3, 5.5, 6.7, 6.9, 5. , 5.7, 4.9,
6.7, 4.9, 5.7, 6. , 4.8, 4.9, 5.6, 5.8, 6.1, 6.4, 5.6,
5.1, 5.6, 6.1, 5.6, 5.5, 4.8, 5.4, 5.6, 5.1, 5.1, 5.9,
5.7, 5.2, 5. , 5.2, 5.4, 5.1])
def ecdf(data):
"""Compute ECDF for a one-dimensional array of measurements."""
# Number of data points: n
n = len(data)
# x-data for the ECDF: x
x = np.sort(data)
# y-data for the ECDF: y
y = np.arange(1, n+1) / n
return x, y
# Compute ECDFs
x_set, y_set = ecdf(setosa_petal_length)
x_vers, y_vers = ecdf(versicolor_petal_length)
x_virg, y_virg = ecdf(virginica_petal_length)
# Plot all ECDFs on the same plot
_ = plt.plot(x_set, y_set, marker = '.', linestyle = 'none')
_ = plt.plot(x_vers, y_vers, marker = '.', linestyle = 'none')
_ = plt.plot(x_virg, y_virg, marker = '.', linestyle = 'none')
# Make nice margins
plt.margins(0.02)
# Annotate the plot
plt.legend(('setosa', 'versicolor', 'virginica'), loc='lower right')
_ = plt.xlabel('petal length (cm)')
_ = plt.ylabel('ECDF')
# Display the plot
plt.show()