-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Expand file tree
/
Copy pathbringing-it-all-together-(2).py
More file actions
66 lines (50 loc) · 2.09 KB
/
bringing-it-all-together-(2).py
File metadata and controls
66 lines (50 loc) · 2.09 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
'''
Bringing it all together (2)
100xp
Wow, you've just generalized your Twitter language analysis that you did in the previous chapter
to include a default argument for the column name. You're now going to generalize this function
one step further by allowing the user to pass it a flexible argument, that is, in this case,
as many column names as the user would like!
Once again, for your convenience, pandas has been imported as pd and the 'tweets.csv' file has
been imported into the DataFrame tweets_df. Parts of the code from your previous work are also
provided.
Instructions
-Complete the function header by supplying the parameter for the dataframe df and the flexible
argument *args.
-Complete the for loop within the function definition so that the loop occurs of the tuple args.
-Call count_entries() by passing the tweets_df DataFrame and the column name 'lang'. Assign the
result to result1.
-Call count_entries() by passing the tweets_df DataFrame and the column names 'lang' and 'source'.
Assign the result to result2.
'''
# Import pandas
import pandas as pd
# Import Twitter data as DataFrame: df
tweets_df = pd.read_csv('tweets.csv')
# Define count_entries()
def count_entries(df, *args):
"""Return a dictionary with counts of
occurrences as value for each key."""
#Initialize an empty dictionary: cols_count
cols_count = {}
# Iterate over column names in args
for col_name in args:
# Extract column from DataFrame: col
col = df[col_name]
# Iterate over the column in DataFrame
for entry in args:
# If entry is in cols_count, add 1
if entry in cols_count.keys():
cols_count[entry] += 1
# Else add the entry to cols_count, set the value to 1
else:
cols_count[entry] = 1
# Return the cols_count dictionary
return cols_count
# Call count_entries(): result1
result1 = count_entries(tweets_df, 'lang')
# Call count_entries(): result2
result2 = count_entries(tweets_df, 'lang', 'source')
# Print result1 and result2
print(result1)
print(result2)