-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Expand file tree
/
Copy pathbringing-it-all-together-(1).py
More file actions
63 lines (49 loc) · 2.19 KB
/
bringing-it-all-together-(1).py
File metadata and controls
63 lines (49 loc) · 2.19 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
'''
Bringing it all together (1)
100xp
Recall the Bringing it all together exercise in the previous chapter where you did a
simple Twitter analysis by developing a function that counts how many tweets are in
certain languages. The output of your function was a dictionary that had the language
as the keys and the counts of tweets in that language as the value.
In this exercise, we will generalize the Twitter language analysis that you did in the
previous chapter. You will do that by including a default argument that takes a column name.
For your convenience, pandas has been imported as pd and the 'tweets.csv' file has been imported
into the DataFrame tweets_df. Parts of the code from your previous work are also provided.
Instructions
-Complete the function header by supplying the parameter for a DataFrame df and the parameter
col_name with a default value of 'lang' for the DataFrame column name.
-Call count_entries() by passing the tweets_df DataFrame and the column name 'lang'.
Assign the result to result1. Note that since 'lang' is the default value of the col_name
parameter, you don't have to specify it here.
-Call count_entries() by passing the tweets_df DataFrame and the column name 'source'.
Assign the result to result2.
'''
# Import pandas
import pandas as pd
# Import Twitter data as DataFrame: df
tweets_df = pd.read_csv('tweets.csv')
# Define count_entries()
def count_entries(df, col_name='lang'):
"""Return a dictionary with counts of
occurrences as value for each key."""
# Initialize an empty dictionary: cols_count
cols_count = {}
# Extract column from DataFrame: col
col = df[col_name]
# Iterate over the column in DataFrame
for entry in col:
# If entry is in cols_count, add 1
if entry in cols_count.keys():
cols_count[entry] += 1
# Else add the entry to cols_count, set the value to 1
else:
cols_count[entry] = 1
# Return the cols_count dictionary
return cols_count
# Call count_entries(): result1
result1 = count_entries(tweets_df, col_name='lang')
# Call count_entries(): result2
result2 = count_entries(tweets_df, col_name='source')
# Print result1 and result2
print(result1)
print(result2)