-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathTask solution
More file actions
185 lines (135 loc) · 7.12 KB
/
Task solution
File metadata and controls
185 lines (135 loc) · 7.12 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
# Main questions solutions:
## Q1:What is the difference between mean, median and mode?
mean, median, and mode are the measures of central tendency most widely used in the field of descriptive statistics.
i. Mean : commonly known as the average value is defined as sum of all the values divided by the total number of values.
ii. Median : is used when numerical data is sorted in some order and it is the midpoint of the set, making it more resistant to outliers than the mean.
iii. Mode : on the other hand, highlights the value that is most frequently observed in the set of data.
# ---------------------------------------------------------------------------
## Q2: What is the difference between left and right skewness of the data?
Skewness is a way to describe the symmetry of a distribution.
i. A distribution is a *Right skewed* if it has a "Tail" on right side of the distribution. "Skewness is +ve"
ii. A distribution is a *Left skewed* if it has a "Tail" on left side of the distribution. "Skewness is -ve"
iii. skewness = 0 if the distribution is symmetric.
# ---------------------------------------------------------------------------
## Q3: What is the difference between list and tuples in Python?
List : Tuple
List items enclosed in [] : Tuple items enclosed in (), and can remove them also.
list are ordered, can use idex to access elements. : Tuple are ordered, can use index to access elements.
List are mutable, can use edit like add, replace or delete : Tuple cannot be edited
both can have any type of data and both items aren't unique.
# ---------------------------------------------------------------------------
## Q4: How can you generate random numbers in Python?
import random
rand1 = random.random() # this a random floating number from 0 to 1
print(rand1)
rand2 = random.randint(1, 10) # this is a random intger number from 1 to 10
print(rand2)
# ---------------------------------------------------------------------------
## Q5: What is break, continue and pass in Python?
i. break: when a certain condition is true, and you want to end the loop now, "break" exit the loop.
ii. continue: when you want to skip the rest of the code and go to the next iteration, "continue" skip the rest of the code.
iii. pass: do nothing, it's a placeholder make the code looks correct.
# ---------------------------------------------------------------------------
## Q6: How can you reverse all the items in a list in Python using just one line of code?
listname.reverse()
# ---------------------------------------------------------------------------
## Q7: Given a list of integers numbers = [3, 7, 2, 8, 1], write a Python code snippet to find and print the maximum value in the list.
num = [3, 7, 2, 8, 1]
print(max(num))
--> another solution using function <--
a = [3, 7, 2, 8, 1]
def sort(mylist):
for i in range(len(mylist)):
for j in range(0, len(mylist) - 1):
if mylist[j] > mylist[j+1]:
mylist[j], mylist[j+1] = mylist[j+1], mylist[j]
return mylist
sort(a)
print(a[4])
# ---------------------------------------------------------------------------
## Q8: What is an outlier in a dataset, and how would you explain it to someone who is new to statistics?
Outliers are a set of data points that are significantly different from the rest of the data in the group.
The difference is either much larger or much smaller than the others. This disrupts the data distribution and causes errors in data collection.
For example, in a dataset of monthly sales numbers, if the revenue for one month is much higher than the rest, identifying those high sales would be considered unusual.
Therefore, outliers distort results and indicate special or rare occurrences. Hence, they must be identified and removed.
# ---------------------------------------------------------------------------
## figure 1:
1. median =
dataset = {4, 1, 2, 1, 3, 2, 1, 1}
after sorting = {1, 1, 1, 1, 2, 2, 3, 4}
middle integers = 1, 2
mean = (1+2)/2 = 3/2 = 1.5 = median #
2. standard deviation =
mean = (4+1+2+1+3+2+1+1)/8 = 15
s = sqrt(sum(x-mean)^2/n-1)
s = sqrt(1387/7) = 14.076
# ---------------------------------------------------------------------------
## figure 2:
ans = a.The mean is greater than the median because "the distribution is right skewed and the mean is affected by the higher value in the tails and the median is less sensitive"
# ---------------------------------------------------------------------------
## PS problems:
i. codeforces problem A:
a = int(input())
b = int(input())
years = 0
while b >= a:
years = years + 1
b = b * 2
a = a * 3
print(years)
------------------------------------
ii. codeforces problem B:
s = input().lower().strip()
word = "hello"
x=0
for a in s:
if a == word[x]:
x = x + 1
if x == len(word):
break
if x == len(word):
print("Yes")
else:
print("No")
# ---------------------------------------------------------------------------
## Bonus:
1. Codeforces problem:
i = int(input())
for x in range(i):
n = int(input())
a = list(map(int, input().split()))
maxVal = max(a)
operation = 0
while min(a) < maxVal:
pos = [y for y in range(n) if a[y] < maxVal]
for y in pos:
a[y] = a[y] + 1
operation = operation + 1
print(operation)
-------------------
# from previouse sol. i got this another sol. :
i = int(input())
for x in range(i):
n = int(input())
a = list(map(int, input().split()))
operation = max(a) - min(a)
print(operation)
-----------------------------------------------------
2. What is the difference between machine learning and traditional programming?
machine learning : traditional programing
- it is a subset of AI, focus on learning from data : - rule based code is written by the developers depending on the problem.
to develop an algorithm can be used to prediction.
- learning from historical data abd then make prediction : - it hasn't self learning, it's rule based and deterministic.
on new data.
- find patterns in large dataset might be difficult for : - it is totally dependant on developers's intelligence.
humans to discover.
-----------------------------------------------------
3. State some examples of machine learning models?
a) Supervised Models:
Supervised learning is the study of algorithms that use labeled data in which each data instance has a known category or value to which it belongs.
This results in the model to discover the relationship between the input features and the target outcome.
b) Unsupervised Models
Unsupervised learning involves a difficult task of working with data which is not provided with pre-defined categories or label.
c) Semi-Supervised Model
Supervised learning is such a kind of learning with labeled data that unsupervised learning, on the other hand, solves the task where there is no labeled data.
Semi-supervised learning fills the gap between the two. It reveals the strengths of both approaches by training using data sets labeled along with unlabeled one.