100-Days-of-Data-Science

Welcome to my 100-Days-of-Data-Science Repository

Day 1 - Variables and Strings in Python

1️⃣ Creating Variables and checking data type with type() function.
2️⃣ Learnt about two important data types; Integer and String
3️⃣ Changing data types
4️⃣ String as Array
5️⃣ Striping, Slicing, Splitting, Concatenating, replacing , formatting and indexing in string

Day 2 - Conditionals and Loops

1️⃣ Conditionals🔗:
- Utilizing the if statement to execute code based on a condition.
- Employing elif to add more conditions to the decision-making process.
- Falling back on the else statement for actions when no conditions are met.
2️⃣ Loops🔄:
- For loops to iterate through sequences like lists, tuples, and more.
- range() function for controlled iterations.
- Embracing the efficiency of while loops for repeated execution until a condition becomes false.

Day 3 - Python's Fundamental Data Structures

1️⃣ Lists: mutable data structure, changeable, ordered sequence of elements inside square brackets[ ].
- Explored the versatile world of lists
- dynamic arrays that can hold a mix of data types.
- Used functions like 'len()', sum(), max(), min(), sorted()append(), insert(),'extend()', remove() , etc. to manipulate elements.
- Performed slicing techniques.
2️⃣ Tuples: Dived into the immutable realm of tuples
- inside brackets().
- similar to lists, but their values cannot be changed after creation and doesnot support item reassignment.
- Perfect for protecting important data!
3️⃣ Dictionaries: A dictionary is a collection which is unordered, changeable and indexed.
- In python, dictionaries are written with curly brackets, and they have keys and values.
- Elements are in the form of key-value pairs.
- Provides fast data retrieval.
- Learned to add, modify, and access elements using keys.
- Also learned about nested dictionaries.
4️⃣ Sets: Set is an unordered collection data type that is iterable, mutable and has no duplicate elements.
- Python's set class represents the mathematical notion of a set.
- This is based on a data structure known as hash table.
- Explored the world of sets, unordered collections of unique elements.
- Utilized functions like add(), 'clear()', remove(), issubset() etc. ,
- Set operations like union, difference and intersection.
- Eliminates duplicates.

Day 4 - Functions is Python

1️⃣ Defining Functions: Functions are defined using the def keyword, followed by a name and parentheses. They can take parameters, which are placeholders for data we'll use inside the function.
2️⃣ Function Calls: We call functions by using their names followed by parentheses. We can pass arguments to functions, which are the actual data we want to operate on.
3️⃣ Return Statements: Functions can return values using the return statement. This allows us to capture the output of a function and use it elsewhere in our code.
4️⃣ Types of Arguments: Functions can have 🔹 Default Arguments 🔹 Positional Arguments 🔹 Keyword Arguments 🔹 Arbitary Arguments
5️⃣ Variable Scope: I learned about the concept of variable scope. Variables defined inside a function have local scope, while those defined outside have global scope. It's essential to understand how scope affects variable accessibility.
6️⃣ Lambda Functions: I explored lambda functions, which are concise, anonymous functions often used for simple operations. They're created using the lambda keyword.
7️⃣ Recursion: I delved into recursive functions, where a function calls itself. It's a powerful technique that can be used to solve complex problems.

Day 5 - Object Oriented Concepts

🔷 Object-Oriented Programming is a programming paradigm that organizes code into reusable, self-contained objects. These objects are instances of classes, which are essentially blueprints for creating objects.
🔷 Python is a versatile and popular programming language that supports OOP principles.
1️⃣ Classes and Objects: I gained insights into the essence of classes as blueprints for creating objects. This paradigm allows me to encapsulate data and behavior within elegant structures.
2️⃣ Attributes and Methods: The foundation of OOP, attributes hold data while methods define actions. This modular approach enhances code organization and readability.
3️⃣ Constructors: With the init method, I unlocked the ability to initialize objects and set their initial attributes right upon creation.
4️⃣ Destructors: Managing resource cleanup is crucial, and the del method lets me take care of that when objects are no longer needed.
5️⃣ str() Function: This function lets me customize how an object is represented as a string, enhancing clarity and debugging.
6️⃣ Inheritance: The concept of creating specialized classes, derived from existing ones, fascinated me. It promotes code reuse and empowers me to build hierarchies of related classes.
7️⃣ Access Modifiers: I explored the world of access modifiers, including public, private, and protected member access. These modifiers offer a structured way to control interaction with class attributes and methods.

Day 6 - More Object Oriented Concepts

1️⃣ Method Overriding: Method overriding is the ability of a subclass to provide a specific implementation for a method that is already defined in its superclass.
2️⃣ Method Resolution Order(MRO): Method Resolution Order determines the sequence in which classes are searched for a requested method. It follows the C3 linearization algorithm.
3️⃣ Polymorphism: Polymorphism allows objects of different classes to be treated as objects of a common base class, leading to code that can work with objects of multiple types. It means the same function name (but different signatures) being used for different types.
4️⃣ Duck Typing: Duck typing is a concept where the type or class of an object is determined by its behavior (methods and properties) rather than its explicit type.
5️⃣ Encapsulation: It describes the idea of wrapping data and the methods that work on data within one unit.

Day 7 - Exception Handling and File Handling

1️⃣ Exception Handling: An exception is an event that occurs during the execution of a program that disrupts the normal flow of instructions. It can be caused by various reasons, such as invalid input, resource unavailability, or unexpected behavior.
- Try-Except Block: The primary mechanism to handle exceptions in Python is the try-except block. It allows us to wrap a block of code that might raise exceptions, and then specify how to handle those exceptions.
- Else Clause: The else clause can be added after all except clauses and will execute if no exception occurs within the try block.
- Finally Clause: The finally clause is used to specify code that will be executed no matter what, whether an exception occurred or not.
- Raising Exceptions: You can explicitly raise exceptions using the raise statement. This is useful when you want to signal an error condition yourself.
- Custom Exceptions: We can create our own custom exception classes by inheriting from existing exception classes. This can help us categorize and handle specific errors in our code.
2️⃣ File Handling
- Opening and Closing Files:
  - open() function: Used to open a file, it returns a file object.
  - close() method: Used to close the file when you're done with it.
- File Modes:
  - 'r': Read mode (default).
  - 'w': Write mode, truncating the file if it already exists.
  - a': Append mode, adding data to the end of the file if it exists.
  - 'x': Exclusive creation mode, creating a new file if it doesn't exist.
  - 'b': Binary mode, used for working with binary files.
  - 't': Text mode (default), used for working with text files.
- Reading from Files:
  - read(): Reads the entire content of the file.
  - readline(): Reads a single line from the file.
  - readlines(): Reads all lines into a list.
- Writing to Files:
  - write(): Writes a string to the file.
  - writelines(): Writes a list of strings to the file.
- Context Managers (with Statement): A safer way to handle files using the with statement. It automatically handles opening and closing of the file.

Day 8 - NumPy

Advantages of NumPy over Python Lists: NumPy offers lightning-fast array operations and efficient memory usage, making it a go-to choice for numerical computations and data manipulation.
Key Functions and Attributes: I got hands-on with np.array() to create arrays, and explored attributes like array.shape() and array.size for shape and size information. The dtype attribute helped me manage data types effortlessly.
Array Manipulation: I learned to manipulate arrays with functions like transpose, np.empty, np.ones, and np.arange(). Reshaping arrays using array.reshape(), array.flatten(), and array.ravel() proved incredibly useful for data restructuring.
Array Slicing and Indexing: NumPy's array slicing capabilities blew my mind! From extracting specific elements to working with slices, I've gained a solid understanding of accessing and manipulating array data.
Mathematical Operations: I explored functions like np.std(), np.log(), and basic arithmetic operations (add, subtract, multiply, divide) on arrays.
Trigonometric and String Operations: The library surprised me with its versatility. I experimented with trigonometric functions and even string operations using NumPy arrays.

Day 9 - Pandas

Pandas is a Python library used for working with data sets.It is a powerful Python library that provides data manipulation, analysis, and cleaning tools and is widely used in data science, machine learning, and any data-related tasks. I leant about:
- Constructing DataFrames
- Concept of csv data and working with them
- Indexing using column name, rowindex and rowindex, column index and Slicing
- Find null values
- several functions like head(), tail(), describe(), info(),isnull(), value_counts(), unique(), and so on.

Day 10 - More Pandas

With the read_json() function, I've been able to effortlessly load JSON data into Pandas dataframes, providing me with a structured, tabular format to work with. And there is also to_json() method and it lets me seamlessly convert Pandas dataframes back into JSON format, making data exchange a breeze.
One tool that has truly dazzled me is the json_normalize() function. It's like having a decoder for nested JSON structures! This function flattens nested JSON data into a neat and tidy dataframe
Also ventured into Pandas' read_html() function. With this gem, I can effortlessly extract HTML tables from web pages and transform them into Pandas dataframes. This is an invaluable skill for converting messy web data into structured, analyzable formats. Plus, the to_html() method allows me to reverse the process, seamlessly converting dataframes back into HTML tables.

Day 11 - Handling Null Values

1️⃣ Deletion or Removal:
- Listwise Deletion: It implies removing entire rows with at least one null value. I realized that this approach is suitable when the null values are relatively rare and not systematically distributed. We use .dropna() to do this. .dropna(axis = 0) for deletion of rows
- Column Dropping: If a column contains a high proportion of null values and is not crucial for analysis, you can consider dropping the column altogether. .dropna(axis = 1) for deletion of columns
2️⃣ Imputation: Imputation involves replacing null values with estimated or inferred values. This helps to retain data points and maintain the structure of the dataset.
- Mean/Median Imputation: Replace null values with the mean or median of the non-null values in the same column. This method is suitable for numerical data.
- Mode Imputation: Replace null values with the mode (most frequent value) of the non-null values in the column. This method is suitable for categorical data.
- Interpolation: For time-series or sequential data, null values can be estimated using interpolation methods like linear interpolation.

Day 12 - Matplotlib Library

label, title, xlabel, ylabel, xticks, yticks, plt.show()
linewidth, marker, markersize, and markeredgecolor, shorhand notation, saving the graph
Bargraph
Piechart
Histograms
Box and Whisker Plots
Visualized the gas_prices,csv and fifa_data.csv data using Matplotlib

Day 13 - Seaborn Library

1️⃣ Line Plot : Explored trends and relationships between variables over time or categories. Customized colors, markers, and labels for clear insights.
2️⃣ Bar Plot : Learned the art of comparing categorical data and showcasing frequency distributions. Leveraged hues and stacked bars to enhance understanding.
3️⃣ Pair Plot : Discovered the power of visualizing pairwise relationships among multiple numerical variables.
4️⃣ Dist Plot : Combined histograms and kernel density estimates for insightful data distribution visualization. Customized colors, line styles, and plot types to convey information effectively.
5️⃣ Scatter Plot : Relationships between two numerical variables with the elegance of scatter plots. Used colors, markers, and sizes to enhance visual representation.
6️⃣ Heatmap : Embraced the world of 2D matrix data visualization using colors to highlight correlations, relationships, and patterns. A fantastic tool for understanding large datasets.

Day 14 - Linear Algebra

Followed Course "Linear Algebra for Machine Learning and Data Science" by DeepLearning.AI : Link to course

System of Sentences, System of Information
Sytem of Equations
Linear and Non Linear Equations
System of Linear Equations as Lines
Linear Dependence and Linear Independence
Relation of Determinant and Singularity

Day 15 - Linear Algebra

Solve a system of linear equations using the elimination method.
Use a matrix to represent a system of linear equations and solve it using matrix row reduction.
Solve a system of linear equations by calculating the matrix in the row echelon form.
Calculate the rank of a system of linear equations and use the rank to determine the number of solutions of the system.

Day 16 - Linear Algebra

Perform common operations on vectors like sum, difference, and dot product.
Multiply matrices and vectors.
Represent a system of linear equations as a linear transformation on a vector.
Calculate the inverse of a matrix, if it exists.
Neural Networks and Matrices

Day 17 - Linear Algebra

Interpret the determinant of a matrix as an area and calculate determinant of an inverse of a matrix and a product of matrices.
Determine the bases and span of vectors.
Find eigenbases for a special type of linear transformations commonly used in machine learning.
Calculate the eignenvalues and eigenvectors of a linear transformation (matrix).

Day 18 - Calculus

Followed Course "Calculus for Machine Learning and Data Science" by DeepLearning.AI : Link to course

Machine Learning motivation
Motivation to derivatives
Derivatives and Tangents
Slopes, maxima and minima
Concept of Derivatives
Approximation of Derivatives
Derivatives and their notation
and some common derivatives

Day 19 - Calculus

Derivative of trigonometric functions
Meaning of the Exponential(e)
The derivative of e^x, logx
Existence of the derivative

Day 20 - Calculus

🔶 Properties of Derivatives:
- Multiplication By scalars
- The Sum rule
- The Product Rule
- The Chain Rule
🔶 Optimization
- Introduction to Optimization
- Optimization of squared loss: one powerline problem, the two powerline problem, and the three powerline problem.

Day 21 - Calculus

1️⃣ Optimization of Log Loss
2️⃣ Tangent Planes
3️⃣ Partial Derivatives
4️⃣ Gradients
5️⃣ Gradients and maxima/minima
6️⃣ Optimization with gradients

Day 22 - Calculus

1️⃣ Optimization using Gradient Descent in One Variable
2️⃣ Optimization using Gradient Descent in Two Variables
3️⃣ Optimization using Gradient Descent - Least Square
4️⃣ Optimization using Gradient Descent - Least Square with Multiple Observations

Day 23 - Calculus

Regression with a perceptron
Regression with Percepton - Loss Function
Regression with Percepton - Gradient Descent

Day 24

1️⃣ Classification with Perceptron
2️⃣ Classification with Perceptron - The Sigmoid Function
3️⃣ Classification with Perceptron - Gradient Descent

Day 25 - Ongoing

1. Classification with Neural Network
1. Classification with a Neural Network - Minimizing log-loss
1. Gradient Descent and Backpropagation
1. Netwon's Method
1. An example of Netwon's method
1. The second derivative and it's concept
1. The Hessian and concavity
1. Newton's Method for two variables

Day 26 - Probability and Statistics

1. Probability (P): At its core, probability quantifies the likelihood of an event occurring. It's often represented as a value between 0 (impossible) and 1 (certain). P(A) = Number of favorable outcomes / Total possible outcomes
1. Complement of Probability (P'): Sometimes, it's easier to calculate the probability of an event not happening. The complement of probability is a simple way to do that. Formula: P'(A) = 1 - P(A)
1. Sum of Probabilities of Joint Events: When you're interested in two events happening together (A and B), the joint probability can be calculated using the product of their individual probabilities. Formula: P(A and B) = P(A) * P(B)
1. Sum of Probabilities of Disjoint (Mutually Exclusive) Events: For events that cannot happen simultaneously, their combined probability is simply the sum of their individual probabilities. Formula: P(A or B) = P(A) + P(B)
1. Union of Events A and B (A ∪ B): The probability of either event A or event B (or both) occurring is represented by the union of events. Formula: P(A ∪ B) = P(A) + P(B) - P(A and B)
1. Independence: Events A and B are considered independent if the occurrence (or non-occurrence) of one doesn't affect the other's probability. In this case, joint probability simplifies.

Day 27

1.Probability Distributions (Discrete)
1. Binomial Distribution
1. Binomial Coefficient
1. Bernoulli Distribution
1. Probability Distributions (Continuous)
1. Probability Density Function
1. Cumulative Distribution Function
1. Uniform Distribution
1. Normal Distribution
1. Chi-Squared Distribution

Day 28

Day 29

Day 30 - Linear Regression

Stock Price Prediction using Simple Linear Regression

Day 31 - Logistic Regression

Classification into Iris Versicolor and Iris Virginica
Trained the Logistic Regression model with the best hyperparameters on the training data.

Day 32 - Decision Tree Classifier

can be used for both Classification and Regression problems.
Splitting Decisions: Gini Impurity and Entropy
Information Gain: Decision Trees aim to choose the feature that provides the most valuable information to differentiate between classes or make predictions.
Pure vs. Impure Split:
- Pure split : all the data in a node belongs to a single class.
- Impure splits : mixed classes that need further branching.
Pruning Strategies: post-pruning and pre-pruning.

Day 33 - Decision Tree Regressor

Decision Tree Regressor is a model for predicting continuous values.
It uses a tree-like structure to make predictions based on feature splits.
It can be prone to overfitting but can be pruned to control complexity.
Decision Trees are interpretable, handle non-linearity, and are robust to outliers.
They are used in various applications, and can be enhanced through ensemble methods like Random Forests.

Day 34 - NLP Text Summarizer Project

Day 35 - Deployment

Deployed the text Summarizer Project using Flask

Day 36 - Random Forest Algorithm

Ensemble Learning: Combines multiple decision trees to improve prediction accuracy and reduce overfitting.
Random Sampling: Uses bootstrapping to create diverse subsets of the training data and random feature selection, which enhances the robustness of individual trees.
Classification and Regression: Random Forest is versatile and can be used for both classification and regression tasks.
Feature Importance: Can determine the importance of features, helping to identify which variables have the most influence on predictions. This is valuable for feature selection and data analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
01. Day 01 - Variables and Strings		01. Day 01 - Variables and Strings
02. Day 02 - Conditional and loops		02. Day 02 - Conditional and loops
03. Day 03 - Lists, Tuples, Sets and Dictionaries		03. Day 03 - Lists, Tuples, Sets and Dictionaries
04. Day 04 - Functions		04. Day 04 - Functions
05. Day 05 - Object Oriented Concepts		05. Day 05 - Object Oriented Concepts
06. Day 06 - OOPs 2nd part		06. Day 06 - OOPs 2nd part
07. Day 07 - Exception Handling and File Handling		07. Day 07 - Exception Handling and File Handling
08. Day 08 - NumPy		08. Day 08 - NumPy
09. Day 09 - Pandas		09. Day 09 - Pandas
10. Day 10 - More Pandas		10. Day 10 - More Pandas
11. Day 11 - Working with Null values		11. Day 11 - Working with Null values
12. Day 12 - Matplotlib Library		12. Day 12 - Matplotlib Library
13. Day 13 - Seaborn Library		13. Day 13 - Seaborn Library
14. Day 14 - Linear Algebra		14. Day 14 - Linear Algebra
15. Day 15 - Linear Algebra		15. Day 15 - Linear Algebra
16. Day 16 - Linear Algebra		16. Day 16 - Linear Algebra
17. Day 17 - Linear Algebra		17. Day 17 - Linear Algebra
18. Day 18,19,20 - Calculus Week 1		18. Day 18,19,20 - Calculus Week 1
19. Day 21, 22 - Calculus Week 2		19. Day 21, 22 - Calculus Week 2
20. Day 23,24,25 - Calculus Week 3		20. Day 23,24,25 - Calculus Week 3
22. Day 30 - Linear Regression/01. Stock Price Prediction		22. Day 30 - Linear Regression/01. Stock Price Prediction
23. Day 31 - Logistic Regression		23. Day 31 - Logistic Regression
24. Day 32 - Decision Tree Classifier		24. Day 32 - Decision Tree Classifier
25. Day 33 - Decision Tree Regressor		25. Day 33 - Decision Tree Regressor
26. Day 34 - NLP Text Summarizer Project		26. Day 34 - NLP Text Summarizer Project
27. Day 35 - Deployment using Flask		27. Day 35 - Deployment using Flask
28. Day 36 - Random Forest Algorithm		28. Day 36 - Random Forest Algorithm
29. Day 37 - K Nearest Neighbour		29. Day 37 - K Nearest Neighbour
30. Day 38 - Ensemble Techniques		30. Day 38 - Ensemble Techniques
31. Day 39 - AdaBoost Algorithm		31. Day 39 - AdaBoost Algorithm
32. Day 40 - Gradient Boosting		32. Day 40 - Gradient Boosting
33. Day 41 - Support Vector Machine - Introduction		33. Day 41 - Support Vector Machine - Introduction
34. Day 42 - SVM Practical Implementation		34. Day 42 - SVM Practical Implementation
35. Day 43 - Clustering in Machine Learning		35. Day 43 - Clustering in Machine Learning
36. Day 44 - K means Clustering		36. Day 44 - K means Clustering
37. Day 45 - Hierarchial Clustering		37. Day 45 - Hierarchial Clustering
38. Day 46 - DBSCAN Clustering		38. Day 46 - DBSCAN Clustering
39. Day 47 - Dimensionality Reduction (Introduction)		39. Day 47 - Dimensionality Reduction (Introduction)
README.md		README.md

kattelsameer/100-Days-of-Data-Science

Folders and files

Latest commit

History

Repository files navigation

100-Days-of-Data-Science

Day 1 - Variables and Strings in Python

Day 2 - Conditionals and Loops

Day 3 - Python's Fundamental Data Structures

Day 4 - Functions is Python

Day 5 - Object Oriented Concepts

Day 6 - More Object Oriented Concepts

Day 7 - Exception Handling and File Handling

Day 8 - NumPy

Day 9 - Pandas

Day 10 - More Pandas

Day 11 - Handling Null Values

Day 12 - Matplotlib Library

Day 13 - Seaborn Library

Day 14 - Linear Algebra

Followed Course "Linear Algebra for Machine Learning and Data Science" by DeepLearning.AI : Link to course

Day 15 - Linear Algebra

Day 16 - Linear Algebra

Day 17 - Linear Algebra

Day 18 - Calculus

Followed Course "Calculus for Machine Learning and Data Science" by DeepLearning.AI : Link to course

Day 19 - Calculus

Day 20 - Calculus

Day 21 - Calculus

Day 22 - Calculus

Day 23 - Calculus

Day 24

Day 25 - Ongoing

Day 26 - Probability and Statistics

Day 27

Day 28

Day 29

Day 30 - Linear Regression

Day 31 - Logistic Regression

Day 32 - Decision Tree Classifier

Day 33 - Decision Tree Regressor

Day 34 - NLP Text Summarizer Project

Day 35 - Deployment

Day 36 - Random Forest Algorithm

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages