Skip to content

Latest commit

 

History

History
82 lines (67 loc) · 6.38 KB

README.md

File metadata and controls

82 lines (67 loc) · 6.38 KB

DATA SCIENCE

Data-Science: What is, Basics & Process

Path for Data Science

Programming

Python_Programming
SQL programming
• Excel
• Comfortable with using the Terminal, version control in Git, and using GitHub

Calculus
Linear Algebra
Probability and Statistics:

  • Accessing database, CSV, and JSON data
  • Data cleaning and transformations using pandas
  • visualization
  • dashboards
  • Content-Based and Collaborative Filtering
  • Evaluation of Recommendation Systems. DCG, nDCG

Natural Language Processing

  • Time Series Analysis
  • Text Analytics
  • Lexical processing
  • Syntactic Processing
  • Semantic Processing

Fundamentals of Big Data Engineering

  • Hadoop and MapReduce Programming
  • NoSQL Databases and Apache HBase
  • Hive Tutorial
  • Analytics using PySpark