Skip to content

Latest commit

 

History

History
6 lines (6 loc) · 650 Bytes

README.md

File metadata and controls

6 lines (6 loc) · 650 Bytes

Advanced Data Mining:

  1. HW1: Evaluating bounds of Euclidean and Manhattan distances for n d-dimensional arrays.
  2. HW2: Text clustering using k-means with eigenvalue decomposition.
  3. HW3: Regularized Linear models (both Lasso and Ridge implementations), as well as Spectral Clustering implementation with Networkx.
  4. HW4: Implementation of Flajolet Martin algorithm for counting highest trailing zeros; Bloom Filter for spam detection; MapReduce for Locality-Sensitive Hashing (LSH); Jaccard Similarity between k-shingles.
  5. Project: Network analysis of Amazon's co-purchased products (Stanford Network Analysis Library).