Columbia MS Courses Home

The following table lists the courses I took during my Computer Engineering MS degree at Columbia University and links to the repositories for assignments, projects, and other course materials.

Semester	Course	Repositories	Summary
Fall 2024	Natural Language Processing	Assignments Repo	Assignment 1: Built a trigram language model in Python for text classification. Assignment 2: Developed a neural network for dependency parsing using PyTorch. Assignment 3: Implemented a conditioned LSTM language model for image captioning. Assignment 4: Created a Semantic Role Labeling system utilizing BERT.
Fall 2024	Introduction to Databases	Projects Repo	Project 1: Designed a database system for a job board application (modeling companies, job postings, skills, and requirements), implemented it on a Google Cloud PostgreSQL server, and built a Web Application to interact with it. Project 2: Expanded the Project 1 database schema with advanced PostgreSQL features including text/array attributes, composite data types, and PL/pgSQL triggers/functions.
Fall 2024	System-on-Chip Platforms	Assignments Repo	Homework 2: Implemented a fully connected neural network layer (fc_layer) using SystemC as part of a deep convolutional neural network (DWARF7) for image classification. Homework 4: Designed a SystemC convolutional neural network accelerator with AXI4 interfaces and synthesized it to an RTL implementation via High-Level Synthesis (HLS) using Catapult.
Fall 2024	Heterogenous Computing for Signal and Data Processing	Assignments Repo Project Repo	Homework 1: Compared host-to-device memory allocation methods for elementwise vector operations using PyCUDA and PyOpenCL, profiling the performance with NVIDIA Nsight. Homework 2: Wrote modular device functions to implement a sin(x) Taylor series approximation using built-in math libraries in PyCUDA and PyOpenCL. Homework 3: Implemented 2D convolution (correlation) on GPUs, progressively optimizing performance using shared and constant memory locality techniques. Homework 4: Implemented naive and work-efficient parallel prefix scans, and built a basic CNN forward pass pipeline (Convolution, ReLU, Flatten, Fully Connected) using PyCUDA and PyOpenCL. Project: Investigated GPU parallelization performance of the normalized cross-correlation (NCC) algorithm for template matching ("Where's Waldo?") across various precision formats (FP32, FP16, FP8, BF16, TF32) on NVIDIA Ampere and Hopper architectures.
Spring 2025	Private Systems	Assignments Repo Project Repo	Homework 1: Explored unintended memorization in neural networks by evaluating the Secret Sharer framework. Homework 2: Investigated privacy accounting and quality control using the Sage Differentially Private ML platform. Homework 3: Built an encrypted database over Google BigQuery supporting computation on encrypted data using non-deterministic (AES-GCM), deterministic (AES-SIV), and partially homomorphic (Paillier) encryption schemes. Project: Investigated the NYC Tuition Assistance Program dataset for quasi-identifiers that could lead to privacy leaks, developing a custom Python CLI tool (`find-quasi-ids`) to conduct privacy-preserving data analysis.
Spring 2025	Applied ML in the Cloud	Assignments Repo Final Project Repo	Homework 1: Compared cloud VM (IaaS), data warehouse (PaaS), and conversational AI (SaaS) offerings across IBM Cloud, AWS, and Google Cloud Platform. Homework 2: Developed an Infrastructure-as-Code (IaC) Python application using Pulumi to automate the discovery and provisioning of scarce GPU instances across multiple GCP regions. Homework 3: Created a deep learning workflow on GCP's Kubernetes Engine (GKE) with Kubeflow, training a PyTorch LeNet-5 model and serving it via KServe with the NVIDIA Triton Inference Server. Project 1: Profiled and compared theoretical vs. measured workload characteristics (FLOPs, memory, arithmetic intensity) for CNNs (LeNet-5, VGG16) running on a Google Cloud T4 GPU in a VM versus a containerized environment. Project 2: Optimized a ResNet50 model using TensorRT (Pruning, Sparsity, Quantization) and deployed the variants to distinct Google Cloud Vertex AI endpoints, building a Streamlit web app to benchmark parallel inference latency and accuracy.
Spring 2025	Embedded Scalable Platforms	Project Repo	Project: Developed a new frontend for the `hls4ml` framework to support Google Flax's NNX API, enabling the generation of high-level synthesizable C++ code for hardware-accelerated machine learning on FPGAs and ASICs. The implementation addressed architectural differences in Flax and was assessed by comparing Power, Performance, and Area (PPA) against the existing TensorFlow/Keras frontend.
Spring 2025	Large-Scale Stream Processing	Assignments Repo	Homework 1: Processed HTTP server logs using PySpark RDD and DataFrame APIs to compute total bytes served, filter top-K IPs, and analyze requests within specified time windows and subnet masks. Homework 2: Implemented and evaluated streaming optimizations (Operator Reordering, Load Shedding, and Redundancy Elimination) using PySpark Structured Streaming and DStream APIs on a denormalized Formula 1 dataset. Project: Designed a real-time analytics dashboard to visualize live Formula 1 race telemetry and detect strategy decisions, built using Apache Beam deployed on GCP Dataflow with a Streamlit frontend.
Fall 2025	Parallel Functional Programming	Assignments Repo Project Repo	Assignments: Completed programming exercises in Haskell covering parallelization and functional programming concepts, utilizing GHCi and Stack. Project: Developed a Haskell-based 4-player Blokus game engine and parallelized the multi-agent MaxN search algorithm to evaluate game states, optimizing execution by using Haskell's `Control.Parallel.Strategies` with depth-budgeting and multi-core work-stealing strategies.
Fall 2025	Artificial Intelligence-of-Things	Labs Repo Project Repo	Labs: Incrementally built a smart watch using an Adafruit HUZZAH32 ESP32 microcontroller with MicroPython. Features added included sensor data processing, I2C/SPI bus communication for display, voice assistant capabilities using an LLM (Whisper), and Human Activity Recognition (HAR) using cloud services. Project: Developed an end-to-end AI-powered medical device error triage system. The system simulates medical device logs, integrates real-time environmental data from an ESP32 sensor node, and leverages a Large Language Model (LLM) enriched with FDA MAUDE reports to generate diagnostic reports and root-cause analyses.
Fall 2025	Computer Networks	Assignments Repo	Homework 1: Analyzed HTTP packets using Wireshark, calculated end-to-end delays using `ping`, and traced network routes with `traceroute`. Homework 2: Developed a caching HTTP Proxy Server from scratch using Python's `socket` and `selectors` libraries, and analyzed HTTP interactions (including HTTP Conditional GETs and Authentication) with Wireshark. Homework 3: Implemented a recursive DNS Resolver in Python and analyzed DNS resolution paths using `nslookup` and `dig`. Also analyzed HTTP/2 performance vs HTTP/1.1 and evaluated DASH adaptive video streaming. Homework 4: Implemented a TCP-Lite reliable data transfer protocol over UDP in Python and analyzed TCP connection establishments, bulk data transfers, and TCP Reno/Tahoe congestion control algorithms. Homework 5: Explored IPv4/IPv6 packet headers, IP fragmentation, NAT translation, BGP routing paths using looking glass servers, and traced high-speed Internet2 connections.
Fall 2025	Malware Analysis & Reverse Engineering	Assignments Repo	Homework 1: Performed basic static and dynamic analysis on Windows executables and DLLs using tools like PEiD, CFF Explorer, strings, Process Monitor, FakeNet, and Wireshark to identify packed files, keyloggers, and ransomware. Homework 2: Analyzed x86 assembly code and reverse engineered programs using IDA Pro to identify C constructs (loops, conditionals), analyze network-based malware, and defuse a command-line C "bomb". Homework 3: Conducted advanced static and dynamic analysis using IDA Pro and debuggers to identify complex code constructs (switch statements, loops) in malware loaders, reverse engineered a game's registration key, and patched a Windows executable (Solitaire) to alter its behavior.
Spring 2026	Hardware Security	Assignments Repo	Homework 1: Acted as a red team to inject subtle hardware Trojans into a Verilog 2-stage pipeline microcontroller (e.g., homoglyph opcodes, state machine deadlocks, pipeline timing bypasses), and acted as a blue team to analyze and detect hardware vulnerabilities inserted into a Verilog IEEE 754 double-to-float converter.
Spring 2026	Computer Hardware Design	Assignments Repo	Labs 1-4: Completed initial labs focusing on Verilog and hardware design basics. Project 1: Designed and tested various priority selectors (arbiters) in SystemVerilog, including 4-bit and 8-bit selectors using basic assign statements, if-else logic, and hierarchical module designs, as well as a 4-bit rotating priority selector. Project 2: Currently in progress.

Getting Started

Create a GitHub repository for a course assignments and/or project following the naming convention <course-code>-<course-name>-<[assignments/project]> i.e. comsw4705-nlp-assignments.
Create an entry for the course with the created repo in the table above.
Run the Pulumi Python ms-courses-home app locally to configure specific settings of the newly created repo.

Template Metadata

Useful metadata to have handy for homeworks and projects.

**Author:** Pablo Ordorica Wiener (UNI: po2311)

**Course:** <Course Number> <Course Name>

- **Semester:** <Fall/Spring> <yyyy>
- **Instructor:** <Instructor Full Name> (UNI: <xxxx>)

- **TA:** <TA Full Name> (UNI: <xxxx>)

Cloud Infrastructure Strategy for Assignments and Projects

Some of the assignments and projects require cloud computing, and this section explains my approach to managing cloud resources for those courses.

As seen on the table above, there are two type of repositories:

Individual assignments: One single repository per course containing all individual homework.
Team projects: Separate repository per project (especially for team collaborations).

Feature	👤 Individual Assignments	👥 Team Projects
GCP Project	Shared GCP Project	New Dedicated GCP Project
Pulumi Project	Unique name per Assignments Repo	Unique name per Project
Pulumi Stack	main (local deployment)	Depends on the project.
State Backend	Pulumi Cloud (Personal Org)	GCP Bucket Storage

Workflow for New Repositories

1. Individual Assignments

All individual assignments

share a single GCP Project to avoid overhead, but
use separate Pulumi Projects (one per assignments repo) to keep state isolated.

Run this in the root of your assignment repo:

# Initialize a new Pulumi Python project using 'uv' as the package manager
# --force is used because the directory already exists (the repo root)
pulumi new python --name <course-code>-<assignment-name> --toolchain uv --force

After initialization, add the my shared cloud infra library:

uv add "git+https://github.com/pablordoricaw/my-cloud-lib.git@v0.2.0#subdirectory=pulumi"

Team Projects

For group projects, I use:

A dedicated GCP Project is created for the team. This ensures my personal credits are not billed for team usage and allows teammates to have IAM access.
A GCP Storage Bucket inside the team's project. This allows all teammates to read/write state without needing access to my personal Pulumi Cloud organization.

Prerequisites:

A new GCP Project in the Google Cloud Console.
Grant "Editor" IAM roles to all team members on that GCP Project.
A storage bucket to use as the backend for the IaC state

Run this in the root of the project repo:

# 1. Authenticate to the Team's State Bucket (Teammates must do this too)
#    Ensure you have 'Storage Object Admin' on the bucket.
gcloud auth application-default login
pulumi login gs://<team-project-bucket-name>

# 2. Initialize the project (same as individual)
pulumi new python --name <project-name> --description "Course Code Team Project"

# 3. Configure the Stack to use the Team's GCP Project
pulumi config set gcp:project <team-gcp-project-id>

Add my shared infrastructure library if needed:

uv add "git+https://github.com/pablordoricaw/my-cloud-lib.git@v0.2.0#subdirectory=pulumi"

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
ms-courses-platform		ms-courses-platform
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Columbia MS Courses Home

Getting Started

Template Metadata

Cloud Infrastructure Strategy for Assignments and Projects

Workflow for New Repositories

1. Individual Assignments

Team Projects

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Columbia MS Courses Home

Getting Started

Template Metadata

Cloud Infrastructure Strategy for Assignments and Projects

Workflow for New Repositories

1. Individual Assignments

Team Projects

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages