A focused, continually-improving collection of clear, rigorous explanations of algorithms I use to analyze complex systemsโespecially in biological and medical contexts where data are noisy, high-dimensional, and full of edge cases.
I care about mathematics. You will see it: definitions first, assumptions stated, and trade-offs made explicit. But everything is written to be approachableโyou should not need to fight the notation to understand the idea. This progress is a continual work in progress, so check back for more updates.
This is not a tutorial farm, a code dump, or a catalog of buzzwords. Itโs a place to understand how an algorithm works, why itโs appropriate, and when to move on to a neighboring method.
- Start with context. Each entry begins with a short statement of what the method is good at and the assumptions it quietly relies on.
- Scan the formulation. A compact, precise formulation follows (objective, loss, constraints). No heavy derivations unless they matter for usage.
- Check โWhen to prefer / avoid.โ Practical decision criteria so you can move quickly.
- Look sideways. Every method lists a few close neighbors (e.g., PCA โ ICA; K-means โ GMM/EM; Lasso โ Ridge/Elastic-Net).
- Apply with discipline. Metrics and diagnostics are included so results donโt become anecdotes.
Section | Purpose | What youโll see |
---|---|---|
Intent | One-sentence โwhat problem does this solve?โ | Clear problem statement |
Formulation | The core mathematics without ceremony | Objective/loss, constraints, variables |
Assumptions | The part that breaks silently if ignored | IID, linearity, separability, smoothness, stationarity, etc. |
When to prefer | Practical conditions where it excels | Data size/shape, noise regime, feature types |
When to avoid | Failure modes and edge cases | Multicollinearity, non-convexity traps, class imbalance, etc. |
Neighbor methods | Closely related alternatives | Swap-ins worth testing |
Diagnostics | How to know it worked | Residual checks, calibration, stability, uncertainty |
Biological applications | Where this has bite | Imaging, genomics, EHR, epidemiology, physiology |
Data shape | Often a good starting point | If that stalls, try |
Tabular, smallโmedium | Regularized GLMs; GBDT (XGB/LGBM/CatBoost) | Nonlinear SVM; simple MLP |
Sequences / longitudinal | Transformers (baseline); temporal CNNs | RNN/LSTM/GRU; HMM/Kalman for structure |
Spatial grids / images | CNNs; U-Net for segmentation | Vision Transformers; diffusion for generation |
Graphs / molecular | Message-passing GNNs | Graph Transformers; spectral methods |
Very high-dimensional, low labels | Contrastive pretraining; masked modeling | Autoencoders/VAEs; self-distillation |
Uncertainty is critical | Bayesian GLMs; calibrated ensembles | Bayesian deep learning; conformal prediction |
I enjoy the structure and honesty of mathematics. In applied workโespecially in biology and medicineโresults improve when the assumptions are explicit, the algorithms are chosen for the data (not for fashion), and the limits are respected. These notes are written to make that process fast, transparent, and repeatable.
You will see a mix of:
- Core methods (GLMs, trees/ensembles, SVMs, PCA, clustering)
- Advanced learning (Transformers, diffusion, contrastive/self-supervised learning, GNNs)
- Frontier topics (flows, neural ODEs, causal estimation, federated/multitask learning)
- Biological ML where signal is subtle and mechanisms matter (U-Net families, AlphaFold-style structure models, EHR sequence models)
Use this structure to keep entries consistent and quick to scan.
This section collects core, advanced, and frontier methods I study and use. Entries focus on:
- Used for (primary purpose)
- When (practical decision criteria)
- Similar (adjacent methods to consider)
The intent is clarity: rigorous enough for research, direct enough for practice.
Method | Used for | When | Similar |
Linear Regression | Linear relationship to continuous target | Interpretable baseline; trend testing | Ridge, Lasso |
Logistic Regression | Binary classification with calibrated probs | Probabilities + interpretability; baseline | Probit; Softmax Regression (multiclass) |
Decision Trees | Rule-based classification/regression | Nonlinear patterns; mixed feature types | Random Forest; Gradient Boosted Trees |
Random Forest | Ensemble of trees (bagging) | Robust tabular performance; low overfit | ExtraTrees; GBM |
Gradient Boosting (XGB/LGBM/CatBoost) | Boosted trees for strong tabular accuracy | State-of-the-art on many tabular tasks | AdaBoost; Random Forest |
K-Nearest Neighbors | Instance-based classification/regression | Simple nonparametric baseline; low-dim data | KDE; RBF-kernel SVM |
Support Vector Machines | Max-margin classification/regression | Medium-sized data; robustness to outliers | Logistic (linear); NNs (nonlinear) |
Naรฏve Bayes | Generative classification with independence | Text; very high-dimensional sparse features | Logistic Regression; LDA |
PCA | Orthogonal dimensionality reduction | Compression; de-correlation; visualization | SVD; ICA |
K-Means | Hard-partition clustering | Fast baseline clustering | GMM (soft clusters); DBSCAN |
ExpectationโMaximization | Latent-variable MLE (e.g., GMM) | Overlapping distributions; soft assignments | K-Means; Variational Inference |
Apriori / FP-Growth | Association rule mining | Frequent itemsets; basket analysis | Eclat |
Dynamic Programming | Optimal substructure optimization | Overlapping subproblems | Greedy (approximate) |
Gradient Descent | Continuous optimization | Differentiable models; large-scale training | SGD; Adam; RMSProp |
Neural Networks (MLP) | Flexible nonlinear mapping | Complex patterns; large data | CNN; RNN |
Method | Used for | When | Similar |
CNNs | Spatial representation learning | Vision; local structure | ViTs; Graph Convolutions |
RNN / LSTM / GRU | Sequence modeling with memory | Time series; language; speech | Transformers; Temporal CNNs |
Transformers | Attention-based sequence modeling | Language; multimodal; long context | RNNs; Attentional CNNs |
Autoencoders | Compression; anomaly detection | Representation learning | PCA; VAE |
Variational Autoencoders | Probabilistic generative modeling | Latent structure + generation | GANs; Normalizing Flows |
GANs | Adversarial generative modeling | Realistic synthesis; augmentation | VAEs; Diffusion |
Diffusion Models | Score-based generation | Diversity + stability | GANs; Score Matching |
Reinforcement Learning (Q-Learning) | Value-based decision policies | Discrete actions; tabular/compact states | Policy Gradient; DQN |
Policy Gradient / ActorโCritic | Direct policy optimization | Continuous/high-dim actions | REINFORCE; PPO |
K-Means++ / Advanced Clustering | Improved initialization | Reduce bad local minima | Spectral; GMM; DBSCAN |
DBSCAN | Density-based clustering with noise | Arbitrary shapes; outliers | OPTICS; HDBSCAN |
Spectral Clustering | Graph-Laplacian embeddings | Manifold/complex geometry | GNNs; Laplacian Eigenmaps |
HMMs | Probabilistic sequence models | Hidden state dynamics | Kalman Filters; CRF |
Kalman Filters | State estimation with noise | Real-time tracking | Particle Filters; HMM |
Graph Neural Networks | Learning on graphs | Relational structure > features | CNN (grids); Graph Transformers |
MCMC | Sampling complex posteriors | Bayesian inference | Variational Inference; HMC |
GBDT (XGB/LGBM/CatBoost) | Top performance on tabular data | Accuracy with moderate compute | Random Forest; AdaBoost |
Recommenders (MF: SVD/ALS) | Collaborative filtering | Sparse userโitem matrices | NCF; Graph-based Recsys |
Method | Used for | When | Similar |
Normalizing Flows | Exact-likelihood generative modeling | Need density + sampling | VAE; Diffusion |
Diffusion Transformers | Diffusion + Transformer backbones | Scaled multimodal generation | DDPM; GANs |
Neural ODEs | Continuous-time dynamics | Physics/biology/finance signals | RNNs; SDEs |
Graph Transformers / Message Passing | Expressive graph learning | Complex relational structure | Spectral GNNs |
Neural Tangent Kernel | Infinite-width NN theory | Generalization & convergence study | Kernels; GPs |
Meta-Learning (MAML, ProtoNets) | Rapid adaptation | Few-shot; transfer | Bayesian Opt; Fine-tuning |
Bayesian Deep Learning | Uncertainty-aware deep models | High-stakes decisions | MCMC; VI |
Causal Inference (DoWhy, EconML) | Estimating causal effects | Policy/health interventions | IV; Propensity Scores |
Federated Learning (FedAvg, FedProx) | Privacy-preserving distributed training | Decentralized sensitive data | Distributed SGD; DP |
Contrastive Learning (SimCLR, CLIP) | Self-supervised representations | Limited labels; large raw data | Autoencoders; Distillation |
Energy-Based Models | Unnormalized density modeling | Intractable partition functions | Boltzmann Machines |
RL โ PPO / SAC / DDPG | Scalable policy optimization | Continuous/high-dim control | REINFORCE; Q-Learning |
Multi-Agent RL | Interacting agents | Markets; autonomy; swarms | Game Theory; Single-agent RL |
Mixture-of-Experts / Sparse Transformers | Efficient scaling | Conditional computation | Standard Transformers; LoRA |
Quantum ML (VQE, QAOA) | Quantum optimization/chemistry | NISQ-era research | Classical Variational Methods |
Neurosymbolic AI | Neural perception + symbolic reasoning | Tasks needing both pattern and logic | Knowledge Graphs |
Masked Self-Supervision (BERT, MAE) | Representation pretraining | Large unlabeled corpora | Contrastive; Autoencoders |
Prompting / Few-Shot Adaptation | LLM task transfer without updates | Generalization to unseen tasks | Meta-Learning; Instruction Tuning |
Curriculum Learning | Staged difficulty schedules | Unstable/complex training | RL Shaping; Augmentation |
Neural Architecture Search | Automated model design | Edge constraints; task specificity | Bayesian/Hyperparameter Opt |
Area | Model | Used for | Similar |
Neuro / Brain Imaging | U-Net, V-Net, nnU-Net; BrainAGE; GLM (SPM/FSL) | Segmentation; age prediction; activation modeling | SegNet; DeepLab |
Radiology | Radiomics+ML; DeepMedic; CheXNet | Quantitative features; lesion segmentation; X-ray Dx | ResNet/EfficientNet variants |
Genomics | DeepSEA; AlphaFold; SpliceAI; DeepCpG/EpiDeep | Variant effect; protein structure; splicing; epigenetics | Basset; Basenji; RoseTTAFold |
Cardiology | ECGNet/DeepECG; EchoNet | Arrhythmia classification; EF estimation | 1D CNNs; video CNNs |
Pathology | HoVer-Net; CLAM (MIL); tile-based classifiers | Nucleus segmentation; WSI classification | Mask R-CNN; MIL variants |
Population & EHR | RETAIN; DeepPatient; BEHRT | Longitudinal risk; multi-outcome prediction | RNNs; Transformers for EHR |
Epidemiology | Compartmental (SIR/SEIR/SEIRD); ABM | Spread modeling; intervention simulation | System dynamics |
Multimodal Medical AI | MedCLIP; BioViL; Bio/ClinicalBERT | Imageโtext alignment; biomedical NLP | CLIP; BERT |
- Clinical Core: U-Net, RETAIN, SIR โ established workhorses.
- Research-Grade: AlphaFold, DeepSEA, SpliceAI โ molecular scale.
- Practice-Changing: CheXNet, EchoNet, CLAM โ real clinical impact.
- Emerging Frontier: MedCLIP, BEHRT, BioBERT โ multimodal and longitudinal.
- Core Python
- Data Handling
- Visualization
- Machine Learning / AI
- Math, Statistics, SciPy
- NLP / Text
- Utilities & Workflow
- Data I/O
- Visualization Add-ons
- Advanced Data & Big Data
- Deep Learning & GPU
- Advanced AI / Transformers / LLMs
- Advanced Visualization & Dashboards
- Statistics, Bayesian, Probabilistic
- Optimization & Math
- Graphs, Knowledge, Advanced Data
- Advanced NLP / Text
- Advanced Utilities & Parallelism
- Computer Vision & Image/Video
- Geospatial & Maps
- Ultra / Rare Imports (HPC, Research, Frontier)
Library | Role | Notes |
os, sys, Path | Filesystem, environment, paths | Portable path handling via pathlib |
re, json, csv | Regex, serialization, CSV I/O | Use jsonlines for large JSONL |
math, random, time, datetime | Math, RNG, timing | dt alias for concise timestamps |
Counter, defaultdict | Counting, default dicts | Efficient tallies and grouping |
Library | Role | Notes |
numpy | Arrays, vectorized math | Foundation for most stacks |
pandas | Tabular data | Wide ecosystem; groupby, time series |
pyarrow | Columnar memory, parquet | High-perf interchange with pandas |
polars | Fast DataFrame (Rust engine) | Laziness, speed on medium/large data |
Library | Role | Notes |
matplotlib, seaborn | Static plotting | Seaborn for statistical charts |
plotly.express, graph_objects | Interactive plots | Browser-ready, tooltips, zoom |
altair | Declarative grammar | Readable specs; Vega-Lite backend |
Library | Role | Notes |
scikit-learn | Classical ML, metrics, preprocessing | Baselines, pipelines, grid search |
XGBoost, LightGBM, CatBoost | Gradient boosting | SOTA tabular; categorical support (CatBoost) |
PyTorch | Deep learning | Define-by-run, custom training loops |
TensorFlow / Keras | Deep learning | High-level layers, production tooling |
transformers | LLMs, transfer learning | Tokenizers, pipelines, model zoo |
Library | Role | Notes |
scipy (stats, signal, optimize, integrate) | Scientific routines | Tests, filters, solvers |
statistics | Built-in descriptive stats | Lightweight helpers |
sympy | Symbolic math | Derivations, simplifications |
Library | Role | Notes |
nltk, spacy | Tokenization, parsing | spaCy for pipelines; NLTK utilities |
gensim | Word2Vec, LDA | Topic modeling and embeddings |
wordcloud | Visual summaries | Exploratory visuals |
Library | Role | Notes |
tqdm | Progress bars | Notebook-friendly via tqdm.notebook |
logging, warnings | Diagnostics | Set handlers, suppress noise selectively |
joblib, pickle | Model I/O | Persist artifacts; mind security |
Library | Role | Notes |
csv, sqlite3 | Flat files, local DB | Good for lightweight pipelines |
h5py | HDF5 storage | Large arrays, hierarchical datasets |
requests | HTTP APIs | Timeouts, retries, backoff |
Library | Role | Notes |
networkx | Graphs/networks | Topology, centrality measures |
geopandas, folium | Geospatial viz | Interactive maps and overlays |
Library | Role | Notes |
dask.dataframe | Out-of-core pandas | Parallelize wide workflows |
vaex, modin | Lazy or distributed DataFrame | Scale on single machine or cluster |
pyspark | Spark API | Cluster compute for very large data |
Library | Role | Notes |
torch, nn, optim, F | Core training | Custom loops, modules |
torch.distributed, TensorBoard | Multi-GPU, logging | DDP for scale-out |
tensorflow, keras | DL stacks | High-level layers and fit loops |
jax, jnp, flax, optax | JIT DL, functional NN | Fast grad, pure functions |
Library | Role | Notes |
transformers | LLMs, pipelines | Text, vision, audio models |
peft, bitsandbytes | Efficient finetuning, quantization | LoRA, 8-bit/4-bit training |
accelerate, sentence_transformers | Distributed, embeddings | Multi-GPU orchestration, retrieval |
Library | Role | Notes |
bokeh, holoviews, hvplot | Interactive viz stacks | Linked brushing, high-level APIs |
panel, dash, streamlit | Dashboards/apps | From notebook to app quickly |
pyvis, pyvista | Networks, 3D | Explorable graphs and volumes |
Library | Role | Notes |
pymc, arviz | Bayesian inference, diagnostics | Priors, posteriors, PPC |
statsmodels | Regression, time series | GLM, ARIMA families |
lifelines, prophet | Survival, forecasting | KaplanโMeier; components/trends |
Library | Role | Notes |
cvxpy, pulp, ortools | Convex, LP/MIP, routing | Solvers and modeling |
numba | JIT acceleration | Speed up Python loops |
sympy | Symbolic math | Closed forms, derivations |
Library | Role | Notes |
networkx, neo4j | Graph analysis, DB | Topology + graph stores |
dgl, torch_geometric, stellargraph | Graph ML | Message passing, link prediction |
Library | Role | Notes |
stanza, flair | NLP pipelines, embeddings | Strong pretrained components |
yake, textblob | Keywords, sentiment | Lightweight tasks |
gensim LdaModel | Topic modeling | Classical LDA workflow |
Library | Role | Notes |
ray, joblib | Distributed, parallel pipelines | Scale compute across cores/nodes |
ThreadPoolExecutor, ProcessPoolExecutor | Concurrency APIs | IO vs CPU bound tasks |
Library | Role | Notes |
opencv | Image/video processing | Transforms, codecs, tracking |
mediapipe | Pose/gesture | Prebuilt inference graphs |
albumentations, skimage | Augmentation, analysis | Training-ready pipelines |
imageio, tifffile | I/O, large images | Microscopy, GeoTIFFs |
Library | Role | Notes |
geopandas, shapely | Geo tables, geometry ops | Buffers, intersections |
rasterio, cartopy | Rasters, cartography | CRS management |
folium, contextily | Interactive maps, basemaps | Tiles and layers |
Area | Examples | Use |
HPC & GPU Kernels | triton, mpi4py, pycuda, pyopencl, numexpr | Custom kernels, multi-node, speed |
Large-Scale Training | deepspeed, fairscale, megatron | Sharded models, parallelism |
Probabilistic Programming | pyro, edward2, gpytorch | Bayesian deep learning, GPs |
Causal ML | dowhy, econml, causalinference | Effects, policy evaluation |
Science & Bio | biopython, deepchem, mdtraj, openmm | Genomics, chemistry, MD |
Quantum | qiskit, cirq, pennylane, qutip | VQA, simulation |
Advanced Viz | datashader, mayavi, k3d, fastplotlib | Huge data, 3D interactive |
Privacy & Federated | opacus, tensorflow_privacy, syft | Differential privacy, FL |
Infra & MLOps | prefect, dagster, kedro, mlflow, hydra, feast | Pipelines, tracking, configs |