master-CV/masterCV.txt at main · ambisinistra/master-CV · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
Mikhail Poma | +36203263091 | pomamikhail@gmail.com | linkedin.com/in/mikhail-poma | Budapest, Hungary

[образование] 2014-2020 Московский Государственный Институт (специалитет) факультет психологии, кафедра психофизиологии, area of studies: computational neuroscience

└- [project] [volunteering] 2015 работал на фМРТ томографе с командой психологов, участвовал в составлении психологических тестов, проведении и обработке МРТ исследований

└- [project] Graduation Project: EEG Signal Classification for Concealed Information Detection (p300 Paradigm)
    <description> Engineered a machine learning pipeline to classify electroencephalographic (EEG) recordings for the detection of intentionally concealed information, leveraging the p300 cognitive evoked potential. Adapted and applied modern Brain-Computer Interface (BCI) algorithms, specifically those utilizing Riemannian geometry developed by A. Barachant, to a novel psychophysiological domain. Developed the complete data processing and evaluation workflow in Python utilizing mne, pyriemann, and scikit-learn. The technical pipeline involved frequency filtering (1-20 Hz), epoch extraction, and mitigating inherently low signal-to-noise ratios through targeted epoch averaging. For feature extraction and classification, I computed covariance matrices from the EEG channels, projected them onto a Riemannian manifold to reduce multi-modal noise, and rigorously evaluated Minimum Distance to Mean (MDM) and Logistic Regression classifiers against a Common Spatial Patterns (CSP) baseline. </description>
    <result> Achieved a peak classification accuracy of 0.76 using Logistic Regression on the Riemannian manifold with only 3 to 4 epoch averages, significantly outperforming chance levels and classical Euclidean metric approaches.</result>
    <result> Successfully demonstrated the viability of utilizing Riemannian classifiers for cognitive lie detection tasks, providing a noise-robust algorithmic foundation for central nervous system-based evaluation systems.</result>

[school] 2020 NeuroMatch Academy - летняя школа по Computational Neuroscience

[project] 2020 Telegram bots development for telegram channels administration and customs sales

[course] 2021 Open Data Science Machine Learning on Knowledge graphs course by senior google researcher Mikhail Gallkin

[project] 2021 Migration from Devonthink ontology to Wikibase (wikidata - like) instance
    <result> Designed a knowledge base system and constructed pipelines for migration from table databases to a graph database

[executive education program] [Fellowship Program] 2021-2022 Nebius Academy Data Science School (ex-Yandex)

[course] Mathematics for Machine Learning London Imperial Colledge

[work] [data scientist] 2022 Antidotehealth Medical startup

└- [project] telemedicine bote search engine optimisation
    <result> Migrated search engine from a rule-based system to a modern NLP stack </result>
    <result> Improved telemedicine search engine to mitigate 76% of error cases. </result>

[work] [data scientist] 2022-2023 Sogo Services

└- [project] coordinates, position and shape estimation for coronar stents
    <description> Developed a system to determine the position and shape of a medical device on intraoperative medical images. My primary objective was to automate device detection, identify its edges, and estimate its shape and spatial orientation—tasks previously performed visually by the surgeon. As the problem resided at the intersection of 2D and 3D computer vision, initial experiments with classical Computer vision techniques proved ineffective due to high visual variability, even on controlled laboratory images. To overcome this, I engineered a multi-stage machine learning pipeline. The solution utilized YOLOv5 for initial object detection, followed by precise segmentation of the device and its edges. I also designed a custom loss function to calculate the most probable spatial alignment of the object. The resulting system surpassed human-level efficiency and accuracy. It is currently deployed as a surgical assistant tool, pending extensive regulatory approval before it can be authorized to make autonomous surgical decisions. </description>
    <result> Engineered and deployed a surgeon-assistant tool that is actively used by surgeons. Proposed and prototyped new ML solutions by closely collaborating with developers and medical professionals on the client side, conducting hypothesis validation and PoC experiments. Designed a full pipeline of an ML algorithm lifecycle for a healthcare startup </result>

└- [project] signal processing analysis and POC for intraaortal pressure signal data
    <result> The ML algorithm lifecycle design led to the startup's acquisition for $300 million. </result>

[course] 3D computer vision course

[course] Computer vision engineering course, learn CV engineering and deployment best practices

[independent researcher] [data scientist] 2024 - 2025 Kaggle competitor

└- [project] BirdCLEF bird songs classification competition

└- [project] NASA e-nose signal classification challenge

└- [project] RSNA 2024 Lumbar Spine 3D Degenerative Classification competition
    <description> Secured a Bronze Medal (103rd place globally) in the 2024 RSNA Lumbar Spine 3D Degenerative Classification Kaggle competition by developing a computer vision solution to detect lumbar nerve impingement from MRI scans. Collaborating closely with a teammate, I took charge of validating and debugging a complex multi-stage model architecture designed to process sequences of 2D slices into cohesive 3D anatomical representations. The pipeline I systematically tested and refined consisted of the segmentation of target vertebrae on vertical slices, algorithmic cross-referencing to match horizontal slices to the detected landmarks, and the precise classification of impingement zones. Once the baseline framework was fully validated, we strategically divided our research efforts to rapidly iterate and test independent hypotheses in parallel. My core technical contributions included significantly improving the public algorithm for spatial mapping between horizontal and vertical slices, alongside heavily optimizing a YOLO-based detection baseline. By successfully merging our parallel modeling efforts and insights, we engineered a highly accurate ensemble solution that resulted in our top-tier global finish. </description>

└- [project] Prediction of Human Gut Biotransformation Pathways competition

└- [project] Medical Sound Classification Challenge

└- [project] Stanford RNA 3D Folding competition
    <description> Secured a Silver Medal (29th place globally) in the 2025 Stanford RNA 3D Folding Kaggle competition, focusing on the prediction of complex three-dimensional RNA structures. To tackle this highly specialized challenge, I strategically partnered with a bioinformatics expert, forming a cross-functional team that perfectly balanced deep biological domain knowledge with advanced machine learning capabilities. As the primary ML engineer, I was responsible for adapting, deploying, and rigorously evaluating state-of-the-art neural network architectures from leading research laboratories within the constrained Kaggle environment. My technical contributions included extensive experimentation with cutting-edge models, notably engineering a LoRA fine-tuning pipeline for an open-source implementation of AlphaFold3. Through rigorous validation, I determined that architectures incorporating cross-species biological data yielded significantly superior predictive performance. This data-driven insight allowed us to strategically pivot away from the AlphaFold3 approach, optimizing our final ensemble to achieve a top-tier global ranking. </description>

└- [project] OpenAI to Z challenge of finding precolumbian villages on satellite images with validation in textual sources
    <description> Engineered a visual assessment system utilizing multi-channel satellite imagery to estimate the probability of pre-Columbian settlements, successfully discovering the two most relevant uncharted sites during an OpenAI-sponsored Kaggle competition. The overall objective required leveraging diverse open-source data—including LiDAR, multispectral satellite imagery, and historical texts—alongside LLM models to identify hidden archaeological traces beneath the Brazilian Amazon canopy. Acting as the sole technical specialist on a cross-functional team, I established the technical methodology and analytical pipeline. We strategically targeted a high-risk area designated for dam construction to uncover endangered archaeological sites before their potential destruction. The implemented solution involved biological zoning to isolate flora historically utilized by pre-Columbian populations, effectively filtering out post-expansion species. Furthermore, I conducted multispectral satellite image analysis to detect potential geoglyphs, and collaborated closely with a domain expert to optimize a prompt for assessing settlement probability. The project culminated in a comprehensive public report detailing our discovered settlements and methodological approach (https://kaggle.com/competitions/openai-to-z-challenge/writeups/lost-city-of-z). </description>
я
└- [project] Great Daxinzhuang Pottery Puzzle Challenge
    <description> Competed in a digital archaeology challenge organized by a Chinese archaeological institute, tasked with virtually reconstructing original vases from a dataset of over 17,000 ancient pottery shards excavated from a Shang Dynasty burial site. Following an extensive review of existing literature and algorithmic solutions, which proved inapplicable to the dataset's unique constraints, I engineered a novel reconstruction pipeline from scratch. The deployed solution utilized an advanced clustering approach that integrated classical computer vision features, texture analysis, and Self-Supervised Learning (SSL) embeddings to accurately group related fragments. Delivered a comprehensive technical presentation detailing the feature extraction methodology and clustering approach for the archaeological community (https://youtu.be/3NY9hE8rRmQ?si=VjU4CoAr6XWe21Nn). </description>

[course] [blockchain] Polkadot Academy PBA-X-3

[project] Medical onthology subgraph extraction
   <description> Engineered a highly optimized knowledge graph extraction pipeline to isolate a specialized subgraph from a large-scale medical ontology (Foundational Model of Anatomy) tailored for specific clinical applications. Utilizing Python, rdflib, and the high-performance oxigraph engine, I designed a Breadth-First Search (BFS) traversal algorithm governed by strict inclusion/exclusion parameters up to a specified depth. A core technical challenge involved architecting complex, custom SPARQL queries to correctly navigate OWL restrictions, parse specific semantic relations, and systematically filter out complex blank nodes that disrupt standard hierarchical querying. Following the automated extraction, the integrity, semantic consistency, and structural validity of the resulting ontology subset were rigorously verified using Protege. </description>
    <result> Achieved high-performance graph traversal and query execution by integrating the Rust-based oxigraph engine, significantly reducing processing time for deep subgraph extraction.
    <result> Successfully flattened complex hierarchical anatomical relationships into a clean, tabular format, directly enabling medical professionals to utilize the data for specialized downstream clinical tasks.

[course] Rust programming language

[course] 2026 LangGraph DeepResearch course

[course] 2026 Nebius AI Perfomance engineer (ongoing)

[languages] English - avdanced; Russian - native