pbanddata.com
Data Scientist
a little bit of everything and some data
scroll down
About
I am a data scientist based in Nashville building pipelines, models, and products that actually get used.
MS in Data Science from Vanderbilt University. IBM Venture Fellow. My work sits at the intersection of healthcare data, machine learning, and applied AI.
MS in Data Science from Vanderbilt University. IBM Venture Fellow. I work at the intersection of healthcare data, machine learning, and applied AI.
Work and Ideas
Work across healthcare AI, financial modeling, fraud detection, EV analytics, and more.
Maternal health risk triage is slow, resource-intensive, and inaccessible in low-connectivity environments. I designed a dual-model agentic pipeline that pairs an on-device Phi-4-mini for real-time inference with a Claude constitutional AI layer for safety review. The system runs offline-first, flags high-risk pregnancies, and escalates edge cases for human review. Placed 3rd at the Vanderbilt AI Showcase, Spring 2026.
Gig platforms like Uber and TaskRabbit were designed for Western markets. In Egypt, people negotiate. I built Shoghl, a bilingual Arabic and English task marketplace where taskers submit competing offers with personal notes and customers counter or accept, closing over WhatsApp the way Egyptians actually do business. Includes a diaspora toggle for Egypt vs. the US. Commission is set at 10% to keep more earnings in the hands of the people doing the work.
Built with Maggie Tu. Distributed fraud detection pipeline on GCP Dataproc processing 590,540 labeled credit card transactions across 434 features. The pipeline handles a heavily imbalanced dataset (3.5% fraud) using class weighting to prevent models from learning to predict "not fraud" for everything. Full Spark MLlib pipeline: broadcast joins on the identity table to avoid shuffle, columns with over 50% nulls dropped, median imputation for numeric features, StringIndexer and OneHotEncoder for categoricals, outputs partitioned by ProductCD and written to GCS as Parquet. Includes a cache experiment comparing training time with and without Spark cache() to demonstrate DAG recomputation cost. GBT achieved 0.89 AUC-ROC and 97.6% recall on the fraud class specifically — not weighted averages, which would hide poor fraud detection.
Only 2.5% of US Nissan Ariya drivers were using the MyNISSAN app to charge their vehicles. As part of a five-person Vanderbilt data science team, we engineered a Jump Rate anomaly detection metric on VIN-level charging data to identify when drivers were charging outside the app. Applied Tukey HSD statistical testing to validate behavioral differences across three user personas: consistent, inconsistent, and infrequent. Built an interactive geospatial map using Geopy and Folium to visualize nationwide charging location patterns. Delivered segment-specific retention and marketing recommendations directly to the American CFO of Nissan across two-week agile sprints.
Hospitals are legally required to publish their prices. Almost no one knows how to find them, compare them, or act on them. Nofi is a mobile-first interface that surfaces real procedure costs across nearby providers and lets users swipe to compare. The goal is to make price transparency actually usable rather than a PDF buried on a hospital website. The problem is structural. The solution is behavioral design.
Over 100,000 people in the US are waiting for an organ transplant and 17 die every day. The bottleneck is not supply, it is trust and awareness. Zoe, from the Greek word for life, is a community-powered donor network that connects donors, recipients, and advocates through verified relationships rather than cold institutional registries. The design prioritizes trust signals and personal story over clinical form-filling.
Student loan dashboards tell you how much you owe. They do not tell you how to feel about it. Arbor, named for the tree Zacchaeus climbed before his life changed, reframes debt repayment around moments of reckoning and recovery. Built with a real-time payoff planner comparing avalanche and snowball strategies, a PSLF tracker, and an HYSA arbitrage calculator that tells you when saving beats paying down debt. Designed to connect to StudentAid.gov and Plaid so every number reflects your actual situation.
Women travel differently than platforms are built to serve them. TARI, from the ancient Egyptian word for rise, is a travel platform designed around how women research, plan, and share experiences on the road. On pause while other priorities take the front seat.
Visit travelwithtari.comMost recycling programs fail because the feedback loop between action and reward is too long. Keko is a gamified cup recycling pilot launching in Nashville that closes that loop with punch card rewards and neighborhood pickup. Piloting in Nashville with bin deployment, neighborhood pickup, and punch card rewards. Target launch partners are The Well and 8th and Roast.
Drop your notebook export, Streamlit app, or results HTML here and it will render inside the portfolio.
// replace this block with:
<iframe src="fraud-demo.html" class="demo-iframe"></iframe>
// or your Streamlit public URL
Drop your Arbor React build or HTML export here and it will load inside the portfolio.
// replace this block with:
<iframe src="arbor.html" class="demo-iframe"></iframe>
// or your deployed app URL
Experience
Feb 2025 to Present
HCA Healthcare
Nashville, TN
Jun 2024 to Dec 2024
HCA Healthcare
Nashville, TN
May 2022 to May 2024
Diana Health
Nashville, TN
Contact
mary.a.morkos@gmail.com · Nashville, TN
Most people scroll past this. You did not.
That is exactly the kind of person I want to work with. Someone who notices the things others miss.
I am Mary. I build things that matter. If you are reading this, maybe we should talk.