pb&data
peanut butter & data
a little bit of everything

pbanddata.com

MaryMorkos

Data Scientist

a little bit of everything and some data

scroll down

About

I turn messy data
into decisions

I am a data scientist based in Nashville building pipelines, models, and products that actually get used.

MS in Data Science from Vanderbilt University. IBM Venture Fellow. My work sits at the intersection of healthcare data, machine learning, and applied AI.

MS in Data Science from Vanderbilt University. IBM Venture Fellow. I work at the intersection of healthcare data, machine learning, and applied AI.

$whoami // data scientist + builder
>Mary Morkos, Nashville TN
$education
>MS Data Science · Vanderbilt · 2026
>BS Data Science · Belmont · 2024
$recognition
>IBM Venture Fellow · AI Showcase 3rd · Spring 2026
$
356K
users impacted through Power BI licensing model
$409K
in monthly vendor invoices surfaced and validated
97.6%
recall on distributed fraud detection model
900K+
learner records owned and analyzed
8
products built, shipped, or in active development

Work and Ideas

Things I have
built

Work across healthcare AI, financial modeling, fraud detection, EV analytics, and more.

01 Built · IBM Venture Fellow

NurtureNet 2.0

Maternal health risk triage is slow, resource-intensive, and inaccessible in low-connectivity environments. I designed a dual-model agentic pipeline that pairs an on-device Phi-4-mini for real-time inference with a Claude constitutional AI layer for safety review. The system runs offline-first, flags high-risk pregnancies, and escalates edge cases for human review. Placed 3rd at the Vanderbilt AI Showcase, Spring 2026.

🥉
Vanderbilt AI Showcase · Spring 2026
02 Built

Shoghl شغل

Gig platforms like Uber and TaskRabbit were designed for Western markets. In Egypt, people negotiate. I built Shoghl, a bilingual Arabic and English task marketplace where taskers submit competing offers with personal notes and customers counter or accept, closing over WhatsApp the way Egyptians actually do business. Includes a diaspora toggle for Egypt vs. the US. Commission is set at 10% to keep more earnings in the hands of the people doing the work.

شغل
03 Built

Fraud Detection at Scale

Built with Maggie Tu. Distributed fraud detection pipeline on GCP Dataproc processing 590,540 labeled credit card transactions across 434 features. The pipeline handles a heavily imbalanced dataset (3.5% fraud) using class weighting to prevent models from learning to predict "not fraud" for everything. Full Spark MLlib pipeline: broadcast joins on the identity table to avoid shuffle, columns with over 50% nulls dropped, median imputation for numeric features, StringIndexer and OneHotEncoder for categoricals, outputs partitioned by ProductCD and written to GCS as Parquet. Includes a cache experiment comparing training time with and without Spark cache() to demonstrate DAG recomputation cost. GBT achieved 0.89 AUC-ROC and 97.6% recall on the fraud class specifically — not weighted averages, which would hide poor fraud detection.

97.6%
recall at 590K transactions
04 Built · Vanderbilt x Nissan

Nissan EV Charging Behavior

Only 2.5% of US Nissan Ariya drivers were using the MyNISSAN app to charge their vehicles. As part of a five-person Vanderbilt data science team, we engineered a Jump Rate anomaly detection metric on VIN-level charging data to identify when drivers were charging outside the app. Applied Tukey HSD statistical testing to validate behavioral differences across three user personas: consistent, inconsistent, and infrequent. Built an interactive geospatial map using Geopy and Folium to visualize nationwide charging location patterns. Delivered segment-specific retention and marketing recommendations directly to the American CFO of Nissan across two-week agile sprints.

2.5%
MyNISSAN app adoption rate
05 Concept

Nofi

Hospitals are legally required to publish their prices. Almost no one knows how to find them, compare them, or act on them. Nofi is a mobile-first interface that surfaces real procedure costs across nearby providers and lets users swipe to compare. The goal is to make price transparency actually usable rather than a PDF buried on a hospital website. The problem is structural. The solution is behavioral design.

swipe
to compare.
real hospital prices
06 Concept

Zoe

Over 100,000 people in the US are waiting for an organ transplant and 17 die every day. The bottleneck is not supply, it is trust and awareness. Zoe, from the Greek word for life, is a community-powered donor network that connects donors, recipients, and advocates through verified relationships rather than cold institutional registries. The design prioritizes trust signals and personal story over clinical form-filling.

ζωή
Greek for life
07 Built

Arbor

Student loan dashboards tell you how much you owe. They do not tell you how to feel about it. Arbor, named for the tree Zacchaeus climbed before his life changed, reframes debt repayment around moments of reckoning and recovery. Built with a real-time payoff planner comparing avalanche and snowball strategies, a PSLF tracker, and an HYSA arbitrage calculator that tells you when saving beats paying down debt. Designed to connect to StudentAid.gov and Plaid so every number reflects your actual situation.

Arbor
Latin for tree
08 On pause

TARI

Women travel differently than platforms are built to serve them. TARI, from the ancient Egyptian word for rise, is a travel platform designed around how women research, plan, and share experiences on the road. On pause while other priorities take the front seat.

Visit travelwithtari.com
TARI
09 In progress

Keko

Most recycling programs fail because the feedback loop between action and reward is too long. Keko is a gamified cup recycling pilot launching in Nashville that closes that loop with punch card rewards and neighborhood pickup. Piloting in Nashville with bin deployment, neighborhood pickup, and punch card rewards. Target launch partners are The Well and 8th and Roast.

Keko

Experience

Where I have
worked

Feb 2025 to Present

HCA Healthcare

Nashville, TN

Associate Data Analyst

  • Designing and deploying 400 plus Power Automate flows to replace an unstable Excel-to-Power BI pipeline, pulling Microsoft Forms data directly into Power BI for accurate, real-time reporting.
  • Coordinating UAT for Insights, a new analytics platform within HealthStream, managing 10 enterprise administrators to evaluate functionality, capture enhancement requests, and document gaps before full rollout.
  • Built a user activity and licensing cost dashboard in Power BI across a 356,000-user base, enabling finance and leadership to validate monthly vendor invoices and identify cost optimization opportunities.
  • Scoping an upcoming data integration project to pull and normalize content from the team SharePoint site for downstream reporting.
Power BIPower AutomateMicrosoft FormsHealthStreamUATSharePoint

Jun 2024 to Dec 2024

HCA Healthcare

Nashville, TN

Business Solution Analyst I

  • Served as a business analysis consultant across three concurrent enterprise technology initiatives, bridging stakeholder requirements and technical delivery teams to define and validate solution requirements.
  • Supported enterprise readiness assessment for a Looker rollout by gathering cross-functional input, documenting adoption requirements, and aligning stakeholders ahead of deployment.
  • Led a telecom vendor cost analysis by extracting and reconciling historical invoices and usage data to evaluate whether a proposed internet service provider switch would reduce costs at enterprise scale.
  • Contributed to an internal idea management platform by documenting workflow requirements and submitting structured process documentation through GitHub to support development.
Business AnalysisLookerRequirements ManagementVendor AnalysisGitHub

May 2022 to May 2024

Diana Health

Nashville, TN

Community Brand Ambassador

  • Built and maintained relationships with 50 plus community organizations across Southeast Nashville to expand brand reach and drive awareness of women's healthcare services.
  • Analyzed community feedback data to identify an underrepresented demographic trend and used findings to inform outreach strategy and program design.
  • Planned and executed community events targeting identified service access gaps for underserved populations.
Community OutreachData AnalysisMaternal Health

Contact

Get in touch

mary.a.morkos@gmail.com · Nashville, TN

Email me LinkedIn GitHub
you found it.

Most people scroll past this. You did not.

That is exactly the kind of person I want to work with. Someone who notices the things others miss.

I am Mary. I build things that matter. If you are reading this, maybe we should talk.