A transparent analytics product path from data contracts to budget decisions.
The project is structured as a reusable Python package plus a Streamlit analyst interface, with clear boundaries between source validation, model diagnostics, evidence calibration, and commercial planning.
Current Workflow
Source Contracts
Weekly schema, connector templates, CSV validation, and source diagnostics.
Modeling Layer
Baseline econometrics, MMM transformations, holdouts, uncertainty, and Bayesian priors.
Evidence Layer
Lift-test uploads, quality scoring, approved-only calibration, and experiment-informed priors.
Planning Layer
Response curves, profit-aware scenarios, constrained optimization, and executive summaries.
Reusable Code Structure
| Module | Responsibility |
|---|---|
data/connectors.py |
Connector templates and validation for commerce, analytics, paid media, CRM, affiliate, influencer, display, and external-control exports. |
data/assembly.py |
Connector-to-weekly assembly into the MMM-ready schema. |
data/diagnostics.py |
Source coverage and data-quality checks for assembled connector data. |
analytics.py |
Dashboard KPIs, channel summaries, promotion summaries, and readiness checks. |
modeling.py |
Baseline econometric model and holdout diagnostics. |
mmm.py |
Adstock, saturation, MMM foundation model, contribution, ROI, and response curves. |
uncertainty.py and bayesian.py |
Coefficient simulation, posterior intervals, priors, and predictive diagnostics. |
calibration.py |
Lift-test templates, evidence governance, and experiment calibration. |
budget.py |
Budget scenarios, gross-margin planning, and constrained allocation optimization. |
governance.py |
Recommendation readiness gates for model fit, profit impact, spend movement, history, and evidence. |
reporting.py |
Deterministic executive summaries, caveats, downloadable reports, and machine-readable run manifests. |
Deployment Shape
GitHub Pages
Static portfolio site served from the docs/ folder.
Streamlit Community Cloud
Interactive dashboard entrypoint at streamlit_app.py.
GitHub Actions
Automated Ruff linting and Pytest checks on push and pull request.
Production Roadmap
A production version would add authentication, role-based access, storage policy, audit logs, warehouse connectors, model tracking, dependency scanning, and signed release provenance before handling private company data.
Future Product Architecture
Warehouse Imports
BigQuery, Snowflake, Postgres, or governed CSV ingestion.
Feature Pipeline
dbt marts, Dagster orchestration, validation, and data versioning.
Model Jobs
Scheduled MMM runs, calibration records, model registry, and artifacts.
Web App and API
FastAPI, Next.js, RBAC, scenario exports, and audit logs.