Selected work

Analysis, modeling, and research code

Each project lists the question it addresses, the method, and the result.

513 contributions in the last year Less More

Languages

Python 78%
TeX 12%
TypeScript 8%
JavaScript 1%
Stata <1%

Applied analysis & modeling

Data work aimed at a decision: choosing the right measure, building the analysis, and being clear about what it does and doesn't support — plus the search, ranking, and tool-calling code that goes with it.

01 Case study · Operational analysis

CitiBike Demand, Risk, and Net Flow

Station- and trip-level analysis of demand, net flow, and collision-adjusted risk across New York's bike-share network.

Problem: What can CitiBike trip data and NYPD crash records, together, tell an operator and an insurer?
Approach: Three analyses on 2023–2025 trips: demand and usage patterns; a per-trip crash-risk measure by station and time (NYPD crashes over trip exposure, empirical-Bayes smoothed); and a net-flow imbalance classifier for rebalancing.
Result: An interpretable per-trip risk measure an insurer could use as a rating input, demand patterns showing seasonality and per-station stagnation, and stable net-flow patterns a classifier predicts to guide rebalancing.

Python · pandas · feature engineering · risk analysis · prediction

Station-level crash risk LowerHigher

Each dot is a station; amber flags the highest exposure-adjusted per-trip risk.

02 GitHub project · Retrieval system

RAG Search Engine

A hybrid movie-search engine combining lexical and semantic retrieval, reranking, image search, and RAG answers.

Problem: Plain search fails when a query is vague, visual, or more description than title.
Approach: Combined BM25, embeddings, reciprocal rank fusion, CLIP, reranking, and caching, with a debug trace through the pipeline.
Result: An evaluated pipeline: RRF search scored on precision, recall, and F1 against a golden set, with a debug mode that traces a query through each stage.

Python · RAG · BM25 · embeddings · CLIP · evaluation

Search pipeline

Hybrid retrieval, fused and reranked, then answered with citations.

03 GitHub project · AI tooling

Minimal Coding Agent

A small Gemini-powered coding agent that reads files, runs scripts, and edits code through explicit tools.

Problem: How much can a coding agent do with only a few explicit tools and no framework?
Approach: Implemented file inspection, script execution, and edits behind explicit, inspectable tool calls.
Result: A working agent that reads, runs, and edits code in a loop, with every tool call explicit.

Python · tool-calling · Gemini API · CLI

Agent run

An example run — the agent finds a failing test and fixes it.

04 In development · Python package

choicekit

A scikit-learn-compatible package for discrete choice modeling on wide-form data, in development.

Problem: Discrete-choice packages expect long-form data and their own fit loops, so they sit outside scikit-learn's cross-validation, pipelines, and tuning.
Approach: Estimators inherit from scikit-learn's BaseEstimator and read wide-form X, y — one row per choice situation, {alt}_{feature} columns — so they drop straight into GridSearchCV, cross-validation, and pipelines.
Result: Intended interface: a ConditionalLogitClassifier you tune with GridSearchCV like any sklearn model. Early-stage and not yet released.

Python · scikit-learn · discrete choice · conditional logit

sklearn-native usage

Intended interface: a choicekit estimator tuned with GridSearchCV on wide-form X, y — one row per choice situation.

Research & replication

Research code: published replication packages and methods work in economics, tested and reproducible.

01 Published · GEB 2024 · Replication package

Efficiency Wages with Motivated Agents

Replication data and code for a published paper on wages, motivation, and effort.

Problem: Do wage incentives and prosocial motivation reinforce each other, or work through separate channels?
Approach: Built reproducible analyses around experimental data and documented the paper's online appendix.
Result: A peer-reviewed empirical project with a complete replication package.

Python · Stata · experimental data · replication

Chosen effort by wage

Mean chosen effort by offered wage, with 95% confidence bands.

02 Manuscript + replication package · Inference & simulation

The Informativeness of Frequency-Report Scoring Rules

Manuscript, simulations, and replication code for belief elicitation from frequency reports.

Problem: A count report is observable; the belief behind it is not. How much does the report actually pin down?
Approach: Characterized the identified set of beliefs behind each report under three scoring rules, then checked the bounds with simulation.
Result: No rule dominates: squared-distance gives the sharpest bounds when beliefs concentrate on a few categories, frequency-guessing when they're spread out, and Manhattan distance rarely wins but holds up across regimes.

Python · simulation · pytest · scoring rules

Sharpest-bound win share

Share of cases where each scoring rule gives the sharpest belief bounds, by belief concentration α.

03 GitHub project · Model comparison

Economic Theories and Machine Learning

Code comparing economic theories against machine-learning benchmarks on behavioral data.

Problem: When a theory predicts behavior, how much of the predictable variation does it actually capture?
Approach: Benchmarked theory-based specifications against machine-learning models and examined the remaining predictive gap.
Result: Theory specifications scored on out-of-sample prediction and compared against the machine-learning benchmark.

Python · machine learning · model evaluation

Completeness by model heterogeneity

Completeness climbs as the model captures more individual heterogeneity — a single type is far from enough, and even a handful of latent types don't exhaust it.

Also on GitHub

Smaller builds from learning new tools — a notes site, a static-site generator, and a few games and CLIs.

Personal Knowledge System Static Site Generator BookBot Asteroids Maze Solver