Projects
Free tier includes selected archive projects. Upgrade for monthly drops and solutions.
Log Anomaly Detector
Train an Isolation Forest on server logs to detect anomalous requests — unusual latency, error spikes, and suspicious traffic patterns.
Rate Limiter Service
Implement a thread-safe sliding-window rate limiter in pure Python. Pass all unit tests and benchmark to under 10µs average latency.
Sales ETL Pipeline
Ingest three region CSVs with inconsistent formats, normalize them to a canonical schema, deduplicate, load to SQLite, and generate a markdown report.
A/B Test Analyzer
Analyze an A/B experiment on a checkout flow redesign: run two-sample t-tests and chi-square tests for multiple metrics, compute confidence intervals, check for statistical significance, calculate minimum detectable effect, and identify Simpson's paradox in subgroup analysis.
API Load Tester & Reporter
Build an API load testing simulator that models concurrent user behavior, simulates response time distributions, detects performance degradation under load, and generates a detailed percentile report — without making real HTTP calls.
Customer Churn Predictor
Train a logistic regression classifier to predict customer churn using a telecom dataset, apply feature engineering, evaluate with precision/recall/AUC-ROC, and output the top 10 most predictive features.
Config Diff Tool
Compare two JSON configuration files, detect added/removed/changed keys at any nesting depth, handle type changes, and produce a structured diff report with human-readable output and an exit code of 0 (same) or 1 (different).
CSV Data Cleaner
Load a messy CSV, detect and handle nulls, remove duplicates, fix column types, trim whitespace, and output a cleaned file with a summary report.
DB Migration Generator
Parse two SQLite schema files, diff the table definitions, and generate a safe SQL migration script (ALTER TABLE ADD COLUMN, CREATE TABLE, DROP TABLE, CREATE/DROP INDEX) to upgrade from v1 to v2.
Event Funnel Pipeline
Ingest raw clickstream events, sessionize by user (30-min gap = new session), compute funnel conversion rates across a 4-step user journey, and store aggregated results to SQLite.
Event Sourcing System
Build an event sourcing system with an append-only SQLite event store, projections that rebuild state from events, snapshotting for fast recovery, and optimistic concurrency control using event version numbers.
Feature Engineering Pipeline
Build a reusable feature engineering pipeline that handles missing values, encodes categoricals, scales numerics, creates interaction features, and outputs a train/test split ready for modeling.
Transaction Fraud Detector
Build a two-stage fraud detector: first apply deterministic rules (velocity checks, amount thresholds, known risky merchants), then train an Isolation Forest on residual transactions and flag anomalies, combining both signals into a final fraud score.
Student Grade Report
Load a grade CSV, compute each student's weighted final grade (homework 30%, midterm 30%, final 30%, participation 10%), assign letter grades, compute class statistics, and output a formatted markdown report.
Social Graph Analytics
Analyze a social network graph: compute degree centrality, betweenness centrality, and PageRank; detect communities using the Girvan-Newman algorithm; find the shortest path between any two users; and identify the top influencers.
JSON Schema Validator
Validate a batch of JSON records against a custom schema definition that specifies required fields, types, constraints (min/max, pattern, enum), and nested objects — then report all violations per record.
Nginx Log Parser
Parse a raw Nginx access log, extract structured fields using regex, compute the top-10 endpoints, error rate, and P95/P99 latency per endpoint, and print a formatted summary report.
Distributed Cache Simulator
Implement a distributed cache simulator supporting LRU and LFU eviction policies, per-key TTL, a consistent hashing ring for key distribution across cache nodes, and cache statistics (hit rate, eviction count, memory usage).
ML Model Drift Monitor
Detect data drift and model performance degradation: compute Population Stability Index (PSI) for each feature, run Kolmogorov-Smirnov tests, identify the most drifted features, retrain a reference model and compare performance before/after drift, and generate a drift report with severity scores.
Password Strength Auditor
Audit a list of passwords against NIST SP 800-63B rules: check minimum length, character complexity, detect common passwords, identify repeated patterns, and output a scored report with per-password recommendations.
Collaborative Filtering Engine
Implement user-based collaborative filtering from scratch using numpy: build a user-item matrix, compute cosine similarity between users, predict ratings for unseen items, and evaluate with RMSE on a held-out test split.
In-Memory Search Engine
Build an inverted index over a document corpus, implement TF-IDF scoring, support multi-term queries with AND/OR boolean logic, and rank results by relevance score.
Sentiment Classifier
Train a Naive Bayes text classifier on labeled reviews using TF-IDF features, evaluate on a held-out test set, and predict the sentiment of 5 new example sentences.
SQL Query Planner
Parse a subset of SQL (SELECT with WHERE, JOIN, GROUP BY, ORDER BY), build a logical query plan as a tree, apply optimization rules (predicate pushdown, index selection, join reordering), and estimate query cost using table statistics.
Stream Window Aggregator
Process an ordered stream of timestamped events and compute tumbling window aggregations and sliding window aggregations for metrics like count, sum, and average — as if processing a real-time stream.
Task DAG Runner
Implement a DAG-based task runner that accepts tasks with dependencies, validates the graph for cycles, computes a topological execution order, runs tasks sequentially respecting dependencies, and reports timing and status for each task.
Time Series Forecaster
Implement simple moving average (SMA) and exponential smoothing (EWM) forecasters from scratch using pandas/numpy, evaluate with MAE and RMSE on a held-out last-6-months test set, and forecast 3 months ahead for each SKU.
URL Shortener
Build an in-memory URL shortener that generates short codes, resolves them back to original URLs, tracks hit counts per short URL, and expires links after a configurable TTL.