Projects

Free tier includes selected archive projects. Upgrade for monthly drops and solutions.

Difficulty
Month
ML / AI
NewIntermediate

Log Anomaly Detector

Train an Isolation Forest on server logs to detect anomalous requests — unusual latency, error spikes, and suspicious traffic patterns.

Pythonscikit-learnpandas
🔒 Pro only
Systems
NewIntermediate

Rate Limiter Service

Implement a thread-safe sliding-window rate limiter in pure Python. Pass all unit tests and benchmark to under 10µs average latency.

Pythonstdlib only
🔒 Pro only
Data Engineering
NewIntermediate

Sales ETL Pipeline

Ingest three region CSVs with inconsistent formats, normalize them to a canonical schema, deduplicate, load to SQLite, and generate a markdown report.

Pythonpandassqlite3
🔒 Pro only
Data Analysis
Advanced

A/B Test Analyzer

Analyze an A/B experiment on a checkout flow redesign: run two-sample t-tests and chi-square tests for multiple metrics, compute confidence intervals, check for statistical significance, calculate minimum detectable effect, and identify Simpson's paradox in subgroup analysis.

Pythonscipypandas
🔒 Pro only
Systems
Intermediate

API Load Tester & Reporter

Build an API load testing simulator that models concurrent user behavior, simulates response time distributions, detects performance degradation under load, and generates a detailed percentile report — without making real HTTP calls.

Pythonstdlibstatistics
🔒 Pro only
ML / AI
Intermediate

Customer Churn Predictor

Train a logistic regression classifier to predict customer churn using a telecom dataset, apply feature engineering, evaluate with precision/recall/AUC-ROC, and output the top 10 most predictive features.

Pythonscikit-learnpandas
🔒 Pro only
Systems
Beginner

Config Diff Tool

Compare two JSON configuration files, detect added/removed/changed keys at any nesting depth, handle type changes, and produce a structured diff report with human-readable output and an exit code of 0 (same) or 1 (different).

Pythonstdlib
Data Engineering
Beginner

CSV Data Cleaner

Load a messy CSV, detect and handle nulls, remove duplicates, fix column types, trim whitespace, and output a cleaned file with a summary report.

Pythonpandas
🔒 Pro only
Data Engineering
Intermediate

DB Migration Generator

Parse two SQLite schema files, diff the table definitions, and generate a safe SQL migration script (ALTER TABLE ADD COLUMN, CREATE TABLE, DROP TABLE, CREATE/DROP INDEX) to upgrade from v1 to v2.

Pythonsqlite3
🔒 Pro only
Data Engineering
Intermediate

Event Funnel Pipeline

Ingest raw clickstream events, sessionize by user (30-min gap = new session), compute funnel conversion rates across a 4-step user journey, and store aggregated results to SQLite.

Pythonpandassqlite3
🔒 Pro only
Systems
Advanced

Event Sourcing System

Build an event sourcing system with an append-only SQLite event store, projections that rebuild state from events, snapshotting for fast recovery, and optimistic concurrency control using event version numbers.

Pythonsqlite3json
🔒 Pro only
ML / AI
Intermediate

Feature Engineering Pipeline

Build a reusable feature engineering pipeline that handles missing values, encodes categoricals, scales numerics, creates interaction features, and outputs a train/test split ready for modeling.

Pythonscikit-learnpandas
🔒 Pro only
ML / AI
Intermediate

Transaction Fraud Detector

Build a two-stage fraud detector: first apply deterministic rules (velocity checks, amount thresholds, known risky merchants), then train an Isolation Forest on residual transactions and flag anomalies, combining both signals into a final fraud score.

Pythonscikit-learnpandas
Data Analysis
Beginner

Student Grade Report

Load a grade CSV, compute each student's weighted final grade (homework 30%, midterm 30%, final 30%, participation 10%), assign letter grades, compute class statistics, and output a formatted markdown report.

Pythonpandas
🔒 Pro only
ML / AI
Advanced

Social Graph Analytics

Analyze a social network graph: compute degree centrality, betweenness centrality, and PageRank; detect communities using the Girvan-Newman algorithm; find the shortest path between any two users; and identify the top influencers.

Pythonnetworkxpandas
🔒 Pro only
Systems
Beginner

JSON Schema Validator

Validate a batch of JSON records against a custom schema definition that specifies required fields, types, constraints (min/max, pattern, enum), and nested objects — then report all violations per record.

Pythonstdlib
Data Engineering
Beginner

Nginx Log Parser

Parse a raw Nginx access log, extract structured fields using regex, compute the top-10 endpoints, error rate, and P95/P99 latency per endpoint, and print a formatted summary report.

Pythonre
🔒 Pro only
Systems
Advanced

Distributed Cache Simulator

Implement a distributed cache simulator supporting LRU and LFU eviction policies, per-key TTL, a consistent hashing ring for key distribution across cache nodes, and cache statistics (hit rate, eviction count, memory usage).

Pythonstdlibcollections
🔒 Pro only
ML / AI
Advanced

ML Model Drift Monitor

Detect data drift and model performance degradation: compute Population Stability Index (PSI) for each feature, run Kolmogorov-Smirnov tests, identify the most drifted features, retrain a reference model and compare performance before/after drift, and generate a drift report with severity scores.

Pythonscikit-learnpandasscipy
🔒 Pro only
Security
Beginner

Password Strength Auditor

Audit a list of passwords against NIST SP 800-63B rules: check minimum length, character complexity, detect common passwords, identify repeated patterns, and output a scored report with per-password recommendations.

Pythonrestdlib
🔒 Pro only
ML / AI
Advanced

Collaborative Filtering Engine

Implement user-based collaborative filtering from scratch using numpy: build a user-item matrix, compute cosine similarity between users, predict ratings for unseen items, and evaluate with RMSE on a held-out test split.

Pythonnumpypandas
🔒 Pro only
Systems
Intermediate

In-Memory Search Engine

Build an inverted index over a document corpus, implement TF-IDF scoring, support multi-term queries with AND/OR boolean logic, and rank results by relevance score.

Pythonstdlibmath
🔒 Pro only
ML / AI
Beginner

Sentiment Classifier

Train a Naive Bayes text classifier on labeled reviews using TF-IDF features, evaluate on a held-out test set, and predict the sentiment of 5 new example sentences.

Pythonscikit-learn
🔒 Pro only
Systems
Advanced

SQL Query Planner

Parse a subset of SQL (SELECT with WHERE, JOIN, GROUP BY, ORDER BY), build a logical query plan as a tree, apply optimization rules (predicate pushdown, index selection, join reordering), and estimate query cost using table statistics.

Pythonstdlibre
🔒 Pro only
Data Engineering
Advanced

Stream Window Aggregator

Process an ordered stream of timestamped events and compute tumbling window aggregations and sliding window aggregations for metrics like count, sum, and average — as if processing a real-time stream.

Pythonstdlibcollections
🔒 Pro only
Systems
Intermediate

Task DAG Runner

Implement a DAG-based task runner that accepts tasks with dependencies, validates the graph for cycles, computes a topological execution order, runs tasks sequentially respecting dependencies, and reports timing and status for each task.

Pythonstdlib
🔒 Pro only
ML / AI
Intermediate

Time Series Forecaster

Implement simple moving average (SMA) and exponential smoothing (EWM) forecasters from scratch using pandas/numpy, evaluate with MAE and RMSE on a held-out last-6-months test set, and forecast 3 months ahead for each SKU.

Pythonpandasnumpy
🔒 Pro only
Systems
Beginner

URL Shortener

Build an in-memory URL shortener that generates short codes, resolves them back to original URLs, tracks hit counts per short URL, and expires links after a configurable TTL.

Pythonstdlib