Problem
Prices fluctuate constantly and many βdealsβ are noise from coupons, bundles, or currency rounding. We need to catch meaningful drops quickly, filter out spammy changes, and avoid duplicate alerts across the same product.
Approach
- Schedule crawls across multiple retailers with resilient parsers and anti-bot handling.
- Normalise product data (SKUs, currencies, units) and deduplicate by canonical IDs.
- Engineer features (rolling median, % change, z-scores) to stabilise noisy prices.
- Detect significant changes and stream verified alerts to a live dashboard.
Architecture
Flow: Crawler β Parser & Normaliser β Change Detector β Event Bus β Web API β WebSockets β UI
Components are decoupled, so I can swap the detector or queue without touching the UI.
Change Detection
I used robust statistical signals (rolling z-scores, minimum drop thresholds) with per-category tuning. Rolling windows smooth transient spikes; a simple cool-down prevents alert flurries.
// Pseudocode: price-change detection (Python)
X = build_features(prices, windows=[3,7,14]) # medians, pct_change, z-scores
delta = X["price_pct_change"]
signals = (abs(X["z_score"]) > per_category_threshold) & (delta < -min_drop)
alerts = products[signals] # only significant negative shifts
Frontend
- React UI with compact cards: product tiles, sparkline trends, and an alert drawer.
- WebSocket push keeps the UI under ~150ms behind the event stream on my machine.
- Keyboard shortcuts to jump through recent alerts and filter by category/retailer.
Results
- Precision: 95β97% on labelled price-change events across 12 retailers.
- Latency: ~120ms median crawl β UI (local), <250ms p95.
- Noise handling: Normalised features reduced false positives by ~42% vs raw diffs.
Whatβs next
- Adaptive, time-of-day thresholds and seasonal baselines for retail cycles.
- Multivariate models (e.g., Bayesian change-point, CUSUM) to fuse coupon and stock signals.
- Playbooks for auto-actions (watchlists, notify subscribers, export to spreadsheets).
See the repo
More projects