← Home

Real-Time Search Ranking System

Clickstream-Driven Personalization with Distributed Event Pipeline and Caching Layer

PROJECT OVERVIEW

A real-time search ranking system that tracks user click behavior, streams events through Kafka, and re-ranks search results dynamically using a scoring formula that combines click-through rate and recency decay. Results are cached in Redis for low-latency delivery and persisted in PostgreSQL for ranking history.

PROBLEM

Traditional search ranking based purely on click count favors stale popular content indefinitely. Writing ranking updates synchronously on every click creates write amplification at scale. This system decouples click ingestion from ranking computation using an async event pipeline, and applies recency decay to keep results fresh.

CORE GOALS

  • Decouple click tracking from ranking computation using Kafka
  • Apply a recency-weighted scoring formula to prevent stale content domination
  • Cache hot query results in Redis for sub-millisecond delivery
  • Persist ranking history in PostgreSQL with atomic upsert operations
  • Containerize the full infrastructure stack for consistent deployment

EVENT PIPELINE

  • Next.js frontend captures click events and sends them to a REST API
  • API route produces click events to a Kafka topic (click-events) asynchronously
  • Kafka consumer reads events and updates PostgreSQL rankings without blocking the request cycle
  • Redis cache is updated after every ranking change with the latest scored result set

RANKING LOGIC

  • Score combines click count and recency: score = click_count * (1 / (days_since_last_click + 1))
  • Recent clicks carry more weight than older ones of equal count
  • Rankings stored using PostgreSQL upsert with conflict resolution on (result_id, query)
  • Score recalculated atomically on every click event

CACHING LAYER

  • Redis sits in front of PostgreSQL for all search result reads
  • Query string used as cache key storing the full ranked result list as JSON
  • Cache updated automatically by the consumer after every ranking change
  • Falls back to PostgreSQL on cache miss with automatic result return

TECHNICAL ARCHITECTURE

Frontend

Next.js 15, TypeScript

Event Queue

Apache Kafka (KafkaJS)

Cache

Redis (ioredis)

Database

PostgreSQL 15

Infrastructure

Docker Compose

Consumer

Node.js background worker (tsx)