Skip to main content

Analytics Service ๐Ÿ“Š

The Analytics Service is responsible for collecting, processing, and exposing business metrics in ShopVerse using an event-driven analytics pipeline.

It provides near real-time insights without impacting core transactional services.


๐ŸŽฏ Responsibilitiesโ€‹

The Analytics Service handles:

  • Consuming domain events from Kafka
  • Processing events using stream processing
  • Storing analytics data in ClickHouse
  • Exposing aggregated metrics via APIs
  • Supporting business dashboards and reporting

๐Ÿง  Why a Separate Analytics Service?โ€‹

Analytics workloads are:

  • Read-heavy
  • Aggregation-focused
  • Computationally expensive

Separating analytics ensures:

  • No performance impact on transactional services
  • Independent scaling
  • Technology freedom (OLAP databases)
  • Clean separation between operations and insights

๐Ÿ—๏ธ High-Level Architectureโ€‹


๐Ÿ“ก Event Ingestion Flowโ€‹


๐Ÿ“ฆ Events Consumedโ€‹

Event TypeSource ServicePurpose
USER_REGISTEREDAuth ServiceUser growth
ORDER_PLACEDOrder ServiceOrder metrics
PAYMENT_SUCCESSPayment ServiceRevenue tracking
PRODUCT_CREATEDProduct ServiceCatalog insights

All events follow a standard analytics schema.


๐Ÿงฉ Analytics Event Modelโ€‹

{
"eventType": "PAYMENT_SUCCESS",
"service": "PAYMENT_SERVICE",
"userEmail": "user@example.com",
"entityId": "ORD123",
"amount": 2499.00,
"timestamp": "2026-01-21T11:45:00",
"metadata": {
"paymentMode": "UPI"
}
}

๐Ÿ—„๏ธ Data Storage (ClickHouse)โ€‹

ClickHouse is used because:

  • Columnar storage
  • Extremely fast aggregations
  • High write throughput
  • Ideal for OLAP workloads

Example Metrics Stored:โ€‹

  • Total users
  • Orders per day
  • Revenue per day
  • Revenue by payment method
  • Service-level event counts

๐ŸŒ Analytics APIsโ€‹

MethodEndpointRoleDescription
GET/api/analytics/dashboardADMINGet dashboard metrics
GET/api/analytics/revenueADMINRevenue trends
GET/api/analytics/ordersADMINOrder statistics

These APIs are read-only.


๐Ÿ” Security Modelโ€‹

  • All APIs accessed via API Gateway

  • Only ADMIN role allowed

  • Identity verified using:

    • X-User-Email
    • X-User-Role

Analytics data is never exposed to customers.


๐Ÿ”„ Stream Processing Strategyโ€‹

Apache Flink handles:

  • Event ingestion
  • Transformation
  • Aggregation
  • Windowing (daily, hourly metrics)

Benefits:โ€‹

  • Near real-time analytics
  • Fault tolerance
  • Exactly-once semantics
  • Horizontal scalability

๐Ÿ›ก๏ธ Reliability & Fault Toleranceโ€‹

  • Kafka retains events on failure
  • Flink checkpoints state
  • ClickHouse ensures data durability
  • Queries are idempotent
  • No data loss during restarts

โš ๏ธ Failure Scenariosโ€‹

โŒ Kafka Lagโ€‹

  • Analytics delayed
  • Core services unaffected
  • Job restarts from checkpoint

โŒ ClickHouse Downโ€‹

  • Events buffered until recovery

โš™๏ธ Key Componentsโ€‹

  • AnalyticsConsumer โ€“ Kafka consumer / Flink job
  • AnalyticsEvent โ€“ Standard event model
  • AnalyticsQueryService โ€“ ClickHouse queries
  • AnalyticsController โ€“ REST APIs
  • JdbcTemplate โ€“ ClickHouse integration
  • GatewayHeaderAuthenticationFilter โ€“ Security

๐Ÿ“ˆ Scalability Considerationsโ€‹

  • Kafka partitions enable parallel ingestion
  • Flink scales horizontally
  • ClickHouse supports distributed clusters
  • Analytics Service APIs are stateless

๐Ÿงช Testing Strategyโ€‹

  • Unit tests for query logic
  • Mock Kafka events for Flink jobs
  • Validation of aggregations
  • Security tests for admin-only access

๐Ÿ“Œ Summaryโ€‹

The Analytics Service provides:

  • Real-time business insights
  • Zero impact on core workflows
  • Scalable stream processing
  • Production-grade analytics architecture

It turns ShopVerse operational data into actionable intelligence.