How Rise Streams 200B Events a Day into an Open, AI-Ready Data Lake with Eon

“At our scale, data infrastructure can quickly become a competitive bottleneck. Eon allows us to focus on deriving value from data rather than constantly operating and optimizing the systems underneath it."

—Chen Shalit, CEO, Rise

About Rise

Rise is a global AdTech platform serving major digital publishers. Its real-time bidding pipeline processes roughly 200 billion events (or more than a petabyte of data) per day, across 2 billion users. Every event feeds models that decide bidding strategy, segmentation, and attribution. At this scale, the edge is how fast and how clean the data feeding them arrives.

The Challenge

Big-data volumes, hard performance bars, and unforgiving economics

Models, rule-based segmentations, and aggregations run continuously over Rise's raw event stream, driving real-time bidding decisions, advertiser-facing reports, and the optimization workflows the business runs on. At more than 200 billion events and over a petabyte of data a day, arriving at 40,000 files a minute, accuracy and freshness are the foundation for every downstream decision.

Two challenges define the problem. First, the volume comes with hard performance requirements and low-latency expectations; a quality slip or a slow path at the source cascades through everything downstream, often before anyone catches it. Second, processing data at this scale is brutally expensive. Teams either build and maintain heavy ingestion machinery themselves or accept compute costs that cap what they can actually process.

In order to maintain their next round of growth, the team set out to put four capabilities into the foundation:

Ingestion that's economical at full volume: Compute lean enough to absorb the full event stream without costs capping what the team can process.
Data that lands ready: Validated, filtered, sorted, and compacted into well-balanced files on arrival, so there's no separate cleanup-and-optimization layer downstream.
An open foundation that grows with the workload: Sub-hourly aggregations as the default, semi-structured payloads handled natively, schema that evolves cleanly, engine-pluggable compute.
An AI-ready surface as data lands: Events catalogued with AdTech semantics on arrival, not made AI-ready as a downstream project.

Why Eon

AI-ready data at petabyte scale at 30x lower cost

Rise processes more than a petabyte of data every day. Eon transforms that continuous stream into an open Apache Iceberg foundation that is clean, validated, and ready for analytics and AI with sub-minute freshness.

Powered by Eon's proprietary streaming compute engine, the platform delivers the same workload with approximately 10x lower compute consumption—and up to 30x lower cost—than a comparable Apache Flink deployment, making AI-ready data infrastructure economically viable at massive scale.

As data arrives, Eon continuously optimizes Iceberg tables through intelligent sorting, balancing, and compaction, preventing the small-files problem that commonly degrades performance in high-throughput environments. The result is data that is immediately ready for fast analytics, AI workloads, and downstream consumption.

Higher-value data from the moment it lands

Eon applies intelligent filtering, schema adaptation, and aggregation directly within the ingestion pipeline, removing redundant and low-value events before they ever reach the lakehouse. Schemas automatically evolve as payloads change, while aggregations are calculated in flight.

The result is less storage and compute overhead, cleaner datasets, and higher-signal inputs for analytics and machine learning systems.

Built-in quality, governance, and metadata

Eon continuously validates data as it is ingested, performing checks for freshness, schema drift, null rates, uniqueness, and distribution anomalies before data reaches downstream systems.

Invalid records are automatically isolated, ensuring dashboards, ETL processes, and AI models operate only on verified data.

At the same time, Eon captures rich metadata across tables, partitions, and fields, continuously maintaining a living catalog of events, schemas, and business semantics as new partners, formats, and data sources are introduced.

Data organized for consumption out of the box

Rather than simply storing data, Eon continuously organizes it around how it is actually used.

The platform analyzes query patterns and transformation behavior to automatically generate and maintain optimized views on top of raw datasets. Common reporting, analytics, and tenant-specific use cases are served from structures designed for fast access and efficient consumption, eliminating the need for platform teams to manually build and maintain layers of transformation logic.

Open by design

The entire foundation is built on Apache Iceberg and cloud object storage, allowing organizations to use the analytics, AI, and query engines of their choice.

Because the architecture is based on open formats rather than proprietary warehouses, organizations retain full ownership of their data while avoiding vendor lock-in. Eon's storage and deduplication architecture further reduces footprint and infrastructure costs, delivering the economics required to operate at petabyte scale.

The Solution

Rise partnered with Eon's Forward Deployed Engineering (FDE) team to build an AI-ready data foundation capable of processing more than a petabyte of data per day with sub-minute freshness, leading to:

Managed ingestion into an open Iceberg data lake: Eon's streaming engine continuously lands and optimizes data in Apache Iceberg on GCS, delivering sub-minute freshness, roughly half the raw Avro storage footprint, and open access from any Iceberg-compatible engine.
Smart filtration at ingest: Redundant and low-value events are filtered before they land, reducing storage and compute costs while improving downstream analytics and AI performance.
Built-in quality and observability: Data quality checks run continuously during ingestion, with automated quarantine, lineage tracking, and visibility into downstream impact.
AI-ready semantic layer and managed views: Eon continuously maintains event catalogs, schemas, and AdTech semantics, automatically generating managed views and publishing context through MCP for analytics and AI agents.
Automated Iceberg operations: Compaction, snapshot management, partition evolution, and file optimization run continuously, eliminating the need for separate maintenance pipelines.

The Results

Sub-minute freshness at petabyte scale

More than 200 billion events per day now land in the data lakewith sub-minute end-to-end freshness, improving on the previous 3–5 minute baseline. Data is continuously available for analytics and AI, with an Iceberg footprint roughly half the size of the raw Avro data.

Clean, AI-ready data from the moment it lands

Events arrive validated, filtered, sorted, and compacted into optimized Iceberg tables. Data quality issues are caught before they reach downstream systems, while smart filtration removes redundant, low-value events before they enter the data lake.

Economics that make AI-ready data infrastructure practical

Eon's streaming engine processes the full event stream with approximately 10x lower compute consumption than a comparable Apache Flink deployment, delivering materially lower infrastructure costs while maintaining sub-minute freshness at massive scale.

The Future

Rise's AI-ready data foundation will continue to evolve as more workloads move to real-time processing, more transformation logic is automated within the platform, and AI agents gain direct access to governed data, semantics, lineage, and quality signals through MCP.

Built on open Iceberg and continuously managed by Eon, the foundation gives Rise a platform for long-term growth, enabling the company to scale data volumes, analytics, and AI initiatives without rebuilding the underlying architecture.

How Rise Streams 200B Events a Day into an Open, AI-Ready Data Lake with Eon

Sub-minute freshness

200B+ events per day

~10x less compute

Quick Summary