Building an Integrated Infrastructure for Real-Time Business – The New Stack

Digitization is leading post-Covid IT spending with a focus on real-time customer engagement – ​​and it’s a challenge.

Manish Devgan

Manish is the Product Manager at Hazelcast, one of the pioneers of real-time data. He has spent over 20 years in data management and analysis, creating cutting-edge products. Prior to Hazelcast, Manish led Software AG’s IoT platform product portfolio. Manish is a published author and a guest speaker at industry conferences.

The majority of companies have, for the first time, designed organization-wide digital transformation strategies; 55% of IT spending will be allocated to digitization by 2024, according to IDC.

But it is far from simple. IDC also believes that many will have “difficulty navigating” the digital world as companies try to figure out what digital really means to them.

The winners will be a group that IDC calls “digital aficionados,” defined as those who innovate customer engagement with data management and analytics technologies at their core.

It means moving to a type of customer engagement that goes beyond just being online. This means real-time engagement, like tailored offers when the customer does their shopping, eliminating fraudulent charges during the transaction, replacement of a part before the machine breaks or supports the customer while they are banking.

Achieving this level of engagement requires a 360-degree view of the customer and real-time personalization – and that’s the challenge because it requires an accurate and actionable customer profile. These profiles are built using a combination of streaming data from events – which are clicks on a site, machine-to-machine communications, transactions, etc. generated in milliseconds – and static historical data for contextual customer understanding. It takes a continuous analytics system to build that holistic picture, but while companies may believe in the power of real-time action and decision-making, many are too overwhelmed with customer data to act in real time. .

According to a Forrester report, less than a third of executives can get the insights they need from their data. In a separate Forrester survey with CSG Systems, only half – 51% – said they could offer personalized or personalized interactions and 46% could orchestrate actions in real time.

Three-level challenge

The McKinsey analyst reinforces this point and suggests why this should be the case: only a fraction of data from connected devices is ingested, processed, queried and analyzed. Translated: It is not the data that causes problems, but the way it is processed.

There are three reasons for this problem.

First, the increasingly decentralized nature of data generation. Data is created by apps, devices, servers and websites, with customer transactions taking place in every corner of a digital business. For those trying to understand and act in real time using streaming analytics, it creates a strategic architectural challenge of where and how to process data and perform analytics. Should they process the data where it is created or transfer that data to a centralized data store? The former may have limited processing, but the latter almost certainly means data has to travel back and forth across the network, which means delayed analysis and, just as importantly, action.

Next comes the mainstream approach to mastering streaming data. More than a third of organizations are adopting streaming applications and environments for data pipelines, data integration and stream processing, according to Swim’s State of Streaming Data report. The problem is that 70% of them build their own streaming environments and infrastructure, which requires them to solve problems with data storage, platform optimization and system integration. Getting it wrong creates inefficiencies and performance overhead that hamper data processing and analysis.

Finally, there is the presence of old data storage and analysis architectures. Databases and data warehouses store static and historical customer data, but were not designed or optimized to capture, ingest, process, or analyze fast-changing event data streams. They must be integrated into streaming engines with the attendant risk of performance inefficiencies. Databases also incur overhead – the need to invest in additional hardware for performance.

Design architecture

What does it take to increase the rate at which data is ingested, processed, queried, and analyzed? A real-time data architecture based on six critical features:

  • An event broker and messaging tier. This layer provides the means to ingest and move data from different sources to consumers by providing a way to queue messages, act as message brokers and support different communication models like publish-subscribe.
  • A real-time data integration layer providing features such as data pipelines and ETL streaming – collecting data from sources (extraction), converting to the desired format (transformation), and finally storing it in a data store (loading).
  • A fast data management layer to quickly store and access data. This layer should be based on a storage medium and format considered “correct” for your SLA needs. Memory-first tiered storage models and SQL-based access are key enablers at this level.
  • Event and flow processing support timely action and engagement based on the latest event data. Advanced features include analysis by aggregating information from continuously moving incoming events, the ability to join data streams and data stored in the data management tier, and scaling to handle millions of data. events per second.
  • Real-time analysis is aimed at analytical workloads that can provide insights to downstream operational applications. These add value by speeding up legacy batch jobs and speeding up time to insights by using open formats such as Parquet and better compute engines, among others.
  • Real-time machine learning (ML). We know that ML is reshaping how businesses can tailor personalized content and services by adapting models to user preferences, which often change in real time. Historically, ML was developed on batch-based data when data scientists built and tested models using offline historical data. Real-time engagement, however, means feeding the model with live data for continuous improvement. Core capabilities for real-time ML include online prediction and continuous learning, which include updating ML models in real time and integrating new incoming data for accurate prediction.
  • Apps — software and services optimized for real-time architecture and flow analysis. Examples include shopping carts that make recommendations based on a shopper’s clicks and past behavior, or fraud detection systems that identify normal behavior on a person’s credit card and can notify them potential fraudulent transactions through alerts.

Blueprint for a better way

These elements already exist in information architectures. What matters when building real-time applications is how they are implemented. They must be integrated. This means an architecture capable of streaming, querying and analyzing these events, but also querying and analyzing stored data.

Moreover, this architecture must reconcile the strategic challenge of knowing where to place your computing chips. In a distributed computing model, memory first is your ally, so it’s beneficial to use the resources available to you by consolidating pools of memory and other low-latency storage tiers into locally available servers. This means that data does not need to return to the data center, nor do you need to harden servers at the edge. This real-time data architecture will deliver low-latency streaming analytics that leverage access to fast contextual data and real-time customer engagement.


Beyond simple online presence, the challenge for organizations is to engage with customers in real time. Becoming a “digital-first aficionado” means working from a 360-degree view of those customers – something that can only be assembled using streaming analytics grounded in a data architecture in integrated real time.

The New Stack is a wholly owned subsidiary of Insight Partners, an investor in the following companies mentioned in this article: Real.

Feature image via Pixabay

Michael J. Chiaramonte