Stream processing for real-time Big Data log analytics

Log analytics has been around for a while but until recently, “real-time log analytics” usually meant

a) slow answers (in many minutes or even a few hours)  and

b) low volumes (data arriving at a few thousand records per second).

Things, needless to say, are changing (could be the recent uprise in all-things-smart, or a fair need that business decisions are made as fast as possible, or a combination of both). In any case, recently we’ve been seeing real-time log analytics where the customer requires sub-second latency at data rates varying from 50,000 records per second upwards (the highest was 100 million records per second but admittedly that’s a bit of an outlier, even by today’s Big Data standards).

There are an increasing number of business intelligence vendors focussing on the log analytics and SIEM market (or operational intelligence as some call it). These vendors offer a wide range of capability and deployment options, from on-premise through to pure Cloud SaaS. However, as requirements move more towards real-time and high performance, we’re seeing the cost of these tools getting prohibitively expensive, as most also fail to scale above the few thousand of records per second mark.

Cue stream processing. Our recent customers in markets as diverse as Internet clickstream analysis and online advertising, fraud detection and security intelligence, call monitoring in telecoms, even GPS micro-location services, all had very similar IT pains – they had failed with log monitoring and SIEM tools to get the required performance, enterprise integration and analytics flexibility. Or they had tried the Apache open source options such as Kafka and Storm, and found the that the cost of ownership too high for integration and application development costs, with inadequate latency and throughput performance, and initial projects taking six months or more.

Guavus SQLstream offers real-time log analytics with the low latency / high throughput performance you would expect of a stream processing platform, but with the rapid ease of deployment, flexibility and inter-operability you would expect of a dedicated log monitoring application.

  • Out of box log collection adapters and remote, embeddable agents
  • StreamLab for browsing and visualizing the streaming log data
  • Flexible, powerful analytics built using streaming SQL
  • Continuous ETL with Hadoop (fastest ingest available at over 400 MB/s) and RDBMS
  • Scalable to millisecond latency from data arriving at millions of records per second
  • Support for Kafka as an option for a distributed stream processing architecture and streaming SQL connectors for Storm to accelerate Storm deployments – both performance and time to market.

As Big Data technologies start to mature, it’s more and more clear that if you need “real-time log analytics” AND “Big Data”, then stream processing is where you should be looking. Guavus SQLstream offers real-time, Big Data performance, and with platforms such as StreamLab, and a core stream processing platform that uses standard SQL queries to analyze data streams, solutions can

a) be deployed in a fraction of the time of existing stream processing tools,

b) at much lower cost through rapid development coupled with significantly less hardware,

c) with the real-time performance that existing log monitoring and SIEM tools are unable to offer.