Blog


Vice President Marketing
January 26, 2012

QoS and service level monitoring has always presented a challenge for telecommunications companies. With the increase in uptake of IP voice and video services, the vast data volumes generated, and the lack of an end to end view, make monitoring the service experience in real-time increasingly difficult.

Real-time QoS Monitoring

In this blog I’m looking at the core building blocks of a real-time IP service monitoring solution by using a much simplified view of a real-time application. Diagram 1 illustrates the basic problem – how to monitor an IP service when the end to end view is only possible by piecing together large volumes of events from many different sources – the core network provider’s network, the home network and the cable modem, and the service providers platforms.

In SQLstream we capture each event stream in real-time. Applications are built as streaming pipelines – unlike a traditional database solution, where event data must first be stored and then processed, SQLstream streams the data through processing views, capturing, combining, filtering, aggregating and applying analytics to the events streams, without having to store the data.  This enables real-time operational intelligence with extremely high volume performance with very low latency.

The first views in the pipeline capture the data streams. A declaration for an external data feed is shown below, the real-time MyEvent stream, where source of events is the external system agent or integration adapter.

CREATE OR REPLACE STREAM MyEvent
( "eventName" VARCHAR(10),
  "eventSeq" BIGINT,
  "eventVal1" INTEGER,
  "eventVal2" SMALLINT,
  "eventVal3" BIGINT )
DESCRIPTION 'source of events';

The raw MyEvent data is first filtered, searching for the events of interest. As illustrated in the code example below, these initial views tend to be as simple as possible in order to maximize reuse – the simplest being a SELECT STREAM * FROM WHERE statement. Streams can be combined, grouped or joined in a single view, or a single view provided per stream, or both.

CREATE OR REPLACE VIEW RawEvents 
    AS SELECT STREAM * FROM MyEvent 
WHERE "eventName" = 'RawEvent';

Diagram 2 illustrates the concept of the streaming data pipeline, using a simplified example for exception detection. The SQL view illustrated above is the Stream Capture #1 view in the diagram. The use case is built on a real world example, raising an exception if a number of events of a particular type or value are detected within a specified time window.

Real-time Stream Processing Pipeline

The second view in the pipeline, Stream Processor #1, is shown below. In this example the view is responsible for the basic processing of the stream, counting the number of events that occur within a time window, in this case 180 seconds.

CREATE OR REPLACE VIEW CountedEvents AS 
SELECT STREAM *, 
   COUNT("eventName") OVER win AS "eventCount",
   FIRST_VALUE(RE.ROWTIME) 
          OVER win AS "firstEventTime",
   FIRST_VALUE("eventSeq") 
          OVER win AS "firstEventSeq" 
FROM RawEvents 
   AS RE WINDOW win 
   AS (RANGE INTERVAL '180' SECOND(3) PRECEDING);

The final stage in this particular processing pipeline is the detection of the alert.

CREATE OR REPLACE VIEW FlagTriggerEvents 
    AS SELECT STREAM *, 
    "eventCount" >= 3 AS "alert" 
FROM CountedEvents;

It would of course be possible to include all processing in a single view. However, maximizing reuse of views is a major consideration when building a stream processing application. The example is to illustrate how a pipeline can be constructed, where each view can have any number of consumers. For example, any number of Rule views can read from the Stream Processor #1 view, and any number of views can read directly from the stream capture view.

The application includes significantly more sophisticated integrations, features and analytics than illustrated here. For example:

  • Multiple rules
  • Recording and forwarding the events responsible for the generation of the alerts
  • Detect escalation
  • Detect clearance events
  • Join with alert history to identify exceptional events that deviate significantly from historical norms

These use cases are important components of a complete solution, and I’ll be providing examples in subsequent blogs, explaining how these have been implemented.


Vice President Marketing
January 11, 2012

For Big Data, 2012 has started where 2011 left off, with a plethora of reports, articles and blogs. Interestingly, most still begin with the question “what is Big Data”. It appears ‘Big Data’ as a market is broadening its footprint far beyond its open source and Hadoop origins. My favourite new term in this quest for delineation is “Small Big Data”. (Isn’t that just “Data”?)

The most interesting trend for us is streaming Big Data processing and analytics. Edd Dumbill, O’Reilly Radar, talks about this as one of the “Five big data predictions for 2012”, “Hadoop’s batch-oriented processing is sufficient for many use cases, especially where the frequency of data reporting doesn’t need to be up-to-the-minute. However, batch processing isn’t always adequate, particularly when serving online needs such as mobile and web clients, or markets with real-time changing conditions such as finance and advertising.”

The real-time use case is an obvious one. If you need to respond or be warned in real-time or near real-time, for example, security breaches or a service impacting event on a VoIP or video call, the high initial latency of batch oriented data stores such as Hadoop is not sufficient.

However, there is also an emerging discussion on the storage of Big Data for big data’s sake. This is the blind collection and storage of data without due consideration as to how it’s going to be used. Dan Woods talks about this in his recent Forbes article “Curing the Big Data Storage Fetish”. The data will never create value without analysis, and little thought has been given to increasing analytics capacity.

There are many vendors emerging for the historical analysis of Big Data repositories, either on the Hadoop platform, or on platforms from the other large scale data warehouse vendors. However, there are very few vendors in streaming Big Data analytics space, and even fewer products with the maturity, flexibility and scalability to process Big Data streams in real-time.

Streaming Big Data analytics needs to address two areas.  First, the obvious use case, monitoring across all input data streams for business exceptions in real-time. This is a given.  But perhaps more importantly, much of the data held in Big Data repositiories is of little or no business value, and will never end up in a management report. Sensor networks, IP telecommunications networks, even data center log file processing – all examples where a vast amount of ‘business as usual’ data is generated. It’s therefore important to understand what’s being stored, and only persist what’s important (which admittedly, in some cases, may be everything).  For many applications, streaming data can be filtered and aggregated prior to storing, significantly reducing the Big Data burden, and significantly enhancing the business value of the stored data.  At least until we understand why we’re trying to store everything.

Posted under Big Data