Blog


Vice President Marketing
January 10, 2013

The SQLstream Briefing Room webinar with Robin Bloor took place on Tuesday January 8th 2013.  ”Windows of Opportunity: Big Data on Tap”  highlighted how the evolving Big Data landscape needs technologies that enable a much bigger enterprise-wide picture, complete with multiple data streams that can be combined to show what’s happening in real-time. The speakers included:

  • Eric Kavanagh, CEO, The Bloor Group, who hosted the event.
  • Robin Bloor, Chief Analyst, The Bloor Group, who lead the online briefing
  • Damian Black, President & CEO, SQLstream, discussing the emergence of streaming Big Data management as a key enabler for Big Data solutions, and how SQLstream is at the forefront of streaming innovation.

This was a very interesting and informative briefing on the emergence of streaming Big Data management and the use cases for real-time Big Data solutions.

Click here to watch the webcast …

Title: Windows of Opportunity: Big Data in Tap
Most business opportunities are moving targets these days, rendering static analytical solutions rather ineffective. Instead, organizations need technologies that enable a much bigger picture, complete with multiple data streams that can be combined to show what’s happening in real-time. And increasingly, companies need to analyze both traditional structured data as well as Big Data, including machine-generated data from all manner of enterprise systems.

Register for this episode of The Briefing Room to hear veteran Analyst Robin Bloor explain how a confluence of market forces has opened the door to a new analytical paradigm, one in which companies can leverage a vast array of data streams to pinpoint windows of opportunity as or even just before they appear. Bloor will be briefed by Damian Black of SQLstream, who will discuss his company’s analytical platform, which enables the management of dynamic information assets in much the way that traditional databases do for stored assets.

Posted under Events

Vice President Marketing
October 25, 2012

Real-time Big Data means the streaming, continuous integration of high volume, high velocity data from all sources to all destinations,  coupled with powerful in-memory analytics. It’s a paradigm shift from conventional store and process systems that’s playing well here at the Intelligent Transportation Systems and Solutions World Congress here in Vienna.

(more…)


Vice President Marketing
October 19, 2012

We’re at the Intelligent Transportation Society’s World Congress in Vienna, Austria next week, Mon 22 – Fri 26 October. The theme for this year is ‘Smarter on the Way’ and this is the first year since 2009 that it’s been held in Europe. The full panoply of intelligent transportation issues is addressed, everything to do with software, hardware and firmware that improves the traveler experience, transportation network utilization and capacity optimization. However, the combination of real-time traveler information systems, wireless sensors, GPS and predictive analytics is a specific focus. And increasingly, telematics and V2I/V2V applications are represented.

We’re going to be exhibiting, but also presenting on a special interest panel session: Optimizing ITS from a Customer Service Perspective. Essentially looking at how new technology can be used to make a significant and noticeable difference to a traveler’s experience. Our focus is to bring to bear our experience of consumer and customer quality of experience monitoring from other industries, particularly telecommunications. Social media and Twitter also has a role to play, with significant interest is extracting semantic information from Tweets to detect serious traffic incidents and congestion as it happens. However, there are issues with relying on social media, as I’ve discussed here in a recent ITS International article, Social media mooted for traffic management.

Industries such as transportation, M2M and telematics are clearly leading the way in deploying real-time Big Data applications. And that’s understandable, it’s sensor networks and the Internet of Things that’s creating vast volumes of high velocity data where each event or record has value for only a very short period of time after its creation. If you’d like to understand more about streaming geospatial analytics from live vehicle GPS data, the video below presents a brief overview of a recent deployment.

Continuous ETL for Google BigQuery

There is another significant difference from the traditional business intelligence and real-time analytics market . The mainstream Big Data market is focused entirely on Business Intelligence, a one way function for  collecting data in a static store, and displaying the results on a graph or report. Visualization is also an important consideration for industry, but streaming integration is the first requirement, and the ability of the platform to support real mission critical applications a close second.

The Intelligent Transportation industry has all three requirements in abundance – vast volumes of fixed, wireless GPS sensor data that must be integrated seamlessly across the operational systems, mission critical operation as significant safety, environmental and traveller impact decisions are made as a result, and the abilty to visualize network flow, congestion and alarms.

So if you’re in Vienna, Austria next week, drop by our booth, P21.


Vice President Marketing
April 19, 2012

Joining real-time structured and unstructured data feeds for better accuracy and reliability from your operational intelligence, and the Text Analytics Summit, 2012, London.

Three IT trends have emerged over the past year – Big Data, real-time and the importance of unstructured data. Taking the latter first, there is an increasing awareness that much of the data we have available to us today is unstructured (Cloudera amongst the many claiming 80% of all data is unstructured).  Unstructured data includes text messages, documents, tweets emails and video content. There’s also a growing industry for tools and software that perform unstructured data analytics – primarily text analytics using semantic modeling, tagging and subsequent analysis.

The past year has also seen Big Data and Hadoop emerge from the rarefied atmosphere of California’s Silicon Valley into mainstream IT.  Driven by statistics such as 90% of all data available today has been generated in the past two years, Big Data as a functional area for primarily unstructured data is here to stay, and is effectively supercomputing lite for the masses.

The need for real-time streaming data management

However, the real-time trend is less well served today by either Hadoop or by the currently available tools and software for unstructured data analytics. Real-time is about the need for immediate detection and response – turning data sources into live data feeds, and processing the data on the fly, then loading batch based distributed platforms such as Hadoop as an output data stream.

‘Stream Reasoning’

I’ve also seen the term ‘stream reasoning’ used to describe the real-time processing of unstructured data, although this is still an area that is less well developed and understood than the more mainstream text analytics from stored data.  ‘Streaming Reasoning’ is the ability to process and respond to semantic knowledge about tweets, messages and other social media interaction in real-time, on the fly. The diagram below illustrates how a semantic modeling library has been plugged into a real-time streaming pipeline in SQLstream – the example is based on SQLstream’s GATE UDX but any library with reasonable performance and a query response API can be plugged in.

Combining streaming structured and unstructured live data feeds

Unstructured data feeds, such as text messages and tweets, are streamed through the semantic tagging UDX and library, with the output of this stage being real-time streams of semantic tagged data.  The data can then be analyzed and frequency charted in real-time.

Text Analytics Summit, 2012, London

I’ll be speaking on this topic at the  Text Analytics Summit, 2012, London.  I’ll be discussing how to combine streaming reasoning (admittedly, mostly Twitter messages) with structured data, with the objective of improving the overall accuracy and reliability of the resulting operational intelligence.  I’ll be using a couple of examples – customer experience management for IP content services such as VoIP and VoD, and also improving the accuracy and reliability of traffic congestion information and travel time information – how can text analysis of tweets and messages help to pinpoint the severity of road network traffic problems.

Look forward to seeing you there, or if you can’t make, I’ll be blogging on the highlights next week.

 


Vice President Marketing
October 12, 2011

Visit SQLstream on Booth #1366

Technology and innovation are central themes of this year’s ITS World Congress.  There’s been much written about the issues of congestion, green transportation schemes and improving personal mobility, not least in this blog.  At SQLstream we’ve been playing our part to help revolutionize the Intelligent Transportation industry.  It’s clear that the concepts of streaming data and real-time analytics are entering the main stream – from low level Big Data toolkits that require a streaming, low latency front end, to the real world of sensor networks and industries such as smart grid and telecommunications.

This is just as true in transportation.  Here we have an industry with vast volumes of sensor data, a need for sophisticated real-time analytics, and platforms capable of driving real-time process automation.  We’ve been working with a number of transportation agencies for some time, and are about to launch a new ‘Insight’ product for intelligent transportation.  Our ‘Insight’ range provides tools and out of the box support for specific industry verticals based on our core Stream-to-Business platform.

Google Earth Display for Road Traffic Congestion

Google Earth Display for Road Traffic Congestion

For Intelligent Transportation this means processing sensor data from GPS and fixed-road sensors, to deliver applications such as real-time Travel Time, live congestion detection and network KPI reporting.

Should you be attending the ITS World Congress, we’d be delighted to see you on our booth (#1366) for a demonstration.


Vice President Marketing
September 13, 2011

The 18th World Congress on Intelligent Transport Systems (ITS) is being held in Orlando from October 16th – 20th, 2011. This is the leading event for intelligent transportation solutions, and attracts a large audience of government, technology and industry professionals. The event seeks to demonstrate advances in the application of new technology and smart transportation. Major areas of focus include the reduction of traffic congestion and improvement in  personal mobility.

With 800 million vehicles on the world’s roads today, a number forecast to grow to between 2 and 4 billion by 2050, it is clear that transportation management  systems will need to analyze real-time sensor and GPS data dynamically on a massive scale to reduce congestion and optimize personal mobility. The objective is to achieve a fluid and reliable transportation network, that can respond dynamically to changing loads and conditions, and provide consistent and acceptable travel times.

The performance of a transportation network can be measured based on road usage (number of vehicles), and the travel speed and time from origin to destination.  Today’s traffic management systems rely on historical analysis of data from fixed sensors.  However, roadside and in-road sensor projects are very expensive to install and maintain. As a consequence, only a very limited view of the overall road network is available,  with sensor deployments focusing on primary routes and major intersections only. Also, fixed sensors tend to report traffic flow – at best a secondary measure of the real requirement –  congestion.

Most important however is the lack of real-time, dynamic behaviour from existing traffic management systems.  Flow control, for example at intersections and on freeways, is activated at specific times based on the historical analysis of the fixed sensor data – this helps, but is unable to react to changing patterns of traffic flow and congestion.

One approach to the problem is to introduce the latest wireless GPS sensor technology.  Wireless GPS sensors have two significant advantages:

  1. Immediate and real-time information on vehicle speed and location.
  2. Low cost solutions that can be deployed quickly, with little or no maintenance.
  3. Provides a direct measure of vehicle speed and the ability for real-time and accurate measure of congestion.
  4. Complete network insight – highways and arterial routes – at the granularity of a few meters.

SQLstream ITS Insight

For example, when one national road agency was re-evaluating its approach to intelligent transportation systems, it identified wireless GPS technology as both a significantly cheaper and potentially much superior solution to congestion detection and Travel Time. SQLstream was selected as the real-time traffic analytics and congestion detection platform based on processing in-vehicle GPS sensor data. The SQLstream solution enabled the agency to cancel a $20million fixed sensor program,  and to build a real-time traffic management platform based on SQLstream’s ITS Insight.

We will be demonstrating our real-time traffic management capabilities on our stand at ITS World Congress in Orlando.  In addition, our CEO, Damian Black will be participating in a number of related panel sessions on arterial travel time solutions and real-time data management for intelligent transportation.  For those attending ITS World Congress, please visit us for a demo at Booth #1366, or visit our website for more information on SQLstream and real-time transportation management systems.  We look forward to seeing some of you at least some at the show.

August 31, 2011

I am going to discuss a SQLstream application for monitoring traffic flow in real-time. In this application, vehicles with GPS enabled devices transmit vehicle position along with other vehicle information such as speed and engine state. SQLstream receives this information as a real-time data stream and uses streaming SQL analytics to detect and predict the rapid onset of congestion on the road network in real-time.

Streaming SQL for Congestion Detection
The SQLstream application for congestion detection uses a typical streaming SQL processing pipeline. In this case, data is fed into the SQLstream pipeline using our Log File Adapter. SQLstream adapters provide an interface to sources and targets such as databases, log files, network sockets and mail servers. Adapters are built using SQL/MED specification which is part of ANSI SQL standard. In this application, each log file contains the vehicle positions on the road network for the latest minute.

The conditioning pipeline performs data cleansing operations such as rejecting poor quality data (records with missing or out-of-bounds columns) followed by mapping of vehicle positions (lat/long pair) to a “road element” of the road network using a UDX to perform geo-spatial lookups in an external road network database.

The diagram and the example SQL below show our implementation of a streaming SQL pipeline for congestion detection. Each vehicle reports its position and speed every minute. Two consecutive vehicle positions are then used to interpolate vehicle speeds for each road element on the vehicle path between reporting positions. The interpolated speed is based on actual distance traveled by the vehicle between two consecutive reports. The interpolated speed is calculated in a User Defined Transform(UDX). The UDX is written in Java. The UDX also associates a confidence factor with each interpolated speed value based on the position of the road element relative to endpoints of the vehicle path.

Streaming Traffic Flow Analytics
As illustrated below, the analytics pipeline calculates 15, 5, 4, 3, 2 & 1 minute moving average speeds for each road element. Each road element is color coded based on the 15-minute moving average speed. The results are streamed to a Google Earth display.

CREATE OR REPLACE VIEW “EstimatedReSpeeds” AS
SELECT STREAM “RE”, “reID”, “Carriageway”, “rePrescribed”, “reSpeedLimit”,
++SUM(“reVehicles”) OVER “last1″ AS “reVehiclesLast1″,
++SUM(“reVehicles”) OVER “last2″ AS “reVehiclesLast2″,
++SUM(“reVehicles”) OVER “last3″ AS “reVehiclesLast3″,
++SUM(“reVehicles”) OVER “last4″ AS “reVehiclesLast4″,
++SUM(“reVehicles”) OVER “last5″ AS “reVehiclesLast5″,
++SUM(“reVehicles”) OVER “last15″ AS “reVehiclesLast15″,
++SUM(“reSpeed” * “reConfidence”) OVER “last1″ /
++SUM(“reConfidence”) OVER “last1″ AS “reSpeedLast1″,
++SUM(“reSpeed” * “reConfidence”) OVER “last2″ /
++SUM(“reConfidence”) OVER “last2″ AS “reSpeedLast2″,
++SUM(“reSpeed” * “reConfidence”) OVER “last3″ /
++SUM(“reConfidence”) OVER “last3″ AS “reSpeedLast3″,
++SUM(“reSpeed” * “reConfidence”) OVER “last4″ /
++SUM(“reConfidence”) OVER “last4″ AS “reSpeedLast4″,
++SUM(“reSpeed” * “reConfidence”) OVER “last5″ /
++SUM(“reConfidence”) OVER “last5″ AS “reSpeedLast5″,
++SUM(“reSpeed” * “reConfidence”) OVER “last15″ /
++SUM(“reConfidence”) OVER “last15″ AS “reSpeedLast15″
FROM “Stage3″
WINDOW “last1″ AS (PARTITION BY “RE”
++RANGE INTERVAL ’1′ MINUTE PRECEDING),
+++++“last2″ AS (PARTITION BY “RE”
++RANGE INTERVAL ’2′ MINUTE PRECEDING),
+++++“last3″ AS (PARTITION BY “RE”
++RANGE INTERVAL ’3′ MINUTE PRECEDING),
+++++“last4″ AS (PARTITION BY “RE”
++RANGE INTERVAL ’4′ MINUTE PRECEDING),
+++++“last5″ AS (PARTITION BY “RE”
++RANGE INTERVAL ’5′ MINUTE PRECEDING),
+++++“last15″ AS (PARTITION BY “RE”
++RANGE INTERVAL ’15′ MINUTE PRECEDING);

Detecting the rapid onset of congestion
Congestion is detected by comparing moving averages for the larger time window with that for the smaller time window. For example, comparing a 2-minute average with a 1-minute average:

CREATE OR REPLACE VIEW “CongestionRule1″ AS
SELECT STREAM
++–- name, ID, highway name, speed limit etc. for each road element
++“RE”, “reID”, “Carriageway”, “rePrescribed”, “reSpeedLimit”,
++–- volume of vehicle reports in each time window
++“reVehiclesLast1″, “reVehiclesLast2″, “reVehiclesLast3″,
++“reVehiclesLast4″, “reVehiclesLast5″, “reVehiclesLast15″,
++–- estimated avg speed for each road element
++“reSpeedLast1″, “reSpeedLast2″, “reSpeedLast3″,
++“reSpeedLast4″, “reSpeedLast5″,”reSpeedLast15″
FROM “EstimatedReSpeeds”
WHERE “reSpeedLast1″ < 0.80 * “reSpeedLast2″ AND – slowdown by 20 %
++“reSpeedLast2″ < 0.80 * “reSpeedLast3″ AND
++“reSpeedLast3″ < 0.80 * “reSpeedLast4″ AND
++“reSpeedLast4″ < 0.80 * “reSpeedLast5″ ;

SQLstream Traffic Congestion Detection - Visualization

Note that these estimated speeds are over overlapping windows and as such slowdown thresholds are set accordingly.

Fine tuning slowdown thresholds and other information, such as the proximity of traffic lights and the volume of vehicle reports in each time window, improves the quality of congestion detection algorithm.

The Google Earth screenshot illustrates real-time traffic view as well as detected slowdowns as pins. The severity of the slowdown is indicated by different shades of red.

November 8, 2010

PostgreSQL Conference: West 2010Attendance at the PostgreSQL West 2010 Conference was encouraging considering a million people had gathered in the city to celebrate the World Series victory of the San Francisco Giants.

I’ve posted the presentation from the event (previously blogged about here).

We presented the concepts of Streaming GIS, integrating SQLsteam’s real-time streaming data analytics with the PostgreSQL-based Geographic Information Systems (GIS) engine PostGIS. With examples from SQLstream’s commercial traffic congestion monitoring application, we discussed how sophisticated high performance real-time geospatial applications can be delivered quickly and easily using standards-based SQL.

Streaming GIS using PostGIS and SQLstream at PostgreSQL West 2010

 

 


Vice President Marketing
October 26, 2010

A streaming SQL query is a continuous, standing query that executes over streaming data. Data streams are processed using familiar SQL relational operators augmented to handle time sensitive data. Streaming queries are similar to database queries in how they analyze data; they differ by operating continuously on data as they arrive and by updating results in real-time.

Streaming SQL queries process dynamic, flowing data, in contrast to traditional RDBMSs, which process static, stored data with repeated single-shot queries. Streaming SQL is simple to configure using existing IT skills, dramatically reducing integration cost and complexity. Combining the intuitive power of SQL with this simplicity of configuration enables much faster implementation of business ideas, while retaining the scalability and investment protection important for business-critical systems.

By processing transactions continuously, streaming SQL directly addresses the real-time business needs for low latency, high volume, and rapid integration. Complex, time-sensitive transformations and analytics, operating continuously across multiple input data sources, are simple to configure and generate streaming-analytics answers as input data arrive. Sources can include any application inputs or outputs, or any of the data feeds processed or generated within an enterprise. Examples include financial trading data, internet clickstream data, sensor data, and exception events. SQL can process multiple input and output streams of data, for multiple publishers and subscribers. To learn more about Streaming SQL, please read our “Concepts in Streaming SQL” mini-white paper.

Posted under Streaming SQL

Vice President Marketing
October 4, 2010

Businesses need to respond faster than ever to customer information and demands, which are arriving in rapidly increasing volumes from ever more diverse and distributed systems. This need for real-time business models can not be addressed by traditional integration and business intelligence solutions because streaming analytics and related concepts are central to the solution. The real-time model means responding immediately to new information as it arrives and streaming analytics is at the core of these next generation IT systems.

Increasing the speed of business under these pressures of rapidly increasing data volume and more diverse data sources has been expensive and complex. Rapid responsiveness has proved elusive because real-time needs simply cannot be met by delivering more information faster from historical data. Real-time businesses require distributed technology that provides low latency and high-performance processing of data and event streams. By using continuous, streaming SQL queries, business answers can be generated as soon as input data becomes available. Whereas databases query historical data, streaming SQL queries and transforms data on the wire without any prior staging in a database.

As a result, streaming SQL is complementary to traditional EAI, business intelligence, and data warehousing solutions. By completing real-time processing and analysis before storing the data, streaming SQL delivers reduces the cost of processing rapidly arriving data. Even better, streaming SQL makes existing, in-house SQL skills immediately applicable to real-time analysis, reducing integration time and costs.

To learn more about the Business Case for Streaming SQL, please read our “Concepts in Streaming SQL” mini-white paper.

Posted under Streaming SQL
July 29, 2010

Railroads have used track side readers to scan bar codes on the sides of freight cars since the 1970s. Such sensors provided real time tracking of goods as they made their way from the supplier to the delivery point. Retail businesses increased the use of RFID tags in the past 20 years to track goods through the manufacturing process. Since the Indian Ocean tsunami of December 2004 the public has become aware of deep water pressure sensors which sit on the ocean floor to detect tsunamis and are intended to generate warnings about potential disasters.

The cost of sensors has decreased significantly in recent years and as a result inexpensive sensors are present nearly everywhere in businesses. As the price of sensors decreases it becomes economically feasible to deploy thousands and even millions of sensors. Such sensors cumulatively generated huge volumes of data. Imagine placing a sensor capable of measuring temperature, humidity, sun light and air pressure sensor within each square kilometer in the state of Iowa to assist farmers in managing crop production. Now imagine each of those 145,743 sensors generating 100 bytes of data every minute resulting in a data volume of nearly 21GB per day.

There is much buzz about Big Data and the challenges of applying traditional database management tools to extract business value from such data. Fortunately, there is a better way – integrating real time data, as provided by sensors, with stream analytic processing, allows timely enterprise decisions in response to changing conditions.

I urge you to read Damian Black’s recent postings on this blog describing the SQLstream approach to “Big Data”.

(more…)


CEO
July 1, 2010

GigaOM Structure 2010 Big Data and Cloud ComputingLast week I was on a panel for “Big Data” at Structure2010 – a GigaOm event. As usual, it was very well run and there was a large throng of silicon valley luminaries ranging from entrepreneurs to venture capitalists scattered in with some large customers and users of technology. We clearly have moved on a long way from the days when I was told to change my slides and remove the cloud graphic and replace it with a box because “clouds are cloudy” (direct quotation from a tier one venture capitalist – I wish to protect his identity to avoid personal embarrassment).

SQLstream is already the market leader in applying stream computing to Intelligent Transportation Systems, and we also have the opportunity to provide a similar impact to the Cloud Computing Service Monitoring space. It seems we have exactly the perfect solution to provide real-time insights into service usage, bottlenecks, error rates and service level compliance. And you can add regulatory compliance to that list too – from the continuous alerting side to complement the excellent historical solutions that are out there.

From the presentations at the show, it is clear that Cloud Computing has truly come of age. SQLstream uses cloud services for all demonstrations and also in our QA and Engineering processes. We also have customers deploying in the cloud. The latest emerging cloud solutions fill in many of the former technology gaps, allowing seamless integration into or transition from traditional data centers. You can even run your own private clouds leveraging the same APIs available on the public clouds.

On the Big Data front, on the panel alongside SQLstream were a Hadoop vendor and a high-performance column store data warehouse vendor. The other two panelists were users of “big data” technologies. It was interesting to discover that we already had two implementations where SQLstream operates in concert with or in parallel with the other two panelist vendors’ technologies.

There is even a customer (Mozilla) that uses all three technology approaches for download analytics – Hadoop in the form of HBase and a column store data warehouse for historical SQL queries over downloads, and SQLstream to generate high-performance continuous real-time analytics and reporting on download statistics for all versions of FireFox. This clearly demonstrates that there is a role for each of the Big Data technologies high-lighted on the panel, and an interesting and growing market opportunity. It also indicates some clear partnership opportunities.

I look forward to seeing the developments in our space and in cloud computing over the coming year and hope to be invited back again soon. We were originally present on the Big Data panel at GigaOm’s inaugural Structure2008 event, so I guess we should be set for a reappearance at Structure2012?! If so, I am sure we will have some exciting new stories to share.

Here is a link to the video recording of the panel session. A big thank-you to Phil Hendrix for his excellent moderation of the panel and the professional preparation work he did beforehand so that the actual event went smoothly.

Posted under Big Data