Blog

April 8, 2014

cloudera_certifiedSAN FRANCISCO, CA – April 3rd, 2014 – SQLstream, Inc., leaders in powering next- generation smart services, today announced the certification of its s-Server stream processor on Cloudera Enterprise 5. The certification couples the real-time power of SQLstream’s stream processor with the enterprise data hub capabilities of Cloudera’s platform for Big Data.

SQLstream s-Server brings SQL standards-compliant real-time stream processing to Cloudera’s next generation Hadoop powered platform.

“We see value in having SQLstream s-Server 4.0 certified on Cloudera Enterprise 5. Enterprises that deploy enterprise data hubs in their environments, and need SQLstream’s stream processing capabilities, will be able to take advantage of the latest innovations from both companies,” said Tim Stevens, vice president of Business and Corporate Development at Cloudera.

The certification covers the s-Server streaming connector for Hadoop, a high performance adapter that automatically creates the Map-Reduce jobs required to load streaming data into Hadoop and to retrieve and replay stored streams. Data streams are automatically partitioned and distributed across replicated nodes in the Cloudera Enterprise Hadoop cluster, and can be queried using tools such as Hive.

“We are proud to work with Cloudera on this exciting initiative,” said Damian Black, SQLstream CEO. “Users of Cloudera Enterprise 5 who wish to power real-time smart services at massive scale, will get those benefits from our s-Server while retaining all the benefits of a Cloudera-powered Enterprise Data Hub.”

SQLstream’s stream processor enables live data streams to be queried using standards-compliant, continuous SQL at rates of over 500,000 records/second per core, and to ingest data at hundreds of MB/second. For Cloudera customers, the certification of s-Server provides the infrastructure to automatically stream machine data and analytics into Hadoop at ingest rates that are significantly higher than any other interface; moreover, stored data can be joined with real-time streams or re-streamed for fast-forward analysis.

ABOUT SQLSTREAM
SQLstream Stream Processors power smart services and other real-time business processes using analytics from unstructured sensor, log, and other machine-generated Big Data streams. SQLstream’s core engine, s-Server, is a distributed, streaming SQL platform for real-time operational intelligence applications, fully integrated with Hadoop and other enterprise data management platforms. SQLstream is the recipient of leading industry awards, including the Ventana Research Technology Innovation Award for IT Analytics and Performance. SQLstream is based in San Francisco, CA.

Posted under Press Releases
March 31, 2014

SQLstream’s Stream Processor to Deliver the Next Generation of Real-time Gaming Analytics  

Screen Shot 2014-03-31 at 11.59.14 AMSAN FRANCISCO, CA – April 1, 2014 – SQLstream Inc., the Big Data Stream Processors company, today announced that The Workshop has selected SQLstream for real-time gaming analytics. The Workshop, a leading US gaming studio and creator of the internationally acclaimed game Sorcery, selected SQLstream to improve game performance and monetization by delivering real-time, actionable intelligence from live streams of game events and platform performance log data.

SQLstream was chosen for its capacity to capture, analyze and visualize in real-time the high volume of unstructured event data generated by server logs, game events and player activity logs.

“SQLstream will enable us to measure and understand how players are interacting with our games in real-time and to respond quickly to their needs,” said Charles Tolman, CTO of The Workshop. “SQLstream gives us the real-time processing power required, plus the flexibility to deploy new and sophisticated game analytics quickly.”

Analytics platforms in the gaming industry have not kept pace with data volumes and the increasing requirement for real-time optimization of player experience and platform monetization.  Analytics tools based on relational database technology remain widespread. The increasing adoption of Hadoop and Big Data NoSQL platforms has helped with capturing and analyzing unstructured data; however, neither RDBMS or Hadoop storage platforms can process high-volume data streams while meeting the low-latency requirements and sophistication of analytics needed to power today’s games. Real-time gaming analytics for modern gaming platforms requires a stream processor.

Learn more about SQLstream’s stream processing solutions.

About The Workshop

Founded in 2007, The Workshop is a gathering of experienced industry vets intent on making games in a new way.  With more than 60 years of game development between them, founders Peter Akemann, Chris Busse and Charles Tolman believe that efficient management and engineering practices, along with strategic use of external technology solutions and co-development partnerships, allow them to maintain a tight-knit creative team that can focus on delivering world-class gameplay.  The Workshop has shipped multiple high-quality game products, including the highly acclaimed Sorcery for the PlayStation® Move as well as the Borderlands 2 DLC known as Mr. Torgue’s Campaign of Carnage.  For more information, visit http://theworkshop.us.com.

About SQLstream

SQLstream’s Stream Processors automate business processes using real-time analytics from unstructured sensor, log, and other machine-generated Big Data streams. SQLstream’s core stream processing engine, SQLstream s-Server, is a distributed, streaming SQL platform for real-time operational intelligence applications.  SQLstream offers a full range of machine data integration agents plus integration with Hadoop and other enterprise data management platforms. SQLstream is the recipient of leading industry awards, including the Ventana Research Technology Innovation Award for IT Analytics and Performance. SQLstream is based in San Francisco, CA.

Media contact

Ronnie Beggs, VP Marketing

Email: pr@sqlstream.com

Phone: +1 415 758 8342

Website: www.sqlstream.com

Posted under Customer win · Press Releases
March 21, 2014

Screen Shot 2014-03-21 at 11.51.26 AMThe Internet of things has been on everyone’s lips (and pen) lately, as the biggest new source of money, customer product innovation, investment opportunities, and Sci-Fi-worthy stories. The truth is the value of the Internet of things only comes from the astounding mass of data it’s bound to produce, and there will be no money made until the data-processing market is ready to take on the challenge.

Venture Beat publishes SQLstream article dissecting the main requirement for being able to harvest the full potential of the Internet-of-Things.

Read full article here

Posted under Impressions · In the News · Market Views
March 20, 2014
data platforms 2014

Click on picture to view in full size

Analyst firm 451 publishes long-awaited data management platform map and introduces areas like log management, in-memory and—finally!—stream management.

Posted under Analyst review · In the News
February 26, 2014

Screen Shot 2014-02-26 at 11.03.13 AMWhat goes around surely comes back around, which in the world of data is often called lifecycle management. Very few organizations have ever formalized and implemented such a grandiose practice, for only until recently has the concept become seriously doable without great expense.

Eric Kavanagh of Inside Analysis writes about a new, much more granular version of lifecycle management circulating these days described by Dr. Robin Bloor as Data Flow, and lists SQLstream as one of the few innovative vendors focused on real-time stream processing.

 Click here to read article

Posted under Analyst review · In the News
February 20, 2014

Screen Shot 2014-02-19 at 4.48.48 PMAPIs are creating a new age of intelligent machine-to-machine (iM2M) analytics apps aided by SaaS. M2M data is everywhere, but individuals don’t want the flood of data — just the pint they need when they thirst for it. Distinguished market researched Kishore Jethanandani talkes about how grossly underutilized data, often tucked away in some obscure database and accessible with APIs on the cloud, can be used to solve the problem and lists SQLstream as one of the companies who found appropriate solutions by connecting the data dots.

Click here to read article

Posted under Analyst review · In the News · Market Views

Vice President Marketing

strata-logoIt was evident at the O’Reilly Strata Conference, Feb 11th through 13th, that stream processing is now a mainstream Big Data technology. In fact, judging by the level of interest, stream processing for real-time analytics over machine data is right up there as the main Big Data trend in 2014.

The tag line on our booth: “Big Data Stream Processors,” caused many of those walking by to stop and ask for more detail. We asked them to first describe their situation and needs, and we heard many stories about how data volumes, particularly machine data, are growing rapidly and that the time interval to react to the data keeps getting shorter. A constant theme was: “we can handle it now, barely, but it is clear that we are going to have to do something differently soon.”

Music to our ears because that’s where a stream processor comes in. Attendees at Strata are generally up on the latest Big Data developments, and for the first time, we didn’t need to spend any time defining streams and explaining how they are not just data coming from a database. Instead, customers know they have or will soon have Big Data streams and wanted to know how a stream processor could help them solve their problems.

Understanding Stream Processors

A stream processor sits in the data path between the sources of machine data and in front of the backing stores. So no rip and replace is required. Incoming data streams are analyzed and can be acted upon without delay; sub-second response times are easy for a SQLstream stream processor, even at a tens or hundreds of thousands, even millions of events per second. The incoming streams can be forked off and stored raw in whatever persistent storage is already in use. The processed streams can be stored as well for later restreaming (restreaming is the fast-forward analysis of stored time-series data).

“Sounds exactly like what we need,” we often heard, “but I’m not sure I’ve got time to ramp up on a new system.” We explained our use of ANSI-standard SQL in a high velocity streaming environment (not a big surprise given our name) and they realized that their existing technical resources could be comfortable supporting stream processing of their high volume, velocity and variety Big Data streams.

Streaming SQL for Storm

Strata attendees asked us about Apache Storm, and we explained our support for Storm installations, bringing higher performance and ease-of-use to the popular platform. While Storm isn’t necessary to use SQLstream, we believe that SQLstream ensures Storm implementations reach their full potential – faster performance, less hardware, dynamic updates and faster time to value. Read about the SQLstream Stream Processor for Storm here.

Announcing SQLstream 4.0

Existing users of SQLstream were happy to hear that we introduced version 4.0 at the show, liking very much improvements in performance, visualization capability, and interoperability. SQLstream 4.0 also includes our APIs for Storm integration and a new HTML5 real-time dashboarding tool. Read more about SQLstream 4.0 here, or download and try out s-Server 4.0 here.

Strata turned out to be a great show. It appears very clear that the future SQLstream has long imagined, one where streams of data are used intelligently to drive business value in real time, is starting to come true.

Posted under Stream Processors
February 18, 2014

Screen Shot 2014-02-18 at 11.07.05 AMWith the O’Reilly Media’s Strata conference opening one week ago, a lot of big data companies announced new products and looked for the spotlight. In this review published by CNN Money, distinguished analyst Derrick Harris of GigaOM goes through the most notable events in Big Data and lists SQLstream as one of the new offerings for Storm integration.

Click here to read article

Posted under Analyst review · In the News · Product

Vice President Marketing
February 13, 2014

SQLstream 4.0 brings unparalleled real-time performance, Apache Storm integration, enhanced real-time visualization and new industry-specific StreamApps toolkits to the leading standards-based Big Data stream processor.

SAN FRANCISCO, CA – 11 February 2014 – SQLstream, Inc., the Big Data Stream Processors company, today announced the availability of SQLstream 4.0, the new generation of its flagship Big Data stream processing product suite.

SQLstream 4.0 includes the latest s-Server Stream Processor, the fastest and most scalable release of the company’s renowned product, introduces new tools for real-time visualization of high velocity data streams, StreamApps toolkits for rapid implementation of operational intelligence solutions, and support for the Apache Storm project, the open source distributed stream computing framework.

“Organizations are telling us that to remain competitive, they need business processes that respond in real time to machine data,” said Damian Black, CEO of SQLstream. “Traditional Big Data systems based on store-then-process technology don’t work in real time. Our SQLstream 4.0 Stream Processors help organizations automate the use of vast amounts of log and sensor data to drive responsive business processes.”

WHAT’S NEW IN SQLSTREAM 4.0
Market-leading performance: SQLstream’s s-Server 4.0 Stream Processor sets the industry standard for streaming data throughput performance. SQLstream s-Server 4.0 has been benchmarked at 500,000 records per second per core, representing the lowest total cost of performance for the million records-per-second requirements of today’s cybersecurity, telecommunications, and Internet of Things applications.

Real-time dashboards: SQLstream 4.0 introduces s-Visualizer, a visualization platform for enterprise power users who wish to build drag-and-drop, real-time dashboards over streaming analytics. s-Visualizer accepts high volume push-based streams and updates continuously without having to store data and refresh.

Stream Processor for Storm: A new SQLstream Stream Processor for Storm enables continuous SQL queries to be deployed seamlessly as Storm processing nodes (known as Bolts) in Storm topologies. For organizations considering Storm implementations, SQLstream’s Stream Processor for Storm offers a significant reduction in hardware costs, improved platform manageability, and with real-time analytics in streaming SQL, a dramatic reduction in time to value.

Accelerator kits for Operational Intelligence applications: SQLstream StreamApps are quick-start kits for real-time operational intelligence applications, enabling organizations to deploy real-time solutions quickly and easily. Each StreamApp contains a set of standard components for machine data collection, streaming analytics and real-time dashboards. StreamApps launched with the release of SQLstream 4.0 include the StreamApp for telecommunications CDR/IPDR analytics with further StreamApps scheduled for release throughout 2014.

DOWNLOAD AND TRY FOR FREE
For more information about SQLstream s-Server 4.0 and its new features, and to download a free trial, visit www.sqlstream.com/whats-new-in-4.

ABOUT SQLSTREAM
SQLstream’s Stream Processors automate business processes using real-time analytics from unstructured sensor, log, and other machine-generated Big Data streams. SQLstream’s core stream processing engine, SQLstream s-Server, is a distributed, streaming SQL platform for real-time operational intelligence applications. SQLstream offers a full range of machine data integration agents plus integration with Hadoop and other enterprise data management platforms. SQLstream is the recipient of leading industry awards, including the Ventana Research Technology Innovation Award for IT Analytics and Performance. SQLstream is based in San Francisco, CA.

Posted under Press Releases

Vice President Marketing

SQLstream’s Stream Processor brings real-time streaming SQL analytics for machine data to Apache Storm.

SAN FRANCISCO, CA – 11 February 2014 – SQLstream, Inc., the Big Data Stream Processors company, today announced the availability of its Stream Processor for Apache Storm, the open source distributed stream computing framework. The release enables the real-time analysis of machine data in Storm using continuous SQL queries. Organizations benefit from the improved performance, operational flexibility and reduced management costs of using standards-compliant SQL for the real-time analysis and integration of streaming data.

Storm is a distributed stream-processing framework available under the Apache Open Source license. The primary focus of the Storm project is its distributed infrastructure, rather than the real-time analytics or connectors for machine data sources and enterprise storage platforms. That is where SQLstream comes in. SQLstream’s Stream Processor for Storm enables Storm implementations to utilize continuous SQL queries for streaming analytics.

“We are proud to enhance Storm with SQLstream’s capabilities,” said Damian Black, SQLstream CEO. “Now Storm users can benefit from the analytic power of continuous SQL queries and will also enjoy an order of magnitude improvement both in stream processing performance and in reduction of hardware costs over standalone Storm implementations.”

SQLstream’s Stream Processor for Storm enables SQLstream s-Server continuous SQL queries to be deployed as Storm Bolts (stream processing nodes) in a distributed Storm system. SQLstream’s s-Server stream processor can also be deployed as Storm Spouts (machine data sources), utilizing SQLstream’s agents and adapters for real-time machine data collection and integration.
Organizations with an existing Storm deployment, or considering a future Storm implementation, will benefit from:

  • Greater performance throughput on significantly less hardware. Benchmarks indicate that SQLstream’s Stream Processors on Storm deployments can achieve a 10x reduction in hardware for the same processing throughput performance.
  • Faster time to value for real-time applications. Storm Bolts can now be written as declarative continuous standards-compliant SQL queries. Real-time analytics and continuous integration applications can be deployed in a fraction of the time required to develop low level Java Bolts for example.
  • Dynamic updates to operational Storm systems. Unlike a Storm-only implementation, Spouts and Bolts implemented using continuous SQL queries can be updated and changed dynamically without having to stop, rebuild and redeploy the Storm topology.

DOWNLOAD A FREE TRIAL
The SQLstream API for Apache Storm is included in the latest SQLstream s-Server 4.0. For more information on SQLstream’s Stream Processor for Storm, visit http://www.sqlstream.com/storm, and to download a free trial of SQLstream s-Server 4.0, visit http://www.sqlstream.com/downloads.

ABOUT SQLSTREAM
SQLstream’s Stream Processors automate business processes using real-time analytics from unstructured sensor, log, and other machine-generated Big Data streams. SQLstream’s core stream processing engine, SQLstream s-Server, is a distributed, streaming SQL platform for real-time operational intelligence applications. SQLstream offers a full range of machine data integration agents plus integration with Hadoop and other enterprise data management platforms. SQLstream is the recipient of leading industry awards, including the Ventana Research Technology Innovation Award for IT Analytics and Performance. SQLstream is based in San Francisco, CA.

Posted under Press Releases

Screen Shot 2014-02-13 at 10.44.36 AM Esteemed analyst James Taylor from Decision Management Solutions reviews the latest release of SQLstream’s flagship product, s-Server, in its 4.0 version.

The new release includes features such as unparalleled performance (recently benchmarked at 1.3M records/second), integration with Apache Storm, fast-start kits for application development, and world’s first real-time visualization engine.

Click here to read full review

February 7, 2014

Screen shot 2013-10-29 at 10.24.38 AMMaybe the last time you heard the word “parallel,” before all the Hadoop MapReduce ballyhoo, was in geometry classes at school. Nowadays, however, the word seems to pepper a good deal of tech talk. Remember when the words “scale out” suddenly became a feature of all new database products? That was a parallelism thing. And if you investigate it, you’ll find it was pre-Hadoop. It all began in earnest when x86 CPUs began having multiple cores in 2004.

Esteemed analyst Robin Bloor describes his group’s view of the whole IT industry, which “willy-nilly is now going parallel”, and includes SQLstream as one of the top 10 products exploiting parallelism.

Click here to read full article

Posted under Analyst review · In the News
January 27, 2014

Screen Shot 2014-01-27 at 1.51.48 PMWith SQLstream’s s-Server being named a trend-setting product for 2014, the industry is offering comments.

Click here to read what  technology industry analyst Jeff Kagan had to say.

Posted under Analyst review · In the News
January 9, 2014

indexcard-dm-radio-iconJanuary 14 Webinar with EMA Industry Expert, SQLstream CEO, and InfoArmor CTO, Will Share Expert Insights on how Enterprises Can Extract Value from Machine Data with Streaming Analytics

SAN FRANCISCO, CA – January 8, 2014–SQLstream Inc., the streaming Big Data platform for real-time operational intelligence, today announced that CEO Damien Black of SQLstream, CTO Christian Lees of InfoArmor and Senior Analyst of EMA John L. Myers will participate in a Bloor Briefing Room webinar on January 14, to discuss the case for streaming analytics.

Eric Kavannagh, CEO of the Bloor Group, will host the webinar, in which attendees will learn about the changing dynamics of modern analytics and its impact on Big Data, Business and Operational Intelligence.

WHAT: No Time Like the Present- the Case for Streaming Analytics

WHEN: Tuesday, January 14, 2014 | 1 p.m. PDT/4 p.m. EDT | Program duration is one hour

REGISTRATION: registration is available online here

WHO: CEO Damian Black of SQLstream, CTO Christian Lees of InfoArmor, and Senior Analyst John L. Myers of Enterprise Management Associates.

In addition, Damian will discuss the technology advantages of streaming analytics for achieving real-time Operational Intelligence, and will expand on the SQLstream SQL standards-compliant approach to streaming analytics solutions. Christian Lees, CTO of InfoArmor, will join Damian. As a SQLstream customer, Christian will present his company’s use of streaming analytics for Internet Surveillance and Security Intelligence.

 

About The Bloor Group
The Bloor Group provides detailed analysis of today’s enterprise software industry. Co-founded by Dr. Robin Bloor and Eric Kavanagh, the company leverages Web-based multimedia to deliver vendor-neutral education that is designed to reveal the essential characteristics of information technologies. The Bloor Group galvanizes the industry’s independent analysts to provide valuable insights via innovative Webcasts, research programs and white papers. More information available at http://www.insideanalysis.com and http://www.databaserevolution.com.

About SQLstream

SQLstream, Inc. (http://www.sqlstream.com) empowers businesses to perform real-time analytics and extract operational intelligence from unstructured log file, sensor and other machine-generated Big Data. The SQLstream streaming Big Data engine is a distributed, 100% SQL platform for real-time data management and operational intelligence applications. SQLstream offers a full range of integration adapters, including log files, support for Hadoop and other enterprise data management platforms. SQLstream is the recipient of leading industry awards, including the Ventana Research Technology Innovation Award for IT Analytics and Performance. SQLstream is based in San Francisco, CA.

About InfoArmor


InfoArmor is a leader in offering identity theft protection and privacy management solutions to businesses. By focusing on employee benefits, InfoArmor provides a turn-key, value-adding benefit with easy employer administration. Rather than solely focusing on credit fraud, InfoArmor uses industrial strength technologies to enable consumers and businesses alike to monitor and identify fraudulent activity. For more information, visit http://www.infoarmor.com.

Media contact

Ronnie Beggs, VP Marketing

Email: pr@sqlstream.com

Phone: +1 877 571 5775

Website: www.sqlstream.com

Posted under Analyst review · Events · Press Releases
December 16, 2013

dbta_logoSan Francisco, CA– SQLstream, Inc., the leading standards-based streaming Big Data analytics platform, announced today that Database Trends and Applications (DBTA) has listed SQLstream as a trend-setting product in real-time Big Data management and analytics. The Trend-setting Products in Data for 2014 honors organizations with a commitment to technology innovation and whose breakthrough products are arming organizations with the tools to address their rapidly-changing market requirements.

Read press release

Posted under Analyst review · Press Releases

Vice President Marketing
December 9, 2013

SQLstream StreamApps are fast-start templates for real-time streaming machine Big Data applications.  Each StreamApp is a library of components for a specific operational business process. In a Big Data industry typified by a lack of standards and high development costs, StreamApps takes SQLstream’s standards-based SQL platform for streaming operational intelligence to the next level of usability and enterprise adoption.

StreamApps are built on SQLstream’s core s-Server streaming analytics and machine data integration platform, and enable organizations to realize immediate value from their streaming log file, sensor, device and network data. This new concept in Big Data solutions offers both rapid time to revenue and application flexibility, with the lowest Total Cost of Big Data Performance of any Big Data technology. Each StreamApp contains a set of standard components, typically including:

  • Log, equipment and device adapters for real-time data collection,
  • A package of streaming analytics,
  • Streaming dashboards,
  • General components such as data conditioning, filtering and streaming aggregation.

The latest SQLstream StreamApp for CDR/IPDR Analytics- in the Telecommunications industry Call Detail Records (CDR) and IP Detail Records (IPDR) capture information such as call duration, location and quality for every traditional voice and VoIP call. Processing CDR and IPDR records in real-time offers real-time rating, customer experience monitoring, fraud detection and real-time policy control, enabling service providers to improve overall call quality, optimize call rating, improve operational efficiency and reduce fraudulent network usage. The latest release adds support for SIP trunking and SBC network equipment providers.

The StreamApp for CDR Analytics has recently deployed by Veracity Networks, enabling them to address the demands of its expanding customer base and to deliver near-zero latency solutions for real-time call rating, service quality and performance monitoring and fraud detection from streaming CDR and IPDR record streams.

SQLstream was chosen for its capacity to dynamically scale for the rapid growth in call data, and for its ability to capture, process and integrate CDR and IPDR data from many different platforms and in different formats. Real-time integration with existing operational platforms was a key requirement, as well as the need for immediate insight through real-time dashboards and reports.

 

Posted under Product
December 6, 2013

 

dbta_logo

SQLstream was named as one of the top trend-setting technologies for data management for 2014 by the Database Trends and Applications (DBTA) magazine.

Two of the most pressing concerns that organizations face today are the need to provide analytic access to newer data types such as machine generated data, documents and graphics, and the need to control the cost of information management for their growing data stores, according to the “Achieving Enterprise Data Performance—2013 Database Growth Survey” (source: DBTA). The study, sponsored by Oracle, found that for 31% of the 322 data managers and professionals in the respondent pool, the greatest challenge for the 12 months ahead was the ability to analyze “more and different kinds of data.” Another 30% said it was the ability to decrease information infrastructure costs.

While many organizations believe that their existing technology is capable of allowing them to benefit from Big Data, current solutions in place also may be in need of an overhaul. The “2013 Big Data Opportunities Survey” found that more than a third of the 304 respondents, 36%, need faster querying of data while another 43% want to get faster access to larger datasets.

To help organizations manage and extract insight from today’s data-propelled environments, new products- of which SQLstream’s suite is an example- are coming to the fore. Capabilities are being added to leading relational and MultiValue data management systems. In addition, cutting-edge technologies and techniques are emerging in the form of NoSQL and NewSQL data management systems, SaaS and cloud solutions, real-time BI, data visualization, and more.

This December, DBTA published its list of Trend-Setting Products for 2014 and included SQLstream as a tool to address rapidly changing market requirements and resolve part of the Big Data monetization problems we’re facing today.

Read more, HERE.  

Posted under Analyst review · In the News

CEO
November 28, 2013

Oxford Said Business SchoolJust back from the Silicon Valley Comes to Oxford (SVCO), an invite-only event at the Said Business School, University of Oxford University in the UK. The aim of the event is to provide insight to the Business School graduates on how to start, scale and run high-growth companies. The speakers include prominent entrepreneurs, innovators and investors from Silicon Valley. Other speakers included Phil Libin CEO of Evernote and Mike Olson, co-founder of Cloudera.

As well as running a masterclass on entrepreneurship in Silicon Valley, my main presentation was to the business school address the mismatch between Big Data hype and reality. The talk entitled  ”The Clash of Big Data and Consumer Expectation”, examined how the Big Data movement, as epitomized by Hadoop and NoSQL platforms, is failing to keep up with consumer expectations. In large part this is down to the solution cost and immaturity of the Hadoop-based technologies, but in the face of dramatically increasing data rates from telecoms and the IoT, means that Hadoop and related Big Data technologies are unable to deliver the real-time, low latency performance at any reasonable cost.  We’re also talking about a world of sensors and other data where much of the data is business as usual and no need to be stored.

Many industries are implementing ‘real-time’ applications, although the reality struggles to make it past the Marketing departments. For example, travel information remains woefully inadequate and inaccurate, real-time ad placements are still based on simplistic measures such as recent site visits, telecoms service providers are struggling with the new Big Data streams from 4G/ IP location-based services, and micro-location services (such consumer behavior in shopping malls) are still in their infancy.

There are two issues at play.  The collection of many different types and formats of machine-generated data from applications, web servers and devices, and the ability of the Big Data platforms to deliver real-time, low-latency responses.

Fast, as-hoc queries over large volumes of stored data is not an issue for Big Data technology platforms, but deliver low latency answers is either not possible or prohibitively expensive in terms of the hardware and software resources required. For example, most Hadoop and NoSQL solution vendors have a social media demo based on Twitter.  Yet Twitter as its peak is around 10k tweets per second.  Applications in telecoms, cybersecurity, telematics and transportation are increasingly in the area of several million records per second.

Therefore delivering cinsumer expectation requires a rethink. How to deliver automated, low latency actions from multiple high velocity data streams, often with poor quality data, low information content and different data formats. This is where streaming data management comes into play.  Streaming in the past has often been associated with event processing or BPM-based middleware platforms, but has now moved on to the next generation of streaming platforms, capable of executing many applications on the same core platform, and scaling to the data rates generated by IoT and telecoms applications.

Posted under Big Data · Events · Impressions · Streaming SQL
November 18, 2013
SAN FRANCISCO, CA – 14 November 2013 – SQLstream Inc., the streaming Big Data platform for real-time operational intelligence, today announced StreamApps, fast-start templates for real-time streaming machine Big Data applications.

 

Each StreamApp is a library of components for a specific operational business process. In a Big Data industry typified by a lack of standards and high development costs, StreamApps takes SQLstream’s standards-based SQL platform for streaming operational intelligence to the next level of usability and enterprise adoption.

“Many organizations want to deploy real-time streaming data applications but are concerned about the costs and skills required,” said Damian Black, SQLstream CEO. “The SQLstream approach is to simplify Big Data by using standards such as SQL, but also to give the option to employ a full solution platform such as SQLstream s-Server, or to benefit from the fast deployment option offered by StreamApps.”

The announcement of StreamApps is supported by the release of the latest SQLstream StreamApp for CDR/IPDR Analytics. In the Telecommunications industry Call Detail Records (CDR) and IP Detail Records (IPDR) capture information such as call duration, location and quality for every traditional voice and VoIP call. Processing CDR and IPDR records in real-time offers real-time rating, customer experience monitoring, fraud detection and real-time policy control, enabling service providers to improve overall call quality, optimize call rating, improve operational efficiency and reduce fraudulent network usage. The latest release adds support for Sonus equipment and SIP trunking.

StreamApps are built on SQLstream’s core s-Server streaming analytics and machine data integration platform, and enable organizations to realize immediate value from their streaming log file, sensor, device and network data. This new concept in Big Data solutions offers both rapid time to revenue and application flexibility, with the lowest Total Cost of Big Data Performance of any Big Data technology. Each StreamApp contains a set of standard components, typically including:

  • Log, equipment and device adapters for real-time data collection,
  • A package of streaming analytics,
  • Streaming dashboards,
  • General components such as data conditioning, filtering and streaming aggregation.

StreamApps can be developed by SQLstream and its accredited partners. The roadmap for StreamApps availability addresses all major industries, including additional telecommunications StreamApps, intelligent transportation, Smart City platforms and cybersecurity applications.

Visit http://www.sqlstream.com/products/streamapps/ for more information.

About SQLstream

SQLstream, Inc. empowers businesses to perform real-time analytics and extract operational intelligence from unstructured log file, sensor and other machine-generated Big Data. The SQLstream streaming Big Data engine is a distributed, 100% SQL platform for real-time data management and operational intelligence applications. SQLstream offers a full range of integration adapters, including log files, support for Hadoop and other enterprise data management platforms. SQLstream is the recipient of leading industry awards, including the Ventana Research Technology Innovation Award for IT Analytics and Performance. SQLstream is based in San Francisco, CA.

Media contact:

Ronnie Beggs, VP Marketing

Email: pr@sqlstream.com

Phone: +1 877 571 5775

Website: www.sqlstream.com

Posted under In the News · Press Releases · Product

Vice President Marketing
November 15, 2013

VentanaResearch-Award-ReducedThe Ventana Research Technology Innovation Summit was held this week in Chicago, where VR CEO Mark Smith and BI analysts such as Tony Cosentino presented their perspective on Big Data technology adoption and the issues being faced by organizations as a result. This was not a conference about Hadoop, far from it, in fact the focus was more on how to best exploit Big Data as an asset and how this is being achieved across different software application areas. I would also mention that one of these areas was Operational Intelligence for which SQLstream picked up the award for Technology Innovation (see picture).

Time to value was a recurring theme. It struck me that the issues emerging from organizations were similar to pre-Big Data concerns, for example too many systems, too many data silos, poor quality data, technology not able to address business needs. What has changed is the scale of the problem. Big Data has elevated the classical data management challenges to new levels. More data, more data storage platforms, fewer standards, and resource-intensive solutions with high total cost of ownership. No wonder that time to value is an emerging theme as Big Data sails closer to mature enterprises.

We’re also experiencing a new phenomenon that means the time to value problem has to be addressed – the ‘Big Data Mindset’. This is perhaps the most important concept to emerge from the Big Data evolution. Not the technology, rather a step change in expectation as to the value locked in a data asset, and a willingness to explore new and innovative approaches in order to extract that value.

Organizations have seen the light, the art of the possible, and there’s no turning back from the Big Data Mindset. Organizations and are now seeking technology maturity, standardization and architecture stability. This is one reason why we’re seeing SQL emerge as the Big Data query language of choice. Not the SQL we learnt about in text books that was inseparable from an RDBMS. The new SQL is using the power of the query language to extract intelligence from any type of data (structured, unstructured or semi-structured) and across any data management platform (map reduce, NoSQL or streaming).

Reducing time to value requires standards and data management applications where the business logic is not locked in volumes of bespoke Java code. We require intelligent data management platforms that abstract away the underlying complexity, that scale up and out effectively (SQL as a parallel programming language), and deliver agile solutions that can be generalized and upgraded for new requirements rapidly and without system downtime.

Posted under Big Data
November 12, 2013

veracity logoVeracity and SQLstream team up to deliver actionable Operational Intelligence for real-time call rating, performance management and fraud detection from streaming CDR and IPDR data

SAN FRANCISCO, CA – 12 November 2013 – SQLstream, Inc., the Streaming Big Data Company, today announced that Veracity Networks, an award-winning internet and telecommunications provider, selected SQLstream’s product suite as the streaming platform for its next generation of real-time operational management systems.  SQLstream enables Veracity Networks to address the demands of its expanding customer base and to deliver near-zero latency solutions for real-time call rating, service quality and performance monitoring and fraud detection from streaming CDR and IPDR record streams.

“We are excited to partner with SQLstream and look forward to the many benefits it will provide our company and our customers,” said Chris Modesitt, CTO of Veracity Networks. “SQLstream will allow Veracity to provide vital real-time reports to our customers that previously took hours to create. SQLstream will also provide real-time monitoring and insight into potential network concerns allowing Veracity to proactively address any such issues.”

SQLstream was chosen for its capacity to dynamically scale for the rapid growth in call data, and for its ability to capture, process and integrate CDR and IPDR data from many different platforms and in different formats. Real-time integration with existing operational platforms was a key requirement, as well as the need for immediate insight through real-time dashboards and reports.

“We’re delighted to have been chosen by Veracity Networks for such a challenging operational Big Data requirement,” said Damian Black, CEO of SQLstream. “We created a scalable, sustainable and repeatable model to manage applications for the VOIP telecommunications world, and seeing SQLstream making a real difference to people’s lives in this challenging arena is further validation of streaming Big Data technology and our approach to real-time operational intelligence.”

SQLstream’s streaming Big Data solution for Veracity includes SQLstream s-Server, the core streaming data management and integration platform and SQLstream s-Visualizer, world’s first push-based engine for real-time visualization of streaming alerts, alarms and analytics. The SQLstream portfolio delivers streaming dashboard visualizations, the highest performance for Big Data scalability, a wide range of integration agents and adapters, and a fast, low-cost deployment supported by a 100% ANSI SQL streaming architecture.

Learn more about SQLstream’s solutions for telecommunications today.

——————–

About Veracity Networks

Veracity Networks is a locally owned, facilities-based telecommunications provider servicing commercial and residential customers on its extensive network throughout Utah. Veracity offers unmatched Utah Internet speeds, high-definition video and feature-rich telephone services.  Customers range from residential customers with a single phone line to one of the state’s largest private employers. Veracity Networks is the recipient of local and national awards such as Best of State, Utah Business Magazine’s Fast 50, MountainWest Capital Network Utah 100 and Inc. Magazine’s Inc. 5000. It is also dedicated to maintaining and improving its Tier-1 network infrastructure while setting industry standards in network stability and customer support. With more than 50 years combined leadership experience and the most advanced telecom platform in the industry, Veracity Networks delivers “World-Class Service with a Local Touch.” For more information, visit www.veracitynetworks.com.

Posted under Customer win · Press Releases
October 29, 2013

Screen shot 2013-10-29 at 10.24.38 AMPossibly the simplest way to think about Operation Intelligence (OI) is that it is Business Intelligence put into action. In reality it is distinctly different to what we usually understand as Business Intelligence in respect of the underlying technologies that are deployed and the overall architecture of data and process flows, but it can be viewed in that manner from a business perspective.

The Bloor Group researched all the vendors who use the term Operational Intelligence and sorted through them to create an OI Market Directory list, as well as other component technologies they have discovered in their research.

See the list HERE, or read more about how SQLstream, Vitria, Splunk and other similar technologies could do wonders in fixing some of the most current data-driven issues in the news (like the Obamacare page, for example).

Posted under Analyst review · Market Views
October 17, 2013

datanami_logoIn the world of analytics there’s no time like the present, which is why there’s such a big push to retrofit Hadoop as a real-time system. But there are other approaches.

Alex Woodie of datanami.com analyzes the SQLstream solution and its uses of SQL to query big data as it flows through the memory chips of cheap, commodity servers.

Read full article HERE

October 15, 2013
Screen shot 2013-10-15 at 11.23.58 AM

Click on image to listen to podcast

On October 4, 2013, CEO Damian Black briefed members of the Boulder Business Intelligence Brain Trust (BBBT) on the topic “From Big Data to Real-time Value: Streaming Operational Intelligence.”

Esteemed analyst Claudia Imhoff interviews Damian in a short podcast on Big Data, SQL and the future of streaming analytics.

Posted under Analyst review · In the News
October 10, 2013

BBBT Logo 2013-02-05

On Friday, October 14, 2013, Damian Black, CEO, and Ronnie Beggs, Vice President of Marketing, of SQLstream, Inc. briefed members of the Boulder Business Intelligence Brain Trust (BBBT) on the topic “From Big Data to Real-time Value: Streaming Operational Intelligence.”

Distinguished BI Expert Paul te Vraak comments on the SQLstream technology and its unique ability to stream Big Data using SQL queries.

Read full commentary HERE.

 

Posted under Analyst review · In the News
October 8, 2013
monitor+dials

SQLstream s-Visualizer provides active graphical dashboards over streaming time-series data, mapping real time to human time without refresh

San Francisco, CA | October 8, 2013SQLstream Inc., the Streaming Big Data Company, announced today the world’s first visualization dashboard product for streaming high-velocity data. SQLstream s-Visualizer provides active, graphical insights over real-time streaming analytics and is fully integrated with SQLstream’s award-winning s-Server streaming Big Data platform.

SQLstream s-Visualizer is a drag-and-drop platform for building live dashboards over streaming data. Operational users rely on accurate, real-time information—yet traditional dashboard tools require data to be stored first, which introduces latency, and must be refreshed as new data arrive, which is unnecessarily resource-intensive.  SQLstream s-Visualizer’s dashboards are updated in real time, eliminating both: latency introduced by storing and aggregating data, and the resource overhead and inconvenience of constant refreshes, re-querying and polling.

“SQLstream s-Visualizer is the only product offering real-time immediate insight into fast data streams,” said Damian Black, SQLstream CEO. “Users benefit from immediate insight into business issues and are empowered to quickly avert them.”

Drinking from the fire hose of Big Data streams also poses a significant performance challenge for existing business intelligence and dashboard tools. Real-time operational intelligence applications must be able to process data at millions of records per second. SQLstream’s intelligent on-the-fly aggregation and filtering of Big Data streams enables s-Visualizer to display meaningful analytics regardless of data volume and velocity.

SQLstream s-Visualizer offers:

  • Access to streaming, real-time analytics on any platform – web, desktop and mobile devices;
  • Full integration with SQLstream s-Server for display of streaming operational intelligence, plus support for combining streaming data with historical trend data from any existing storage platform or data;
  • Customizable, metadata-driven interfaces for building personalized dashboards in minutes, with visual programming and drag-and-drop workflows.

Further information available at http://www.sqlstream.com/products/s-visualizer/.

 

About SQLstream

SQLstream, Inc. (www.sqlstream.com) empowers businesses to perform real-time analytics and extract operational intelligence from unstructured log file, sensor and other machine-generated Big Data. SQLstream’s award-winning streaming Big Data platform is a distributed, 100% SQL platform for real-time data management and operational intelligence applications. SQLstream offers a full range of integration agents and adapters for log file, sensor and telecommunications data, plus support for continuous integration with Hadoop and other enterprise data management platforms. Based in San Francisco, CA.

Media contact:

Ronnie Beggs, VP Marketing

ronnie.beggs@sqlstream.com

+1 415.758.8342

www.sqlstream.com

September 24, 2013
Oracle OpenWorld Logo

SQLstream to Demonstrate Real-time Big Data Solutions for Traffic Analytics and the Internet of Things in the Big Data and Business Analytics Industry Showcase

San Francisco, CA | September 23, 2013 – SQLstream Inc., the streaming Big Data engine for real-time operational intelligence and a Silver level member in Oracle PartnerNetwork (OPN), today announced that they are a Signature sponsor of Oracle OpenWorld 2013 in the Big Data and Business Analytics Industry Showcase. SQLstream will showcase real-time industry and public sector solutions, and will deliver a presentation on congestion reduction through real-time traffic analytics for the transportation industry.

Click here to read full press release

Posted under Events · In the News · Press Releases
September 11, 2013

logo-informationmanagementJust when it media started presenting NoSQL as the current that would supersede the venerable standard of information management, we again see SQL on the rise. NoSQL engines are building SQL interfaces, as are vendors that analyze data streams, big data and other forms of information assets.

On Thursday, September 12 at noon PST host Eric Kavanagh from DM Radio interviews our CEO, Damian Black, alongside Niklas Bjorkman of Starcounter, Fred Gallagher of Actian and Jamie MacLennan of Predixion Software on the resurgence of SQL- come join!

Click here to read more and register.

Posted under Events · In the News
September 9, 2013

inside analysis

Distinguished analyst Robin Bloor has long been discussing the term of “data science,” with posts ranging from explanatory to complaining (particularly about the use of terminology).

Throughout all, however, he recognizes data science as a sensible activity for a business to pursue.

In his last post, he is promoting that idea, by discussing ten reasons that lie behind the attention that data science has acquired and underlining SQLstream as one of the effective technologies that contributed to building its well-deserved momentum.

Click here to read the full post.

 

Posted under Analyst review · In the News

Vice President Marketing
September 5, 2013

TDWI LogoSQLstream Explores the Total Cost of Streaming Big Data Solutions at TDWI Solution Summit 2013, September 15-17 — CEO Damian Black to Present “Operational Intelligence from Machine Big Data: Case Studies in Taming the Cost of Real-time Performance”

San Francisco, CA, September 4, 2013 — SQLstream Inc., the streaming Big Data engine for real-time operational intelligence, today announced that the company will speak on the cost of performance for real-time Big Data solutions at the TDWI Solution Summit in Austin, TX. SQLstream will also be demonstrating its streaming SQL platform for real-time operational intelligence from machine Big Data.

The session, scheduled for Tuesday September 17 at 10:00AM, will discuss current Big Data management technologies, the performance and cost tipping points for real-time Big Data solutions, and ways to solve real-time Big Data challenges with the lowest cost of performance. SQLstream CEO Damian Black will compare three very different customer case studies across a range of industries, highlighting how SQLstream’s streaming Big Data management platform has been leveraged to reduce traffic congestion, preempt identity theft and deliver log analytics for high performance computing.

Held at the Barton Creek Resort and Spa on September 15-17, TDWI Solution Summit is an exclusive event for experienced professionals, industry thought leaders, and top solution providers. Attendees gain real-world tips and best practices that will help them harness the power of big data analytics within their organizations.

Damian Black is known for his pioneering work in real-time and Big Data software and his commitment to reducing latency-to-action for business operations. “TDWI is the foremost organization for enterprise IT professionals and industry best practices,” said the CEO. “TDWI’s members recognize the importance of standards and SQL to their businesses, but also the need to adopt real-time Big Data technologies. SQLstream is the only company to offer the best of both worlds, delivering real-time operational intelligence from streaming Big Data using standard SQL as the query language.”

About SQLstream

SQLstream, Inc. empowers businesses to perform real-time analytics and extract operational intelligence from unstructured log file, sensor and other machine-generated Big Data. The SQLstream streaming Big Data engine is a distributed, 100% SQL platform for real-time data management and operational intelligence applications. SQLstream offers a full range of integration agents and adapters for log file, sensor and telecommunication data, plus support for Hadoop and other enterprise data management platforms. SQLstream is based in San Francisco, CA.

Posted under Press Releases

Vice President Marketing
August 27, 2013

The previous blog on Mastering the Internet commented on how government intelligence agencies are intercepting Internet communications en masse. And that, perhaps, this would signify the end of the first Internet age, “The Naive Age”, where many social media users are blissfully unaware of their lack of privacy and where phishing scams and identity theft are commonplace. However, the Naive Age is almost done. Enter “The Age of Enlightenment”, where Internet users are aware of the risks, and where Internet and technology providers must do more to protect their visitors and consumers.

In a recent article in the UK’s Financial Times, Internet launches fightback against state snoopers, Robert Cookson comments on the IETF’s intent to introduce encryption at ‘the heart of the web’ by mandating TLS in the next version of HTTP. Essentially, a proposal to extend the encryption technology used by banks and e-commerce retailers to include all web communication. The objective is to make eavesdropping and electronic snooping more difficult.

What is TLS?

TLS has been around since 1999, so why has adoption not been more widespread?  The answer lies in the lack of an operating standard. There is no consistent adoption of both TLS versions (several) and ciphers (multiple, as well) across the many web browser, web sites and application vendors. For example, the OpenSSL website has a long list of different ciphers where the choice of any preferred cipher relies on best practice. Of greater importance is the lack of adoption of TLS1.2, the most recent and only secure version. The key (pun intended) initiative therefore from the IETF is one of standards adoption.

How secure is TLS?

Will it make any difference? The Financial Times’s article focussed on preventing government snooping. Unfortunately secure TLS is likely to be more of an inconvenience than an insurmountable obstacle for most Intelligence Agencies. For example, the secure TLS mode on my Blackberry offers 128 bit encryption which is better than the 56-bit mode, but far from impregnable. A good high school maths student ought to be able to decipher these messages in a few minutes. 1024, 2048 and 3072-bit systems are supported and are increasingly secure, but processing overhead (and cost) increases dramatically with key size. And of course, there are many other attacks other than the sledgehammer of a front door approach, including TLS side channel attacks and website spoofing (for TLS attacks, an Internet take of the classic man-in-the-middle attack on public key systems).

Privacy and Security

That said, the main concerns for most of us using the Internet are privacy and identity assurance- protecting key information from criminals, our credit card and banking details, logins and passwords, and access to online tax, insurance e-Commerce sites. Most of us are awakening from “The Naive Age” having suffered from or had near misses from phishing scams and security theft. We are all hoping for better security and protection. So perhaps this is where the IETF initiative will have most impact.

Real-time Security Intelligence

SQLstream recently announced that InfoArmor has adopted SQLstream as the core real-time data collection and analytics platform for its identity theft monitoring engine, now enabled to collect and analyze in real-time the many different data sources required to detect patterns of identity theft. As we’ve seen in other areas such as cybersecurity, the more advanced the defense, the more sophisticated the attacks, and the better the detection systems need to be- which is where we plan to be of help.

Posted under Market Views

Vice President Marketing
August 20, 2013

Streaming Hadoop HBase Case Study: Turbo-charging Hadoop for real-time machine data analytics. Session to be Presented at NoSQL Now! on August 22nd at 9:30 PST, San Jose.

San Francisco, CA | August 19, 2013SQLstream Inc., a pioneer of the streaming Big Data engine for real-time operational intelligence, today announced that SQLstream CEO Damian Black has been invited to speak about the opportunity of accelerating Hadoop using streaming SQL at NoSQL Now! 2013, August 20-22.

Damian will be exploring the challenge of accelerating Hadoop for real-time operational intelligence by processing the live, high-velocity unstructured machine data streams before the data are stored. With the emergence of both streaming and the SQL language as key components of Big Data architectures, Damian’s presentation will demonstrate how to utilize the strengths of Hadoop in real-time operational intelligence applications using streaming SQL queries – a new concept for Big Data. The combination of streaming SQL for low latency analytics and Hadoop HBase for stream persistence and replay offers the lowest Total Cost of Performance for real-time Big Data applications.

Held at the San Jose Convention Center and now in its third year, NoSQL Now! gathers senior IT managers, analysts, architects and business executives interested to learn more about the new technology choices for Big Data analytics, integration and search. Damian will present Streaming Hadoop HBase Case Study: Turbo-charging Hadoop for real-time machine data analytics at 9:30AM PST on Thursday, August 22nd.

Damian Black was selected for his pioneering work in real-time and Big Data software. “Streaming SQL over Hadoop combines low latency analysis of the arriving data in real-time combined with the stream persistence, streaming joins and stream replay from Hadoop HBase,” said Damian Black, SQLstream CEO. “SQL may be a surprise to some, but is inevitable as Big Data storage technologies seek wider and mainstream adoption, and is the ideal platform for high velocity data analytics.”

If you are a member of the press or analyst community and are interested in setting up a meeting with SQLstream at NoSQL Now!, please contact Ronnie Beggs by email at ronnie.beggs@sqlstream.com.

About SQLstream

Breaking News! SQLstream named 2013 Ventana Research Technology Innovation Award Winner for IT Analytics and Performance.

SQLstream (www.sqlstream.com) is the pioneer and innovator of a patented Streaming Big Data Engine that unlocks the real-time value of high-velocity unstructured machine data. SQLstream’s s-Streaming products put Big Data on Tap™ – enabling businesses to harness action-oriented analytics, with on-the-fly visualization and streaming operational intelligence from their log file, sensor, network and device data. SQLstream’s core data streaming technology is a massively scalable, distributed platform for analyzing unstructured Big Data streams using continuous SQL, with support for streaming SQL query execution on Hadoop and other enterprise database, data warehouse and data management platforms. SQLstream’s headquarters are in San Francisco, CA.

Media Contact
Ronnie Beggs, VP Marketing
Company | SQLstream
Email | pr@sqlstream.com
Phone | +1 415 758 8342
Website | www.sqlstream.com

Resources
NoSQL Now! Agenda

Posted under Press Releases

Vice President Marketing
August 18, 2013

Ventana Innovation Technology AwardVentana’s Technology Innovation Awards showcase “advances in technology that contribute significantly to improved efficiency, productivity and performance of the organization.” SQLstream’s IT Analytics and Performance award recognized SQLstream’s innovative technology and ability to optimize operational processes and systems. The award considered all aspects of SQLstreams technology and business approach, in particular:

  • Vision – bringing standards to the world of Big Data and delivering a platform for real-time operational intelligence and integration across all industries.
  • Technology innovation – massively scalable, streaming data management platform using standard SQL (SQL:2008) as the query language.
  • Operational process optimization – enabling business to eliminate latency entirely from their operational processes.
  • Business impact – delivering real-time, low latency applications with the lowest Total Cost of Performance of any Big Data technology.
  • Customer ROI – Delivering significantly lower cost solutions while impacting directly end customer quality of service and experience.

The SQLstream Vision – Standards, one platform and real-time

Big Data platforms and RDBMS both fall short as platforms for low-latency, high velocity data management. Storing data and continually polling to re-evaluate queries introduces unacceptable latency and cost (both operational and in lost opportunity). The few existing true real-time platforms are clustered in niche areas, requiring significant, specialized bespoke development, and systems highly tailored to specific applications, making them both expensive and very limiting.

The SQLstream vision is simple – to minimize the Total Cost of Performance for real-time applications by delivering a true real-time data management platform that:

  • Scales to the most extreme high velocity Big Data applications.
  • Supports multiple real-time applications on a single core platform.
  • Enables organizations to build sophisticated analytics with low TCO.
  • Complements and accelerates the performance of existing IT infrastructure through continuous integration of streaming data and analytics.

Technology Innovation – Streaming SQL Big Data platform

SQLstream is a 100% SQL-standards based streaming data management platform that turns high-velocity machine data into real-time operational intelligence before the data are stored. SQLstream is built on a massively-parallel, continuous query execution engine that delivers streams of valuable information to any consuming application or service. But why SQL? SQL is well defined and understood globally, it offers low cost and rapid development, and is ideally suited to dynamic optimization of streaming query execution over a massively distributed server infrastructure.

Operational Process Optimization

Streaming operational intelligence eliminates the gap between analytics and the operational processes that connect an organization with the outside world, its customers, consumers and partners. Eliminating latency from the key operational processes directly impacts the efficiency of an organization’s day to day execution. A streaming operational intelligence platform in particular offers the capability to integrate real-time, predictive analytics across and organization’s existing operational systems, and either drive automated actions or support faster, more effective decision-making through traditional visualizations and alerts.

Business Impact

Most industries today are looking to eliminate latency from their business operations. Some are further advanced than others, but all are looking to lower the cost of their operations, maximize potential growth opportunities and improve the quality of experience and the quality of the delivered service. The business impact is both internal, in terms of OPEX and CAPEX improvements, but also can be measured through improved quality of experience and service for our clients’ customers, consumers and partners. For example:

  • Cybersecurity – detecting the emerging advanced persistent threats requires the processing of vast volumes of typically log data in real-time, coupled with rules-based pattern detection, geo-spatial and location-based analysis.
  • Fraud prevention – from an enterprise perspective, particularly in the telecommuncations and M2M markets, the ability to detect and prevent fraudulent access in real-time is providing a major differentiator in terms of an ROI-led project.
  • Identity theft monitoring – has a direct impact from the perspective of the individual.
  • Smart Cities – automated, real-time management of facilities such as street lighting.
  • Intelligent transportation – accurate real-time travel time for travellers, and automated traffic infrastructure management to reduce congestion.

Customer ROI – Lowest Total Cost of Performance

Change that can be directly attributed to the deployment of SQLstream’s streaming operational intelligence platform can be measured in several ways.

First, the direct ROI of the project. For example, in Intelligence Transportation, SQLstream’s ability to deliver accurate, real-time traffic flow and congestion prediction from GPS data saves transportation agencies millions of dollars – in comparison with the high cost of deploying (less accurate) in-road and roadside sensor projects.

Second, OPEX and CAPEX reduction through improved operational efficiency. For example, Cornell Computing Centre was unable to find a log management tool that could scale to the volume and velocity of their log data. SQLstream provided a level of visibility into platform and application issues in real-time that they were unable to achieve using their own in-house development, or with any of the other log monitoring tools on the market.

And finally, improved customer satisfaction. For example, InfoArmor uses SQLstream as the platform for their real-time identity theft monitoring applications. SQLstream collects and analyzes a large number of different internet, log file and network data feeds in order to detect (and predict) patterns of identify theft. The existing environment was a common scenario – multiple different tools, applications and storage platforms, loose and bespoke integration, with the result that the entire IT environment was unable to scale to meet their business growth. SQLstream enabled them to centralize their real-time, streaming analytics on a single, future-proof and scalable platform, and to integrate streaming intelligence across the remaining systems, improving overall business support and performance.

And finally

Operational Intelligence is emerging as the next phase of Big Data, enabling true real-time performance from structured and unstructured machine data. Many have been surprised that Hadoop and similar Big Data platforms have not been as real-time as hoped, yet Hadoop was not designed for low latency applications. As with all new technology sectors, there are a number of different approaches vying for control. SQLstream’s streaming data management platform offers the most comprehensive set of technology attributes.

The design and implementation of the SQLstream streaming data management architecture is a new paradigm for processing real-time data. The technology offers a true platform – SQL – as the streaming analysis development language, ensuring:

  • Lowest TCO: SQLstream offers the lowest initial lifecycle TCO for real-time, low-latency applications. Please refer to the attached performance benchmark press release for details.
  • Speed of application development: SQL enables applications to be developed and deployed using a fraction of the code required to deploy a comparable Java-based or SQL-like streaming application.
  • High-velocity data support: SQLstream applications are sized on the basis of 250,000 to 500,000 records per second (depending on record size), and scales efficiently across multiple servers to deliver applications capable of processing millions of records per second.
  • Low-latency analytics: SQLstream supports latency requirements down to single digit milliseconds (where latency is measured from the time of data creation to the availability of analytics or results).

But perhaps the single largest contributing factor was the realization of the original idea. SQLstream’s core technology began due to a belief that real-time time applications required standards, a true platform, and high performance. Streaming technology, coupled with SQL, and the techniques used for multi-server scaleout were seen by SQLstream as the future and the only path available. The current release of the technology 3.0, is the realization of that idea.


Vice President Marketing
August 15, 2013

SQLstream’s s-Streaming product suite has been selected by Ventana Research as the winner of its 2013 Ventana Research Technology Innovation Award

Ventana Industry Innovation Award

SAN FRANCISCO, CA | August 15, 2013 – SQLstream Inc., the Streaming Big Data Company, announced today that Ventana Research, the leading benchmark business technology research and advisory services firm, has selected SQLstream Inc. as the winner of its 2013 Ventana Research Technology Innovation Award for IT Analytics and Performance. The Ventana Research awards distinguish those pioneers that have developed clear visionary and transformative technology, and that are helping move businesses to the next level in 2013 and beyond.

Click here to read the full Ventana Research press release.

“Real-time intelligence from high velocity streaming data enables businesses to extract the maximum value from their Big Data asset, and enables them to eliminate latency completely from their business operations,” said Mark Smith, CEO and Chief Research Office at Ventana Research. “SQLstream was selected for the Ventana Research Technology Innovation Award in IT Analytics and Performance for its innovative streaming platform, and for its use of standard SQL as the building block for real-time operational intelligence solutions. I would like to congratulate SQLstream on their pioneering technology and introduction of standards to the Big Data arena. “

SQLstream is the leading streaming data management platform for high velocity Big Data applications, delivering real-time operational intelligence and high performance analytics solutions across a wide range of industries from telecommunications, transportation, security intelligence and IT operations. SQLstream’s pioneering standards-based core platform, SQLstream s-Server, uses SQL (SQL:2008) as the development language for real-time streaming data applications. Its s-Streaming product suite also includes s-Cloud for cloud deployments, s-Transport for geo-spatial and location-based analytics, and s-Analyzer for real-time log analysis.

“Recognition by Ventana Research is an honor and represents validation of our standards-based technology and that businesses are looking to transform their operations to real-time,” said Damian Black, SQLstream CEO. “SQLstream’s products enable organizations to improve and automate their operational processes through real-time intelligence extracted from all their log, sensor, network and device data on a single platform, and with the lowest Total Cost of Performance of any Big Data technology.”

If you are a member of the press or analyst community and are interested in finding out more about SQLstream and the Ventana Research award, please contact Ronnie Beggs at ronnie.beggs@sqlstream.com.

SQLstream s-Server is available to download and try for free by registering at www.sqlstream.com/downloads.

About SQLstream

SQLstream is the pioneer and innovator of a patented Streaming Big Data Engine that unlocks the real-time value of high-velocity unstructured machine data. SQLstream’s s-Streaming products put Big Data on Tap™ – enabling businesses to harness action-oriented and predictive analytics, with on the fly visualization and streaming operational intelligence from their log file, sensor, network and device data. SQLstream’s core streaming technology is a massively scalable, distributed platform for analyzing unstructured Big Data streams using standards-based SQL, with support for streaming SQL query execution over Hadoop/HBase, Oracle, IBM, and other enterprise database, data warehouse and data management systems. SQLstream’s headquarters are in San Francisco, CA.

Media contact:
Ronnie Beggs, VP Marketing
Company | SQLstream
Email | pr@sqlstream.com
Phone | +1 415 758 8342
Website | www.sqlstream.com

Posted under Press Releases
August 14, 2013

NoSQL Now! Speaker SpotlightDamian Black, SQLstream CEO, is interviewed by Dataversity prior to his presentation at NoSQL Now! 2013. Speaking on streaming SQL and operational intelligence for Hadoop and HBase, Damian discusses the challenges of accelerating Hadoop to process live, high velocity unstructured data streams, and the emergence of SQL as the the query language of choice for Big Data analytics.

Click here to read the full Dataversity interview.

Posted under In the News

Vice President Marketing
August 12, 2013

The definition of machine data covers, not surprisingly, all data generated by machines – servers, applications, sensors, web feeds, networks and service platforms. It covers everything from data centers, telecommunications networks and services to machine-to-machine and the Internet of Things in a device-connected world.

The value of machine data is immense. Machine data contains a wealth of information on customer and consumer behavior and location, consumer quality of experience, financial transactions, security and compliance breaches, as well as the state of industrial processes, transportation networks and vehicle health.

Machine Data

Understanding Machine Data in Real-time

Log files are the most common source of machine data today across all industries. But sensor networks and applications are catching up fast and with the growth of machine-to-machine and the Internet of Things. And telecommunications has always been a generator of Big Data with Call Detail Records (CDRs), network equipment performance measurements and subscriber and handset location data.

The following list provides an overview of the different types of machine data that SQLstream can collect and transform into real-time operational intelligence. The list is not exhaustive and is growing every day.

Application Logs
SQLstream’s log adapter turns any log file into a stream of real-time updates. Data from any log file type and format can be collected and analyzed. Common examples include server and cloud infrastructure performance logs, Apache error logs, syslog, firewall security logs and data generated by any application. Complete log files or just updates can be collected, and log data joined, transformed and aggregated as needed.

Application Server Logs
Many application servers including JBoss, Websphere and WebLogic generate log files using standard logging frameworks such as log4j. The data contains critical insights into application and application server operation and performance, but also the transaction information offers insights into business transactions and in particular fraud and security problems.

Call Detail Records for Telecom Services
Call Detail Records (CDRs) and IP Data Records (IPDRs) are generated by telecoms network equipment for every call and session, and contain the information necessary to produce billing records. They also contain information that can be used to determine service quality and customer experience issues, particularly when joined in real-time with GPS and location data.

Clickstream Data
Clickstream data captures users’ activity on websites. It contains valuable information on visitor activity that can be used to alert on customer experience issues, drive real-time ad placement and detect shopping cart abandonments for example.

Deep Packet Inspection (DPI) Probe Data
Deep Packet Inspection (DPI) tools use network probes to extract detailed information on connections, services and sessions. DPI probes generate a vast amount of real-time data that must be captured, correlated and combined into end-to-end connection and service information. The data contains key insights into Quality of Service, capacity management and network performance issues, as well as indicators of security attacks and breaches.

GPS Data
GPS data records the exact position of a device at a specific moment in time. GPS events can be transformed easily into position and movement information, for example for vehicles on a road network, or mobile subscribers with smartphones. Telecommunications, transportation, logistics and telematics rely on the accurate and sophisticated processing of GPS information.

IP Router Syslog
All IP network equipment from the major vendors use syslog formation to capture connection status, capacity information, routing information, failure alerts, security alerts and performance data. When processed in real-time, and when data from all sources can be joined and queried simultaneously, this data provides a unique insight into the operation of the network and offers a platform for predictive analytics and forecasting.

Sensor Data
The availability of low cost, intelligent sensors, coupled with the latest 3G and 4G wireless technology has driven a dramatic increase in the volume of sensor data, but also the need to extract operational intelligence in real-time from the data. Examples include industrial automation plants, smart metering, environmental monitoring and the oil and natural gas industry.

SCADA Data for Industrial Control Systems
Supervisory Control and Data Acquisition (SCADA) is the data management infrastructure for industrial control systems. SCADA systems produce an immense volume of measurement data, status information and failure alerts, and is widely deployed for remote equipment process monitoring across the smart grid, oil and natural gas, transportation and utilities sectors.

Posted under Big Data
August 1, 2013
constellation-logo

The number of articles about the Internet of Things [IoT], Machine-to-Machine communication [M2M], the Industrial Internet, the Internet of Everything [IoE] and the like have been increasing since this post introducing an IoT mindmap almost a year ago. We all learn from some of them, with some we nod sagely in agreement, and others cause everyone to scratch their heads in confusion.

Distinguished analyst Joseph di Paolantonio establishes definitions and taxonomies for the Internet-of-Things and M2M and provides an industry review in this article for Constellation Research.

Read full article HERE.

Posted under Analyst review · In the News · Market Views

Vice President Marketing
July 15, 2013

Folklore has it that the term ‘Internet of Things’ (IoT) was first popularized in 1999 at MIT to describe the architecture of connected RFID devices. Cisco then looked to define when the IoT came in to being as a concrete entity – defined as the year in which the number of connected devices exceeded the number of connected people (somewhere between 2003 and 2010 – see diagram).

Internet of Things

Click to zoom

Cisco predicted that the number of devices will grow to 50 billion by 2020, a trend substantiated by many. And over the years, the use of the term ‘Internet of Things’ has been enlarged to describe connected devices across any type of communication technology, regardless of whether this is public Internet or private networks – the Internet of Everything.

Yet Cisco also reports that the Internet of Everything is failing to deliver on its revenue potential. Much like other exciting technology trends, such as semantic analytics, is it possible that this is technology-driven excitement looking for a business problem to solve?  An understanding of the value chain and related business models is required. For example, Telco M2M applications have been hindered by the cost of wireless backhaul, which given the low ARPU levels for M2M services, has meant M2M services have struggled to achieve mass adoption. Although the cost of optical backhaul capacity has dropped significantly over the past 10 years, it’s still the ‘last mile’ access that is the constraining cost factor.

Technology makes the possible probable

Just as the Big Data movement was enabled by the combination of technology drivers (increasing volumes of unstructured data, low cost commodity servers and a thirst for faster answers), similarly a number of different technology advances have converged to enable the Internet of Everything:

  • Low cost ‘smart’ wireless sensors
  • Significant advances in energy and battery technology
  • IPv6 capability and associated network devices
  • Proliferation of smart phones and instrumentation devices

Yet so far we have seen the cart driving the horse. If we look at M2M again, still seen as the great hope for mobile carriers looking to move further up the value chain, it has failed so far to live up to revenue and growth expectations (at least, as forecast by some). Healthcare (and micro-sensor monitoring) applications are starting to emerge, but require external investment in order to be viable, the connected home has several conferences dedicated to it, yet not much revenue, and Smart Grid applications have been hindered by the existing distribution infrastructure.

Yet there is hope. GPS-based solutions for Intelligent Transportation applications – traffic flow and congestion prediction maps, real-time Journey Time apps – have seen an increase in adoption as M2M services. Consumers traffic apps, logistics and asset tracking, as well as partnerships with automotive and Transportation agencies for telematics. The network for IPv6 traffic is also evolving, with technologies such as Software Defined Networks (SDN) reducing the cost of network management and capacity optimization.

Real-time Performance, Health Monitoring and Analytics

The platforms for IoT/IoE applications have a several major components, the main three being the SDP for delivery and definition of the services, billing and revenue generation, and real-time performance and health monitoring with real-time analytics. The latter is effectively a streaming operational intelligence problem, requiring:

  • Remote, intelligence data collection architecture, with the ability to perform remote filtering and analytics, therefore improving the signal to noise, and reducing the backhaul capacity requirements.
  • Real-time analytics, predictive analytics and prescriptive analytics – the ability to extract actionable intelligence form the streams of device and sensor data, but being able to predict device failures and operational exceptions in advance, and where possible, to automate appropriate actions.
  • A  massively scalable platform capable of managing the data collection and reuse of all real-time data streams, and ensuring the reuse of raw and derived data streams across any number of  applications executing on the platform, such as performance, health monitoring, GPS analytics, streaming aggregation and backend integration.

Total Cost of Performance for IoT Platforms

The cost of achieving real-time solution performance remains an important consideration. SQLstream’s streaming platform delivers the lowest total cost of ownership for real-time applications across all data management technologies. Both Hadoop and traditional RDBMS-based systems suffer from both a technology tipping point for latency and throughput, beyond which it is virtual impossible to improve performance, but even before this point, the cost of real-time performance makes these solutions unfeasible. There are many reasons, but the underlying architecture is simply not scalable for high velocity, low latent analytics – the principle of store first, analyze second introduces unacceptable delays.

A performance benchmark for SQLstream’s real-time operational intelligence scalability, a system for 4G wireless network performance monitoring, discusses the core streaming architecture scaling to 1.3 million records per second on a standard 4-core server, and an overall architecture of 10 million records per second over only a few servers. The Total Cost of Ownership was the lowest by some margin (based on an independent benchmark program).

If you would like to know more about real-time operational intelligence as the performance platform for the IoT, register for our next webinar “Listen to your Sensors“, where we’ll be discussing the requirements and business case for large-scale intelligence sensor network management.

Posted under Sensor Networks

Vice President Marketing
July 5, 2013

The recent public release of information on the activities of our Intelligence Agencies makes interesting reading for Big Data professionals. In particular, how these agencies have mastered the Internet, with vast data collection and analytics facilities for monitoring Internet and smartphone activity. A series of news articles described GCHQ’s (the UK’s smaller cousin of NSA) role in the program and the numbers are staggering. For example, one claim is that as far back as 2011, the UK alone was collecting 39 billion events per day and storing over 20 petabytes of data per day. It is reported that the raw data is retained and analyzed over a 3 day period, scanned for 40,000 keyword searches simultaneously, with the results being correlated and stored as metadata. The resulting ‘analytical’ metadata is retained for a period of one month.

Staggering numbers which we can only assume have grown significantly over the past two years. But there is also another technological feat here – collecting that much data. There are billions of active Internet and Smartphone users worldwide at any given time – making it a seemingly daunting prospect to ‘monitor the Internet’. However, much of the Internet traffic flows into and out of any country through a relatively small number of very large pipes – usually submarine cables. And this is how data are captured – by storing all the data from multiple 100GB trans-continental cables.  Now, that really is some very Big Data firehouses from which to drink.

On one level, little has changed since the beginning of digital communication. Intelligence Agencies monitor and decode communications, searching for keywords that might indicate interesting communications by their respective enemies, terrorists and criminal operations. However, pre-Internet, the focus was on intercepting radio, satellite and microwave communications, and involved endless processing of industrial-strength, bulk cypher streams. Streaming Big Data indeed, but not on the same scale as its post-Internet cousin. The difficult part was the interception, something that is much more straightforward in the Internet age. Nor are today’s Internet and phone data as difficult to decipher. Much of the target data are plain text, or with lightweight encryption.

The complexity in the post-Internet age is volume, velocity and variety. Therefore a use case for Big Data technology. Intelligence Agencies and supercomputing centers have had Big Data Hadoop-like technology for thirty years or more. Today’s manifestation of Big Data with Hadoop is merely supercomputing ‘lite’ for the masses. Yet the scenarios described raise the question – why store the data at all? The processing described in the articles is nothing more than a short data processing pipeline with two stages, one with a three day rolling window, one with a one month rolling window. The entire article reads like a streaming data analytics problem, and perhaps this is the underlying technology layer, with stream persistence for additional cross-Agency correlation.

Streaming data management would also enable real-time analysis, rather than waiting three days for the results. Over the years, Intelligence Agencies have been accused of spending billions and yet still seem to find out about major security events in the same way as the rest of us – in the morning’s newspapers. So real-time must be a key requirement, and streaming data management is the only technology available that can deliver low latency analytics at this scale. Massively scalable, distributed streaming SQL platforms scale by simply adding more servers. A data velocity of 39 billion events per day may seem a lot, but is a little less than 500,000 per second. Given the number of rules, and an assumption of some fairly complex analytics (geo-spatial, semantic analytics, keyword matching etc), this is still well within the capability of streaming data management on a small number of commodity servers.

Streaming technology also has advantages in other areas for this type of high velocity use case. For example, data collection, using remote streaming data collections agents with filtering and basic analytics capability, the volume of backhauled data can be reduced significantly. And with so many rules, streaming data platforms enable new rules and analytics to be added on the fly, without have to take the platform down.

And finally, side-stepping the issue of the ethical framework that allows our Intelligence Agencies to operate in this mode, what does this mean for the future of the Internet? You could argue that this first Internet Age is build on naivety, “The Naive Age”. Many social media users seem blissfully unaware that anyone other than their immediate friends will want to read a post. Even those perpetrating criminal acts seem to like to boast on social media. And those involved in terrorist acts would seem to believe that each tweet, email and call is such a small needle in a very large haystack, that discovery is unlikely.

This could be the wakeup call for the wider world of Internet users. Just as 100% pure water can be super-cooled below zero degrees until the first impurity, no mater how small, will cause it to freeze, so perhaps the awareness of bulk monitoring will mean the same for the Internet. This could spell the end of the ‘The Age of Naivety’ and the beginning of ‘The Age of Enlightenment’ for the Internet. An age of privacy-aware users, and social media platform vendors who must work within a constraints of national security. For the Intelligence Agencies however, this may well be ‘back to the future’, applying tried and test cypher analytics to Internet Big Data firehouses.

Posted under Big Data Analytics · Market Views
June 27, 2013

Webinar Series to Explore Challenges and Solutions for Low-latency Operational Intelligence in the Telecommunications, Cybersecurity and Large-scale Sensor Network Industries

San Francisco, CA | June 27,2013 – SQLstream, Inc., the leading standards-based Streaming Big Data Platform, announced today that it will be hosting a series of educational webinars to take place through July 2013. The events will focus on exploring the total Cost of Performance (or total COP) for transforming log file and other machine-generated data into streaming Operational Intelligence. The series will dissect the benefits, approaches and value of low-to-no-latency streaming intelligence across a range of industries including telecommunications, cybersecurity, and massive-scale wireless sensor applications for the Internet of Everything. Click here to see topics and calendar.

Operational Intelligence platforms transform unstructured Big Data streams—from log file, sensor, network and service data—into real-time, actionable intelligence. Unlike traditional business intelligence applications, where results from historical, stored data are viewed on dashboards, operational intelligence drives real-world actions and updates based on streaming prescriptive analytics. Operational decisions can be made faster, and key operational processes—automated.

Experts in the field of real-time data analysis will deliver the “Inside the insight: a look at Operational Intelligence” webinar series.  Each webinar will explore the technical challenges, business value and ROI for a selected industry.  The opening webinar, Analytics, Predictive Analytics, Prescriptive Analytics: The Anatomy Of Operational Intelligence will begin with an analysis of the total Cost of Performance (COP) for real-time log analytics applications. Generating low-latency answers from high-velocity unstructured log data has both technical limitations and practical cost constraints for different Big Data technologies and existing tools. Understanding the tipping points for Hadoop and RDBMS, for example, is key to understanding the future growth potential of any deployed solution.

The “Inside the insight: a look at Operational Intelligence” webinar series includes the following topics:

  • July 9th, 2013: Analytics, Predictive Analytics, Prescriptive Analytics: The Anatomy Of Operational Intelligence; Guest speaker: Damian Black, CEO
  • July 16th, 2013, Listen To Your Sensors: A Tale Of Managing Large-Scale, Intelligent Sensor Networks In Real Time; Guest speaker: Glenn Hout, VP Americas
  • July 23th, 2013, Predict and Avert: Using Log File Data to Prevent Cybersecurity and Fraud Attacks in Real Time; Guest speaker: Damian Black, CEO
  • July 30th, 2013, No More CPR For Your CDRs: Meet Real-Time Traffic Utilization, Billing And Fraud Detection; Guest speaker: Chris Clabaugh, VP Business Development

For more information about the webinar series or to register, please visit http://www.sqlstream.com/webinars/.

Posted under Events · Press Releases
June 20, 2013
GigaOM: Structure 2013 Logo

While Hadoop and relational databases have their purposes, SQLstream thinks companies can also benefit from analyzing data as it comes in.

While NoSQL databases have had their share of interest in recent years, that doesn’t mean developers should rule out running SQL queries to get insights in to large amounts of data flowing in real time, and do so without spending an arm and a leg.

CEO Damian Black made the pitch for this streaming approach and talked about its Total Cost-of-Performance relative to RDBMS and Hadoop-based approaches at GigaOM’s Structure conference in San Francisco.

CEO Damian Black speaks about the total cost of performance @ GigaOM Structure

CEO Damian Black speaks about the total Cost-of-Performance @ GigaOM

To read full article, click HERE.

 

Posted under Events · In the News · Market Views

Vice President Marketing
June 19, 2013

UBM TECH 32 CONFERENCEWe participated on the “Architecting Big Data Systems for Speed” panel at E2 Conference. Great event, and a great opportunity to discuss technology in a business context. The panel offered a range of perspectives with other panelists from Translattice and Oracle’s NoSQL division. A number of interesting topics emerged, including the meaning of real-time in a Big Data world, and how enterprise architectures are evolving to deliver low latency as well as volume, velocity and variety.

What is real-time Big Data?

People often seem disappointed when Hadoop turns out not to be as real-time as they hoped. Which is somewhat unfair. Hadoop was designed to deliver much faster results from petabytes of stored data. So if real-time means faster answers from unstructured stored data, great. And let’s face it, this is a significant improvement of the previous BI norm of 24 hour turnaround, if indeed you could tailor your RDBMS to store event data at all. However, if you consider real-time to be low latency actions measured from the time of data creation, that’s a different story. All this data arrives over time, so there’s obviously a velocity factor, and for some applications, near real-time answers may be possible, for example, Twitter peaks at around 10k tweets per second, and there’s a number of Big Data apps that claim real-time Twitter analytics.

Big Data are valuable, but only for a short period of time

The world is moving on and the kinds of low latency requirements that are now emerging have significantly higher data rates. For example:

  • Telematics and M2M apps can generate 5 to 20 million car, transport network and environmental records per second.
  • Real-time Cybersecurity monitoring can be several million records per second.
  • Telecommunications, a 4G performance app can be 10 million records per second, an IP monitoring can be many times that./li>

There’s a couple of new requirements evident in these industry sectors:

  • the need to collect and analyze many different sources and types of data
  • the need to drive real-work actions
  • the need for predictive and prescriptive analytics
  • and most importantly, low latency actionable intelligence

How real-time can we get with different Big Data technologies?

RDBMS – a few minutes
Bizarrely, the traditional RDBMS has a decent claim to some limited real-time credentials, so long as the bars for velocity and latency are not set too high. The data collection, ETL and data loading process introduces significant delay, and scale out is limited by the difficulties of distributing processing across multiple servers. This means the cost of delivering real-time performance ramps up significantly as requirements for velocity increase and latency decreases. But when requirements are modest, latency of a few minutes may be possible.

Hadoop – a few hours
With Hadoop, the story is somewhat different. The latency introduced by ingest and execution of Map-Reduce jobs is significant, in the order of twenty minutes to several hours for real-world scenarios. However, the cost of scaling for velocity (and therefore greater volume) remains significant, but infrastructure costs scale better than for comparable RDBMS solutions.  And of course, Hadoop suffers from the same problems as the RDBMS – queries must be executed in order to update answers when new data arrive.

Streaming – a few milliseconds
Streaming platforms offer the lowest latency, in the order of milliseconds, by processing data streams in-memeory over moving time windows, and without the overhead of having to persist the data first. The major advantage is that answers are updated continually as soon as new data arrive, without having to re-execute the queries. SQLstream’s technology is also very resource efficient, and scales out well over multiple servers, making it by far the most cost effective solution where low latency answers are required from high velocity data.

The analogy here is viewing a webpage on the internet. With an RDBMS or Hadoop based solution, you’re continually hitting the refresh button to view updated information, with streaming, there’s no need for a refresh button, streaming queries run continuously, updating answers automatically when new data arrive, a bit like having your webpage update instantaneously as soon as anything of relevance to that page, anywhere on the WWW changes.

Beyond Streaming Analytics – Driving Real-time Actions

SQLstream’s vision is to go much further than the generation of real-time analytics – in a real-time world it’s essential to be able to drive change automatically – for the examples discussed – telematics, telecoms, cybersecurity – what’s important is to drive action in real-time, flagging a bank accounts under attack, driver updates, changing mobile service quality in real-time – and to deploy predictive analysis to detect a business exception in advance. So for the web page analogy, this means not going to the internet at all, because your bank accounts were never hacked in the first place, you don’t need a tow truck because the your car warned of ice on the road based on data from other vehicles, and your streaming video never dropped in the first place because the QoS was corrected.

Posted under Big Data Analytics

Vice President Marketing
June 18, 2013
GigaOM: Structure 2013 Logo

“The Total Cost of Performance in a Massively Connected World” Session to be Presented at GigaOM Structure on June 19th at 4:55PM PST in San Francisco

San Francisco, CA | June 18, 2013

SQLstream Inc., a pioneer of the streaming Big Data engine for real-time operational intelligence, today announced that SQLstream CEO Damian Black has been invited to speak about the cost of achieving real-time performance from Big Data technologies at GigaOM’s Structure 2013, June 19-20.

Damian will be exploring the cost and technical limitations of generating real-time operational intelligence from different Big Data technologies. As data-management architectures evolve, each new technology looks to increase real-time visibility and reduce the latency-to-action and decision making. Yet technologies such as Hadoop have an associated cost and a practical performance limit beyond which real-time, high velocity distributed data management becomes unfeasible.

Held at Mission Bay Conference Center at UCSF, and now in its sixth year, GigaOM Structure gathers data scientists, business leaders and investors interested to learn more about the emerging business opportunities for Cloud and Big Data. Structure 2013 will focus on how real-time business needs are shaping IT architectures, hyper-distributed infrastructure and creating a cloud that will look completely different from everything that’s come before. Damian will present The Total Cost of Performance in a Massively Connected World at 4:55PM PST on Wednesday, June 19.

Damian Black, was selected for his pioneering work in real-time and Big Data software and his commitment to reducing latency-to-action for business operations.  Black has participated in many GigaOM events, including the first ever Big Data panel session at GigaOM Structure 2008. “GigaOM has its finger on the pulse yet again with the focus on addressing real-time business needs from Big Data,” said Damian Black, SQLstream CEO. “The drive towards high velocity data coupled with low latency actions marks a new frontier for Big Data, one where the cost of performance is driving streaming data technology to the fore.”

If you are a member of the press or analyst community and are interested in setting up a meeting with SQLstream at GigaOM Structure:Data, please contact Ronnie Beggs via email at ronnie.beggs@sqlstream.com.

About SQLstream

SQLstream (www.sqlstream.com) is the pioneer and innovator of a patented Streaming Big Data Engine that unlocks the real-time value of high-velocity unstructured machine data. SQLstream’s s-Streaming products put “Big Data on Tap™ – enabling businesses to harness action-oriented analytics, with on-the-fly visualization and streaming operational intelligence from their log file, sensor, network and device data. SQLstream’s core data streaming technology is a massively scalable, distributed platform for analyzing unstructured Big Data streams using continuous SQL, with support for streaming SQL query execution on Hadoop and other enterprise database, data warehouse and data management platforms. SQLstream’s headquarters are in San Francisco, CA.

Media Contact

Ronnie Beggs, VP Marketing

Company Name |  SQLstream, Inc.
Contact E-mail | pr@sqlstream.com
Website URL | www.sqlstream.com

Resources

GigaOM Structure agenda

GigaOM Structure registration

SQLstream GigaOM Structure:Data 2013 Damian Black interview

 

Posted under Events · Press Releases
June 17, 2013
UBM TECH 32 CONFERENCE

Ronnie Beggs Participating on “Big Data: Architecting Systems at Speed ” Panel at E2 Conference, June 18th at 2:30PM ET, Boston

San Francisco, CA | June 17, 2013 – SQLstream Inc., the streaming Big Data platform, today announced that SQLstream has been invited to speak at E2 Conference on the architectural challenges of generating real-time operational intelligence from high-velocity Big Data streams.

The panel gathers industry experts to explore how real-time, high velocity Big Data streams can be processed rapidly in order to make meaningful decisions quickly.  It has been assumed that volume is the fundamental problem presented by Big Data.  It’s not. It’s data velocity and speed to action. The panel lead by esteemed analyst Johna Till Johnson of Nemertes Research will analyze the challenges of rethinking how Big Data processing is architected.

Held at Boston Marriott Copley Place, E2 is the only conference focused entirely on the choices and challenges organizations face as a result of shifts in technology. A new event in the Big Data world, E2 offers decision makers in IT, data scientists, and business leaders a progressive forum where they can engage in sessions on the future of software, and explore how it can dictate the needed changes in corporate strategies.

“What I like about E2 Conference is its focus on the challenges that organizations face as a result of technology shifts, rather than the technology itself,” said Ronnie Beggs, SQLstream’s Vice President Marketing.  “Our panel is timely as well, given the wider focus on the cost of Big Data performance for low latency actions from high velocity machine data.”

If you are a member of the press or analyst community and are interested in setting up a meeting with SQLstream at GigaOM Structure:Data, please contact Ronnie Beggs via email at ronnie.beggs@sqlstream.com.

About SQLstream

SQLstream (www.sqlstream.com) is the pioneer and innovator of a patented Streaming Big Data Engine that unlocks the real-time value of high-velocity unstructured machine data. SQLstream’s s-Streaming products put “Big Data on Tap™ – enabling businesses to harness action-oriented analytics, with on-the-fly visualization and streaming operational intelligence from their log file, sensor, network and device data. SQLstream’s core data streaming technology is a massively scalable, distributed platform for analyzing unstructured Big Data streams using continuous SQL, with support for streaming SQL query execution on Hadoop and other enterprise database, data warehouse and data management platforms. SQLstream’s headquarters are in San Francisco, CA.

 

Media Contact

Ronnie Beggs, VP Marketing, Company Name |  SQLstream, Inc.

Phone: +1 415 758 8342
Contact E-mail | pr@sqlstream.com
Website URL | www.sqlstream.com

 

Resources

E2 agenda

E2 registration

Posted under Events · Press Releases

Vice President Marketing
June 11, 2013

logo_sensors_0Sensors Expo is the leading industry event for the types of intelligent sensor-integrated systems that are driving the next generation Internet of Everything, Industrial Internet, telematics and Machine-to-Machine services. As a key sponsor in the Big Data and Wireless Systems pavilion, and speaking in the Big Data track, we found the event a great litmus test for the current state of the sensor-related systems’ industries.

This year, the focus was very much on the monetization of intelligent sensors networks. This included how businesses and consumers can benefit from the Internet of Things, but also a focus on infrastructure costs, in particular, architectures for managing the high cost of wireless backhaul for large scale sensor data transmission.

The clues were there at last year’s event, where the focus was not so much real-time operational intelligence and monetization, rather the more fundamental issue of real-time streaming sensor data integration. The main question twelve months’ ago was how can such huge volumes of data from many different sources and locations be integrated into the operational platforms, and at an affordable cost.

Drinking from the Sensor Firehose

The Internet of Everything is the new frontier for real-time Big Data management, where data velocity now trumps data volume as the primary driver, where the geographical distribution of streaming data adds new levels of complexity, and yet the useful lifetime of data, the window within which to make a decision, has decreased dramatically. As data management architectures evolve, each new technology looks to increase real-time visibility and reduce the latency to actions and decision-making.

Which is exactly problem that SQLstream’s VP of Americas, Glen Hout, presented in his session on Real-Time and Big Data: The Perfect Union for Large-Scale Sensor Network Management. SQLstream is a streaming Big Data management platform, sensor data is processed in real-time as they are created, and answers generated with sub-second latency. SQLstream also collects unstructured sensor data of any type and from any number of sources simultaneously, and can deliver multiple aggregated and real-time intelligence to external systems such as operational management platforms, existing data warehouses and Big Data storage platforms.

The Intelligent Edge

However, the issue with the cost of transmission backhaul is also addressed by SQLstream. SQLstream’s agent framework enables intelligent data collection agents to be deployed as close to the data source as possible. Intelligence Agents offer filtering, aggregation and analytics capability at the edge of the network. Depending on the information content in the data, the volume and therefore cost of real-time transmission will be reduced significantly.

So the future is bright, the future is real-time operational intelligence. And to achieve that, requires the next generation of streaming data management and true real-time operational intelligence platforms.

June 6, 2013

dbta_logo

SQLstream was named as one of the top 100 companies that provide technologies and services for data management by the Database Trends and Applications (DBTA) magazine.

A result of lengthy research and compiled by the publication’s editorial staff, the first “DBTA 100: The Companies That Matter Most in Data” list recognizes industry leaders that are helping organizations derive tangible value and insight from Big Data.

Read more, HERE.

Posted under In the News
June 5, 2013

InfoArmor

InfoArmor and SQLstream Team Up to Deliver the Next Generation of Continuous Identity  Monitoring Services at Massive Scale 

SAN FRANCISCO – JUNE 5TH, 2013 – SQLstream, Inc., the Streaming Big Data Company, today announced that InfoArmor, a U.S. leader in identity theft monitoring and internet surveillance, selected SQLstream as the real-time engine for the next generation of its identity monitoring and surveillance platform. SQLstream’s s-Streaming products enable InfoArmor to meet the rapid growth in demand for its real-time identity monitoring services. SQLstream was chosen for its capacity to capture, process and integrate in real-time high volumes of unstructured data coming from a large variety of sources, its massive scalability for high velocity real-time operational intelligence, and its simple, fast deployment.

Identity theft in the U.S. alone claimed 12.6 million victims in 2012, a rise of over 1 million from 2011, resulting in more than $21 billion in damages. InfoArmor’s identity protection services are helping detect and correct identity theft before any damage is done or costs incurred. Real-time identity theft monitoring has complex, rules-based alerting intelligence that must be applied over a large number of different data sources simultaneously.

 “We needed a real-time operational intelligence platform that could scale to new levels of data acquisition, conditioning, analytics and alert delivery,” said Christian Lees, CTO of InfoArmor. “We evaluated building our own and explored other vendors, but chose SQLstream because they met our requirements entirely and they provided the only 100% ISO ANSI/SQL standards-based streaming platform. That enabled us massive scalability, a very fast deployment and a highly competitive TCO.”

The system captures and parses multiple data feeds in different XML-based formats. SQLstream parses and conditions the data feeds on the fly, and applies a sophisticated rules catalog across all data, delivering real-time alerts to consumers through SMS when rules and specific combinations of rules are breached.

“We’re delighted to have been chosen by InfoArmor for such a challenging operational Big Data requirement,” said Damian Black, CEO of SQLstream. “Seeing SQLstream making a real difference to people’s lives is a validation of our approach to operational intelligence and predictive analytics.”

Learn more about SQLstream’s security solutions today.

About InfoArmor

InfoArmor is a leader in offering identity theft protection and privacy management solutions to businesses. By focusing on employee benefits, InfoArmor provides a turn-key, value-adding benefit with easy employer administration. Rather than solely focusing on credit fraud, InfoArmor uses industrial strength technologies to enable consumers and businesses alike to monitor and identify fraudulent activity. For more information, visit http://www.infoarmor.com.

Posted under Customer win · In the News · Press Releases
May 30, 2013
logo_sensors_0

San Francisco, CA | May 30, 2013 – SQLstream Inc., the Streaming Big Data Company, announced today that SQLstream VP Americas, Glenn Hout, has been invited to speak on real-time operational intelligence and prescriptive analytics for the Internet of Everything at Sensors Expo 2013, Jun 4-6.

Held in Rosemont, Illinois, and now in its 27th year, Sensors Expo is the leading industry event in its class and gathers the world’s top engineers, scientists and business leaders involved in the development and deployment of sensor networks and systems. Areas of focus include sensor Big Data analytics, wireless sensor networks for Machine-to-Machine services, telematics and smart energy.

Glenn Hout is presenting “Real-Time and Big Data: the Perfect Union for Large-Scale Sensor Network Management” at 10:10AM CT on Thursday, June 6th.  SQLstream was selected to address the Sensor Expo audience in the Big Data and Analytics track based on its pioneering work in real-time operational intelligence and streaming integration of prescriptive analytics for industrial sensor, telecommunications, intelligent transportation, telematics and M2M networks and services.

“It’s always a pleasure to engage with an audience so familiar with the practical implications of Big Data,” said Glenn Hout, SQLstream VP Americas. “The Internet of Everything is the new frontier for Big Data, and SQLstream’s prescriptive analytics engine is the only platform able to meet its real-time requirements. Scaling up to handle sensor data velocities of millions of records per second with ultra-low latency response is the prerequisite for generating and acting on operational intelligence.”

If you are a member of the press or analyst community and are interested in setting up a meeting with SQLstream at Sensors Expo, please contact Ronnie Beggs via email at ronnie.beggs@sqlstream.com.


Vice President Marketing
May 16, 2013

The Internet of Everything is the new frontier for real-time and Big Data, where Velocity now trumps Volume as the primary driver, where the geographical distribution of streaming data adds new levels of complexity, and yet the useful lifetime of data, the window within which to make a decision, has decreased dramatically.

Industries such as telecommunications, telematics, M2M and the Industrial Internet are starting to generate high velocity data streams at rates of millions of records per second. Gaining actionable insight from data of this magnitude and speed may be technically feasible to a point for the elastic scale-out architectures of the leading Big Data and Cloud technologies, but at what point does it cease to be financially viable?

The Cost of Real-time for Big Data

The Technology Tipping Point for Real-time Operational Intelligence

As data management architectures evolve, each new technology looks to increase real-time visibility and reduce the latency to action and decision-making. Yet each technology has an associated cost and a practical limit beyond which real-time, distributed data management becomes commercially infeasible and then technically impossible.

Big Data technologies emerged due to the combination of lower cost commodity hardware and Cloud infrastructure, coupled with the cost and the technical challenges of processing massive volumes of unstructured machine data (server logs, sensors, IP network data for example) rapidly using conventional “store-first, query second” relational database technology. This lead a charge towards new elastic scale out processing frameworks based on HDFS and Map-Reduce. However, as Hadoop-based and other similar technologies mature, the true cost of real-time and Big Data for Hadoop is starting to emerge.

For low latency operational intelligence applications, Big Data storage-based solutions are proving to be costly and complex to deploy, and technically challenging to scale for high velocity / low latency performance. Furthermore, today’s Cloud multi-tenant and shared infrastructures introduce unacceptable latency to real-time business analytics, and the cost ramps up significantly as ingress and egress data velocities increase.

The ROI and technical tipping points for storage-based (Big Data and traditional RDBMS-based architectures) is very much at the low end of the scale, typically in the order of 10,000 records per second. This is sufficient to process a Twitter firehose at full speed, but falls significantly short for industry applications for real-time operational intelligence. For example:

  • Operational performance & QoS monitoring in a 4G cellular network generates cell and call detail information at many 10s of millions of events per second.
  • Telematics applications are generating data rates of between 5 million and 20 million records per second for a small installation.
  • For IP service networks, IP probe data regularly exceeds data rates of 10 million records per second.

These data rates are well within the technical capabilities of SQLstream’s Big Data streaming platform, and importantly, SQLstream has a proven ROI for high velocity data applications. That’s not to say in-stream analytics is the only platform required, and in fact SQLstream deploys Hadoop HBase as its default stream persistence platform, and a typical architecture deploys SQLstream for the in-stream analytics with continuous feeds of raw and operational intelligence from SQLstream to existing operational control and data warehouse platforms. The resulting architecture offers an attractive overall ROI and eliminates the technical tipping points for low latency, high velocity data management.

More on the cost of real-time performance in a Big Data world to follow as part of this series in subsequent blogs.


CEO
April 21, 2013

SQLstream sponsored the recent IE Group Big Data Innovation Summit in San Francisco where I also presented on streaming SQL for Hadoop, and extending Hadoop for real-time operational intelligence. As Big Data technologies and Hadoop push further into mainstream enterprises, so the need for real-time business operations is an important parallel trend. ‘Real-time’ and ‘Hadoop’ had been considered synonymous by some, yet surprisingly, people are surprised when Hadoop does not seem to be as real-time as they hoped. This should not come as a surprise, as Hadoop as many strengths, but was never intended for low latency, real-time analytics over high velocity data.

SQLstream Hadoop-Innovation-Summit Real-time Hadoop

Click to View Damian’s Presentation on Slideshare

Real-time Big Data or real-time Big Data?

Which raises the question, what do we mean by real-time? Many products have emerging that claim ‘real-time’ analytics over Hadoop. Yet Hadoop remains a batch processing framework, and struggles to deliver low latency analytics against high velocity streaming data, struggling due to the same limitations as existing RDBMS-based data management platforms. These ‘real-time’ products may generate rapid results over the stored data, but ignore the latency introduced by data collection and storage, and also ignore the resource load of repeated execution of queries to process newly arriving data. The latency issue may not be apparent for slower data streams, such as twitter feeds for example, but with the data rates of machine data in the world of telecommunications, industrial automation, M2M and large scale security intelligence for example, the problem rapidly becomes extreme.

SQLstream’s core stream computing platform, s-Server, processes high velocity data as soon as they are generated, executing continuous SQL queries and analytics directly over log files, sensor feeds and any other machine-generated data source. We measure real-time form the time of data creation, eliminating completely the latency introduced by collecting, storing and the repeated updates of results.

Drive real-time actions with streaming operational intelligence

We discussed in a previous blog how real-time operational intelligence eliminates the chasm between business operations and analytics. Operational intelligence is about more than the collection and analysis of log file and machine-generated data. One of the advantages of stream computing is the ease with which predictive analytics can be applied over multiple data streams. This makes it possible to alert on time and space-based patterns of machine, user and consumer behavior that are predictors of some future event – a security breach, network failure or service fault.

streaming operational intelligence

And true operational intelligence platforms need to go one step further – true real-time platforms must do more than visualize results on a dashboard – it’s essential to connect back to application and operational systems, and to drive automated updates. Security breaches can be avoided, network resilience mechanisms activated, and service faults corrected before SLA breaches occur and customers are aware of the problem.

Real-time operational intelligence on Hadoop

So what does this mean for Hadoop? Streaming is not a new technology, but approaches streaming technologies have focussed on single source problems, and have been deployed as standalone platforms for low velocity use cases. With SQLstream, standard SQL queries, albeit continuously executing SQL queries, execute to join, group, partition and analyze real-time machine data streams. There is a further difference – SQLstream’s s-Server streaming SQL platform can also be deployed as a streaming SQL query extension for Hadoop.

A number of streaming Hadoop scenarios are supported:

  • Stream persistence – Hadoop HBase as an active archive for streaming data and derived intelligence using the Flume API. SQLstream also performs continuous aggregation  to support high velocity streams without data loss.
  • Stream replay – restream the complete history of persisted streams from HBase for ‘fast forwarding’ of time-based and spatial analytics. Various interfaces can be utilized, including Cloudera’s Impala.
  • Streaming data queries, joining streaming real-time data with historical streams and intelligence persisted in HBase.

Making the Elephant fly

Accelerating Hadoop to process live, high velocity unstructured data streams delivers the low latency, streaming operational intelligence demanded by today’s real-time businesses. Hadoop has been the driving force behind Big Data Analytics but as the technology hits the mainstream, many industries are seeking to take a step further and eliminate latency from their business completely. With the SQL language emerging as the key enabler for the mainstream adoption of Hadoop, executing streaming SQL queries over Hadoop extends the platform out to the edge of the network, making it possible to query unstructured log file, sensor and network machine data sources on the fly and in real-time.

Posted under Big Data · Events
April 17, 2013

Organizations that deploy Big Data tools to analyze operational data soon learn staff’s expertise in SQL isn’t always helpful. New skills need to be learned.

Distinguished analyst Dan Kusnetzky of ZDNet discusses how SQLstream technology makes the SQL skills already found in most organizations useful for the analysis of streaming operational data.

Read full article HERE.

Posted under Analyst review · In the News
April 9, 2013

Unlike traditional relational database management systems (RDBMS), which persist data that can later be queried, SQLstream runs continuous queries on targeted data streams – enabling a real-time view of an organization’s situation.

Eric Kavanagh recaps SQLstream’s last session in The Briefing Room, held alongside analysts Mark Madsen and Robin Bloor- read full summary here.

Posted under Analyst review · In the News
March 26, 2013

SQL or NoSQL? In-memory or hard disks? Graph? These questions have been top of mind in recent years as developers and IT administrators check out new-age databases capable of handling scale-out data sets. SQLstream CEO and Founder Damian Black, together with 3 other executives from new data management technology companies, showed how they stand out in a hot market at GigaOM’s Structure:Data.

Read full article and watch full recording on gigaom.com HERE.

Posted under In the News

CEO
March 20, 2013
  • SQLstream s-Streaming Big Data Engine Benchmarks at 1.35 Million Streaming Events Per Second per 4-core server  – Outperforming Twitter’s Storm Stream Computation Project with Significant Overall TCO Advantage

New York, NY | March 20, 2013– SQLstream Inc., the Streaming Big Data Company, announced today at GigaOM’s Structure:Data, the results of an independent performance benchmark which measured the SQLstream s-Server 3.0 Big Data Engine processing 1.35 million 1Kbyte records per second per 4-core commodity server, outperforming a comparable configuration based on the Twitter Storm distributed real-time computation system. SQLstream’s s-Server outperformed the Storm-based solution by a factor of 15x.

SQLstream’s s-Streaming Big Data Engine delivers action-oriented analytics, extracting operational intelligence in real-time from high velocity, unstructured log file, sensor and other machine-generated data. Streaming intelligence can be persisted, queried and replayed in Hadoop, with additional connectors to all major storage platforms and data warehouses.

The streaming Big Data benchmark was conducted by a large enterprise with a roadmap to stream unstructured operational data from multiple remote log and machine data flows at up to 10 million records per second for each installation. The benchmark requirement was to perform advanced time-series analytics over mobile network infrastructure records in order to predict potential service-impact problems. The benchmark projects that the s-Server platform would require just eight servers to scale up to 10 million records per second — versus an estimated more than 110 servers for the comparable Storm approach.

SQLstream s-Server 3.0 was able to demonstrate significant cost savings with dramatically lower TCO. The TCO savings came from a combination of reduced hardware and power consumption, the power and simplicity of SQL over low-level Java development, plus reduced maintenance requirements. Other factors influencing SQLstream s-Server’s TCO advantage came from its integrated Big Data platform architecture, ability to update on the fly as new data flows are incorporated, significantly faster implementation timescales using SQL for streaming analytics and integration, and automatic platform optimization for turbo-charged performance and parallel dataflow execution.

“SQLstream excels through the combination of its mature, industry-strength streaming Big Data platform, our support for standard SQL (SQL:2008) for streaming analysis and integration, plus a flexible adapter and agent architecture,” said SQLstream CEO Damian Black. “SQLstream s-Server is today’s clear streaming performance winner – with blazingly fast throughput, an ability to handle a wide variety of message types, sources and formats, and an efficient Streaming Data Protocol with compact optimized binary data formats.”

Advantages of SQLstream’s s-Server, the core element of the company’s s-Streaming Big Data Engine, as demonstrated in the performance benchmark project include:

  • Scaling to a throughput of 1.35 million 1Kbyte records per second per four-core server each fed by twenty remote streaming agents.
  • Expressiveness of the standards-based streaming SQL language with support for enhanced streaming User Defined Functions and User Defined Extensions (UDF/UDX).
  • Deploying new processing analytics pipelines on the fly without having to stop and recompile or rebuild applications.
  • Advanced pipeline operations including data enrichment, sliding time windows, external data storage platform read and write, and other advanced time-series analytics, all based on existing SQL standards.
  • Advanced memory management, with query optimization and execution environments to utilize and recover memory efficiently.
  • Higher throughput and performance per server for lower hardware requirements, lower costs and simple to maintain installations.
  • Proven and mature enterprise-grade product with a validated roadmap and controlled release schedule.

All required modules used in the benchmark were integrated with s-Server 3.0, using 20 remote streaming agents connected per SQLstream s-Server instance each running on a four-core Intel® Xeon© server platform with RedHat Enterprise Linux.

 

About SQLstream

SQLstream (www.sqlstream.com) is the pioneer and innovator of a patented Streaming Big Data Engine that unlocks the real-time value of high-velocity unstructured machine data. SQLstream’s
s-Streaming products put “Big Data on Tap™ – enabling businesses to harness action-oriented and predictive analytics, with on the fly visualization and streaming operational intelligence from their log file, sensor, network and device data. SQLstream’s core V5 streaming technology is a massively scalable, distributed platform for analyzing unstructured Big Data streams using standards-based SQL, with support for streaming SQL query execution over Hadoop/HBase, Oracle, IBM, and other enterprise database, data warehouse and data management systems.  SQLstream’s headquarters are in San Francisco, CA.

Posted under Events · Press Releases
March 18, 2013
  • “Four For The Future: Upcoming Database Technologies” Session to be Held at GigaOM Structure:Data on March 21st at 4:40PM ET

San Francisco, CA | March 14, 2013 – SQLstream Inc., a pioneer of the streaming Big Data engine for real-time operational intelligence, today announced that SQLstream CEO Damian Black will speak about innovative real-time Big Data technologies at GigaOM’s Structure:Data 2013, March 20-21.

Held on Pier Sixty, The Chelsea Piers in New York, and now in its third year, Structure:Data  gathers data scientists, business leaders and investors interested to learn more about the emerging business opportunities for Cloud and Big Data. Damian will speak on the invitation-only panel session Four For The Future: Upcoming Database Technologies at 4:40PM ET on Thursday, March 21.

Damian will be discussing the emergence of SQL for Big Data, and more specifically, SQL as a streaming language for low-latency, real-time operational intelligence in a Hadoop environment. Damian was selected for the panel because of his pioneering work in real-time and Big Data software. SQLstream’s s-Streaming products transform log, sensor and other machine-generated data into streaming operational intelligence in real-time. Its core streaming Big Data architecture is built on a massively parallel platform for processing high velocity machine data. Streaming intelligence can be persisted, queried and replayed in Hadoop, with additional connectors to all major storage platforms and enterprise data warehouses if required.

SQLstream CEO Damian Black has participated in many GigaOM events, including the first ever Big Data panel session at GigaOM Structure 2008. “It’s always a pleasure to speak at a GigaOM and to engage with an audience at the forefront of Big Data,” said Damian Black. “Real-time is the next wave of Big Data, and SQLstream is the future of true real-time business decisions. Real-time is only a streaming SQL query away.”

If you are a member of the press or analyst community and are interested in setting up a meeting with SQLstream at GigaOM Structure:Data, please contact Ronnie Beggs via email at ronnie.beggs@sqlstream.com.

About SQLstream

SQLstream (www.sqlstream.com) is the pioneer and innovator of a patented Streaming Big Data Engine that unlocks the real-time value of high-velocity unstructured machine data. SQLstream’s s-Streaming products put “Big Data on Tap™ – enabling businesses to harness action-oriented analytics, with on-the-fly visualization and streaming operational intelligence from their log file, sensor, network and device data. SQLstream’s core V5 data streaming technology is a massively scalable, distributed platform for analyzing unstructured Big Data streams using continuous SQL, with support for streaming SQL query execution on Hadoop and other enterprise database, data warehouse and data management platforms. SQLstream’s headquarters are in San Francisco, CA.

 

Resources

GigaOM Structure:Data 2013 agenda

GigaOM Structure:Data 2103 registration

SQLstream GigaOM Structure:Data 2013 Damian Black interview

Posted under Press Releases
February 26, 2013

In Part One of the Hadoop 2013 series, Merv Adrian points out how significant attention is being lavished on performance in 2013. In this second installment, the topic is projects, which are proliferating precipitously. One of the most frequent client inquiries is “which of these pieces make Hadoop?” As recently as a year ago, the question was pretty simple for most people: MapReduce, HDFS, maybe Sqoop and even Flume, Hive, Pig, HBase, Lucene/Solr, Oozie, Zookeeper. When the Gartner piece How to Choose the Right Apache Hadoop Distribution was published, that was pretty much it.

Since then, more projects have matured, and SQLstream is now part of of shortlist.

Read full article on gartner.com HERE. 

Hadoop 2013- Projects

Posted under Analyst review · In the News · Market Views

Vice President Marketing
February 20, 2013

Developed to Meet Demand from Universities and Supercomputing Centers

San Francisco, CA | February 20, 2013 – SQLstream, a pioneer of real-time Big Data, today announced an extension to its partner program that helps to fulfill the growing need for real-time data processing systems in supercomputing centers and academic cyberinfrastructure facilities.

The SQLstream University Partner Program (UPP) addresses educational institutions looking to expand research into streaming Big Data concepts by offering a non-commercial, royalty-free license to its market leading real-time analytics platform.

The Cornell University Center for Advanced Computing (CAC) headlines the growing list of UPP organizations using SQLstream’s streaming analytics platform to gain real-time insight into massive volumes of data. CAC is one of 17 partner institutions in the Extreme Science and Engineering Discovery Environment (XSEDE), a single virtual system that scientists can use to interactively share computing resources, data and expertise.

“Too many computing systems producing too many logs too quickly are unmanageable,” says Lucia Walle of Cornell University CAC. “SQLstream provided the real-time data mining system compatible with our local business logic that allows CAC to find events of importance using Sisyphus and queries, accurately simulate logs including the scale and pace of log activity, and monitor both simulated and actual system logs so that we can gain reliable insight into the cause and imminence of system failures before they actually occur.”

All UPP members have access to the full suite of SQLstream server technologies and have access to engineering support from SQLstream. Among many of the SQLstream University Partner Program benefits are:

  • Access to a non-commercial, royalty-free license to the SQLstream technology;
  • Access to a library of use cases and analytic algorithms;
  • Discounted training and kick-off assistance to research projects;
  • Discounted commercial license available for those Universities desiring to implement SQLstream in a production environment.

“One of our main goals as an organization is to accumulate business intelligence by adding research partners that can challenge and complement our existing solutions,” said Chris Clabaugh, VP of Business Development at SQLstream. “Streaming analytics and Big Data is seeing a lot of demand in the commercial and research space right now, so we are thrilled to be the streaming analytics technology supplier to university supercomputing centers.”

To learn more about UPP, please visit http://www.sqlstream.com/university-partner-program/.

 

SQLstream Inc. (www.sqlstream.com) makes systems responsive to real-time operational Big Data. SQLstream enables organizations to query their log, sensor and service data directly, and to share streaming operational intelligence with external systems, continuously and in real-time. SQLstream is built on a standards-based, distributed and massively parallel architecture, and uses industry standard SQL for the rapid analysis of high volume, real-time data streams. Standards mean lower costs, proven performance and seamless integration. Headquartered in San Francisco, CA, SQLstream is transforming the world of real-time, Big Data stream management.

Posted under Press Releases

Vice President Marketing
January 31, 2013

Today’s edition of Information Management’s DM Radio Broadcast, The Future of Integration: ETL, CDC and IOA, had a great panel line up discussing the breadth of data integration issues in today’s world of Big Data, Cloud and traditional enterprise architectures. The session was hosted by Eric Kavanagh (Bloor) and Jim Ericsson (ex-Information Management, today was his last DM Radio after 5 years at the helm), and supported by Philip Russom (TDWI). SQLstream’s Damain Black was on the panel, with representatives from Denodo and Dell.

Extending the reach of Hadoop to the edge of your business.

A key observation emerged from the discussion, that the world of ETL and data integration is changing with the need for lower latency business operations. Most businesses are now global. There is not the window of opportunity for batched data management processes. Exploiting streaming data enables organizations to extend the reach of their Big Data storage platforms and data warehouses out to the edge of their business, connecting these platforms directly to the data sources. Streaming data can be filtered, aggregated and integrated with minimal latency, plus real-time operational intelligence extracted as the data stream past. This means businesses can be responsive to their real-time data without having to wait for data to be stored and processed.

Blurring the boundary between analytics and operations.

The emergence of low latency business operations is also blurring the boundary between traditional BI analytics and the world of business operations. The emergence of operational intelligence is the application of analytical techniques in real-time. The two are no longer distinct business functions, rather a seamless continuum from streaming data to real-time business value. The benefit is that businesses can now extract the information they need in real-time from arriving data streams, but still populate existing Big Data storage platforms and data warehouses, enabling existing business processes to function as normal.

The rising power of SQL queries over Hadoop.

Integrating data between systems implies a degree of structure, a common understanding of sink and source formats. The majority of integration platforms today assume SQL as the data management language. However, SQL as a language does not necessarily mean that the underlying data store has to be relational. And this is the game changer for Hadoop that will open up the existing enterprise market – the emergence of a true standards-based SQL query interface. Cloudera is already some way down this road with Impala.

So what’s the answer: SQLstream = Real-time data integration + operational intelligence + SQL.


Vice President Marketing
January 24, 2013

I was listening to the BBC News on the Internet, and the technology section had two experts explaining Big Data to the layman. After some mutterings about it being ‘very large’ and ‘useful in genomics’, most listeners would have been left none the wiser. The BBC’s Science Editor, Susan Watts, had a bit of a better crack at it here. Our industry obviously has a way to go in explaining itself to the wider world.

Susan Watt’s BBC article of Big Data

Tackling the idea of ’very large’ first, and how Big is Big Data? Turns out, it isn’t, at least, not yet. EMA Research recently published a report; that debunked the myth of Big Data being Petabytes upwards. Authored by John Myers and Shawn Rogers, the report suggests that Big Data starts at 110GB, with typical use cases between 10 to 30TB.

But even at that size, why is the Big Data market still forecast to grow dramatically, with IDC predicting the market to be $24B by 2016? Surely this can’t all be put down to market hype and engineers’ excitement. Well, the answer lies in the ‘useful in genomics’. Not everyone has data of that scale or complexity, but most want to extract value from their data much more quickly and often in real-time. Big Data technology is aimed at making that easier, by removing many of the constraints of traditional data management and processing. For example, removing fixed schemas, processing all data, and generating answers on the fly, all with low cost, commodity hardware.

So Big Data Analytics is the killer app? Absolutely. Unfortunately, with today’s first generation Big Data platforms such as Hadoop, HDFS and even HBase, many are still struggling to analyze their data at scale and in real-time. That’s where technologies such as streaming and the (re-)emergence of SQL as a Big Data processing language is key. For example, Cloudera’s Impala is the first full frontal attack on mature Big Data market for larger enterprises. Real-time and SQL will help us bridge not only the technology gap, but also the ability to explain in layman’s terms how Big Data really benefit the person in the street.

Posted under Big Data · Big Data Analytics

Vice President Marketing
January 17, 2013

Bloor Group’s Robin Bloor hosted SQLstream’s CEO Damian in The Briefing Room on Tuesday January 8th 2013. The webcast, entitled “Windows of Opportunity: Big Data on Tap” focussed on the emergence of both SQL and the streaming data platform as a key enabler for real-time Big Data solutions in an ever-maturing marketplace. You can watch the full webinar from the link below, but I’m going to focus on some of the topics arising from the online discussion between Robin, Damian and the audience.

It was an interesting discussion, covering Big Data and streaming in the wider context of enterprise deployments, but a number of important points were raised:

Hadoop is a data reservoir, not a real-time platform.

Many believe incorrectly that Hadoop is a platform for real-time low latency analytics.  It’s not. Hadoop is a multi-purpose engine but not a real-time, high performance engine. The parallelism of Hadoop is great for processing the data once it’s stored, but has high throughout latency.  However, with the integration of a streaming data platform for continuous data collection, analysis and streaming integration, Hadoop can be used as the active archive for a true real-time, streaming Big Data system.

Operational intelligence needs a Streaming Big Data Platform

The bulk of real-time operational intelligance today is derived from log and machine data, data generated by the Internet, Cloud infrastructure and applications for example. There are many log monitoring tools out there, and while very capable, we’re finding that SQLstream with our real-time streaming Big Data platform is being used to solve the high volume, high velocity, complex data problems that log monitoring tools are unable to address at an affordable price point.

The emergence of SQL for Big Data

The first phase of Hadoop and Big Data platforms saw the emergence of NoSQL data storage platforms, looking to overcome the rigidity of normalized RDBMS schemas. However, as the technology hits mainstream industry, the need for simpler, high performance and reliable queries is driving a resurgence in SQL as the de facto language for Big Data processing (see Cloudera Impala for example). What’s not apparent is that SQL is the ideal language for processing data streams using real-time, windows-based queries. The issue with normalization and rigid schemes is a non-issue for a streaming data platform – there are no tables, no data gets stored!

So in summary, streaming Big Data is the emerging technology for 2013. And SQL is the (re-)emerging technology as Big Data hits mainstream industry.  Processing real-time log and machine data streams is a key requirement today, but industry with sensor, M2M and telematics applications are catching up fast.

 

Posted under Big Data · Events
January 15, 2013

Dylan Janus from Information Today talks about the latest SQLstream release, 3.0. With improved speed and scalability, the flagship product introduces high performance distributed stream processing. Google BigQuery integration, and enhanced platform manageability and streaming application development.

“We have improved ways of processing machine data, so you can process much higher volumes and much more flexible and resilient processing of log file data,” Damian Black, CEO of SQLstream, tells 5 Minute Briefing.

Read the full article on DBTA.com here. 

Posted under Big Data · In the News

Vice President Marketing
January 10, 2013

The SQLstream Briefing Room webinar with Robin Bloor took place on Tuesday January 8th 2013.  ”Windows of Opportunity: Big Data on Tap”  highlighted how the evolving Big Data landscape needs technologies that enable a much bigger enterprise-wide picture, complete with multiple data streams that can be combined to show what’s happening in real-time. The speakers included:

  • Eric Kavanagh, CEO, The Bloor Group, who hosted the event.
  • Robin Bloor, Chief Analyst, The Bloor Group, who lead the online briefing
  • Damian Black, President & CEO, SQLstream, discussing the emergence of streaming Big Data management as a key enabler for Big Data solutions, and how SQLstream is at the forefront of streaming innovation.

This was a very interesting and informative briefing on the emergence of streaming Big Data management and the use cases for real-time Big Data solutions.

Click here to watch the webcast …

Title: Windows of Opportunity: Big Data in Tap
Most business opportunities are moving targets these days, rendering static analytical solutions rather ineffective. Instead, organizations need technologies that enable a much bigger picture, complete with multiple data streams that can be combined to show what’s happening in real-time. And increasingly, companies need to analyze both traditional structured data as well as Big Data, including machine-generated data from all manner of enterprise systems.

Register for this episode of The Briefing Room to hear veteran Analyst Robin Bloor explain how a confluence of market forces has opened the door to a new analytical paradigm, one in which companies can leverage a vast array of data streams to pinpoint windows of opportunity as or even just before they appear. Bloor will be briefed by Damian Black of SQLstream, who will discuss his company’s analytical platform, which enables the management of dynamic information assets in much the way that traditional databases do for stored assets.

Posted under Events
January 2, 2013

Justin Rowlatt, BBC World Service, explores how Big Data is not just a whole lot more thank the old “small” data, but a transformative technology that is set to change the way we do everything. From Formula 1 racing to targeted retail, the effective use of data is going to be a basis for competition going forward and a necessary addition to the toolkit of every company.

Why Big Data is so Big (BBC World Service Report)

Damian Black, SQLstream CEO joins Michael Chui of management consultants McKinsey in a talk about how the torrent of data digital technology is creating can be used, and explains why all this data could allow companies and governments to learn all sorts of secrets about us all (which is, it seems, not all that bad).

Broadcast on: BBC World Service

Duration: 18 minutes

Click HERE to listen

Posted under Analyst review · In the News
December 21, 2012

This evolution toward coexistence of the Hadoop framework, NoSQL and SQL approaches could mark a new step in big data’s maturation. As 2013 approaches, there is the possibility that big data may move from hot topic to practical reality.

SQL may have taken a punch or two in 2012, but it refused to go down for the count. Companies specializing in the alternative NoSQL and Hadoop side of things brushed up their SQL credentials this year. A prime example was Hadoop startup Cloudera Inc. It looked to enhance its SQL standing with Impala, a Hadoop software offering that supports interactive queries done in standard SQL.

Read the full article HERE

Posted under In the News

Vice President Marketing
December 20, 2012

I have recently researched IBM® Infosphere Streams’ “Stream Processing Language” or SPL, after a pretty good talk at a recent SVForum SIG group meeting. Early users of this technology (which are really PS projects disguised as license sales) found that their Version 2.0 Stream Processing Application Declarative Engine (SPADE) programming language has been replaced by the IBM Streams Processing Language (SPL).

I suppose the research labs either changed interns for the summer or the products team have had changes in management. More seriously this is a natural outcome of productizing a customer software development project, paid for by the US Government, and then re-badged as a COTS product (commercial off-the-shelf product). IBM has time and again demonstrated their mastery of this jiu-jitsu move. It is truly impressive.

What we have done I believe is actually far harder, yet far simpler for the real users concerned with looking at and analyzing massive flows of streaming big data. From the start, we had a fundamental belief that our customers, partners, and end users, would need a familiar programming syntax and language. We would do all the difficult hard stuff “under the covers” as it were, so as to hide the complexities of aggregating and analyzing time-stamped streaming data over some time frame of relevance. We made a bet some 10 years ago that the deterministic and concise characteristics of SQL would continue to be a dominant force in the market; yet we undertook the very hard work of making difficult streaming data management appear easy to our end users and partners.

So, as IBM continues to revise / debug / enhance its various versions of SPL based upon various PS-led projects in the field, we view that as a great thing – for SQLstream and our customers. Our standard remains constant.

Posted under Market Views

Vice President Marketing
December 13, 2012

New SQLstream s-Server 3.0 enables faster response to operational Big Data by tapping into streaming log file, sensor and service data in real-time.

SAN FRANCISCO, CA, December 13, 2012 – SQLstream Inc. (www.sqlstream.com), a pioneer of real-time Big Data, today delivered the new generation of its streaming Big Data management platform. SQLstream s-Server 3.0 is the fastest and most scalable release of the company’s flagship product, introducing high performance distributed stream processing, Google BigQuery integration, and enhanced platform manageability and streaming application development.

SQLstream’s Big Data on Tap™ platform architecture is built from the ground up for real-time, streaming Big Data applications. The new release, with its real-time data collection, transformation and sharing capabilities, enables businesses to respond even faster to their operational Big Data. SQLstream s-Server 3.0 queries log file, sensor and service data in real-time, joins and transforms data streams using only the standard SQL language, and shares results continuously to outputs such as Big Data storage platforms. Streaming SQL enables high volume, high velocity applications for both structured and unstructured data to be built rapidly without having to resort to low level code development.

With s-Server 3.0, throughput performance is up to 10 times faster than that of the previous releases. The performance breakthrough is enabled by lock-free distributed processing for live data streams, a method pioneered by SQLstream and essential for real-time Big Data scalability. In addition, s-Server 3.0 brings faster integration, achieved through a new bi-directional connector for Google BigQuery, and faster development through new streaming SQL operators and added Windows support.

“Recent EMA research shows that Big Data environments consist of multiple platforms including structured (RDBMS) and multi-structured (Hadoop and other NoSQL) data stores – each one handling the processing that matches the strengths of the platform. This collection is called the Hybrid Data Ecosystem,” said John Myers, senior business intelligence and data warehousing analyst at Enterprise Management Associates. “Streaming Big Data solutions like SQLstream s-Server offer continuous integration with real-time analysis. This will be important in the area of operational intelligence analysis where low query latency is key.”

“SQLstream s-Server 3.0 addresses the operational Big Data problems that other log and network monitoring software solutions are unable to solve,” said Damian Black, SQLstream CEO. “Organizations want to be more responsive to their real-time Big Data, and that requires SQLstream’s low latency and raw performance, plus our ability to join and share data continuously between any data source and any destination storage platform.”

An early availability program for SQLstream s-Server 3.0 validated the new performance and integration capabilities across a range of industries including telecommunications, transportation, financial services and High Performance Computing (HPC). High performance log and machine data processing was a key requirement, met by SQLstream s-Server 3.0’s ability to detect real-time operational issues in high volume, high velocity data streams that were out of range for existing log monitoring systems.

“We had too many systems producing too many logs too quickly for any of our existing tools to process in real-time or otherwise,” says Lucia Walle, Cornell University Center for Advanced Computing. “SQLstream is the solution that scaled to monitor logs in real time for key patterns indicating imminent and undesirable conditions.”

To download a free trial and for more information about SQLstream s-Server 3.0, its technical capabilities, business benefits, and our Big Data on Tap™ approach, visit www.sqlstream.com/whats-new-in-3.

About SQLstream Inc.

SQLstream Inc. (www.sqlstream.com) makes systems responsive to real-time operational Big Data. SQLstream enables organizations to query their log, sensor and service data directly, and to share streaming operational intelligence with external systems, continuously and in real-time. SQLstream is built on a standards-based, distributed and massively parallel architecture, and uses industry standard SQL for the rapid analysis of high volume, real-time data streams. Standards mean lower costs, proven performance and seamless integration. SQLstream is headquartered in San Francisco, CA.

Posted under Press Releases
December 4, 2012

The move over recent years has been towards increasingly distributed processing of data, both in terms of the underlying model and in terms of the processing architectures available. More and more of this data is also streaming and SQLstream is focused on effective access to streaming big data …

Read the full article on Decision Management Solutions: http://jtonedm.com/2012/12/03/first-look-sqlstream/

 

 

Posted under In the News

Vice President Marketing
November 21, 2012

I have a difficult time restraining my impulse to interrupt customers, vendors, employees, the janitor etc., when I hear them refer to ‘real time data’ when what is meant is the ability to ingest massive amounts of data and to interpret or analyze this in a short amount of time. Invariably they will say, “Well, this is clearly a ‘real time’ data problem” or the like. In short, they have confused acceptable low latency with streaming flows of data.

As the character Luke (portrayed by Steve McQueen, who many moons ago drove Vic Hickey’s Baja Boot in one of our Baja 1000 excursions, the Baja Boot being the precursor to the Humvee – but that’s another story) says in the classic 1967 movie Cool Hand Luke, “What we’ve got here is failure to communicate” – I submit  ‘real time’ can never be actually achieved or determined; it is very much akin to the intersection of Brownian motion with Heisenberg’s Uncertainty Principal – assuming you can catch / read / interrogate something in ‘real time’, both time has passed and the mere action of observing the object has caused the object to move away. That is, ‘real time’ can never be truly achieved or realized, an event is over by the time one tries to measure or quantify it. Low latency, or rather low latency in a time frame of relevance, is actually what these folks mean.

Streaming data, however, is constant, flowing, measurable, and accessible should one have the means to do so, SQLstream s-Server being a great tool for such purposes. Results from analysis of such streams are streams themselves, or can be quantized into discrete events which can then be used to trigger or visualize events of import, leading systems, administrators, executives, etc., to take appropriate actions. This needs to happen in a low latency time constraint; however, ‘low latency’ itself may have different meanings for different applications of these technologies.

Posted under Market Views

Vice President Marketing
November 16, 2012

Silicon Valley Comes to Oxford 2012 (SVCO 12) is an iconic entrepreneurship event now in its 12th year, that brings the most relevant technology entrepreneurs from Silicon Valley to Oxford, UK to engage with University of Oxford’s Saïd Business School students plus members of the wider Oxford entrepreneurial ecosystem. SQLstream CEO Damian Black is hosting and presenting sessions on both aspects – the trials and tribulations of getting a tech Big Data startup off the ground in Silicon Valley, plus insight into the current state and future potential of the Big Data market. There’s also the debate at the University of Oxford Student Union where Damian will be leading one side of the debate, but more on that in a separate blog.  Other speakers include senior representatives from Google, IBM and McKinsey. Click here for the full list of speakers.

What they don’t teach you in Business School on how to create and build a start-up company.

Damian’s workshop on tech startups focuses on a real-world perspective that is often overlooked in Business Schools. Most are aware of the importance of developing the right ideas, the right business plans, on building the right team and attracting the right capital. These are of course all useful steps but the reality is often very different.

Damian is leading a workshop that will cover the lessons and scars learnt during the early days of SQLstream and other start-ups, giving a real-world perspective as to what was easy, what was hard, what worked, what didn’t, and at the unanticipated pitfalls and the unexpected boosts that together made up the start-up journey of taking a company from crazy concept to everyday reality.

The full material and paper from the workshop will be published here on the blog over the coming weeks.

Also coming up in the next few weeks, The “Big Data” Janus Effect (and why is everyone might be looking the wrong way). Damian’s talk on Big Data at SVCO 12 will look at the intersection of Big Data and real-time, and whether the current emerging Big Data storage technologies are the beginning of a new era, or the final death throes of the static storage approach to business and operational intelligence that has persisted since the 70s.

Silicon Valley Comes to Oxford is on the 18th and 19th of November at the Oxford Centre for Entrepreneurship and Innovation.

Posted under Uncategorized

Vice President Marketing
October 25, 2012

Real-time Big Data means the streaming, continuous integration of high volume, high velocity data from all sources to all destinations,  coupled with powerful in-memory analytics. It’s a paradigm shift from conventional store and process systems that’s playing well here at the Intelligent Transportation Systems and Solutions World Congress here in Vienna.

(more…)

October 20, 2012

The latest installment in the datacentre mysteries – in which Hadoop has long been the prime suspect for writing off the relational database (RDB) – is out now. With fox-like cunning, [the NoSQL movement] devised a plan so that traditional databases could be slowly migrated onto this new model, without anyone noticing.

And they would have got way with it too, if it wasn’t for a young startup, SQLstream, sticking its nose in where it doesn’t belong. Damian Black, its CEO, might have accidentally disrupted the perfect disposal of RDB, by creating a real time analytical system that doesn’t need to number crunch big data. Instead, it intercepts live data streams, in real time, and creates insights from them a lot quicker. If they used it to analyse tweets about traffic jams, for example, they could avoid hundreds of millions of snarl ups on the motorway by tipping people off quicker than those ludicrous outdated ‘information’ notices they have on all our major trunk roads.

Click HERE to read the article in full.

Posted under In the News

Vice President Marketing
October 19, 2012

GigaOM took their successful format for the Structure events to Europe this week for the first time and SQLstream’s CEO, Damian Black, was one of the key speakers. In a session with GigaOM’s leading Big Data writer, Derrick Harris, Damian discussed how real-time Big Data can only be achieved using continuous streaming integration, and how standard SQL is the perfect vehicle as a powerful language for the massively parallel distributed processing of live data streams.

As always, the GigaOM events were extremely well executed,and drew many of the main thought leaders and business leaders in the industry. For example, Ame Awadakkah CTO of Cloudera and Werner Vogels CTO of Amazon.

Real-time Distributed Big Data Processing

Damian’s talk was entitled ‘The Rebirth of Dataflow Computing: Real-time Distributed Big Data Processing’, and discussed how dataflow, an architecture for massively parallel distributed computing, is ideal for streaming Big Data integration and analysis. And SQLstream’s dataflow architecture harnesses the power of parallel dataflow using standard SQL as the data management language.

SQL is ideally suited to driving massively scalable applications based on a parallel, distributed processing architecture. In turn, this enables our customers to respond to each and every new record or piece of data as they arrive, in real time. So SQLstream is super low latency, unlike traditional data warehousing and Hadoop storage platforms, where data must be stored first and then queried. Even if you query faster, querying faster is not real-time.

Posted under Big Data · Real-time · Streaming SQL

Vice President Marketing

We’re at the Intelligent Transportation Society’s World Congress in Vienna, Austria next week, Mon 22 – Fri 26 October. The theme for this year is ‘Smarter on the Way’ and this is the first year since 2009 that it’s been held in Europe. The full panoply of intelligent transportation issues is addressed, everything to do with software, hardware and firmware that improves the traveler experience, transportation network utilization and capacity optimization. However, the combination of real-time traveler information systems, wireless sensors, GPS and predictive analytics is a specific focus. And increasingly, telematics and V2I/V2V applications are represented.

We’re going to be exhibiting, but also presenting on a special interest panel session: Optimizing ITS from a Customer Service Perspective. Essentially looking at how new technology can be used to make a significant and noticeable difference to a traveler’s experience. Our focus is to bring to bear our experience of consumer and customer quality of experience monitoring from other industries, particularly telecommunications. Social media and Twitter also has a role to play, with significant interest is extracting semantic information from Tweets to detect serious traffic incidents and congestion as it happens. However, there are issues with relying on social media, as I’ve discussed here in a recent ITS International article, Social media mooted for traffic management.

Industries such as transportation, M2M and telematics are clearly leading the way in deploying real-time Big Data applications. And that’s understandable, it’s sensor networks and the Internet of Things that’s creating vast volumes of high velocity data where each event or record has value for only a very short period of time after its creation. If you’d like to understand more about streaming geospatial analytics from live vehicle GPS data, the video below presents a brief overview of a recent deployment.

Continuous ETL for Google BigQuery

There is another significant difference from the traditional business intelligence and real-time analytics market . The mainstream Big Data market is focused entirely on Business Intelligence, a one way function for  collecting data in a static store, and displaying the results on a graph or report. Visualization is also an important consideration for industry, but streaming integration is the first requirement, and the ability of the platform to support real mission critical applications a close second.

The Intelligent Transportation industry has all three requirements in abundance – vast volumes of fixed, wireless GPS sensor data that must be integrated seamlessly across the operational systems, mission critical operation as significant safety, environmental and traveller impact decisions are made as a result, and the abilty to visualize network flow, congestion and alarms.

So if you’re in Vienna, Austria next week, drop by our booth, P21.


Vice President Marketing
October 11, 2012

Perhaps the highlight of Oracle OpenWorld last week, or at least, the most commented on by attendees at our booth, seemed to be Larry Ellison’s demo of Exadata and Exalytics – querying 10 days or so of stored twitter feeds with the hope of finding the best US athlete from the recent London 2012 Olympics to endorse a car company. This seemed to strike a chord with the audience. How many organizations employ a marketing analytics company to spend a vast amount of time poring over data to work out the top candidates for a marketing campaign? That said, would the CMO really go with a query result, or chose their favorite in any case?

Cloud and business applications were a focus, although as others have blogged elsewhere, despite 80+ acquisitions in the past few years, Oracle remains a database company. Major announcements / news included:

  • Release of Oracle 12c (the ‘c’ for ‘cloud’), and the announcement of its first multi-tenanted and ‘pluggable’ databases got a few ripples of applause from the audience.
  • Exadata X3 box, the in-memory machine with 22 raw TB of memory and a claimed 10X compression making a total of 220TB of ‘memory’ in a rack. Oracle claims this is 100 times faster that the Exadata Oracle launched in the last few years.

Streaming analytics and Twitter

Back to Larry’s Twitter example. Of course, this can be achieved easily as a streaming application in real-time. Semantic streaming is something SQLstream’s been doing for some time, taking unstructured data such as twitter, emails and texts and determining sentiment and aggregated scoring in real-time. Use cases include identifying traffic incidents on the road networks to augment geospatial analytsis of vehicle GPS data, and also in telecommunications, to better determine in real-time a customer’s true perception of their quality of experience for delivered services.

The numbers seemed impressive – Larry crunched nearly five billion tweets and 27 billion social media relationships. But breaking this down, is this really a Big Data problem? Five billion tweets, even over a one day period (I believe the demo was 10 days), is only 58,000 tweets per second. This is well in access of Twitter’s top peak loads during major events such as the Superbowl. But well within the capability of SQLstream’s real-time streaming Big Data platform, even on an entry level single server, 2-core machine. Of course, the complete solution architecture may include data storage platforms such as Hadoop or Oracle, where aggregated streaming results can be loaded and persisted in real-time, further crunched in the data warehouse, and historical analysis joined back with the real-time streams to help identify better any moving trends.

It was an interesting demo nonetheless, and one that really should be completed in real-time as a streaming problem. SQLstream’s ability to analyze and aggregate streams across in this case keywords and hashtags, provide geospatial and clustering analysis, as well as delivering raw and aggregated data as continuous streams to the backend storage platforms, makes this very achievable today.

On the show floor

Oracle OpenWorld Speaking RobotApart from the heavy footfall at the SQLstream booth, perhaps most notable was the increasingly uninventive marketing mechanisms used to persuade unsuspecting attendees to listen to product pitches based on the promise of winning a piece of Apple hardware. Surely marketing managers can think up something a bit more inventive than an iPad? The exception was the the speaking robot. Not sure if this was an exhibit floor attraction, although I saw it ‘chatting’ to passersby on the Wipro booth.

Contact us if you’d like to find out more about SQLstream and our streaming Big Data management platform.


Vice President Marketing
September 28, 2012

We’re going to be exhibiting at Oracle OpenWorld with SQLstream next week. Looking forward to it following some great analyst briefings over the past few weeks and a lot of interest now from  companies looking for a real-time Big Data management platform where they can built there apps using standards-based streaming SQL.

In the beginning, or at least 18 months ago, there was Hadoop, and an increasing awareness that the Internet, sensors, M2M and cloud infrastructure was generating more data that existing platforms could manage.  The trend was Map-Reduce, NoSQL databases and a wide range of analytics companies claiming ‘real-time’. But querying faster is not real-time, even with the distributed power of Map-Reduce and more structured storage platforms such as HBase and Cassandra. Nor does Hadoop and related technologies address one key requirement – the real-time integration of high volume, high velocity and high variety data. Real-time, streaming integration is essential to making Big Data applications work.

Even more importantly, mainstream industry is looking to adopt real-time streaming technology, but does want to base their mission critical applications on high latency, non-standardised development frameworks.

This is where SQLstream comes in.  A real-time Big Data management platform for integrating live data, and performing real-time, in-memory analytics on the data as it streams past.  Industry and major companies have a real-time need, but prefer the use of standards such as SQL. The ability to lift existing SQL from their high performance analytics databases and data warehouses and drop the queries into a real-time streaming platform is a key.

So if you’re going, come and see us, we’re at #106 Moscone South.

Posted under Uncategorized

Vice President Marketing
September 25, 2012

This week we’re at the Intelligent Transportation Society of California’s Annual Conference & Expo in Sacramento. The conference is focussed on the adoption of advanced technologies to improve traveler mobility and heighten safety in California.  Attendees include specialists from government, industry and academia.

This week we’ll be demoing real-time traffic congestion analytics and visualization based on processing live vehicle GPS data. Play the video below to watch a short video of the live customer case study.

The real-time traffic analytics and connected vehicle programs are of particular interest to SQLstream.  SQLstream turns live data into real-time value, enabling industries such as Transportation, Telecommunications and Automotive Telematics to integrate and analyze all their real-time data on a single, streaming Big Data platform.

So if you are visiting Sacramento this week, we’d love to meet up.


Vice President Marketing
August 23, 2012

SAN FRANCISCO, CA, August 23, 2012. SQLstream Inc, the leading standards-based real-time Big Data solutions provider, today announced the appointments of Glenn Hout as Vice President of Sales for the Americas and Chris Clabaugh as Vice President of Business Development. Hout and Clabaugh join the senior management team to build on SQLstream’s strong foundation in North America, and to expand SQLstream’s success into Asia Pacific and Europe.

Damian Black, SQLstream CEO, attributes SQLstream’s success to the rate with which industries such as transportation and telecommunications are adopting real-time, streaming Big Data technology. “As a result, we have seen a surge in customers seeking real-time Big Data solutions,” continues Damian. “Chris and Glenn bring passion, experience and leadership, as well as the existing relationships necessary to establish SQLstream as industry’s preferred partner for mission critical, real-time Big Data applications.”

Glenn Hout brings twenty-four years’ experience in the database and software applications industry with both early-stage and established software organizations. Glenn joins SQLstream from Big Data startup Algebraix Data, and prior to that, held management positions at Oracle, Information Builders and Hyperion solutions.

Chris Clabaugh joins SQLstream from Kabira Technologies (acquired by TIBCO) where he was Vice President of Business Development, overseeing ISV and OEM partnerships globally, managing branch operations in Japan, and opening new markets and geographical expansion in the Americas. Prior to Kabira, Chris held management positions at Collabnet, Progress Software, Allegrix and SCO.

Interested in Real-time Big Data solutions? Register here for a 60-day free trial of SQLstream s-Server. The download version is fully functional, and includes all adapters, drivers and development tools.

About SQLstream Inc.

SQLstream is the leading standards-based platform for integrating and analyzing streaming Big Data, forging real-time competitive advantage from live service, system and sensor data. SQLstream’s standards-based, distributed and scalable architecture uses industry standard SQL for the rapid analysis of high volume, real-time data streams. Standards mean lower costs, proven performance and seamless integration. With SQLstream, our customers are turbo-charging their Big Data environments for real-time, and responding with confidence to business exceptions based on accurate, up to the second information. SQLstream is headquartered in San Francisco and can be found on the web at www.sqlstream.com.

Media contact: Ronnie Beggs, +1 877 571 5775, pr@sqlstream.com

Posted under Press Releases
August 3, 2012

Datanami article on SQLstream for real-time Big Data solutions, reporting on a SQLstream application for real-time traffic analytics based on live GPS data.

“Providing real-time traffic updates is a practical and feasible application of the big data analysis technology that exists today. According to real-time streaming big data platform vendor, SQLstream, which aims to integrate and quickly analyze live data feeds,  “High-performance systems such as SQLstream Transport are able to transform large volumes of raw GPS data into real-time actionable information.” The transformation in question took place on the roads of Venezuela in the form of SQLstream’s ETL Connector for Google Big Query….”

Read the full article on datanami here http://bit.ly/M9ps1b

Posted under In the News

Vice President Marketing
July 24, 2012

SQLstream, the leading platform for real-time Big Data integration and analytics, today announced it has joined the Google Cloud Platform Partner Program as a Technology Partner with its release of the Continuous ETL connector for real-time Big Data integration with Google BigQuery.

Continuous ETL for Google BigQuery

SQLstream’s continuous ETL is the key building block for powerful real-time Big Data solutions that integrate vast volumes of real-time streaming Big Data with historical trend information. SQLstream provides streaming integration for structured and unstructured data in real-time and on a massive scale, overcoming the poor scalability and high latency issues that are typical with traditional batch-based solutions. With SQLstream, organizations can act instantly on new information as it arrives, improving operational efficiency and driving new revenue opportunities.

“The Google Cloud Platform Partner Program enables us to integrate our product offerings for real-time Big Data with the power of the Google cloud platform,” said Damian Black, SQLstream CEO. “SQLstream is the only real-time, streaming Big Data management platform that is built on the ANSI and ISO SQL standards, making it the perfect real-time complement for Google Big Query. Our customers benefit from the power of the Google Cloud Platform while utilizing the same SQL skills to perform streaming Big Data integration with real-time analytics.”

The continuous ETL integration has already been deployed by Grupo Intech in Venezuela as an integral component of a SQLstream and Google BigQuery solution for detecting road network traffic congestion in real-time using GPS data. “Understanding the quality and coverage offered by each of our GPS data providers is essential,” said Marcelo Ricigliano Cantos, CEO of Grupo Intech. “Google BigQuery is updated continuously and in real-time by SQLstream, and generates on-demand confidence indicators as to GPS data quality and reliability.”

“To help customers get the most out of our cloud platform products,” explains Eric Morse, Head of Sales and Business Development, for Google’s Cloud Platform, “we work closely with technology companies, like SQLstream, that provide powerful complementary solutions integrated with our platform.”

Google’s Cloud Platform products enable customers to implement:

  • Cloud app solutions, such as mobile apps, social apps, business process apps, and websites, using Google App Engine and Google Cloud SQL.
  • Cloud storage solutions, such as high-end backup and recovery, active archiving, global file sharing/collaboration, and primary SAN/NAS, using Google Cloud Storage.
  • Large-scale computing solutions, such as batch processing, data processing and high performance computing using Google Compute Engine.
  • Big data solutions, such as interactive tools, trend detection and BI dashboards, using Google BigQuery and Google Prediction API.

About SQLstream Inc.
SQLstream is the leading standards-based platform for integrating and analyzing streaming Big Data, forging real-time competitive advantage from live service, system and sensor data. SQLstream’s standards-based, distributed and scalable architecture uses industry standard SQL for the rapid analysis of high volume, real-time data streams. Standards mean lower costs, proven performance and seamless integration. With SQLstream, our customers are turbo-charging their Big Data environments for real-time, and responding with confidence to business exceptions based on accurate, up to the second information. SQLstream is headquartered in San Francisco and can be found on the web at www.sqlstream.com.

Contact: Ronnie Beggs, VP of Marketing, +1 877-571-5775

Posted under Press Releases

Vice President Marketing

SQLstream has joined the Google Cloud Platform Partner Program as a Technology Partner with the release of our Continuous ETL connector for real-time Big Data integration with Google BigQuery. Continuous ETL solves the real-time performance issue for Big Data storage platforms. Although very capable of processing large datasets once the data has been stored, Big Data storage platforms are not designed to process real-time streaming data.

Continuous ETL for Google BigQuery

SQLstream’s Continuous ETL connectors enable SQLstream to integrate and analyze vast volumes of live, real-time Big Data, and to update the Big Data storage platforms immediately as input data arrives. With existing ETL solutions, the data warehouse and Big Data storage platforms are updated in batch mode and are therefore always out of date. Continuous ETL updates continuously and in real-time.

Real-time, integrated Big Data solutions

Grupo IntechThe continuous ETL integration was carried out by Grupo Intech as an integral component of a SQLstream and Google BigQuery solution for detecting road network traffic congestion in real-time using GPS data.

GPS records are collected from several specialist data providers. Each GPS record contains the position, direction and speed of a vehicle. SQLstream collects GPS records from all the providers in real-time and turns the live GPS data feeds into real-time traffic flow information, and by applying a variety of congestion detection algorithms, is able to generate a map of the current traffic congestion positions, plus additional information as to the extent or severity of the problem.

SQLstream calculates real-time traffic analytics and displays the results in real-time on a map-based display. Roads are color-coded based on the average traffic speed relative to the posted speed limits for each road segment. Colored push-pins appear on the highways to provide further information on the potential problem, such as the average speed for every minute over the previous 15 minutes.

Continuous ETL and Google BigQuery

So how does Big Query help? GPS data is of variable quality, and also, as the data is available from many different providers, the network coverage for each provider can differ significantly.

Google Big Query is used to store the real-time traffic information produced by SQLstream, and to generate a confidence index on the coverage and quality, and therefore the reliability, of the incoming GPS records. Effectively, Google Big Query is helping Grupo Intech answer the question – do we have sufficient valid data for the real-time congestion information to be reliable.

SQLstream’s Connector for Google Big Query is used to deliver real-time analyzed data using a Continuous ETL operation. With continuous ETL, streaming data is continuously acquired, cleaned, conditioned, transformed and then periodically appended to the BigQuery table, for example, every minute or every 500,000 events. Therefore the data in Big Query is always up to date and accurate.

Here we see the Big Query table, which holds a minimum of 3 months of enhanced and interpolated data for ad-hoc data mining. The period can be extended if needed for improved accuracy.

Continuous ETL – more than real-time

SQLstream’s analysis of the road network is based on individual 10m road segments. Typically GPS records are not available for every vehicle for each and every road segment, so SQLstream interpolates missing road segment data in real-time, and continuously appends those results as well into the Google Big Query table – storing a complete real-time map of traffic flow for the entire road network at a level of granularity of every 10 meters. Therefore unlike traditional ETL tools that tend to offer only a basic mapping function, SQLstream delivers significantly more information into Google Big Query than is available in the source data.

How does it work

SQLstream allows the user to zoom in and center the geographic map to an area of interest. This action defines a bounding box and the coordinates of the bounding box are used in this first Big Table query to extract the number of 10 meter road segments within the bounding box. The action of defining the bounding box in SQlstream  launches a query similar to the following example:

select count(*) from road_elements
where
  "reLatitude" between 7.7109920000000001 
                   and 9.8525100000000005 AND
  "reLongitude" between -65.754776000000007 
                    and -62.958756000000001.

This query retunes the number of road segments within the bounding box (in this example, there are 191,866 road segments) which is then plugged into the query executed by BigQuery in order to calculate the percentage of road elements for each GPS data provider actually used for calculating the road network coverage within the bounding box.

select vendorID, float(integer(count(*) / 191866 * 1000)/10) as Confidence
from (select vendorID, reID from sample.july2nd
where 
 float(reLatitude) between 7.7109920000000001 
                       and 9.8525100000000005 AND
 float(reLongitude) between -65.754776000000007 
                        and -62.958756000000001
group by vendorID, reID)
group by vendorID order by Confidence desc

The query results are returned to SQLstream which displays the answer to the user. Big Query is ideal for these types of geospatial coverage metrics. There is a high volume of meaningful data stored in the Big Query table and the calculation of coverage is available immediately. By choosing a variety of bounding boxes, it is possible to obtain coverage metrics a country-wide basis, on a metropolitan area basis, and on for a much smaller problem area.

Conclusions

In conclusion, we have demonstrated how continuous ETL is the basic building block for integrating real-time and historical data. Continuous ETL enables offline data storage to be kept up to date directly from real-time applications. And unlike conventional ETL solutions that simply map data between systems, SQLstream’s Continuous ETL connector for Google Big Query delivers aggregated data, analytics as well as augmented and interpolated data values.

 


Vice President Marketing
June 25, 2012

We’ve been exhibiting at Structure 2012 in San Francisco, where our CEO Damian Black was speaking on dataflow architectures for massively scalable real-time Big Data computing. In fact, this was a milestone for us as Damian was on the very first Big Data panel at the first Structure event in 2008.

Dataflow is a technique for parallel computing that emerged from research in the 1970s. It’s based on graph-based execution models where data flows along the arcs on a graph and is processed at the nodes. It was decades ahead of its time in an era when hardware was expensive and real-world requirements for massively parallel, low latency computing architectures were not required in the mainstream. However, dataflow as an architecture has found its place and time, with the emergence of Big Data volume, real-time low latency requirements, commodity hardware and low cost storage. Dataflow is driving the architectures for today’s real-time big data solutions.

Structure 2012 - Dataflow comes of age

Click to view Structure 2012 presentation video

SQLstream adopted the principles of dataflow as the basis of our architecture for SQLstream s-Server. Our adapters turn any data source into a live stream of data tuples which are combined, aggregated and analyzed by the SQLstream s-Server platform. SQLstream has added one essential feature to data flow – the use of SQL as a dataflow management language. SQL has been used for some time as the language of choice for relational database management systems, and in this context is getting a bad press in light of new structures for Big Data storage and NoSQL queries. However, SQL is powerful, declarative (therefore applications can be built easily, quickly and cheaply) and is a natural, powerful paradigm for processing streaming dataflows. The benefit is extremely low latency with the ability to process massive volumes of live data over an unlimited number of servers – exactly the requirements of real-time Big Data. In fact, this is the only architecture capable of processing real-time Big Data streams. With real-time requirements now in the 20 to 100 million events per second range, power, scalability and low latency are key.

Dataflow architecture for real-time streaming Big Data computing

Diagram 1: Dataflow architecture for real-time streaming Big Data computing

The SQLstream s-Server architecture concept is illustrated in Diagram 1. As a dataflow architecture, each node is a streaming SQL statement – a continuous SQL query, processing arriving data over a moving time window (time windows can be from 1 millisecond for ultra low latency requirements, to months or even years where comparison against long term moving averages is required, for example, Bollinger bands). Why is this important? Well, it’s the only approach for low latency, real-time solutions, as information flows out of the system as soon as input data arrives, that is, the high latency of batch-based approaches such as Hadoop Map-Reduce is removed completely.

Mozilla Glow: Real-time download monitor with SQLstream and HBase

Mozilla Glow: Real-time download monitor with SQLstream and HBase

Damian presented a simple example of SQLstream and parallel dataflow in action. Mozilla’s Glow application is a continuously updating download counter for the Firefox 4 browser when it was released. The application used SQLstream s-Server to collect live download statistic from all the download servers worldwide. Download records were processed and aggregated in real-time and displayed on the Glow visualization map, illustrating exactly how many copies of the browser had been retrieved. SQLstream s-Server also provided a continuous ETL operation into Apache Hbase, storing aggregated and filtered records for further in depth analysis. Click here to watch the application in action.

Finally, and in contrast to Structure, we also attended a Gartner session last week with Merv Adrian and Svetlana Sicular, which sought to bring some sense of perspective to Big Data. This was really a reality check as to the current maturity levels of the Hadoop Big Data platforms and the effort required to deploy. The wider adoption across industry in general will require significantly more mature products and applications, particularly around the OPEX costs for deployment, security concerns and ability to deliver business intelligence all consumers in an large organization. The recommendation was to use an integrator such as Cloudera or Hortonworks. Mainstream organizations are looking at the Hadoop / Big Data approach, but many do not currently see either a use case or a reason for adoption. It was interesting to hear a perspective that didn’t need to be buzzword compliant, and presented a positive yet realistic perspective on wider adoption.

Posted under Big Data · Streaming SQL · Uncategorized

Vice President Marketing
June 7, 2012

We’re at Sensors Expo this week, showcasing in the Big Data & Analytics Pavilion. This is the first year the event has included a specific area for real-time Big Data solutions for sensor networks.

Real-time control in a Big Data World

SQLstream CEO, Damian Black, presented on Real-time Control in a Big Data World. The presentation focused on the increase of sensor data and the emergence of the “Sensor Internet”, plus the applications required to collect and analyze streaming sensor data, and to drive real-time actions and updates. In particular, addressing the emerging real-time Big Data challenges in this area driven by wireless and GPS technologies, M2M applications and V2V/V2I.

SQLstream Damian Black Sensors Expo 2012 Real-time Big Data Integration

Click here to view Real-time Big Data integration and analytics for sensor networks

Real-time streaming data integration for Big Data

It’s clear the primary challenge is not managing the data volume per se, or even delivering real-time operational intelligence, rather it’s the more fundamental issue of real-time streaming data integration. How can such huge volumes of data from many different sources and locations be integrated into the operational platforms, and how can the issues of multiple operational siloes be overcome to provide an integrated real-time control platform.

Interestingly, these are the exact same issues SQLstream addresses for the Big Data and Hadoop world in general – getting data in, getting data out, connecting existing data stores in real-time, and delivering real-time in-memory analytics on the data as it streams past:

  • Real-time streaming data integration of any data source and between existing storage platforms and operational systems
  • Real-time streaming monitoring and analytics on the arriving and streaming data
  • Scalability through parallel distributed processing of processing pipelines

The importance of geospatial analytics

Geospatial analytics is a key requirement in the sensor data market. Big Data analytics in general is about one dimensional problems, usually the correlation of similar events, or the correlation of events over time. The geospatial dimension is the key difference between Big Data platforms for the “Sensor Internet” and the wider IT / machine data applications. Fortunately this has been a feature of SQLstream for some time, and central to many of our customer deployments. For example, real-time traffic analytics from GPS data, and real-time seismic monitoring.

Posted under Uncategorized

Vice President Marketing
June 5, 2012

Glue Conference 2012 , Denver CO, at the end of May was a great conference, well attended, knowledgeable participants and is the only conference I know that looks at gluing cloud and mobile applications together with a developer focus.

There was the usual wave of NoSQL, cloud storage, cloud platforms and Hadoop presentations, as you’d expect, but also with some interesting keynotes as well. Ray O’Brien, CTO for IT at NASA. talked about the evolution of Nebula and OpenStack at NASA, and James Governor from Redmonk, talking about the evolution of historical analytics.

From our perspective, the strength of the show was in making physical rather than logical connections. Both partnerships and potential customer interest in building real-time Big Data applications, and how SQL has been repurposed as an API for streaming Big Data, moving it forward significantly from its roots as static data management language.

Real-time, streaming Big Data

Relational Streaming for real-time Big Data scaling

Relational Streaming for real-time Big Data scaling

Our CEO, Damian Black, presented on real-time streaming Big Data, both as a real-time alternative to Hadoop, and also as a complement to add real-time responses and streaming integration to existing Hadoop installations. One question we were asked several times was why SQL? A good question. This isn’t a religious debate about the language by any means, and if we had opted to build a Big Data batch storage and analytics platform (e.g. like Hadoop), we would have gone a different route.

However, when it comes to processing streaming tuples in real-time, a standard SQL approach has two big advantages over all others. First, with the extension of the SQL WINDOW operator to process streaming data over fixed time windows, both structured and unstructured data can be processed painlessly without having (no pun intended) to define a static schema and without the need for any coding whatsoever. In effect, SQLstream processes streams of arriving tuples over time windows and pushes out the results to other systems.  Similar in concept at least to Hadoop, although Hadoop is purely batch-based, processing static files and pipelining sets of tuples through low level Map-Reduce functions.

Relational Streaming for Real-time Big Data

Click on the title slide to view the real-time Big Data presentation

However, the second benefit is equally important. Streaming SQL queries include standard operators such GROUP BY and PARTITION. These provide the best clues possible to a query planner capable of automating the dynamic scaling of streaming pipelines over vast numbers of servers. This gives a reliable and controllable mechanism for Big Data scalability without the need for hardcoding server allocation hints.

Real-time at GlueCON

The strength of the real-time track at GlueCON was encouraging. It was interesting though that the term ‘real-time’ is now about as over used as ‘Big Data’, and about as poorly understood. For SQLstream, it’s the streaming integration of any and all data sources with in-memory analytics, processing streams at millions of events per second. For some other vendors, it appears real-time drops off at significantly lower rates and numbers of connections!

Next stop, GigaOM Structure in San Francisco

Next stop GigaOM Structure in San Francisco, June 20 / 21 at the Moscone Center. Visit us there if you’re attending.

Posted under Big Data · Real-time · Streaming SQL · Uncategorized

Vice President Marketing
May 22, 2012

We’re at ITS America Annual Meeting, National Harbor, Washington DC  this week (see the team in action below), the yearly opportunity for the US intelligent transportation community to get together and discuss how IT and technology can be used to better serve travelers, industry and government. Real-time, the Smart City and Vehicle-2-Vehicle communication have been the trend topics.  Bill Ford previously said “the car will be a rolling set of sensors” has echoed here.  Martin Thall, VP Telematics at Verizon talked about the issues of getting the data up from the vehicle, and how 4G / LTE will be key due to the large volume of fast data, however the pricing and subscription model for 4G services is currently one of the hurdles to wider telematics adoption.  Essentially real-time Big Data requirements for mission critical applications, the area where SQLstream excels.

Chris Vein, deputy white house CTO discussed the need for open standards, and for government to find ways for private industry to participate more.  The idea of “government as platform, encouraging the “lean startup” and to offer “smart disclosure” of data to encourage information sharing and real innovation.

Why is my GPS Travel Time always wrong?

Ursula Burns, Xerox CEO captured exactly the need for SQLstream s-Transport. How come, every morning when I drive to work, does my GPS always tell me the journey will take 37 minutes, and it has yet to take 37 minutes?  Accurate and reliable real-time Travel Time is the first problem SQLstream solved with the SQLstream s-Transport product.  Using GPS data to monitor traffic flow accurately and in real-time, and present correct end to end journey time information throughout the journey, not just at the beginning.  (More on SQLstream s-Transport here).

I’ll update and publish a final report on the show  these next final few days. If you’re attending ITS America Annual Conference, National Harbor, Washington DC, please drop by Booth #631.

Posted under Uncategorized

Test
May 10, 2012

This week I’m attending an interesting conference at UC Berkeley called the “Berkeley conference on Streaming Data”.  The organizers are primarily astronomers and statisticians, but the talks discuss issues and solutions to streaming data problems across a wide selection of scientific areas and engineering applications.  Real-time streaming Big Data applications presented included oceanography biology genetics, reading handwriting, astrophysics, particle physics, recommendation engines for social media, and inevitably, real-time fraud detection from live data feeds.

I presented on a deployment of SQLstream as a Dynamically Scalable Cloud Platform for the Real-Time Detection of Seismic Events. Based on work with UCSD seismologists, SQLstream has been deployed to detect significant events in data collected from a large grid of seismic sensors. A large-scale data infrastructure (the OOI/CI) provides raw signal data over an AMQP message bus.

Plot of Seismic Events

SQLstream monitors live seismic data feeds in real-time, applying heuristic algorithms that look for patterns indicating earthquakes. The live system scales dynamically across multiple servers in a cloud environment based on the current demand. You can view the presentation here.  I also blogged previously on the application here.

In conclusion, I have two main observations from the conference so far (it continues until Friday). The first is that the majority of fields in science and technology appear to have a Big Data and often a real-time Big Data problem.  Secondly, the extent of the innovation and computer science resources dedicated to solving these problems.  In particular for this conference, developing algorithms for data analysis and machine learning (that is automatic pattern recognition) that work on streams of flowing data.  It’s clear that traditional data management and even Big Data batch-based methods don’t work when you need continuous results from dynamic data. And the amount of data is huge.


Vice President Marketing
April 25, 2012

The Text Analytics Summit in London this week was an opportunity to catch up on the latest trends and state of the Text Analytics market.  An interesting couple of days with a few themes emerging.

Firstly, Big Data.  Not entirely unexpected, but almost every presentation referred to Big Data in some shape or form.  In part this was referring to the volume of data to be processed, but primarily in the context of databases for the storage and processing of unstructured data of any volume.

Although not discussed explicitly, there’s obviously a search for business models that work.  Most applications were B2B platforms, sold as a package of product, services and consultancy, enabling organizations to better mine text data for market and competitor intelligence.  However, some were seeking to monetize through subscriptions to information feeds.

 

For SQLstream, we presented on the use of real-time text analytics for improving incident detection and prediction.  In particular, the use of real-time Twitter and text messages for identifying Quality of Experience issues with IP content services, but also the use of Twitter for improving real-time incident detection in transportation networks.  And in line with the rest of the conference, we did our bit for Big Data, describing how real-time streaming integration and analytics can be built on unstructured data analytics as an integrated real-time Big Data and Hadoop platform.


Vice President Marketing
April 19, 2012

Joining real-time structured and unstructured data feeds for better accuracy and reliability from your operational intelligence, and the Text Analytics Summit, 2012, London.

Three IT trends have emerged over the past year – Big Data, real-time and the importance of unstructured data. Taking the latter first, there is an increasing awareness that much of the data we have available to us today is unstructured (Cloudera amongst the many claiming 80% of all data is unstructured).  Unstructured data includes text messages, documents, tweets emails and video content. There’s also a growing industry for tools and software that perform unstructured data analytics – primarily text analytics using semantic modeling, tagging and subsequent analysis.

The past year has also seen Big Data and Hadoop emerge from the rarefied atmosphere of California’s Silicon Valley into mainstream IT.  Driven by statistics such as 90% of all data available today has been generated in the past two years, Big Data as a functional area for primarily unstructured data is here to stay, and is effectively supercomputing lite for the masses.

The need for real-time streaming data management

However, the real-time trend is less well served today by either Hadoop or by the currently available tools and software for unstructured data analytics. Real-time is about the need for immediate detection and response – turning data sources into live data feeds, and processing the data on the fly, then loading batch based distributed platforms such as Hadoop as an output data stream.

‘Stream Reasoning’

I’ve also seen the term ‘stream reasoning’ used to describe the real-time processing of unstructured data, although this is still an area that is less well developed and understood than the more mainstream text analytics from stored data.  ‘Streaming Reasoning’ is the ability to process and respond to semantic knowledge about tweets, messages and other social media interaction in real-time, on the fly. The diagram below illustrates how a semantic modeling library has been plugged into a real-time streaming pipeline in SQLstream – the example is based on SQLstream’s GATE UDX but any library with reasonable performance and a query response API can be plugged in.

Combining streaming structured and unstructured live data feeds

Unstructured data feeds, such as text messages and tweets, are streamed through the semantic tagging UDX and library, with the output of this stage being real-time streams of semantic tagged data.  The data can then be analyzed and frequency charted in real-time.

Text Analytics Summit, 2012, London

I’ll be speaking on this topic at the  Text Analytics Summit, 2012, London.  I’ll be discussing how to combine streaming reasoning (admittedly, mostly Twitter messages) with structured data, with the objective of improving the overall accuracy and reliability of the resulting operational intelligence.  I’ll be using a couple of examples – customer experience management for IP content services such as VoIP and VoD, and also improving the accuracy and reliability of traffic congestion information and travel time information – how can text analysis of tweets and messages help to pinpoint the severity of road network traffic problems.

Look forward to seeing you there, or if you can’t make, I’ll be blogging on the highlights next week.

 


Vice President Marketing
April 2, 2012

Last week SQLstream sponsored and CEO Damian Black presented at Structure Data in New York, a conference exploring “the technical and business opportunities spurred by the growth of big data”.

It’s clear that Big Data has moved on considerably in a very short space of time. From the Silicon Valley, 101 world of Java developers and Hadoop, into the mainstream wider business world (but still with Hadoop!).

Some themes emerging from the conference:

  • The basic need to deliver high performance, massively scalable computing infrastructure as data volumes grow exponentially. It’s clear that the pain from structured and unstructured data is driving different approaches at different stages in the data management lifecycle – better visualizations, better cleansing and filtering, and a better understanding of the appropriate analytics tools that are most applicable at each stage.
  • The emergence of the SQL layer. It’s clear Hadoop has its strengths and is here to stay. It’s effectively ‘supercomputing lite’ and given today’s data volumes, is just the tool for the job. However, there are a couple of trends emerging. First, is it actually necessary to store all the data, when much of it is obviously not of interest? Second, once the initial analysis of both all structured and unstructured data is achieved, there’s an emerging layer above Hadoop that’s looking very structured.  Both these functions are looking much more SQL-like.
  • Real-time, low latency analytics. Hadoop is not, nor does not claim to be, a low latency, real-time data management platform. There is a well-defined business need to analyze log file, sensor and network data in real-time (sub-second to a few minutes latency), but also to stream the arriving data through to Hadoop for further analysis. Obviously this layer needs to as scalable, if not more so, than the underlying Hadoop platform.

Damian’s presentation Structure Data focused on relational streaming – massive-scale parallel data processing using SQL, generating real-time results from streaming input data. The talk described relational streaming as a standalone real-time management layer, and also SQLstream integrated with Hadoop as the streaming layer in the Big Data stack (you can also read the GigaOM report in the presentation here).

 

Posted under Big Data · Events · In the News

Vice President Marketing
March 26, 2012

GigaOM reports on Damian Black, SQLstream CEO, talking about streaming Big Data at Structure Data, New York,March 21 – 22. In the talk entitled “Streaming Big Data: Millions of events per second”, Damian discussed the similarities between Hadoop Map/Reduce and the parallel, distributed architecture used for streaming data processing in SQLstream.

Click here to read the GigaOM report.

Posted under In the News

Vice President Marketing
March 20, 2012

Visit our new website to find out more about real-time Big Data applications

Big Data is here to stay. The breadth of the term Big Data may change as it becomes as much a marketing imperative as the ‘Cloud’ word, but the requirement for ‘supercomputing lite’ processing for the non-supercomputing world of enterprise data is a must have.

The rise of Big Data has happened in parallel with the emergence of real-time operational intelligence, and the extension of real-time analytics into the world of real-time updates and process control. Much of the recent interest has focussed on how these two worlds merge into a single complementary solution.

The NoSQL BigData platforms offer massively scalable, resilient data processing over commodity hardware. Ideally suited to scaling large scale data problems over hundreds or thousands of servers. However, platforms such as Hadoop do not support, nor were designed to support, real-time streaming data processing and analytics. Their forte is the batch-based, highly scalable, store-compute loop of map/reduce.

That’s where SQLstream comes in. SQLstream collects and conditions real-time updates from sources such as log files, sensor networks and GPS events, and both integrates streaming data into and from Big Data stores, but also generates real-time analytics from the data as they stream past. The SQLstream architecture also has parallels to that of map/reduce. SQLstream uses Relational Streaming, which is a paradigm for processing streaming Big Data tuples using standard SQL queries. SQL offers strong potential for automatic optimization and distributed parallel processing of streaming data. Whereas platforms such as Hadoop execute batch queries over stored tuples, SQLstream and Relational Streaming executes continuous queries over arriving data.

We’re also at Structure Data this week in New York, where our CEO, Damian Black, will be presenting on the wider area of streaming Big Data and massive scalability. However, if you are attending, visit us for a demo of the ‘millions of events per second” program, and a demonstration of massively parallel stream processing on an Elastic Compute Cloud.


Test
March 13, 2012

SQL is a declarative language – a SQL query is a specification for the result, it’s neither a recipe nor a program to produce the results. A traditional relational database query returns a set of rows, the ResultSet. A streaming SQL query in SQLstream returns a stream of rows. That is, the ResultSet may never end. In a traditional relational database query, all the rows are fetched, and the ResultSet scans them. With a relational streaming query, the result rows do not exist as yet – as time goes by they come into existence as arriving data are processed.

However, just like any relational database, the SQLstream stream computing Server has two main components:

  1. The query engine or planner, calculates the most efficient plan to produce the requested results – this is the query plan.
  2. The data engine or kernel, executes the query plan to produce the results. The scheduler controls the execution process.

Streaming dataflow graphs

Executing a query means computing the results from the inputs. For a streaming query that means processing the rows in the input streams as they arrive. The execution is organized as a dataflow graph, that is, a mathematical directed graph of nodes and arrows, where the nodes represent elementary operations on data, and the edges into and out of a node represent the input and output data streams. In effect, an assembly line that produces results, where the nodes are the machines or stages on the line.

A query plan defines one of these data flow graphs. The executor runs the data through the graph: it is responsible for executing the nodes, where each node consumes rows arriving on input edges and produces rows on the output edges. Of course each output edge is often the input of a downstream node.

Multiple, connected query plans

In a traditional static database, each query plan is independent and transitory, and operates against persistent tables and indexes. In a relational streaming platform, the query plans last forever and are interconnected. Although streams are used just like tables in SQL, they are not persistent, in fact they have no contents at all, and can be as rendezvous points that accept input rows and pass them on to their output consumers.

Now if the data flow graph were a physical system — say, a collection of transparent plastic straws with colored water flowing through them — then all the processing would be happening simultaneously. However, for the software abstraction of the streaming straw pipelines, it’s not practical or necessary to run all the nodes at the same time. It is the scheduler that manages this network of interconnected query plans, and when and how to execute each node.

In a traditional, static database, the result of a query is a set of rows that are computable all at once. The executor can give good performance by running one node at a time, pushing batches of rows through the graph. A streaming database is different. When the inputs are streams of recent events, arriving in real time, it’s important to produce the outputs fast enough so that the result rows are timely.

The execution works by pushing outputs, not by pulling inputs, and it means executing several nodes at the same time, whenever possible. This requires a finer management of the execution objects and the ability to schedule parallel execution on the nodes.

Parallel scheduling of stream execution

In SQLstream, multiple, interconnected query plans are being executed at the same time. Together they constitute a large dataflow graph in which each node is a mini data processing machine that performs a simple operation on its input data, and passes its output to the next node.

The scheduler is responsible for managing the interconnected dataflow graph. It keeps track of the status of each node: at any moment, some are running, some are ready to run, some are waiting for more input, some are waiting for their output to be consumed. Each node is allowed to run for a quota of time, and where possible nodes are selected to execute in parallel as separate threads.

The SQLstream scheduler may not want to be fair: some branches in the graph (some streams) may be more important, some may need high throughput, some may need lower latency. The application designer decides, and the SQLstream scheduler delivers.

Next time …

This is the first is a series of blogs discussed both the principles and practical examples of parallel stream execution. The next blog in the series will look at some real world examples, and how parallel execution is essential to deliver both high throughput and low latency requirements.

February 17, 2012

One of the great advantages of SQLstream as an analytical platform is that it uses the most popular, standardized language for data analysis, SQL. SQLstream worked to make only the minimum number of extensions to SQL necessary to encompass the streaming data paradigm, so that most streaming SQL pipelines look almost indistinguishable from SQL for reading static relational data. This enables data analysts to leverage virtually all of their existing SQL skills in the streaming context.

Similarly, SQLstream felt it was important to make the streaming environment feel familiar and productive to application developers as well, so SQLstream supports the standard JDBC interface for using streams, again with just the minimum extensions necessary to encompass streams.

This post assumes a basic familiarity with JDBC and its main components: connections, statements, and result sets. First we’ll look at these in their usual tabular context, then see what it takes to extend the model to streaming data. All the data items come from the SALES example schema that comes with SQLstream.

Reading JDBC Data

In a database world, the pattern for reading data is quite standardized: Connect to the database, execute a query, and read each row that comes back until there are no more rows. For example,


Connection c = getConnection();
try {
     Statement s = c.createStatement();
     ResultSet rs = s.executeQuery(“SELECT * FROM SALES.EMPS”);
     while (rs.next()) {
           System.out.println(rs.getString(“NAME”) + “ “ + rs.getString(“EMPID”));
     }
     rs.close();
     s.close();
     c.close();
} catch (SQLException se) {
}

This simple example loops through the entire EMPS table, printing the name and employee ID number for each row in the table, then finishes. The key for this model is that the result set is finite. Even if this were a very large table, the loop would eventually process every row. So you can handle reading the data from a table as a monolithic step in a sequence of procedures.

The Challenge of Streams

In a streaming data environment, however, you have to change a couple of your basic assumptions. In particular, I have found three Rules of Streaming that dictate how to write client code:

  1. There always might be more data.
  2. You never know when the next row might arrive.
  3. The rate of the rows matters.

The same pattern shown earlier for reading finite tables will work for streams,  as long as you don’t expect your application to do anything else. For a streaming ResultSet, the next() method only returns false when the stream closes. In many applications, that might never happen, or at least might not happen for weeks or months or longer, so clearly you cannot simply wait for each stream to end.

This is particularly critical in an application such as SQLstream Studio, where a developer needs to be able to edit the definitions of objects and at the same time be able to watch data flowing in existing streams. These streaming data views – known as Inspect windows – have to function in an event-driven, multitasking environment. There can be a virtually unlimited number of them active at any given time, along with editors, console views, and other dynamic content. So nothing should block the updates of other views. And for a little added complexity, Studio also needs to be able to handle non-streaming items as well, so preferably the same code should handle either tables or streams.

Studio also has to deal with a number of other requirements that are fairly unique to the development environment, such as how to manage updating a human-readable window of maybe 20 or 30 rows at a time from a stream that might be flowing at many thousands of rows per second. For now, we will just focus on the tasks common to handling streams in any application environment.

Reading in the Background

As with any long-running task, the solution involves partitioning the work into threads. Because of Rule #1 (“There always might be more data.”), you should just assume that reading from a stream needs to happen in a background thread. There is essentially never a scenario in which you want to wait for an entire stream to be read before proceeding to the next task.

To some extent how you implement your background stream-handling tasks will depend on the environment you are working in. Variations in the data rate and the required responsiveness of the application might cause you to make some different choices.

One good rule of thumb is to do as little processing as possible in the stream-reading loop. Slow processing of incoming rows can result in “back-pressure” in the data pipeline. As a result, it’s best to read rows from the ResultSet as expeditiously as possible, handing off the data to other threads for processing.

Here, for example, is a simple thread to read rows from the SALES.BIDS stream:


class BidReader extends Thread
{
      @Override
      void run()
      {
            try {
                 Connection c = getConnection();
                 Statement s = c.createStatement();
                 ResultSet rs = s.executeQuery(“SELECT STREAM * FROM SALES.BIDS”);
                 while (!interrupted() && rs.next()) {
                       // read columns and put into work queue for processing thread
                 }
                // close rs, s, and c
            } catch (SQLException se) {
            }
      }
}

Note that the loop doesn’t depend solely on the ResultSet’s next() method, but also tests whether the thread has been interrupted. You probably wouldn’t use this exact mechanism (I prefer to override Thread.interrupt() and set a boolean flag), but it shows you succinctly that you need to be aware of more things than just whether there is more data to read.

Another thing to keep in mind is that ResultSet.next() blocks until either more data arrives or the stream closes. That’s yet another reason to have this happening in a background thread.

You’ll notice that code snippet references a “processing thread.” That’s because of Rule #2 (“You never know when the next row might arrive.”). It’s generally good to decouple the reading code from the processing code. One easy model is to have the reader read each row, then add the row to a shared queue structure, where the processing thread can pick up the rows and process them asynchronously. This minimizes the chances of a slow processing step causing the reader to slow down and potentially push back up the stream. To the extent possible, you’d like your reader to be sitting and waiting for the next row when it arrives, rather than constantly trying to handle a backlog of incoming rows. If you have to have a backlog, you want it to be in your application, not in the stream server, because that will slow everyone down, including other applications trying to read from the same streams.

Such decoupling doesn’t really make sense in a pure database application, since there will be a finite number of rows to process, so your goal is to minimize total processing time. If there are 1,000 rows, it doesn’t matter whether you read them all in one second, then spend an hour processing, or read them over the period of an hour, processing as you go. The total amount of data read and processed is the same.

In the streaming world, it is important to read the data as quickly as possible in order to maximize the overall throughput of the system. That’s Rule #3 (“The rate of the rows matters.”) coming into play. The data throughput of a given stream is gated by its slowest reader. Part of the contract your client needs to fulfill is to not bog down the system for other clients.


Vice President Marketing
January 26, 2012

QoS and service level monitoring has always presented a challenge for telecommunications companies. With the increase in uptake of IP voice and video services, the vast data volumes generated, and the lack of an end to end view, make monitoring the service experience in real-time increasingly difficult.

Real-time QoS Monitoring

In this blog I’m looking at the core building blocks of a real-time IP service monitoring solution by using a much simplified view of a real-time application. Diagram 1 illustrates the basic problem – how to monitor an IP service when the end to end view is only possible by piecing together large volumes of events from many different sources – the core network provider’s network, the home network and the cable modem, and the service providers platforms.

In SQLstream we capture each event stream in real-time. Applications are built as streaming pipelines – unlike a traditional database solution, where event data must first be stored and then processed, SQLstream streams the data through processing views, capturing, combining, filtering, aggregating and applying analytics to the events streams, without having to store the data.  This enables real-time operational intelligence with extremely high volume performance with very low latency.

The first views in the pipeline capture the data streams. A declaration for an external data feed is shown below, the real-time MyEvent stream, where source of events is the external system agent or integration adapter.

CREATE OR REPLACE STREAM MyEvent
( "eventName" VARCHAR(10),
  "eventSeq" BIGINT,
  "eventVal1" INTEGER,
  "eventVal2" SMALLINT,
  "eventVal3" BIGINT )
DESCRIPTION 'source of events';

The raw MyEvent data is first filtered, searching for the events of interest. As illustrated in the code example below, these initial views tend to be as simple as possible in order to maximize reuse – the simplest being a SELECT STREAM * FROM WHERE statement. Streams can be combined, grouped or joined in a single view, or a single view provided per stream, or both.

CREATE OR REPLACE VIEW RawEvents 
    AS SELECT STREAM * FROM MyEvent 
WHERE "eventName" = 'RawEvent';

Diagram 2 illustrates the concept of the streaming data pipeline, using a simplified example for exception detection. The SQL view illustrated above is the Stream Capture #1 view in the diagram. The use case is built on a real world example, raising an exception if a number of events of a particular type or value are detected within a specified time window.

Real-time Stream Processing Pipeline

The second view in the pipeline, Stream Processor #1, is shown below. In this example the view is responsible for the basic processing of the stream, counting the number of events that occur within a time window, in this case 180 seconds.

CREATE OR REPLACE VIEW CountedEvents AS 
SELECT STREAM *, 
   COUNT("eventName") OVER win AS "eventCount",
   FIRST_VALUE(RE.ROWTIME) 
          OVER win AS "firstEventTime",
   FIRST_VALUE("eventSeq") 
          OVER win AS "firstEventSeq" 
FROM RawEvents 
   AS RE WINDOW win 
   AS (RANGE INTERVAL '180' SECOND(3) PRECEDING);

The final stage in this particular processing pipeline is the detection of the alert.

CREATE OR REPLACE VIEW FlagTriggerEvents 
    AS SELECT STREAM *, 
    "eventCount" >= 3 AS "alert" 
FROM CountedEvents;

It would of course be possible to include all processing in a single view. However, maximizing reuse of views is a major consideration when building a stream processing application. The example is to illustrate how a pipeline can be constructed, where each view can have any number of consumers. For example, any number of Rule views can read from the Stream Processor #1 view, and any number of views can read directly from the stream capture view.

The application includes significantly more sophisticated integrations, features and analytics than illustrated here. For example:

  • Multiple rules
  • Recording and forwarding the events responsible for the generation of the alerts
  • Detect escalation
  • Detect clearance events
  • Join with alert history to identify exceptional events that deviate significantly from historical norms

These use cases are important components of a complete solution, and I’ll be providing examples in subsequent blogs, explaining how these have been implemented.


Vice President Marketing
January 11, 2012

For Big Data, 2012 has started where 2011 left off, with a plethora of reports, articles and blogs. Interestingly, most still begin with the question “what is Big Data”. It appears ‘Big Data’ as a market is broadening its footprint far beyond its open source and Hadoop origins. My favourite new term in this quest for delineation is “Small Big Data”. (Isn’t that just “Data”?)

The most interesting trend for us is streaming Big Data processing and analytics. Edd Dumbill, O’Reilly Radar, talks about this as one of the “Five big data predictions for 2012”, “Hadoop’s batch-oriented processing is sufficient for many use cases, especially where the frequency of data reporting doesn’t need to be up-to-the-minute. However, batch processing isn’t always adequate, particularly when serving online needs such as mobile and web clients, or markets with real-time changing conditions such as finance and advertising.”

The real-time use case is an obvious one. If you need to respond or be warned in real-time or near real-time, for example, security breaches or a service impacting event on a VoIP or video call, the high initial latency of batch oriented data stores such as Hadoop is not sufficient.

However, there is also an emerging discussion on the storage of Big Data for big data’s sake. This is the blind collection and storage of data without due consideration as to how it’s going to be used. Dan Woods talks about this in his recent Forbes article “Curing the Big Data Storage Fetish”. The data will never create value without analysis, and little thought has been given to increasing analytics capacity.

There are many vendors emerging for the historical analysis of Big Data repositories, either on the Hadoop platform, or on platforms from the other large scale data warehouse vendors. However, there are very few vendors in streaming Big Data analytics space, and even fewer products with the maturity, flexibility and scalability to process Big Data streams in real-time.

Streaming Big Data analytics needs to address two areas.  First, the obvious use case, monitoring across all input data streams for business exceptions in real-time. This is a given.  But perhaps more importantly, much of the data held in Big Data repositiories is of little or no business value, and will never end up in a management report. Sensor networks, IP telecommunications networks, even data center log file processing – all examples where a vast amount of ‘business as usual’ data is generated. It’s therefore important to understand what’s being stored, and only persist what’s important (which admittedly, in some cases, may be everything).  For many applications, streaming data can be filtered and aggregated prior to storing, significantly reducing the Big Data burden, and significantly enhancing the business value of the stored data.  At least until we understand why we’re trying to store everything.

Posted under Big Data

Vice President Marketing
November 18, 2011

ITS California AGM and Exhibition, 2011Event report from ITS California,  Annual General Meeting in Long Beach, Nov 13 – 15, 2011.

ITS California’s AGM is SQLstream’s local intelligent transportation event, and an opportunity for public and private sector companies to discuss those issues specific to the state of California. The event has grown in size significantly over the past few years, with new organizations attending, all keen to contribute and discuss different perspectives on the various transportation problems. ITS-CA was established in 1994 as a not-for-profit organization with the remit to foster the adoption of ITS technology across the state. Its funding is a mix of public and private industry.

SQLstream setting up the booth at ITS California

California has some acute transportation issues. For example, in Los Angeles County, the number of new residents is set to increase traffic on already congested roadways by an estimated 39 percent. Roadway expansion in this period is set to increase by only 3 percent, resulting in congestion levels rising by more than 200 percent in the next 25 years. Obviously addressing these issues ha been the focus of the Californian transportation industry as a whole, but ITS-CA serves as a focus point once per year to discuss in a wider forum.

The theme of this year’s event therefore (not unsurprisingly) was “Discovering Keys to the Next Decade”, with session focused on:

  • Support for connected vehicle deployments.
  • Transportation systems efficiency, for example reliable travel time and pricing.
  • Analyzing the success of traffic light systems and how close we are to the ‘continuous green’ vision.
  • Improving traveller information systems (of specific interest to SQLstream).
  • The effectiveness of schemes such as car sharing.

Although there’s always an over-arching theme or direction that emerges, and in this case it was the increasing presence of the private sector, and how private sector innovation is helping to drive the adoption of new technology and approaches. This was a common topic at the ITC-CA in Berkley last year, and has continued to grow. It was also a trend that was raised frequently at the recent ITS World Congress in Orlando. Long may it continue.

November 9, 2011

The Tutorial blog series helps SQLstream developers build streaming SQL applications. This blog is the third and final part of the Geospatial Visualization tutorial. The first blog in the series set out the streaming use case for connecting SQLstream to a Google Earth visualization, and described the initial steps required to capture the data and create a display list using Rails. The second part of this tutorial presented the core of the application – how to render the display list. And in this the concluding part of the visualization tutorial, the final key element is discussed – how to get the data flowing.

Getting the Data Flowing

The last step is to tie SQLstream and Postgres together. First we need to give SQLstream the credentials to access Postgres. Create a new file:

$QUAKES/quake.postgres.properties
1 URI=jdbc:postgresql://localhost/quakekml_development 
2 DRIVER=org.postgresql.Driver 
3 CONNPARAMPREFIX=dbConn_ 
4 dbConn_databaseName=quakekml_development 
5 dbConn_user=USER 6 dbConn_password=PASSWORD
7 dbConn_applicationName=SQLstream TableReader Adapter

On lines 5 and 6 insert the Postgres user and password you set up (same as in $QUAKES/quakekml/config/database.yml). Create a directory under $SQLSTREAM_HOME/plugin called jndi (if it’s not there already), copy your quake.postgres.properties file there and restart the SQLstream server.


linux> cd $SQLSTREAM_HOME/plugin
linux> mkdir jndi
linux> cp $QUAKES/quake.postgres.properties jndi/
linux> cd $SQLSTREAM_HOME
linux> bin/sqlstreamd

If you haven't loaded webfeed.sql and usgs.sql as described in the beginning of this post, you should do so now. Now we need to write the SQL that bridges the databases: $QUAKES/viz.sql


 1 SET SCHEMA '"WebFeed"';
 2 
 3 CREATE OR REPLACE FOREIGN DATA WRAPPER "TableUpdate"
 4     LIBRARY 'class com.sqlstream.plugin.tableupdate.TableUpdateStreamControlPlugin'
 5 ++++LANGUAGE java
 6 ++++DESCRIPTION 'adapter for doing insert/update/merge/delete to an external database';
 7 
 8 --
 9 -- Create SQL/MED foreign server and foreign stream
10 --
11 CREATE OR REPLACE SERVER "Postgres_TableUpdate"
12 FOREIGN DATA WRAPPER "TableUpdate"
13 OPTIONS (
14 ++++connParams 'quake.postgres',
15 ++++sqlDialect 'Postgres 8.x',
16 ++++pollingMillis '5000',
17 ++++commitCount '1000',
18 ++++commitMillis '2000')
19 DESCRIPTION 'Postgres database with visualization';
20 
21 --
22 -- Display list for quake events
23 --
24 CREATE OR REPLACE FOREIGN STREAM "QuakeEventsDB" (
25 ++++"SQLS_opcode" CHAR(2) NOT NULL,
26 ++++"SQLS_chg" VARBINARY(32),
27 ++++"id" INTEGER options("insert" 'skip', "update" 'skip'),
28 ++++"when" TIMESTAMP,
29 ++++"lat" DOUBLE,
30 ++++"lon" DOUBLE,
31 ++++"mag" DOUBLE,
32 ++++"created_at" TIMESTAMP,
33 ++++"updated_at" TIMESTAMP
34     )
35 ++++SERVER "Postgres_TableUpdate"
36 ++++OPTIONS (
37 ++++++++TYPE 'tableUpdates',
38 ++++++++MASTER 'true',
39 ++++++++updatesTable 'quake_events')
40 ++++DESCRIPTION 'table updated with quake events';
41 
42 --
43 -- Pump quake events into display list
44 --
45 CREATE OR REPLACE PUMP "1000-QuakeEventsPump" STOPPED
46 DESCRIPTION 'pump from "SmallQuakesDay" view to "QuakeEventsDB" foreign stream' AS
47 ++++INSERT INTO "QuakeEventsDB" (
48 ++++++++"SQLS_opcode", "when", "lat", "lon", "mag",
49 ++++++++"created_at", "updated_at")
50 ++++SELECT STREAM 'IN',
51 ++++++++q.ROWTIME,
52 ++++++++CAST(SUBSTRING("point", 1, POSITION(' ' IN "point") - 1) AS DOUBLE),
53 ++++++++CAST(SUBSTRING("point", POSITION(' ' IN "point") + 1) AS DOUBLE),
54 ++++++++"mag",
55 ++++++++CURRENT_TIMESTAMP, CURRENT_ROW_TIMESTAMP
56 ++++FROM "SmallQuakesDay" as q;

The first 20 lines set up the table updater (note the reference to our properties file on line 14). We then create a foreign stream to describe the Postgres table quake_events. When Rails created the table, it added an auto-incrementing id field and two timestamps, created_at and updated_at. The options specified on line 27 cause SQLstream to ignore the id column and let Postgres maintain it. At line 45 we describe the pump that reads from SmallQuakesDay and inserts into the stream we defined above. In the select clause we use the 'IN' opcode to indicate that we're inserting and the ROWTIME of the quake record to set the 'when' column in the display list. We parse the 'point' column the USGS provides, which is in the format "lat<space>lon", on lines 52 and 53. To set the Rails timestamps correctly we use CURRENT_TIMESTAMP to get the creation time and CURRENT_ROW_TIMESTAMP to get the time of this update. Load this file into SQLstream with:

linux> sqllineClient < viz.sql

Now all that's left to do is start the pump. Create this one-line file:

$QUAKE/start-pump.sql
1 ALTER PUMP "1000-QuakeEventsPump" START;

and start the pump:

linux> sqllineClient < start-pump.sql

Data should now be flowing from the USGS web service, through SQLstream, into Postgres, and rendered to Google Earth from Postgres via Rails. To verify that data is flowing, you can use the 'Edit quake events' in your web app. When you follow the Google Earth link, you should see pins scattered across the globe indicating the past day's earthquakes, and updated every 60 seconds. You'll probably have to zoom out a bit to see them, unless San Francisco is having another bad day. This is what I saw when I ran it:

And finally

That concludes the streaming data visualization tutorial. In case you've missed the earlier posts, the complete series can be found using these links:

++++++++Part 1: Capture the data and create the display list

++++++++Part 2: Rendering the display list

++++++++Part 3: Making it work with flowing data

Please contact us if you have any questions.

Posted under SQLstream Tutorials
November 2, 2011

The Tutorial blog series helps SQLstream developers build streaming SQL applications. This blog is the second in the Geospatial Visualization tutorial.  The first blog in the series set out the streaming use case for connecting SQLstream to a Google Earth visualization, and described the initial steps required to capture the data and create a display list using Rails.  In the second part of this tutorial, we’re going to discuss the meat of the application –  how to render the display list.

Rendering the Display List

To keep Google Earth continuously updated with the data flowing from SQLstream we’ll have to serve two different KML files: one will contain a KML Placemark for each quake, and the other gives the URL of the quakes feed and tells GE to continuously refresh it (in KML, this is the NetworkLink). We’re going to be serving compressed KML to cut down on the transmission time, so we’ll need the rubyzip gem we installed earlier included in our web app. Stop the server, go to $QUAKE/quakekml, and edit the file “Gemfile” to add this line to the end of the file:

gem 'rubyzip'

then issue these commands:

linux> bundle
linux> bundle package
linux> rails generate controller home index
linux> rails generate controller quakes start feed

The first two commands bundle up the application, including the new gem. The third command creates a Rails controller for our home page, while the last command creates a controller with two actions, one for each of our services. If you restart the server now, you can view these services at http://localhost:3000/home/index, http://localhost:3000/quakes/start and http://localhost:3000/quakes/feed. Edit $QUAKE/quakekml/config/routes.rb and add this line after the “get” commands to make home/index the home page for the web app:

root :to => "home#index"

You’ll also have to remove the file $QUAKE/quakekml/public/index.html. Restart the server and visit http://localhost:3000, you should now see the home#index default page rather than Rails’ startup page. Note that it shows you the name of the template file for this page, relative to the $QUAKE/quakekml directory. Edit $QUAKE/quakekml/app/views/home/index.html.erb to create the content for your landing page, at some point in the body add this line to create a link to the start service:

<%= link_to 'Earthquake events (open in Google Earth)',
+++++:controller => 'quakes', :action => 'start' %>

You should also add this link to the scaffolding for quake events:

<%= link_to 'Edit quake events', quake_events_path %>

You shouldn’t have to restart the server, just refresh http://localhost:3000 to see the changes.

The Ruby code that implements our services is in $QUAKE/quakekml/app/controllers, starter code has already been written by the “rails generate” commands we’ve been issuing. The ancestor of all of our controller classes is in application_controller.rb, we’ll add a method for setting up parameters available in any request (in this case, only one, the path to the server) and two methods for sending text so that the browser recognizes it as KML or KMZ:

$QUAKE/quakekml/app/controllers/application_controller.rb

1 require 'zip/zip'
2
3 class ApplicationController < ActionController::Base
4 protect_from_forgery
5
6 # Set up the common params to be computed
7 # after the request is received (can't go
8 # initialize). These are available in all views.
9 #
10 def setup_common
11 @path = request.host + ':' + request.port.to_s
12 end
13
14 # Output the kmz, given the kml
15 #
16 def send_kmz(kml)
17 t = Tempfile.new("zipout-#{request.remote_ip}")
18 Zip::ZipOutputStream.open(t.path) do |zos|
19 zos.put_next_entry("sqlstream.kml")
20 zos.print kml
21 end
22
23 send_file t.path,
24 :type => "application/vnd.google-earth.kmz",
25 :filename => "sqlstream.kmz"
26
27 t.close
28 end
29
30 # Output the kml directly
31 #
32 def send_kml(kml)
33 render :text => kml,
34 :layout => false,
35 :content_type => "application/vnd.google-earth.kml+xml"
36 end
37 end

Next we edit quakes_controller.rb to write the methods that respond to the quakes/start and quakes/feed requests. Each method uses Rails’ template support to render a KML template, with an option set to prevent it from being laid out like an HTML page. The start method sets an instance variable, @refresh, to the number of seconds we want to wait before refreshes. The feed method uses Rails’ database support to store all of the quake event rows in @quakes. These instance variables are expanded in the templates. $QUAKES/quakekml/app/controllers/quakes_controller.rb

 1 class QuakesController < ApplicationController
 2+++++def start
 3+++++++++setup_common
 4+++++++++@refresh = 60
 5+++++++++kml = render_to_string :template => 'quakes/start.kml',
 6+++++++++++++:layout => false
 7+++++++++send_kmz kml
 8+++++end
 9
10+++++def feed
11+++++++++setup_common
12+++++++++@quakes = QuakeEvent.all
13+++++++++kml = render_to_string :template => 'quakes/feed.kml',
14+++++++++++++:layout => false
15+++++++++send_kmz kml
16+++++end
17+end

The path for the template files is relative to $QUAKES/quakekml/app/views, the quakes directory there should already exist and contain the default templates generated by Rails. We’ll create two new template files, starting with the one for quakes/start:

$QUAKES/quakekml/app/views/quakes/start.kml

 1 +<?xml version="1.0" encoding="UTF-8"?>
 2 +<kml
+++++xmlns="http://www.opengis.net/kml/2.2"
+++++xmlns:gx="http://www.google.com/kml/ext/2.2"
+++++xmlns:kml="http://www.opengis.net/kml/2.2"
+++++xmlns:atom="http://www.w3.org/2005/Atom">
 3 +<Document>
 4 +++<name>Earthquake Monitor</name>
 5 +++<open>1</open>
 6 +++<visibility>1</visibility>
 7 +++<LookAt>
 8 ++++++<longitude>-122.418955</longitude>
 9 ++++++<latitude>37.775410</latitude>
10+++++++<altitude>359000.0</altitude>
11+++++++<range>37000.0</range>
12+++++++<altitudeMode>relativeToGround</altitudeMode>
13 +++</LookAt>
14 +++<NetworkLink>
15 +++++<name>Quakes</name>
16 +++++<open>0</open>
17 +++++<visibility>1</visibility>
18 +++++<refreshVisibility>0</refreshVisibility>
19 +++++<flyToView>0</flyToView>
20 +++++<Link>
21 +++++++<href><%= url_for(:controller => 'quakes', :action => 'feed', : only_path => false) %></href>
22 +++++++<refreshMode>onInterval</refreshMode>
23 +++++++<refreshInterval><%= @refresh %></refreshInterval>
24 +++++++<viewRefreshMode>onStop</viewRefreshMode>
25 +++++++<viewRefreshTime>1.0</viewRefreshTime>
26 +++++</Link>
27 +++</NetworkLink>
28 +</Document>
29 +</kml>

The start KML begins with a LookAt element specifying the starting view (directly above SQLstream HQ!). The NetworkLink element includes two substitutions handled by Rails: at line 21 we insert the URL for the quakes/feed service, and at line 23 we insert the refresh rate.

$QUAKES/quakekml/app/views/quakes/feed.kml

1 <?xml version="1.0" encoding="UTF-8"?>
2 <kml xmlns="http://www.opengis.net/kml/2.2"
+++++xmlns:gx="http://www.google.com/kml/ext/2.2"
+++++xmlns:kml="http://www.opengis.net/kml/2.2"
+++++xmlns:atom="http://www.w3.org/2005/Atom">
3 <Document>
4 ++<name>Pins</name>
5 ++<open>1</open>
6 ++<visibility>1</visibility>
7 ++<Style id="pin">
8 ++++<IconStyle id="pin">
9 ++++++<scale>1.0</scale>
10+++++++<Icon>
11+++++++++<href>http://<%= @path %>/images/pin.png</href>
12+++++++</Icon>
13+++++</IconStyle>
14+++++<LabelStyle>
15+++++++++<color>ff0000dd</color>
16+++++++++<scale>1.2</scale>
17+++++</LabelStyle>
18+++</Style>
19+++<Folder>
20+++++<name>Earthquakes</name>
21+++++<open>0</open>
22+++++<visibility>1</visibility>
23+++++<description></description>
24+++++<%= render :partial => "quake_event", :collection => @quakes %>
25+++</Folder>
26+</Document>
27+</kml>

At line 11 we use the @path variable to specify the location of an image we want to appear on the globe at each quake location. You should place an image in $QUAKES/quakekml/public/images/pin.png, we use this one:

The other substitution, at line 24, causes a partial template to be rendered for each record in the @quakes collection. According to Rails’ naming conventions, the partial must be in this controller’s view directory with the name _quake_event.html.erb:

$QUAKES/quakekml/app/views/quakes/_quake_event.html.erb

 1 +++<Placemark>
 2 ++++ <name><%= quake_event.mag %></name>
 3 +++++<open>1</open>
 4 +++++<visibility>1</visibility>
 5  ++++<description><![CDATA[<%= render :partial => 'quake_description',
+++++++++++++:locals => {:quake_event => quake_event} %>]]></description>
 6  ++++<styleUrl>pin</styleUrl>
 7  ++++<Point id="quake">
 8  ++++++<extrude>false</extrude>
 9  ++++++<coordinates><%= quake_event.lon %>,<%= quake_event.lat %>,0</coordinates>
10  ++++</Point>
11  ++</Placemark>

The quake event records are inserted at lines 2 and 9. At line 5 we reference another partial that renders the HTML for the popup that appears when you click on the pin in Google Earth:

$QUAKES/quakekml/app/views/quakes/_quake_description.html.erb

1 <table style="text-align: center; width: 300px;" border="0" cellpadding="2" cellspacing="2">
2 ++<tbody>
3 ++++++<tr align="left">
4 ++++++++<td>on <%= quake_event.when.strftime("%Y/%m/%d") %>
+++++++++++++at <%= quake_event.when.strftime("%X %Z") %></td>
5 ++++++</tr>
6 ++++++<tr align="left">
7 +++++++++<td>at lat: <%= number_with_precision(quake_event.lat, :precision => 6) %>
+++++++++++++lon: <%= number_with_precision(quake_event.lon, :precision => 6) %></td>
8 ++++++</tr>
9  +++</tbody>
10 </table>

Note that this separation into multiple templates lets us express code in Ruby files, KML in KML files, and HTML in HTML files. Details about how the data is presented, such as how we format a timestamp, are taken out of the code and expressed in markup language.

You should now have a working KML renderer. Go to your app’s home page and follow the ‘Edit quake events’ link to add one or more fake quakes. Use SQLstream’s lat/lon from start.kml (above) for at least one of them. Now follow the Google Earth link from your app’s home page (you may have to instruct your browser to open KML/KMZ files in Google Earth, if you’ve never done this before). Google Earth should open and zoom to SQLstream HQ, and there should be a pin indicating an earthquake there.

You can now refine the visualization of a quake event by updating the templates and refreshing the display (you can refresh the quakes/feed stream by right-clicking on ‘Quakes’ in the Places tree on the left and selecting ‘Refresh’). We have a tool to display whatever quake events are dropped into the database, the next step is to feed it from SQLstream.

Next time
Part 3 of the visualization tutorial concludes this series and will be published next week. It will discuss the final key element – how to get the data flowing.

Posted under SQLstream Tutorials

Vice President Marketing
October 21, 2011

It struck me that the underlying theme of the conference could be described as ‘low risk innovation’.  An oxymoron?  At first glance, yes, but in this case it describes the cautious adoption of new technology while protecting existing investment.

The perception of the ITS industry, rightly or wrongly, has been one of an industry focussed on major manufacturing and hardware deployment projects, rather than software and new technologies. Therefore perhaps the industry hasn’t seen the level of growth and innovation that has occurred in other areas, telecommunications being a good example.  Or perhaps it simply hasn’t been possible until new technologies come along with a sufficiently compelling business case – lower cost solutions and faster to deploy.

New technology introduction – complementary and overlaid

GPS and Bluetooth featured strongly at the event on both the exhibition floor and in the breakout panel sessions.  It was clear however that the preference was to use these either as a proof concept across a small area, or where monitoring infrastructure exists, as complementary solutions, extending the accuracy and scope of existing fixed-road sensor deployments.

Software solutions and architecture

It was interesting how many of our booth visitors were initially attracted by SQLstream ITS Insight, but who were actually looking for more of a horizontal platform solution.  Yes, out of the box capability was important to get solutions up and running quickly, but wider concerns touched on some fundamental software engineering principles – openness, scalability and interoperability.

Open and flexible

The focus on manufacturing and hardware oriented projects tends to produce software support systems that are built to do just that – support the particular hardware installed for that project.  This leads to capable and often feature-rich systems, but systems that are difficult to extend and configure for new technologies and requirements.  There was a definite theme in the questions being asked at our booth for openness – open platforms, where an agency’s IT department, or consulting partner, can add new applications easily, and in fact are encouraged to do so.

Performance and scalability

Systems are required to scale in two ways.  The first is raw performance as the number of sensor events increases and real-time performance is required. Secondly, and perhaps less obvious, is that transportation agencies are looking to consolidate systems and provide common systems across multiple counties and even at the state level.  This highlights immediately the scalability issues with existing systems.   As the user base increases, the geographical scope of the system increases, and the drive for real-time information increases, the weakness of existing systems have been exposed.

Standards

The emergence of IT standards is a sure sign of increasing maturity in any industry.  This represents a move away from limited, siloed solutions to consideration of the wider integration issues – integration with network hardware, but also the integration between management systems.

A final word …

One final word, not quite a trend yet, but ITS appears to be embracing the Cloud as a solutions platform, as a mechanism for providing access to applications, but also for scalability and as a lower cost solution where large infrastructure deployment would be required.


Vice President Marketing
October 19, 2011

A defining feature of the show is the Technology Showcase featuring  demonstrations from some of the technologies and applications that are bringing the future of transportation to life.  Each ‘village’ covers a specific theme such as Safety, Mobility, Environment/Sustainability and Pricing.  Environment/Sustainability focuses on the potential for reducing emissions.  Interesting demonstration offered by Imperial College, London, that will use a pollution sensor mounted on the roof of a demonstration vehicle travelling around the site.  Data is transferred over a GSM connection and viewed over the internet. Another is Ricardo Engineering’s demonstration of improving fuel efficiency using GPS navigation and traffic signal phase and timing data.

On the theme from yesterday of using simulators to demonstrate the benefits of technology, Toyota’s Star safety system is a star of the show – attendees get a spin in a off-road driving simulator, once with all electronic aids turned off, and once with traction control, stability control, brake assist and anti-lock braking turned on.

And for SQLstream’s day, another busy day with many questions and demonstrations of real-time traffic analytics and congestion detection with SQLstream ITS Insight.  Our core Stream-to-Business platform, and the ability to integrate and extend SQLstream ITS Insight easy and quickly, is generating significant interest.


Vice President Marketing
October 18, 2011

The ITS World Congress claims to be the largest transportation event in 2011. Certainly the range of attendees and exhibitors is impressive, from software products to the latest in roadside hardware infrastructure. Some interesting themes emerging. Vehicle to vehicle communication is generating a lot of interest, and electronic driver aids. Some great simulators as well to amuse the attendees.

‘Real-time’ is also a common theme across the exhibition hall. And in particular the use of new technologies such as Bluetooth and GPS. Damian Black, SQLstream’s CEO, was speaking in a session today on how to achieve accurate arterial travel time. This is a hot topic right now, where existing in-road sensors are too expensive and the solutions too inaccurate. Bluetooth and GPS are the two emerging, although not necessarily competing, technologies – both can be used simulataneously to reinforce the other.

SQLstream in action – Day 1

It was great to have so much interest at the booth for our real-time intelligent transportation solutions. We announced the public launch of our SQLstream ITS Insight solution this morning. Perhaps we’re one of the few new companies at the show, but the ability to build real-time traffic analytics solutions based on maximizing the use of all available sensor data is generating a lot of interest.