Blog


Vice President Marketing
November 16, 2012

Silicon Valley Comes to Oxford 2012 (SVCO 12) is an iconic entrepreneurship event now in its 12th year, that brings the most relevant technology entrepreneurs from Silicon Valley to Oxford, UK to engage with University of Oxford’s Saïd Business School students plus members of the wider Oxford entrepreneurial ecosystem. SQLstream CEO Damian Black is hosting and presenting sessions on both aspects – the trials and tribulations of getting a tech Big Data startup off the ground in Silicon Valley, plus insight into the current state and future potential of the Big Data market. There’s also the debate at the University of Oxford Student Union where Damian will be leading one side of the debate, but more on that in a separate blog.  Other speakers include senior representatives from Google, IBM and McKinsey. Click here for the full list of speakers.

What they don’t teach you in Business School on how to create and build a start-up company.

Damian’s workshop on tech startups focuses on a real-world perspective that is often overlooked in Business Schools. Most are aware of the importance of developing the right ideas, the right business plans, on building the right team and attracting the right capital. These are of course all useful steps but the reality is often very different.

Damian is leading a workshop that will cover the lessons and scars learnt during the early days of SQLstream and other start-ups, giving a real-world perspective as to what was easy, what was hard, what worked, what didn’t, and at the unanticipated pitfalls and the unexpected boosts that together made up the start-up journey of taking a company from crazy concept to everyday reality.

The full material and paper from the workshop will be published here on the blog over the coming weeks.

Also coming up in the next few weeks, The “Big Data” Janus Effect (and why is everyone might be looking the wrong way). Damian’s talk on Big Data at SVCO 12 will look at the intersection of Big Data and real-time, and whether the current emerging Big Data storage technologies are the beginning of a new era, or the final death throes of the static storage approach to business and operational intelligence that has persisted since the 70s.

Silicon Valley Comes to Oxford is on the 18th and 19th of November at the Oxford Centre for Entrepreneurship and Innovation.

Posted under Uncategorized

Vice President Marketing
September 28, 2012

We’re going to be exhibiting at Oracle OpenWorld with SQLstream next week. Looking forward to it following some great analyst briefings over the past few weeks and a lot of interest now from  companies looking for a real-time Big Data management platform where they can built there apps using standards-based streaming SQL.

In the beginning, or at least 18 months ago, there was Hadoop, and an increasing awareness that the Internet, sensors, M2M and cloud infrastructure was generating more data that existing platforms could manage.  The trend was Map-Reduce, NoSQL databases and a wide range of analytics companies claiming ‘real-time’. But querying faster is not real-time, even with the distributed power of Map-Reduce and more structured storage platforms such as HBase and Cassandra. Nor does Hadoop and related technologies address one key requirement – the real-time integration of high volume, high velocity and high variety data. Real-time, streaming integration is essential to making Big Data applications work.

Even more importantly, mainstream industry is looking to adopt real-time streaming technology, but does want to base their mission critical applications on high latency, non-standardised development frameworks.

This is where SQLstream comes in.  A real-time Big Data management platform for integrating live data, and performing real-time, in-memory analytics on the data as it streams past.  Industry and major companies have a real-time need, but prefer the use of standards such as SQL. The ability to lift existing SQL from their high performance analytics databases and data warehouses and drop the queries into a real-time streaming platform is a key.

So if you’re going, come and see us, we’re at #106 Moscone South.

Posted under Uncategorized

Vice President Marketing
June 25, 2012

We’ve been exhibiting at Structure 2012 in San Francisco, where our CEO Damian Black was speaking on dataflow architectures for massively scalable real-time Big Data computing. In fact, this was a milestone for us as Damian was on the very first Big Data panel at the first Structure event in 2008.

Dataflow is a technique for parallel computing that emerged from research in the 1970s. It’s based on graph-based execution models where data flows along the arcs on a graph and is processed at the nodes. It was decades ahead of its time in an era when hardware was expensive and real-world requirements for massively parallel, low latency computing architectures were not required in the mainstream. However, dataflow as an architecture has found its place and time, with the emergence of Big Data volume, real-time low latency requirements, commodity hardware and low cost storage. Dataflow is driving the architectures for today’s real-time big data solutions.

Structure 2012 - Dataflow comes of age

Click to view Structure 2012 presentation video

SQLstream adopted the principles of dataflow as the basis of our architecture for SQLstream s-Server. Our adapters turn any data source into a live stream of data tuples which are combined, aggregated and analyzed by the SQLstream s-Server platform. SQLstream has added one essential feature to data flow – the use of SQL as a dataflow management language. SQL has been used for some time as the language of choice for relational database management systems, and in this context is getting a bad press in light of new structures for Big Data storage and NoSQL queries. However, SQL is powerful, declarative (therefore applications can be built easily, quickly and cheaply) and is a natural, powerful paradigm for processing streaming dataflows. The benefit is extremely low latency with the ability to process massive volumes of live data over an unlimited number of servers – exactly the requirements of real-time Big Data. In fact, this is the only architecture capable of processing real-time Big Data streams. With real-time requirements now in the 20 to 100 million events per second range, power, scalability and low latency are key.

Dataflow architecture for real-time streaming Big Data computing

Diagram 1: Dataflow architecture for real-time streaming Big Data computing

The SQLstream s-Server architecture concept is illustrated in Diagram 1. As a dataflow architecture, each node is a streaming SQL statement – a continuous SQL query, processing arriving data over a moving time window (time windows can be from 1 millisecond for ultra low latency requirements, to months or even years where comparison against long term moving averages is required, for example, Bollinger bands). Why is this important? Well, it’s the only approach for low latency, real-time solutions, as information flows out of the system as soon as input data arrives, that is, the high latency of batch-based approaches such as Hadoop Map-Reduce is removed completely.

Mozilla Glow: Real-time download monitor with SQLstream and HBase

Mozilla Glow: Real-time download monitor with SQLstream and HBase

Damian presented a simple example of SQLstream and parallel dataflow in action. Mozilla’s Glow application is a continuously updating download counter for the Firefox 4 browser when it was released. The application used SQLstream s-Server to collect live download statistic from all the download servers worldwide. Download records were processed and aggregated in real-time and displayed on the Glow visualization map, illustrating exactly how many copies of the browser had been retrieved. SQLstream s-Server also provided a continuous ETL operation into Apache Hbase, storing aggregated and filtered records for further in depth analysis. Click here to watch the application in action.

Finally, and in contrast to Structure, we also attended a Gartner session last week with Merv Adrian and Svetlana Sicular, which sought to bring some sense of perspective to Big Data. This was really a reality check as to the current maturity levels of the Hadoop Big Data platforms and the effort required to deploy. The wider adoption across industry in general will require significantly more mature products and applications, particularly around the OPEX costs for deployment, security concerns and ability to deliver business intelligence all consumers in an large organization. The recommendation was to use an integrator such as Cloudera or Hortonworks. Mainstream organizations are looking at the Hadoop / Big Data approach, but many do not currently see either a use case or a reason for adoption. It was interesting to hear a perspective that didn’t need to be buzzword compliant, and presented a positive yet realistic perspective on wider adoption.

Posted under Big Data · Streaming SQL · Uncategorized

Vice President Marketing
June 7, 2012

We’re at Sensors Expo this week, showcasing in the Big Data & Analytics Pavilion. This is the first year the event has included a specific area for real-time Big Data solutions for sensor networks.

Real-time control in a Big Data World

SQLstream CEO, Damian Black, presented on Real-time Control in a Big Data World. The presentation focused on the increase of sensor data and the emergence of the “Sensor Internet”, plus the applications required to collect and analyze streaming sensor data, and to drive real-time actions and updates. In particular, addressing the emerging real-time Big Data challenges in this area driven by wireless and GPS technologies, M2M applications and V2V/V2I.

SQLstream Damian Black Sensors Expo 2012 Real-time Big Data Integration

Click here to view Real-time Big Data integration and analytics for sensor networks

Real-time streaming data integration for Big Data

It’s clear the primary challenge is not managing the data volume per se, or even delivering real-time operational intelligence, rather it’s the more fundamental issue of real-time streaming data integration. How can such huge volumes of data from many different sources and locations be integrated into the operational platforms, and how can the issues of multiple operational siloes be overcome to provide an integrated real-time control platform.

Interestingly, these are the exact same issues SQLstream addresses for the Big Data and Hadoop world in general – getting data in, getting data out, connecting existing data stores in real-time, and delivering real-time in-memory analytics on the data as it streams past:

  • Real-time streaming data integration of any data source and between existing storage platforms and operational systems
  • Real-time streaming monitoring and analytics on the arriving and streaming data
  • Scalability through parallel distributed processing of processing pipelines

The importance of geospatial analytics

Geospatial analytics is a key requirement in the sensor data market. Big Data analytics in general is about one dimensional problems, usually the correlation of similar events, or the correlation of events over time. The geospatial dimension is the key difference between Big Data platforms for the “Sensor Internet” and the wider IT / machine data applications. Fortunately this has been a feature of SQLstream for some time, and central to many of our customer deployments. For example, real-time traffic analytics from GPS data, and real-time seismic monitoring.

Posted under Uncategorized

Vice President Marketing
June 5, 2012

Glue Conference 2012 , Denver CO, at the end of May was a great conference, well attended, knowledgeable participants and is the only conference I know that looks at gluing cloud and mobile applications together with a developer focus.

There was the usual wave of NoSQL, cloud storage, cloud platforms and Hadoop presentations, as you’d expect, but also with some interesting keynotes as well. Ray O’Brien, CTO for IT at NASA. talked about the evolution of Nebula and OpenStack at NASA, and James Governor from Redmonk, talking about the evolution of historical analytics.

From our perspective, the strength of the show was in making physical rather than logical connections. Both partnerships and potential customer interest in building real-time Big Data applications, and how SQL has been repurposed as an API for streaming Big Data, moving it forward significantly from its roots as static data management language.

Real-time, streaming Big Data

Relational Streaming for real-time Big Data scaling

Relational Streaming for real-time Big Data scaling

Our CEO, Damian Black, presented on real-time streaming Big Data, both as a real-time alternative to Hadoop, and also as a complement to add real-time responses and streaming integration to existing Hadoop installations. One question we were asked several times was why SQL? A good question. This isn’t a religious debate about the language by any means, and if we had opted to build a Big Data batch storage and analytics platform (e.g. like Hadoop), we would have gone a different route.

However, when it comes to processing streaming tuples in real-time, a standard SQL approach has two big advantages over all others. First, with the extension of the SQL WINDOW operator to process streaming data over fixed time windows, both structured and unstructured data can be processed painlessly without having (no pun intended) to define a static schema and without the need for any coding whatsoever. In effect, SQLstream processes streams of arriving tuples over time windows and pushes out the results to other systems.  Similar in concept at least to Hadoop, although Hadoop is purely batch-based, processing static files and pipelining sets of tuples through low level Map-Reduce functions.

Relational Streaming for Real-time Big Data

Click on the title slide to view the real-time Big Data presentation

However, the second benefit is equally important. Streaming SQL queries include standard operators such GROUP BY and PARTITION. These provide the best clues possible to a query planner capable of automating the dynamic scaling of streaming pipelines over vast numbers of servers. This gives a reliable and controllable mechanism for Big Data scalability without the need for hardcoding server allocation hints.

Real-time at GlueCON

The strength of the real-time track at GlueCON was encouraging. It was interesting though that the term ‘real-time’ is now about as over used as ‘Big Data’, and about as poorly understood. For SQLstream, it’s the streaming integration of any and all data sources with in-memory analytics, processing streams at millions of events per second. For some other vendors, it appears real-time drops off at significantly lower rates and numbers of connections!

Next stop, GigaOM Structure in San Francisco

Next stop GigaOM Structure in San Francisco, June 20 / 21 at the Moscone Center. Visit us there if you’re attending.

Posted under Big Data · Real-time · Streaming SQL · Uncategorized

Vice President Marketing
May 22, 2012

We’re at ITS America Annual Meeting, National Harbor, Washington DC  this week (see the team in action below), the yearly opportunity for the US intelligent transportation community to get together and discuss how IT and technology can be used to better serve travelers, industry and government. Real-time, the Smart City and Vehicle-2-Vehicle communication have been the trend topics.  Bill Ford previously said “the car will be a rolling set of sensors” has echoed here.  Martin Thall, VP Telematics at Verizon talked about the issues of getting the data up from the vehicle, and how 4G / LTE will be key due to the large volume of fast data, however the pricing and subscription model for 4G services is currently one of the hurdles to wider telematics adoption.  Essentially real-time Big Data requirements for mission critical applications, the area where SQLstream excels.

Chris Vein, deputy white house CTO discussed the need for open standards, and for government to find ways for private industry to participate more.  The idea of “government as platform, encouraging the “lean startup” and to offer “smart disclosure” of data to encourage information sharing and real innovation.

Why is my GPS Travel Time always wrong?

Ursula Burns, Xerox CEO captured exactly the need for SQLstream s-Transport. How come, every morning when I drive to work, does my GPS always tell me the journey will take 37 minutes, and it has yet to take 37 minutes?  Accurate and reliable real-time Travel Time is the first problem SQLstream solved with the SQLstream s-Transport product.  Using GPS data to monitor traffic flow accurately and in real-time, and present correct end to end journey time information throughout the journey, not just at the beginning.  (More on SQLstream s-Transport here).

I’ll update and publish a final report on the show  these next final few days. If you’re attending ITS America Annual Conference, National Harbor, Washington DC, please drop by Booth #631.

Posted under Uncategorized

Test
May 10, 2012

This week I’m attending an interesting conference at UC Berkeley called the “Berkeley conference on Streaming Data”.  The organizers are primarily astronomers and statisticians, but the talks discuss issues and solutions to streaming data problems across a wide selection of scientific areas and engineering applications.  Real-time streaming Big Data applications presented included oceanography biology genetics, reading handwriting, astrophysics, particle physics, recommendation engines for social media, and inevitably, real-time fraud detection from live data feeds.

I presented on a deployment of SQLstream as a Dynamically Scalable Cloud Platform for the Real-Time Detection of Seismic Events. Based on work with UCSD seismologists, SQLstream has been deployed to detect significant events in data collected from a large grid of seismic sensors. A large-scale data infrastructure (the OOI/CI) provides raw signal data over an AMQP message bus.

Plot of Seismic Events

SQLstream monitors live seismic data feeds in real-time, applying heuristic algorithms that look for patterns indicating earthquakes. The live system scales dynamically across multiple servers in a cloud environment based on the current demand. You can view the presentation here.  I also blogged previously on the application here.

In conclusion, I have two main observations from the conference so far (it continues until Friday). The first is that the majority of fields in science and technology appear to have a Big Data and often a real-time Big Data problem.  Secondly, the extent of the innovation and computer science resources dedicated to solving these problems.  In particular for this conference, developing algorithms for data analysis and machine learning (that is automatic pattern recognition) that work on streams of flowing data.  It’s clear that traditional data management and even Big Data batch-based methods don’t work when you need continuous results from dynamic data. And the amount of data is huge.


Vice President Marketing
April 25, 2012

The Text Analytics Summit in London this week was an opportunity to catch up on the latest trends and state of the Text Analytics market.  An interesting couple of days with a few themes emerging.

Firstly, Big Data.  Not entirely unexpected, but almost every presentation referred to Big Data in some shape or form.  In part this was referring to the volume of data to be processed, but primarily in the context of databases for the storage and processing of unstructured data of any volume.

Although not discussed explicitly, there’s obviously a search for business models that work.  Most applications were B2B platforms, sold as a package of product, services and consultancy, enabling organizations to better mine text data for market and competitor intelligence.  However, some were seeking to monetize through subscriptions to information feeds.

 

For SQLstream, we presented on the use of real-time text analytics for improving incident detection and prediction.  In particular, the use of real-time Twitter and text messages for identifying Quality of Experience issues with IP content services, but also the use of Twitter for improving real-time incident detection in transportation networks.  And in line with the rest of the conference, we did our bit for Big Data, describing how real-time streaming integration and analytics can be built on unstructured data analytics as an integrated real-time Big Data and Hadoop platform.


Vice President Marketing
August 10, 2011

The latest product update of SQLstream, version 2.5.1, has just been released and shipped.  This will be the final 2.x release prior to the SQLstream 3 launch, and although SQLstream 2.5.1 is predominately a maintenance release, it does include a range of feature enhancements, including:

- Support for exponentially decaying averages in windowed aggregation functions

- Enhancements to the standard Log File and Socket Adapters

A number of our customers have already downloaded and upgraded to the new version, and across a range of industries, including those in transportation, sensor network management, e-Commerce and web analytics.

Click here if you’d like to register for SQLstream product downloads (note – registration is required), or would  like to know more about solutions for real-time analytics and data management.

But what of SQLstream 3?  Well, watch this space.  SQLstream 3 will take a major step forward, not just in SQLstream’s stream computing product capability, but also in the way stream computing solutions are built and deployed.  More on this over the coming months.

Posted under Uncategorized

Vice President Marketing
April 7, 2009

April 7, 2009. Perez Hilton tweets and then blogs that Lindsay Lohan was just spotted leaving Santa Monica Pier while drinking a Red Bull. Paparazzi swarm the promenade as curious onlookers follow Twitter from their iPhones. Tipped off by their new real-time BI solution, a cutting edge advertising executive receives a mobile alert that there is a spike in Twitter activity about a client. The agency shelves the usual morning Raisin Bran ads for an energy drink campaign, all in real-time. Sales volume surges as the agency establishes web advertising leadership by delivering unique value to their client. The morning ends with executives on their Blackberries discussing the rapid progress made toward KPIs and sales goals while drinking a…….

Fresh off the Pentaho Partner Summit, SQLstream, Pentaho and SQL Power announce a collaborative partnership to deliver actionable, affordable real-time business intelligence. This easily integrated solution will query, triage and analyze high volume data feeds such as Twitter, on the wire, then deliver it via sub-second updates into an executive business intelligence dashboard.

SQLstream’s real-time analytics engine bridges the gap between operational business systems and the data warehouse. Pentaho is the world’s most widely deployed open source business intelligence suite and is based on the open source Mondrian OLAP server. SQL Power is the premier business intelligence and data migration consultancy in Canada, specializing in the implementation of cost effective BI solutions.

“Committed to open-source and open standards, the SQLstream, Pentaho & SQL Power alliance will break down the barriers that have prevented business leaders from capitalizing on their data,” said Damian Black, CEO of SQLstream. “We do more than monitor ad campaigns in real-time. SQLstream enables continuous revenue optimization by enabling ad campaigns to be adjusted in real-time, while at the same time optimizing pricing, catching fraud, detecting service failure and recognizing customer disaffection. By processing the data continuously in real-time, the classic ELT/ETL bottlenecks are removed from the data warehouse.”

“Enterprises need more from their data warehouses but can not afford to be locked into expensive and proprietary architectures,” said Lance Walter, Vice President of Marketing at Pentaho, the open source business intelligence leader. “Pentaho, SQLstream & SQL Power are dedicated to the open architectures that have become a non-negotiable requirement for enterprise software.”

“In this tough economic climate, now more than ever, companies need real time business intelligence in order to improve efficiencies and quickly react to changing market conditions. The integration of SQL Stream’s real-time technology, Pentaho’s OLAP technology and SQL Power’s Wabit Dashboard functionality will deliver affordable, real time performance metrics to progressive organizations around the world,” said Sam Selim, President of SQL Power. “Together, SQL Power, Pentaho & SQLstream can help clients react to new information and fresh business opportunities in real time, thus improving performance on key metrics and positively impacting their bottom line.”

Both SQLstream and Pentaho executives will be speaking at the 2009 MySQL Conference and Expo, April 20th – 23rd in Santa Clara, California. On Tuesday at 11:55 a.m., Pentaho (booth #308) will explain “MySQL Data Warehousing.” On Wednesday at 10:50 a.m., SQLstream (booth #117) will present a technical session on “Eliminating MySQL Bottlenecks with Continuous ETL” as part of the Products & Services track. To see a demo of the Real-Time Twitter BI application, attend SQLstream’s session or visit SQLstream at booth #117 and have yourself a Red Bull.

About SQLstream, Inc.

SQLstream is making the real-time enterprise possible, enabling business decisions based on up to the minute information. Using the industry standard SQL language, SQLstream executes queries before data reaches the warehouse. Built on open source technologies, SQLstream enhances existing business intelligence solutions while maximizing the value received from your organization’s data warehouse when really urgent analytics are required. SQLstream’s investors and advisors include Bob Frankenberg, former CEO of Novell and current Board Member at National Semiconductor, Dick Watts, former member of Hewlett-Packard’s Executive Committee and Duane Zitzner, former HP Executive. SQLstream is headquartered in San Francisco, California and is on the web at www.sqlstream.com.

About Pentaho Corporation

Pentaho Corporation is the commercial open source alternative for Business Intelligence (BI). Pentaho BI Suite Enterprise Edition provides comprehensive reporting, OLAP analysis, dashboards, data integration, data mining and a BI platform that have made it the world’s leading and most widely deployed open source BI suite. Pentaho’s commercial open source business model eliminates software license fees, providing support, services, and product enhancements via an annual subscription. In the years since Pentaho’s inception as the pioneer in commercial open source BI, Pentaho’s products have been downloaded more than three million times, with production deployments at companies ranging from small organizations to The Global 2000. For more information, visit www.pentaho.com.

About SQL Power

Founded in 1988, SQL Power Group is a leading Business Intelligence software and consulting firm. Our proven methodology, highly skilled consultants and our use of state-of-the-art open source productivity tools have delivered Business Intelligence solutions of the highest quality to value-oriented clients. SQL Power’s dedication and ingenuity have established us as the premier Business Intelligence and Data Migration solution provider in Canada. For more information, visit www.sqlpower.ca.