![]() |
|
| Extract, Translate and Load – ETL | ||||||||
|
Before people started using MOM (Messaging Oriented Middleware) to share changing data between applications, they used a technology known as ETL (Extract Translate and Load). It collected data; cleaned, aggregated and transformed the data; and shipped it to a central location, optionally to populate a database or data warehouse. Historically Batch-OrientedThe ETL process was and is inherently batch-oriented. As market pressures and business needs have forced ETL vendors to offer more of a real-time solution, their main response has been to move to micro-batches. The idea is simple – if the batches are relatively small, they can label their activities “near real-time” without having to significantly rework their product or approach. However, moving to micro-batches offers no architectural benefits and no competitive leverage against products designed from inception to provide real-time, continuous processing. ETL Traditional Tools CharacteristicsETL tools are extremely useful, performing valuable tasks, but fundamentally functionally inferior to using Relational Asynchronous Messaging (RAM) and the RAM Management System (RAMMS) applied to the same task set. Basically, ETL tools are
As a result, they unsurprisingly do not support standard driver interfaces such as ODBC or JDBC, and they do not follow the SQL security model or other related standards. They do not support the kinds of plug-in extensibility and elegance of a RAMMS. ETL tools often do support rich and intuitive user interfaces, and allow for easy repeatability in terms of batch scheduling to run their processes. They also are reliable in delivering and loading data into their target databases and data warehouses. However, many of the graphical user interfaces available in today's ETL tools map simple transformations of data by using mouse clicks and lines dragged from source fields to destination fields. This approach is great for small numbers of fields and simple data remapping, but it does not scale to the increasingly complicated, real-world scenarios facing customers today. The most maintainable and scalable approaches are all language-based. SQL shines here by providing excellent facilities for transforming individual fields and handling complicated joins and aggregations of data. Another advantage of SQL approach is that it enables the creation of programmatically updated solutions. In other words, one can create SQL-based computer programs and readily edit
the SQL rules to update them to handle changes in business policy.
Normally, such changes are simple, such as a changing threshold,
or validation parameters or something of the sort. Functionality Subsumed in the RAM ModelIn very many ways, the RAM model completely subsumes the functions of ETL while concurrently subsuming the functions of MOM. Moreover, RAM achieves all of this through leveraging a standards-based, widely understood and used language, SQL, allowing nearly instant productivity for the large pool of SQL-literate programmers. Basic ETL Requirements Met by RAMA review of the high-level requirements for ETL tools reveals just how applicable the RAM model is:
Additional Facilities Inherent in RAMFeatures inherent in RAM, but which ETL tools cannot easily do, include the following capabilities:
While ETL may have a host of other deficiencies when compared to RAM, the ones listed above serve to illustrate the powerful advantages RAM provides over the “old school of integration” – ETL. RAM as the Grail of Merged EAI and EIIMany industry analysts have been looking for the “Holy Grail” of data and application integration. That Grail would be a way to seamlessly blend Enterprise Application Integration (EAI) with ETL and Enterprise Information Integration (EII). We would argue that RAM is in fact that Holy Grail. We explore why in fact RAM in many ways subsumes the need for EII in the EII page.
|
||||||||