How to make Cloud ETL work

Cloud based database services such as Amazon Redshift, Microsoft SQL Azure and Google BigQuery are now commonplace. But handling the ETL (Extract, Transform, Load) processing required to perform analytics against their data has some challenges.

It is possible to run ETL routines in the Cloud and only incur cost for as long as you need them, but you will have to rethink your architecture and unlearn past architectural patterns.

The challenge

Traditionally, companies invested in physical database servers. These were specified and sized to allow capacity for running ETL routines. Later, these physical servers became virtualised and then they were outsourced to data centres which provided failover and disaster recovery. Now databases have become services and exist in the cloud without the ability to use the hardware that provides the service.

All this poses a conundrum: where do you run your ETL routines now?

Potential solution

One solution would be to create a dedicated ETL virtual machine hosted in a data centre or by a cloud commute provider such as Amazon, Microsoft or Google. However, this suffers from the same problems and constraints as the legacy model of running these routines on a traditional database server. In fact these issues are exacerbated, as you now incur the cost of a permanent ETL server as well as that of the database service.

Because you are failing back on old hardware patterns, you are now paying for and maintaining a permanent virtual machine that has been provisioned and sized to cope with the largest ETL load. Most of the time this is sitting around idle, as typically ETL routines run on a regular cycle, so you are paying for a resource that you are underutilising.

Want to learn more?

Do you want to learn more about Agile BI and Data Warehousing? Contact me or another member of the Red Olive team on 01256 831100 or email [email protected].

How to make Cloud ETL work

The challenge

Potential solution

Recommended solution

Want to learn more?

Red Olive presents expert thinking at SQLBits 2024

Exploring data management in the housing sector

Red Olive showcases MLaaS solution at Big Data LDN 2023

Get more value from your data

How to make Cloud ETL work

Share on social media

The challenge

Potential solution

Recommended solution

Want to learn more?

Expert Advice & Insight

Get more value from your data

Expert insight — delivered straight to your inbox. Join the priority list for information and upcoming events.