Forum One works with organizations to create and encourage positive and progressive change in the world. In order to do this, we take a data-driven approach to shape how we identify and approach opportunities with our clients. We think our clients should, too, especially when it comes to defining strategy.
But we hear you: data is hard. It is difficult to capture, laborious to normalize, and counterintuitive to analyze and derive meaning. Thankfully, technologies centered on the intake, storage, analysis, and sharing of data have profoundly improved over the last decade. Tools that focus on facilitating, automating, and analyzing data throughout this process are available to help modernize your data operations.
There are two basic approaches your organization can employ to move towards making your data operations more strategically focused.
Approach 1: Extract, Transform, Load
As it stands, you might be housing and analyzing your data in a number of places, such as Excel sheets, Flat files, or custom SQL instances. Traditionally the process of combining these various data sets is by using an Extract, Transform, Load (ETL) model. In this way, data is:
- Extracted: copied from each individual data source to a separate staging area
- Transformed: reformatted or enriched, applying various business logic to convert it into the necessary end format
- Load: copied from the data staging into the warehouse
From there an additional API or business intelligence tool is used to distribute or analyze the data.
- Better performance; data is ready for analysis when it’s in the warehouse
- Easier to get answers from data; it is already in the formats necessary and end users don’t have to do complicated transformations in order to use it
- Requires more up-front planning; you may not be able to get back to source data if the needs change
- Significant transformations will slow down the pipeline, leading to bottlenecks in the process and higher hardware requirements
Approach 2: Extract, Load, Transform (ELT)
An Extract, Load, Transform (ELT) approach has the same elements as the more traditional ETL process; however, the difference is that the data is not transformed between the data source and the warehouse but is instead done prior to analysis.
- Data can be repurposed at any time without having to go back to original sources
- Additional data sources can be added and mixed in at any time
- May have data inconsistencies, such as date and boolean formats
- Since transformation happens at runtime it may require additional processing or intermediate data storage increasing hardware requirements
Which approach should you use?
Your next question is surely, “how do I know which approach I should use for our organization?” Which one is better depends on your priorities and the type of data you are dealing with. In fact, you may use both, depending on what insight you’re looking to achieve.
If you have numerous disparate data sources coming from different formats, it may be best to use the ELT approach to preserve your original sources. Also, if you have requirements that need real-time data slicing and dicing, or analysis of information in different contexts, the ELT approach may provide more flexibility to do that in parallel from a single content model.
In the case that your sources for data are fewer and more consistent, the need for complex transformation can be limited. In this case, the ETL approach may work well. Additionally, if your needs for data analysis are more singularly focused and your reporting needs consistently structured, the ETL approach provides a more straightforward path for storage and transformation.
Choosing your data management process is an important decision and has both short-term and long-term ramifications, and so the reasons you would choose one method over another will be different based on your specific needs and situation.