Unlocking a BigCommerce Superpower: BigQuery + DBT
BigCommerce’s native BigQuery integration is one of its most compelling features. But it’s underutilized. Let’s change that.
In mid-2021, BigCommerce released a native integration with BigQuery. With just a few key strokes, BigCommerce merchants could be up and running with BigQuery — replicating customer, order, and product data into Google’s fully-managed data warehouse solution.
As far as I know, BigCommerce was the first eCommerce SaaS platform to launch its own ETL (extract, transform, load) solution for replicating eCommerce data into a data warehouse (it might still be the only eCommerce platform that supports this out of the box…).
While many, many BigCommerce merchants now rely on this integration to power mission-critical BI dashboards, I believe this incredibly powerful integration is underutilized, and its potential often overlooked when comparing BC with the likes of Shopify and Magento.
Particularly when considering a migration from a legacy eCommerce platform to a modern eCommerce SaaS offering, BigCommerce’s BigQuery integration should be top of mind.
Below, I’ll explain why and how…
Data Migrations When Re-Platforming eCommerce Websites
One of the trickiest, and most expensive, steps in migrating from a legacy eCommerce platform to a modern eCommerce solution like BigCommerce or Shopify is the migration, or blending, of historical order data between the two platforms.
Many teams attempt to address this by migrating legacy orders into the new platform directly. But because product IDs don’t match, this can be challenging. Either you write a highly complex migration that maps old IDs to new IDs, or you migrate legacy orders using custom line items (ie, you just migrate the total order values without matching up or translating product line items.).
Both approaches above are problematic. Often the better approach is simply to blend historical order data with new data structures in a data warehouse.
But how do you do this quickly and effectively? Between a legacy platform and BigCommerce, you could have hundreds of database tables replicated into your data warehouse. How do you efficiently match all this data up?
Working with all this data in SQL can be fragile, requiring thousands of lines of SQL queries, none of which can be documented inline as code.
DBT (Data Build Tool) is a relatively new tool that allows data engineering teams to transform warehouse data with a lightweight, Python-based templating language.
Instead of writing thousands of lines of SQL to transform data, DBT allows teams to write a small amount of code (with inline documentation…) that compiles into SQL at runtime.
DBT is easy to learn. It’s open source. It can run on your laptop to manually transform data. Or, it can be hosted for scheduling data transformation jobs. DBT Labs provides a fully-hosted SaaS edition, that’s highly affordable and a great place to get started.
Leveraging DBT with BigCommerce Data in BigQuery
Before founding Fueled, a customer data platform for eCommerce, I ran a data engineering team that leveraged DBT every day to transform BigCommerce data.
When migrating a merchant to BigCommerce, we’d spin up DBT to blend historical data with BigCommerce data within the data warehouse — saving ourselves many hours (sometimes days) of work...
Taking BigCommerce, BigQuery, and DBT to the Next Level
eCommerce Data Modeling
Blending eCommerce data with DBT is just the first step. What’s even more awesome is that DBT provides tools for building and documenting data models for all of these data transformations. Then, these models can be consumed directly by modern BI tools like Superset and, my new favorite, LightDash. (Both Superset and LightDash are open source too…)
By surfacing these models directly within your BI tool, your entire organization can work with this data directly, without having to understand complex queries or having to look up table and column names. Operators and analysis can quickly get to work.
Another data engineering process that’s made easier with DBT is “flattening” eCommerce data into tables that include calculated data about orders and customers.
In short, because data warehouses aren’t as constrained as traditional databases when it comes to table size and row counts, you can leverage DBT to include “rollup” data about orders and customers in just a handful of tables.
For example, we often build a flattened orders table, where each row represents a order — and then we include additional columns that calculate metrics such as:
Days between the current order and the customer’s first order.
The difference in order value between the current order and the customer’s average order value.
Obviously, you could flatten data like this in SQL, but again, because SQL can’t really be documented with inline comments, and because queries like this can get really long, it’s much easier to maintain procedures like this in DBT.
Enter Fal: Now Let’s Really Blow Some Minds…
Soon, I predict that DBT will be a household name among eCommerce engineers and analytics teams.
In the not-too-distant future, these teams will also start leveraging tools like Fal (Features & Labels) to extend DBT and incorporate advanced statistical analysis packages into their work with eCommerce data.
Combining Fal with DBT, teams can transform eCommerce data, and then apply forecasting algorithms, like Meta’s Prophet package.
Admittedly, I’ve thrown out a lot of tools in this post without breaking down these concepts appropriately for eCommerce business leaders and marketers. (It’s a Friday night, and I want to watch latest Rings of Power episode…). But I’d love to talk more about how my fellow fans of BigCommerce can work together to continue to keep BC at the forefront of eCommerce-specific data and analytics engineering.