You might have missed it, but the free export of Google Analytics V4 (GA4) data into BigQuery (BQ) is one of the great new features included in GA4. It’s one of several features that used to be $$$ but is now available to all users.
But, you’ve survived this long without it. Why would you start now?
Here’s our quick take on GA4 & BigQuery and a few things to keep in mind as your organization moves into the next chapter of Google Analytics.
Google BigQuery is a professional-grade cloud database tool used by teams who need more capacity and relational structure than spreadsheets can offer. As the name suggests, BigQuery handles multiple, large datasets which you can freely query with simple (or super complex) SQL statements. Yes, your data is still sitting in the cloud on Google servers, but it’s much more like owning your data than just viewing via the normal Google Analytics web experience (UI) or Looker Studio Data Connector (formerly Google Data Studio or GDS).
What are the primary benefits of using Google BigQuery for storing GA4 data?
- You *can* handle the truth!
BigQuery gives you direct access to your raw data and a no-nonsense view of Google’s new data model. Maybe you don’t care, and that’s ok, but for a clear unfiltered look at what *exactly* you are collecting in your analytics, the BigQuery dataset is for you. There are some downsides to this, but for those of you who like to look under the hood, BQ is for you.
- Lightning fast dashboarding and no limits!
Ever been on a zoom call with your boss, waiting for the Looker Studio hamsters to run a little faster? Even worse, have you run into the dreaded quota limitations, where the GA4 LookerStudio Connector just up and decides to take a smoke break while you’re trying to build out that new dashboard due tomorrow morning?
BigQuery is the rather spectacular solution to all of these problems. This is one of the main reasons we run our dashboards on BigQuery whenever possible. We hate limitations and nobody’s got the time for laggy, draggy dashboard reporting.
- Custom reporting & integrations
Google Analytics data isn’t the only thing you can store in your BigQuery account. In fact, Google Search Console (GSC) has recently enabled a connector too, so you can easily export site & url aggregated data sets into a BigQuery project. And, there are plenty of third-party tools that will push other datasets from your favorite CRM (eg Hubspot), CMS (eg WordPress), or offline records into your BQ project. This is when the party really gets started, allowing you to easily query across multiple, related datasets.
- Data control & retention
With your data exported daily from GA4 into your BigQuery database, you have much more control over what happens to it, who has access, and how long you care to keep it. This can be comforting in an age where the powers that be are fighting over where, how much, and how long Google should be able to retain analytics data. With the BigQuery export, you have whatever you want at your fingertips for download or long-term cloud storage.
- Avoid data sampling
Sampling refers to metrics being estimated when the underlying data gets too big or complex. You may be familiar with this from Universal Analytics, where it was a common problem. Most websites won’t run into this problem with GA4 very often because Google significantly increased the sampling quota to 10 million events. Plus, sampling only really applies to the custom exploration reports, not the standard, pre-processed reports. Still, in BigQuery, you are always dealing with the raw, unprocessed data and don’t have to worry about running into any “guesstimating”.
But wait, is thresholding the same as data sampling? No. Those are similar but different controls used in the GA4 UI. Thresholding doesn’t really apply to BigQuery data because (see below) the BQ dataset simply doesn’t get the kind of demographics data that would be hidden or protected by thresholding in the GA4 web experience.
Lots of people are finding that things quickly get confusing after pushing that BigQuery GA4 export button. Here are a few cautions to keep in mind and maybe even deal-breakers for you using it at all depending on your situation.
- Steep learning curve
There’s lots of support out there if you want to learn, but for those unfamiliar with Google Cloud Console, BigQuery, or the new GA4 data model, you’ve got some serious learnin’ ahead. It’s technical, tedious, and murky. We think it’s worth it, but don’t underestimate the effort it requires.
- Building everything from scratch
You’re on your own now. You quickly find out how much pre-processing and behind-the-scenes manipulation Google Analytics does to make those nice reports available in the UI and Looker Studio connectors. Not so with the BigQuery data. You want a house? Here, have some bricks.
Many of the metrics you’re used to pulling off the shelf into your dashboard have to be created or calculated from scratch. If you love working with complex SQL statements, you’re in luck. Otherwise, maybe stick with reporting in the UI.
- Data gaps & discrepancies
This is an important one and often a surprise. But, while a BigQuery export might be your organization’s best source of truth, there are a number of discrepancies and important gaps to keep in mind. We’re keeping this short, but the big ones are:
- No demographics data (Age, Gender, etc from Google Signals).
- Session attribution is still a black box (there are rumors Google will add this).
- There are complex differences between user counts in the GA4 Interface vs GA4 BQ.
- Data freshness. Similar to Google Search Console, the most recent three days worth of data isn’t really finalized until 72 hours. Hopefully, by Monday morning, most or all of your data from the previous work week will be good to go.
- No demographics data (Age, Gender, etc from Google Signals).
- Cost management
For most SMBs with typical usage, BigQuery is going to be pretty cost effective. Many businesses never pay more than a couple dollars per month, if anything. Problem is, it’s difficult to understand what you’re getting yourself into as the cost structure is complex and varies by your usage and storage sizes, which will increase over time. Good news is there is a free tier with very high monthly limits. So unless you plan on going crazy right away, you can reasonably expect to use BigQuery for free for quite a while before incurring any serious costs at all.
Luckily, you’re in control if things ever get expensive. And, Google Cloud Console (GCC) allows for budgeting your billing, so you can be alerted if the price starts getting out of control.
How to: https://cloud.google.com/billing/docs/how-to/billing-access
Setting up the initial BigQuery GA4 export option is almost a no-brainer. But, wrapping your head around the system itself, navigating the new GA4 data model, and building everything you need from scratch can lead to some real headaches. Plus, the threat of creeping costs is something to keep an eye on.
However, we are mostly fans and include it in many of our implementations for fast & reliable dashboarding, data governance, and custom data integrations.
But, not every problem is solved with a hammer. Because we can’t get everything we want in one place, we’ll continue to pick and choose between GA4 Explorer (UI), Looker Studio Connector, and the GA4 API, in addition to BiqQuery.
- Google BigQuery Pricing Page
- Google Support Page on GA4 UI vs BigQuery Data Discrepancies
- Johan van de Werken & Simo Ahava – Tutorials & SQL Snippets
- Interactive Data Model Reference Tool