Data Migration: Google Analytics to BigQuery using Airbyte

Introduction:

The integration of Google Analytics with BigQuery represents a powerful synergy, enabling organizations to unlock the full potential of their data assets. Airbyte, an open-source data integration platform, serves as the bridge between these two platforms, facilitating seamless data migration from Google Analytics to BigQuery.

Prerequisites:

  • GA4 account
  • GCP login
  • Airbyte login

GA4 setup:

  • To set up data collection for your website or app, choose from where you will be collecting data (the Web, an Android app, or an iOS app). In this case choose web platform.
  • Enter the URL of website and provide a stream name.
  • On creating the stream, a measurement id corresponding to the stream will be generated. 
  • Add the data collection tag to that source.  Go to Web Stream Details, Manage connected site tags and add the measurement id and click connect. 
  • Implement google tag for the data stream. Select the Square space platform and enter the URL of the website. Now the data stream is set up for Squarespace website. 

Connection to Squarespace site:

  1. Sign in to your Squarespace account.
  2. On the “Dashboard”, choose the site you want to modify.
  3. From the main menu, click Settings.
  4. Click Developer Tools.

5. Select External API Keys.

6. Paste your tag ID.

7.Click Save.

Data is now flowing in the Data Stream: 

We should now be able to access all the behavioural data of Squarespace site. User acquisitions, User behaviour, demographic details, etc. Please note that it would take minimum 24 hours for the data stream to start collecting the data. 

Setup Airbyte source and destination connection:

Step 1: Set up the GA4 source connector. 

  • From the Airbyte UI, click the Sources tab, then click + New source  
  • 2.Select Google Analytics 4(GA4) from the list of available sources. 
  • 3.Enter the preferred source connector name and for credentials choose ‘Authenticate via Google (Oauth)’. On entering the property id click ‘Authenticate your Google Analytics 4 (GA4) Account’. This will navigate to authentication portal where, choose the email with which account is created. Click on set up source and now the source connector is configured. 

Step 2: Set up the BigQuery destination connector. 

To use a Google Cloud Storage bucket:

  • Create a Cloud Storage bucket with the Protection Tools set to none or Object versioning. Make sure the bucket does not have a retention policy.
  • Create an HMAC key and access ID.
  • Grant the Storage Object Admin role to the Google Cloud Service Account. This must be the same service account as the one you configure for BigQuery access in the BigQuery connector setup step.
  • Make sure your Cloud Storage bucket is accessible from the machine running Airbyte.

Set up the BigQuery connector:

  • Log into your Airbyte Cloud or Airbyte open-source account.
  • Click Destinations and then click ​+ New destination.
  • On the Set up the destination page, select BigQuery from the Destination type dropdown.
  • Enter the name for the BigQuery connector.
  • For Project ID, enter your Google Cloud project ID.
  • For Dataset Location, select the location of your BigQuery dataset.
  • For Default Dataset ID, enter the BigQuery Dataset ID.
  • For Loading Method, select Standard Inserts or GCS Staging.
  • For Service Account Key JSON (Required for cloud, optional for open source), enter the Google Cloud Service Account Key in JSON format.
  • Test the connection.

Set up Connection:

  • Click the new connection in Connection tab. 
  • Choose the GA4 source connector created. 
  • Choose the BigQuery destination connector created. 

Configure the connection. 

  • Enter the preferred connection name and data residency. 
  • Enter the preferred Schedule type either Manual or Schedule, Destination Namespace, Destination Stream Prefix and Detect and propagate schema changes. Destination Namespace is the target schema into which the data is to be replicated. 
  • Enable the streams which needs to be synced and click set up connection. 

Sync data from GA4 to Snowflake: 

  • Select the connection created and click ‘sync now’ to let the connection sync data from the GA4 to Snowflake.  

Check the status of the job. 

  • Check the status of the connection. 
  • Validate the data in the BigQuery.

Conclusion:

In conclusion, the above procedure helps leveraging Airbyte to sync Google Analytics data to Google BigQuery provides businesses with a powerful solution for unlocking the full potential of their web analytics data.

Cittabase excels in modern data platform deployments and is a proud Airbyte partner. Feel free to reach out for your data integration needs using Airbyte. 



Leave a Reply