Docs

Google Analytics 4 BigQuery (Service Account) Extractor

What is Google Analytics 4 Extractor?

Google Analytics 4 is an analytics service to measure traffic and engagement across your websites and apps. It collects event-based data from both websites and apps.

Google Analytics 4 BigQuery (Service Account) extractor by Electrik.AI is designed to export Google Analytics 4 property data from your linked BigQuery dataset and load it into your database. You no longer need to manually export GA4 property data to database niether do you need to bother about unnesting the property data or about merging the multiple events_YYYYMMDD tables.

With Electrik.AI’s Google Analytics 4 BigQuery (Service Account) Extractor, you can get daily report in the database of your choice. Electrik.AI makes it easy to get the data you want so that you can focus on analysis and reporting.

How to set up Analytics for a website and/or app

How to add a Google Analytics 4 BigQuery (Service Account) Connection in Electrik.AI?

Follow these steps to connect your Google Analytics 4 BigQuery (Service Account) to Electrik.AI

Step 1: Ensure you have following prerequisites to create a Google Analytics 4 BigQuery (Service Account) Connection in Electrik.AI

  1. Access to Google BigQuery Project (the one Linked to GA4 property)
  2. Google BigQuery Service Account Key File (extension JSON).
  3. Google Analytics Account ID

Step 2: Sign-in into your Electrik.AI account

Step 3: Select IO in Electrik.AI dashboard

Select IO in ElectrikAI Dashboard - ElectrikAI

Step 4: Select the Connections tab in the left most panel and click on “Add” button.

Select the Connections tab in the left most panel Conn - ElectrikAI

Step 5: The Source default tab is selected just click on “Google Analytics 4 BigQuery (Service Account)” connection.

Source default tab is selected just click on Google Analytics 4 BigQuery (Service Account) connection - ElectrikAI

Step 6: Provide your Google Analytics 4 BigQuery (Service Account) details.

  1. Choose the BigQuery Service Account Key File
  2. Choose the project that is linked to your GA4 Property
  3. Select the dataset created by GA4 in your project for the linked GA4 property viz analytics_<property_id>
  4. Provide your Google Analytics Account ID (Make sure you provide correct id as it will not get validated)
Provide your Google Analytics 4 BigQuery (Service Account) details

Step 7: Click on Save button on the top right corner of the screen.

Click on Save button on the top right corner of the screen - ElectrikAI

Congratulations, you have successfully created a Google Analytics 4 BigQuery (Service Account) Connection in Electrik.AI

Congratulations, you have successfully created a Google Analytics 4 BigQuery (Service Account) Connection - ElectrikAi

What are the supported Data Destinations?

Electrik.AI supports the following databases and Data Warehouse to export Google Analytics Data. It is important to note that Electrik.AI itself is not a destination. A destination is required to export data from various marketing data sources using Electrik.AI

PostgreSQL Data Destination-Eletrik.AI
MySQL Data Destination-Electrik.AI
AzureSQL Data Destination-Eletrik.AI
Azure Blob Storage - ElectrikAI

How to export Google Analytics 4 Data?

Electrik.AI creates a data flow (or ETL process) that exports Google Analytics 4 property data to a target database or data warehouse or cloud storage. This process runs daily at scheduled time and exports data for the previous day.

Follow these steps to create a Data Flow from Google Analytics 4 BigQuery (Service Account) to Google BigQuery

Step 1: Ensure you have the following prerequisites.

  1. Access to Google BigQuery Project (the one Linked to GA4 property)
  2. Google BigQuery Service Account Key File (extension JSON).
  3. Google Analytics Account ID
  4. Access to a Database of your choice (PostgreSQL, MySQL, Azure SQL)
  5. Database Connection Details such as Host, Port, DB Name, DB User.

Step 2: Sign-in into your Electrik.AI account.
Step 3: Select IO in Electrik.AI dashboard

Select IO in ElectrikAI Dashboard - ElectrikAI

Step 4: Select the Connections Icon in the left most panel and Click on Add button to create a Google Analytics 4 BigQuery (Service Account) Data Extractor Flow.

Select the Connections Icon in the left most panel Flow - ElectrikAI

Step 5: Select Google Analytics 4 BigQuery (Service Account) Connection present in the list of available sources.

Conn Step5-Select-the-Connections-Icon-in-the-left-most-panel-Flow-ElectrikAI

Step 6: Select your existing Google Analytics 4 BigQuery (Service Account) connection in Electrik.AI.

Select your existing Google Analytics 4 BigQuery (Service Account) connection - ElectrikAI

To configure a new connection, refer to this link.

Step 7: Click on a destination of your choice from list of available destinations.

In this example we are choosing PostgreSQL. 

Click on a destination of your choice from list of available destinations - ElectrikAI

Step 8: Click on a destination of your choice from list of available destinations. 

Click on a destination of your choice from list of available destinationss - ElectrikAI

 Step 9: Review the Data Source, Data Destination, Database Schema, Provide Success Notification Email, Failure Notification Email. Click “Save and Proceed” to continue.

Review the Data Source Data Destination Database Schema - ElectrikAI

Success Notification Email: Enter a valid Email ID to receive the Flow Success Email Notification.
Failure Notification Email: Enter a valid Email ID to receive the Flow Failure Email Notification.
Note: By Default, Electrik.AI uses the signed-in user’s Email Id.

Congratulations, Flow is created. Please wait while you are redirected to Flow Details Screen.

Congratulations, Flow is created. Please wait while you are redirected to Flow Details Screen - ElectrikAI

You have now successfully setup Google Analytics 4 BigQuery (Service Account) to PostgreSQL flow in Electrik.AI

You have now successfully setup Google Analytics 4 BigQuery (Service Account) to PostgreSQL flow in ElectrikAI

You have now successfully setup Google Analytics 4 BigQuery (Service Account) to PostgreSQL flow in Electrik.AI

Your Google Analytics data will start loading after 24 hours in your Google BigQuery Dataset - ElectrikAI

How to find your account_id in Google Analytics

Step 1: Login to your Google Analytics account.

Step 2: Click on ‘Admin’, in lower left corner of your screen.

Click on Admin, in lower left corner of your screen - ElectrikAI

Step 3: Click on ‘Account Settings’

Click on GA4 Account Settings - ElectrikAI

Step 4: Copy the “Account Id”

Copy the GA4 Account Id - ElectrikAI

Sample Queries To Get Google Analytics 4 Data

1. Gives event param key and values for a specific event name.

SELECT
event_date,
event_name,
EP.key AS param_key,
COALESCE(value.string_value,CAST(value.int_value AS STRING), CAST(value.float_value AS STRING),
CAST(value.double_value AS STRING)) AS param_value
FROM
`your_project_id.your_dataset_id.events_*`,  -- Replace table name.
UNNEST(event_params) AS EP
WHERE
event_name = 'user_engagement'  -- Provide required event_name.
AND _TABLE_SUFFIX BETWEEN '20230801' AND '20230831';  -- Replace date range.

2. Gives all available event parameters and counts of their occurrences in your dataset.

SELECT
  EP.key AS event_param_key,
  COUNT(*) AS occurrences
FROM
   `your_project_id.your_dataset_id.events_*`,   -- Replace table name.
   UNNEST(event_params) AS EP
WHERE
  _TABLE_SUFFIX BETWEEN '20230801' AND '20230802'   -- Replace date range.
GROUP BY
  event_param_key
ORDER BY
  event_param_key ASC;

3. Gives event params values for all the available event param keys in your dataset.

SELECT distinct key,
COALESCE(value.string_value,CAST(value.int_value AS STRING), 
CAST(value.float_value AS STRING), CAST(value.double_value AS STRING)) 
AS param_value
FROM  
`your_project_id.your_dataset_id.your_table_id`, -- Replace table name.
UNNEST(event_params) AS event_params
-- Use this to filter for a specific key and provide the required key
-- WHERE key in ('ga_session_id','ga_session_number','session_engaged','campaign','medium') 
ORDER BY key;  

4. Gives list of items.

SELECT event_date, event_timestamp, event_name, user_pseudo_id,items.*
FROM  `your_project_id.your_dataset_id.your_table_id`,  -- Replace table name.
UNNEST(items) AS items 
ORDER BY 3;

5. Gives list of user properties.

SELECT event_date, event_timestamp, event_name, user_pseudo_id,
key,
COALESCE(value.string_value,CAST(value.int_value AS STRING), CAST(value.float_value AS STRING), CAST(value.double_value AS STRING)) AS param_value
FROM  `your_project_id.your_dataset_id.your_table_id`,  -- Replace table name.
UNNEST(user_properties) AS uprop
ORDER BY 1;

6. Gives unique events counts by a specifc period of days and by selected event name.

SELECT
  event_date,
  event_name,
  COUNT(*) AS event_count -- Counts unique events by date and by event name for a specifc period of days
FROM  `your_project_id.your_dataset_id.events_*`   -- Replace table name.   
WHERE 
  event_name IN ('page_view', 'session_start', 'purchase') --PASS enent_names here
  AND _TABLE_SUFFIX BETWEEN '20230701' AND '20230708'   -- Replace date range TO get data for specific date range
GROUP BY 1, 2;

7. Gives day wise user count for specific date range.

SELECT _TABLE_SUFFIX AS event_date,count(distinct user_pseudo_id) AS date_wise_user_count
FROM    `your_project_id.your_dataset_id.events_*`   -- Replace table name.
WHERE   _TABLE_SUFFIX BETWEEN '20230717' AND '20230722' -- Replace date range TO get data for specific date range
GROUP BY _TABLE_SUFFIX
ORDER BY 1;

8. Gives total user count and new user count for specific date range.

WITH
  new_users AS (
    SELECT
      user_pseudo_id,
      MAX(IF(event_name IN ('first_visit', 'first_open'), 1, 0)) AS is_new_user
    -- Replace table name.
    FROM  `your_project_id.your_dataset_id.events_*` 
    WHERE _TABLE_SUFFIX BETWEEN '20230701' AND '20230708'   -- Replace date range TO get data for specific date range
    GROUP BY 1
  )
SELECT
  COUNT(*) AS total_user_count,
  SUM(is_new_user) AS new_user_count
FROM new_users;
--
-- Queries the individual timestamps and values for all 'purchase' events.

SELECT
  event_timestamp,
  (
    SELECT COALESCE(value.int_value, value.float_value, value.double_value)
    FROM UNNEST(event_params)
    WHERE key = 'value'
  ) AS event_value
FROM
  -- Replace table name.
   `your_project_id.your_dataset_id.events_*`
WHERE
  event_name = 'purchase'
  -- Replace date range.
  AND _TABLE_SUFFIX BETWEEN '20230801' AND '20230802';

9. Gives the latest ga_session_id and ga_session_number for specific users during last ‘N’ days.

DECLARE N INT64 DEFAULT -3 ; -- Replace 3 BY LAST NUM OF DAYS REQUIRED
DECLARE REPORTING_TIMEZONE STRING DEFAULT 'America/Los_Angeles'; --Replace America/Los_Angeles BY REQUIRED TIMEZONE

CREATE TEMP FUNCTION GetParamValue(params ANY TYPE, target_key STRING)
AS (
  (SELECT `value` FROM UNNEST(params) WHERE key = target_key LIMIT 1)
);

CREATE TEMP FUNCTION GetDateSuffix(date_shift INT64, timezone STRING)
AS (
  (SELECT FORMAT_DATE('%Y%m%d', DATE_ADD(CURRENT_DATE(timezone), INTERVAL date_shift DAY)))
);
WITH
  userpseudoid AS(
  SELECT
    DISTINCT user_pseudo_id
  FROM
     `your_project_id.your_dataset_id.events_*`,
    UNNEST(event_params) AS event_params
  WHERE KEY IN ('ga_session_id','ga_session_number')
  AND _TABLE_SUFFIX BETWEEN '20230517' AND '20230822'
  )

SELECT DISTINCT
  events.user_pseudo_id,
  FIRST_VALUE(GetParamValue(event_params, 'ga_session_id').int_value)
    OVER (UserWindow) AS ga_session_id,
  FIRST_VALUE(GetParamValue(event_params, 'ga_session_number').int_value)
    OVER (UserWindow) AS ga_session_number
FROM
  -- Replace table name.
    `your_project_id.your_dataset_id.events_*` events
  INNER JOIN   userpseudoid AS upid
          ON(upid.user_pseudo_id = events.user_pseudo_id)
WHERE
RIGHT(_TABLE_SUFFIX, 8)
BETWEEN GetDateSuffix(N, REPORTING_TIMEZONE)
AND GetDateSuffix(0, REPORTING_TIMEZONE)
WINDOW UserWindow AS (PARTITION BY events.user_pseudo_id ORDER BY events.event_timestamp DESC)
ORDER BY 1;