How to export Google Analytics data from BigQuery with standard SQL
This post is an update to “how to export Google Analytics data from BigQuery,” I wrote two years ago. If you are unfamiliar with Google Analytics BigQuery data or would like to review the export process, it is still worth reading.
In June 2016 Google introduced BigQuery, 1.11 wich features a revamped SQL dialect named “standard”. The previous SQL is now called “legacy.”
I thought it would be a good idea to update the BigQuery data export SQL script it to the new “standard.”
Below is a snippet of the SQL query. You can find the complete SQL queries on my GitHub.
Standard SQL
The #standardSQL signals BigQuery the SQL version you are using.
#standardSQL
SELECT
-- Hits are recorded in UTC. Customize your timezone to convert to your timezone.
FORMAT_TIMESTAMP("%Y-%m-%d %H:%M:%S", TIMESTAMP_SECONDS(SAFE_CAST(visitStartTime+hits.time/1000 AS INT64)), "America/Los_Angeles") AS hit_timestamp,
visitNumber,
visitId,
fullVisitorId,
FORMAT_TIMESTAMP("%Y-%m-%d %H:%M:%S", TIMESTAMP_SECONDS(SAFE_CAST(visitStartTime AS INT64)), "America/Los_Angeles") AS hit_visitStartTime,
CONCAT(SUBSTR(date,0,4),"-",SUBSTR(date,5,2),"-",SUBSTR(date,7,2)) AS date,
trafficSource.referralPath,
trafficSource.campaign,
trafficSource.source
...
-- Events
hits.eventInfo.eventCategory,
hits.eventInfo.eventAction,
hits.eventInfo.eventLabel,
hits.eventInfo.eventValue,
-- Custom Dimensions (Add your custom dimensions by adding a line for each dimension and chaging the index)
(SELECT MAX(IF(index=1, value, NULL)) FROM UNNEST(hits.customDimensions)) AS dimension1,
-- Custom Metrics (Add your custom metrics by adding a line for each metric)
(SELECT MAX(IF(index=1, value, NULL)) FROM UNNEST(hits.customMetrics)) AS metric1
FROM `dataset_id.ga_sessions_YYYYMMDD` , UNNEST(hits) as hits
Legacy SQL
The #legacySQL signals BigQuery the SQL version you are using.
#legacySQL
SELECT
STRFTIME_UTC_USEC(SEC_TO_TIMESTAMP(visitStartTime + hits.time/1000),"%Y-%m-%d %H:%M:%S") as hit.timestamp,
-- ROUND(visitStartTime + hits.time/1000) as hits.timestamp, /* Use for UNIX timestamp instead of timestamp */
visitNumber,
visitId,
fullVisitorId,
STRFTIME_UTC_USEC(SEC_TO_TIMESTAMP(visitStartTime),"%Y-%m-%d %H:%M:%S") as hit.visitStartTime,
LEFT(date,4)+"-"+SUBSTR(date,5,2)+"-"+RIGHT(date,2) as date,
trafficSource.referralPath,
trafficSource.campaign,
trafficSource.source,
...
-- Events
hits.eventInfo.eventCategory,
hits.eventInfo.eventAction,
hits.eventInfo.eventLabel,
hits.eventInfo.eventValue,
-- Custom Dimensions (Add your custom dimensions by adding a line for each dimension)
MAX(IF (hits.customDimensions.index = 1, hits.customDimensions.value, NULL)) WITHIN RECORD AS dimension1,
-- Custom Metrics (Add your custom metrics by adding a line for each metric)
MAX(IF (hits.customMetrics.index = 1, hits.customMetrics.value, NULL)) WITHIN RECORD AS metric1,
FROM [dataset_id.ga_sessions_YYYYMMDD]