Joao Correia

Today, web clickstream data is mostly in the guarded hands of web analytics vendors; some of these vendors allow you to access this raw data. Human and computational resources to leverage this raw data are more accessible and cheaper than ever. Why and how can you take back ownership of your web analytics (clickstream) data?

What is Clickstream Data?

Every time a user sees a webpage or takes an action that is being tracked on a website, a “hit” is recorded. A hit is simply a row of data which contains rich information about the hit and the visitor. A typical hit can include dimensions such as: date, time, user ID, browser, operating system, country, city, requestedURL, hostname, etc.

Clickstream data is composed of thousands, millions or even billions of hits which tell the story of how, who, and what visitors are doing on your website. This is what web analytics vendors use to provide you statistics about your website.

Why to own your web analytics data

If you can relate to the questions below, then keep reading this post to find out how to own your clickstream data.

State of the Analytics Nation

We are in a booming cloud computing age:

Almost every web analytics vendor uses javascript tracking (or via mobile SDKs), by firing a pixel with data that identifies each hit with browser resolution, page path information and other rich metadata. Most of this information doesn’t go to your own server logs; it is stored inside your vendor’s data warehouse and you have to work to get it out (if they provide it to you).

Many of the paid vendors provide this granular clickstream data (Google Analytics Premium via BigQuery, Adobe Analytics via Data Feeds, etc) and it is up to you to extract it.

Today, companies should not be stuck with answering basic questions due to a lack of clickstream data.

How to Free Your Digital Analytics Data

If you are a Google Analytics Premium client, you can enable the Google BigQuery integration which will make the clickstream data available to you on a daily basis.

If you use Google Analytics Standard (free version), Google will not be providing you access to the clickstream data. You can query the API for certain data, but it is summarized and processed data. Fear not though; you are not stuck! One solution to get ownership of your GA Standard clickstream data is to use Blast’s Clickstreamr product which captures a copy of the same data that is sent to GA.

If you use Adobe Analytics, you can have your clickstream data delivered daily via FTP in multiple zipped files.

If you use Snowplow Analytics, you own your clickstream data. This sophisticated event analytics platform leverages Amazon Web Services to scale. You can store and analyze your data using Hadoop Hive, Spark, Redshift or PostgreSQL. This platform has the particularity of offering real-time clickstream analytics.

Owning your clickstream data has never been so easy and affordable! Some of the most popular applications of this data include:


Having a solid web analytics implementation with summarized data can bring value to organizations, but as your analytics maturity evolves you’ll find that highly granular, user-level data will allow you to answer far deeper and more valuable business questions. Stop waiting for your analytics tool to build a feature to answer your business questions and move into the limitless world of clickstream data analysis!


Joao Correia