Tag Clouds are a very common paradigm for navigation on blogs and web sites. The typical tag cloud consists of a block of works, one after another, usually ordered alphabetically and with font size being used to represent a variable such as the number of articles so tagged or the number of searches performed on a particular tag.
I am going to set out in this article to create a Google Gadget which implements a tag cloud from data obtained from Google Analytics.
Using custom variables to tag pages
First, consider how you can "tag" pages with Google Analytics. There two approaches you can use to use the Tracking API to tag pages. One would involve custom variables and the use of the setCustomVar() JavaScript tagging function.
So to tag a page with the tags "analytics" and "reports", you could have for example
_setCustomVar(1, "tag", "analytics") _setCustomVar(2, "tag", "reports")
Then a query can get the values of the page tags with a query such as:
SELECT METRICS ga:pageviews ON COLUMNS DIMENSIONS ga:pagePath, ga:customVarValue1,ga:customVarValue2 ON ROWS TOP 50 BY ga:pageviews DESC FROM [www.shufflepoint.com] WHERE TIMEFRAME yearToDate FILTER ga:customVarName1 == "tag" OR ga:customVarName2 == "tag"
The results would have to then be post-processed to merge the two columns of tags into a single tag list sorted by pageviews.
The advantage of tagging pages using custom variables is that you get to choose the domain of tags. However GA only allows five custom variables per page, so you really need to think about both the tags domain and about each pages tags. It sure would be nice to have ten custom variables available. I don't see anything which would prevent one from implementing this approach, but since we don't have such tags now in our GA, it wasn't implemented for this article.
Using keywords instead of tags
The other way to approach building a tag cloud from GA is to use keywords as tags. This is thus closer to a "folksonomy" in that your users are effectively creating the domain of tags. There are two possible sources of keyword tags available in GA - organic search and site search.
If you setup Google Site Search early on, and have steered users towards using it, then that would be the best place to obtain your tag folksonomy. A query such as the one below will get your top search terms. In this example, I've filtered out "search" as a keyword because that word may be present in the search text field as a user hint and I don't want to consider those results. Also I've limited the the results to the top three dozen by pageviews and specified with NOOTHER that I don't want ShufflePoint to calculate an "Other" record.
SELECT METRICS ga:pageviews ON COLUMNS DIMENSIONS ga:searchKeyword ON ROWS TOP 36 NOOTHER BY ga:pageviews DESC FROM default WHERE TIMEFRAME yearToDate FILTER ga:searchUsed == "Visits With Site Search" AND ga:searchKeyword != "search"
If you haven't setup Site Search, or you find that you don't have enough usage to get a good tag set, you can try looking at organic search as a source for you tag domain. This would be accomplished with a query such as
SELECT METRICS ga:pageviews ON COLUMNS DIMENSIONS ga:keyword ON ROWS TOP 36 NOOTHER BY ga:pageviews DESC FROM default WHERE TIMEFRAME yearToDate FILTER ga:keyword != "(not set)"
ShufflePoint will return the dataset in Google's Visualization API JSON format. So I have two tasks to tackle. The first is to render the results out as an HTML tag cloud. The second is to add a useful behavior to this tag cloud.
Rendering the cloud
I Googled on tag cloud generators and found there to be nothing complicated - you just have a list of links in a div element and set the font size to be proportional to the frequency. If you want some richer tag cloud visualizations, there is a nice jQuery library at http://plugins.jquery.com/node/3109. For some ideas on how cool a tag could be, check out http://www.wordle.net/. If anybody knows of an interactive (i.e. Flex or AJAX) tagcloud generator like Wordle, please, please let us know!
Handling clicks
How clicks are handled depends on which pattern of query was used to get the tag list. If the Site Search pattern was used, then the tags should be linked to the Site Search page - with the selected tag set as the search parameter. This is the idea approach because users remain on your web site (Google provides a JSON search feed interface).
If you are using the organic search query pattern, then you have two options. You could signup for Site Search now, and create a new page (using Google's JSON feed) to display search results. If you don't want to signup for Site Search, then your tag links will be to Google's search page so users will be leave your site to see the list of pages which contain the tag. But the search expression will include a site qualifier, so all search results links will direct the user back to your own site. Site Search is only $100/year for a typical size site - a small price to pay for the capabilities it offers.
This screenshot shows example of the final cloud. Notice that a timeframe dropdown is present. This allows the visitor to dynamically adjust the timeframe of the query which is retrieving the ranked keywords used to render the cloud.