Center for Citizen Media Rotating Header Image

Citizen Media Business Issues: Web Statistics

(This is the sixteenth in a series of postings about citizen media business issues. See the introduction here. All of these entries are considered to be in “beta” and will be revised and refined as they find a home on a more permanent area of the Center for Citizen Media web site.   To that end, your comments, additional examples, and criticisms are welcome and will be invaluable contributions to this process.)

How many people are reading what you’re writing? Who are they? How did they find you? If applicable, how likely are they to click an ad or buy a t-shirt? And without negatively affecting the users’ experience, how can you attract more visitors or increase the probability that they’ll click the ad or buy the t-shirt?

For the final three Citizen Media Business Issues posts, we’ll try to answer these questions by exploring Web statistics, traffic rankings, search engines, and optimization (for search engines as well as for other goals).

Web Statistics

Often referred to as analytics, Web statistics are various measures of website activity intended to help the webmaster or marketer. Webmasters use the information to attract more visitors and improve overall user experience. Marketers use it to maximize revenue and determine the value of ad space.

While companies have been providing and/or selling analytics software and services since the mid-90s, the average webmaster without a budget for such things had to, for a long time, settle for the once-ubiquitous odometer-styled hit counter. The simple, public measure of how many times files were accessed since the counter’s creation, aside from being a little tacky, was too unreliable and subject to manipulation by those seeking to misrepresent their traffic data. Counters just didn’t do a good enough job of explaining what really happens on a site. Even if you found a free piece of real analytics software, you needed a good amount of technical savvy and access to you site’s server to get it going and maintain it.

Google Analytics has brought statistics to the Web mainstream, providing a free service that requires no downloads and very little effort. All you have to do is copy and paste a little snippet of code into each of your pages (or just the template, if you’re using one). Google Analytics’s metrics and features with which you can analyze them will likely be more than what you need (like which ISP your visitors are using most), but let’s take a look at how to use it so you can figure out which stats are most important to you.

[Note: The majority of this article deals with Google’s product because it is well-known, comprehensive, easy, and free, but it’s far from the only game in town. Drawbacks and alternatives will be discussed at the end, but almost all of the features and metrics discussed in the context of Google Analytics are applicable to its competitors, as well.]

Using Google Analytics

The default tab, the Dashboard, contains the information you’ll most regularly want to check. It’s customizable, so you can click the little X in the corner of each box to remove it and the sections in other tabs all have an “Add to Dashboard” button at the top. The first two things to note, because they apply to every Analytics page, are the date range and the graph. The date range (top right corner) is, of course, the period of time that the statistics reflect, and you can cover as much or as little time as you want by playing with it. The graph displays a measure of a particular spec over the selected date range (number of visits, by default, on the Dashboard). The drop down in the top-right corner of the graph allows you to change what it is that you’re measuring.

Google Analytics Dashboard

The rest of the data can be roughly explained in four groups: how many visitors you have, who they are, what they do on your site, and where they came from.

How many, which includes stats like the numbers of visits and pageviews, is, of course, the most definitive measure of how popular your site is and often the primary motivator to use this software to begin with. A visit is logged any time someone starts a new session on your site, with pageviews counting all pages loaded within that session. To Google, a “session” times out after 30 minutes of inactivity. If you visit, leave, and come back within 30 minutes, it will likely be counted as a single visit. Likewise, if you idle for 30 minutes and then click a link to another page on the same site, you’ll probably be starting a new visit/session on the new page.

Many webmasters want to know who their visitors are in order to improve user experience. Google tracks geographic data, browser types, operating systems, ISPs, connection speeds, percentage of users with Flash or Java installed, and the number that have been to your site before. If, for example, you find that half of the people looking at your page use Internet Explorer on dial-up connections, you probably wouldn’t want to require use of a FireFox plug-in or use a lot of large files like videos or high-resolution images.

Google Analytics City Detail

Information relating to what people do on your site includes amount of time spent on site, average pageviews, and bounce rate (a term that refers to the percentage of visitors who left the page they arrived on without checking out any others).

Unfortunately, due to tabbed browsing, idle time, session timeouts, and wildly varying personal browsing habits, it’s hard to get a lot of meaning from the average logged amount of time on site/length of visit, but we’ll talk about some of its possible uses in comparisons later. Also, average pageviews, bounce rate, and depth of visit are really only useful to you if you’re not running a blog or news site. Blogs are almost invariably set up to display many articles on one page and the news sites that don’t use blogs still make use of scannable headlines, placing a good deal of content on the front page. What reason would someone have, then, to explore other pages?

A useful what people do stat is under the Content tab, where you can see a break-down of how popular each of your pages is in terms of pageviews. With this you can assess where your strengths and weaknesses are, perhaps spot a problem if something should be higher or lower (a mistyped link, for example), or figure out what sort of content the search engines most closely associate with your site (more on this in the next post).

If your goal is to get more traffic, then the most valuable data for you here probably has to do with where visitors come from, which can be found under the Traffic Sources tab. “Where” in this case doesn’t refer to geographical location, but how people find you on the Internet. Most people probably don’t type your URL into their address bar, but those that do are counted as “direct traffic,” as are those who have it bookmarked in their browser and access your pages that way. High direct traffic is usually the result of offline marketing (business cards, print ads, etc.), an extremely accessible/memorable URL, or high reader loyalty.

Traffic that isn’t “direct,” then, must have come from some other point on the Web. The Referring Sites section displays not only the names of pages that link to you and how many of your hits came from there, but also the trends of each group of referred users. So you can see how the average pageviews, time on site, or bounce rate differs between users who clicked in from and those who found you through your friend’s blogroll. The only search engine you will probably see on the list is Google Images. Search engines have their own section (not sure why Google Images isn’t included there).

By taking a close look at your referring sites, you can tell why people are linking to you (they’re probably linking to particular pages or topics), how relevant the referral was (a visitor from Site A may spend twice as long and look at twice as many pages as Site B), learn your strengths and weaknesses, and gain feedback about what you’re writing (though rarely criticism—that usually just shows up as a lack of hits unless you’re sufficiently polemic).

The last major part of where users come from is your search engine data, including keywords. The next post will cover all things search engine in detail.

FeedBurner for RSS

Google Analytics is a fantastic tool that can really help you improve your site (or at least provide you with some fun trivia about your readers), but the major thing it’s missing is data about RSS subscriptions.

Many if not most regularly-updated sites have RSS feeds these days. If you run a blog via any popular weblog software, in fact, you definitely have one. Unless you’re hiding it, odds are that a significant percentage of your visitors get your content that way. So if you have an RSS feed and want the same kinds of information about it that you now have for regular Web traffic, head over to FeedBurner.

FeedBurner was recently acquired by Google, so may be integrated into Analytics soon, but for now it’s the best place to get statistics and add features to your feed. The service works by replacing your current RSS or Atom feed, directing visitors to subscribe to your FeedBurner feed instead. The end user’s experience doesn’t change unless FB’s compatibility tweaks makes the content more readable or you add features.

The two main RSS stats of note are subscribers and reach. Subscribers are a measure of how many people used their RSS reader to check in to see if you had new content. Reach is the number of people who actually see content either through an RSS reader or otherwise—like on a news aggregating website or an RSS search engine.

Drawbacks to Google Analytics and Privacy Concerns

The major feature-based drawback to Google Analytics has to do with the availability of the data. Google decides when reports are generated, not you, so information you see is usually from at least a few hours ago. But features aside, perhaps the biggest concerns people have about using this software have to do with privacy.

While use of Google Analytics is monetarily free (unless you get more than 5 million pageviews per month), you are paying them in the form of information. All of the data you collect about your site, including the information visitors “give” to you, is collected by Google. Per the Google Privacy Policy and Analytics Terms of Use, the company can/will collect information you provide in user sign-up forms, search histories, emails, information about your browser and computer via cookies (which includes at least that which you can see about your own site’s visitors), what sites you’ve visited (lots of pages use Google AdSense and/or Analytics), and so on. And though they assure us it will not be shared with any third-parties (it’s mainly for making Google AdWords/AdSense advertising more relevant), Google has a massive amount of data about the world’s Internet users, and that makes some people uneasy.

Alternative Statistics Programs

If it’s possible for you to do so, the best options will generally be programs that you host on your own Web server. They’re the most reliable, most customizable, give you total control over your data (and ownership thereof), and won’t limit how many pageviews you can analyze like most of the hosted services. The downside to these is that they require access to your server and the technical knowhow to install, configure, and access the software yourself. If you feel comfortable going down this route, Piwik’s website should be one of your first stops. It’s a very good, free, open-source program with a large base of developers behind it. It positions itself as the “open source alternative to Google Analytics” and it’s about as user friendly as this sort of thing can get. Other free server-based options include AWStats, SlimStat, and Webalizer, but while each of these has its own unique benefits, they are decidedly more difficult to use than Piwik.

Hosted (where a company has the software on their server so you don’t have to worry about installing it on yours) alternatives to Google are generally pay services or limited in the number of pageviews per day/month you can have analyzed. W3Counter, for example, has a good free service, but it’s limited to 5,000 pageviews/day and you’re required to display a small logo of theirs on each of your pages. The upgraded plan, which is currently $9.95/mo, removes the logo obligation and allows up to a million pageviews/month (among other features). As another option, StatCounter’s free plan offers almost all of the features of the premium plans, which range from $9-$29/month, except for the amount of analysis it will perform, which is broken down into two levels. At the no-cost level, basic statistics are viewable for 250,000 pageviews/month, but detailed information is limited to the last 500 visitors. Both W3Counter and StatCounter offer real-time reporting.

Almost all of these services have a demonstration page on its website for you to test drive the program before you sign-up or install it, so check out a few before deciding.

(Ryan McGrady is a new media graduate student at Emerson College where he is studying knowledge, identity, and ideas in the information age.)

1 Comment on “Citizen Media Business Issues: Web Statistics”

  1. #1 Steve (UK)
    on Jun 3rd, 2009 at 12:45 pm

    Re Google Analytics.
    What is the outcome/effect on stats for site visitors (like me) who actively and specifically disable Google Analytics (e.g. via NoScript)?

    Are such site visitors invisible and missed from the data collected/logged – or does Google Analytics recognise/count the visitor but fail to provide detailed analysis?

    Cheers, Steve.