An overview of the two methodologies used in web analytics: page tagging and web log file
As mentioned in my other post about What Google Analytics Can’t Do, one of the key factors of getting your site a visible ranking is to know how well it performs with search engines and how well it meets your site visitors’ needs. To know about this, you must use one or more web analytics tools, whose function is to provide collection, measurement and reporting of web traffic data. There are different tools and most if not all of them are based on either one of the two methodologies: page tagging or web server log files.
Page tagging method
Google Analytics is an example of web analytics application that uses page tagging method. Let’s take a look at the whole process involved:
You tag your web pages by inserting the tracking code on each page -> Visitors visit your website -> (Visitors’) Web browsers send information to Google Anallytics server -> Google Analytics stores and processes data -> You access Google Analytics to view the metrics
Web server log file method
A web server log (file) is a log file, like a text file, that is automatically created by the web server with details of its activities. Every visitor to your website will be tracked or more exactly logged by the server. The server creates an entry for each visit in its log with details about the visitor’s IP address, date and time of the visit, the page and files requested, bytes served, referrer, user agent etc. Log files have a wealth of data, and because of this, each log file is pretty big in size (we’re talking about tens or hundreds of megabytes per text file. As a result, to analyze log files, you’d need a web log analyzer, a software that can read/import log files and spit out useful information in a user-friendly way. This technique, independent of visitor’s browsers, is referred to as the server-side data collection.
ClickTracks and Awstats are examples of a web log analyzers. Let’s take a look at what happens:
Visitors visit your website -> Your web server creates entries in its log file -> You use ClickTracks/Awstats to process the logs and get the reports/metrics
From the above, page tagging has more steps involved, thus seems to be more time-consuming and error prone. However, it is now considered the new standard in web analytics. Find out why through the advantages and disadvantages of each method described below.
|Web server log / Web log analyzer||Page tagging|
|Log files are generated by web servers; even if you don’t have your own web server but buy a web hosting package from a hosting company, you can get the log files. Thus, the raw data is readily available, without you having to change or tag your web pages.||To collect data via page tagging, you need to tag your web pages properly. Pages that are not tagged or not tagged properly, won’t have data.|
|The raw data again is on your/your hosting provider’s server, so you can easily access, archive log files, or switch to another web analytic softwares if want to and still be able to analyze data, including historical data.||Data is on your web analytics vendor’s server. You don’t have access to the raw data and can’t move or archive them. Thus, if you want to switch vendors, you may have to consider the fact that you’re losing historical data. Even if you have historical data, that data may not be raw and can’t be used with other vendors.|
|The web server only logs an entry if there’s a request to the server. If a page is (stored and) served from the browser cache, there’s no request to the server, thus there’s no activity recorded on the log file. That visit is not counted; and not counting visits from cached pages can really distort your data.||Page tagging will collect data no matter where the pages are served from, browser cache or the server. As soon as the browser is open, the tracking code will fire, and data will be collected.|
Each method when stands alone has its own strengths and weaknesses. Page tagging, however, is considered the standard of web analytics due mostly to the fact that it can collect and provide data regarding the interaction between your web audience and the site. Through this type of data, you and your web team can decide whether the current layout/model is good enough, or you need to change the way your site/page looks and functions to achieve the established goals.
Again, hope this helps!