Web Analytics Made Easy -




Socrethics Ė Yearly Visits


 B.Contestabile    †admin@socrethics.com              




An introduction to the reliability of website statistics can be found in Web Analytics (Wikipedia).

The web analytics software used by Socrethics is StatCounter.

         StatCounter is based on a client program. It uses JavaScript and a HTTP-cookie in the userís webbrowser.

         StatCounter discards visitors like internet bots and web crawlers, which do not use browsers and JavaScript. It can therefore be assured that only human visits are counted.

Server based programs show considerably higher numbers than StatCounter:

How can the difference be explained?



Implausible explanations

Most explanations for the differences mentioned in Page Tagging and Log Analysis and Comments on Google Analytics do not apply in the case of StatCounter:

1.      If JavaScript is disabled, then the <noscript> section within the StatCounter routine becomes active. It records the pageload but not the detailed visitor data.

2.      If the cookie is blocked, terminated or deleted, then every hit is recorded as unique visitor.

3.      Since the StatCounter routine is inserted at the beginning of the webpage, it also records visitors, who quit the webpage before it is completely loaded.

4.      If the webpage is accessed by two devices (e.g. PC and Laptop) with the same IP number, then two visitors are counted, not just one.

5.      Files without a StatCounter tag (like PDF-Files) do not exist within the Socrethics project.



Plausible explanations

1.      Lacking information in StatCounter:

a.    The webbrowser requests the server, if the webpage has changed. If the page is unchanged (HTTP 304), then it is read from the userís cache memory and not from the server. StatCounter only increases the count, if the webpage is loaded from the server into the webbrowser.

b.   Programs which copy websites like HTTrack are not registered by StatCounter

c.    Visitors who disable cookies and JavaScript and image loading in their browser are missing in StatCounter. According to Andy Crestodina (chapter 14) only 1-2% of the visitors disable Javascript.


2.      Unreliable information in server based statistics:

Server based programs separate bots/crawlers/spiders from human visitors by means of publicly available data about blacklisted IP's like MYIP. There is no guarantee, however, that this separation is flawless. "Black-hat" bots that do not identify themselves as such and that appear to have a legitimate user agent cannot be filtered in this way. Furthermore the undesirable IP numbers are constantly changing. A comparison of the server based program Logaholic with StatCounter (using Socrethics data) showed that the number of bots/crawlers/spiders was five times higher than the number of genuine visitors.