Spiders/Bots - WebLog Expert does not give tru visit count...

Avatar
  • updated
When I run a report with spiders as visits checked on I get an increase in visits (expected), but when I run with spiders as visits unchecked I get a number that actually isn't correct either.

I've found after manually log parsing my bots out (not a fun and very time consuming process) and cross checking with omniture for example, that the true customer visits can be found with WebLog Expert but there is a convoluted way to do so.

The true visitor count (minus the bots) is actually derived from running the report both ways (with and without spiders as visits being enabled) and subtracting the without spider visit count from the with spider visit count. Then taking that result and subtracting it from the number reflected in the without spider visits report.

In short, the without spider as visit totals are not correct at face value in the reports which is why I have to go through all this to arrive at numbers truly accurate!

HELP, can WebLog Expert be updated so that spiders/bots don't show in the standard report when the checkbox is unchecked for the spiders treated as visits.

I have good docs to show any engineer at WebLog Expert the proof so you can enhance your system around this area, versus just telling me I'm wrong and moving on with your day.
Avatar
Michael
If you wish to get information on visitors who accessed pages only (like Omniture does) you can create an include "Requested file" filter with value %PageFiles% . Otherwise the program also include visitors that accessed non-page files (e.g. images) only.

However, the program will probably still show more visitors than JavaScript-based counters for two main reasons:

1. The program cannot identify some robots either because it doesn't have their useragents in its robot database, or because these robots use useragents of usual browsers.

2. Some visitors may use multiple IPs, so they are counted as multiple visitors.

Unfortunately it's not possible to avoid these two issues just by analyzing usual logs. The method you mentioned (with subtracting number difference from the current results) seems to show more accurate results by coincidence, you just subtract some number from the results, and as the accurate results should be smaller than the shown ones, you get more accurate results after subtracting. However, for a site with different percentage of spiders you may get a completely inaccurate result, e.g. for a site with a very large number of spiders you can actually get a negative value.