How we find fraud
TrueStats uses many different detection methods to determine whether the user is valid or not. Some of these methods are algorithmic, and others are learned over time by detecting patterns in the data. Not all methods will be listed in order to protect our intellectual property and to prevent reverse-engineering.
Some methods are fairly conclusive about whether the user is valid or not. Other methods produce a likelihood of validity, which we show as a score from 0-1000. The lower the score, the more likely the user is invalid. For example, a user proxying in through a data center's I.P. address is highly unlikely to be a valid user. On the other hand, certain countries originate the bulk of invalid traffic, but also have real users.
Scores will generally be between zero and about 500. A score below 100 is considered 'invalid', meaning there is almost a certainty that the user is not real. The upper end of the scoring range will be used in the future for whitelisting methods.
A large and growing percentage of web traffic is generated by bots, spiders, extensions, headless browsers, toolbars and other means (collectively called bots). The bots have become increasingly sophisticated in how they disguise themselves, therefore requiring continuously evolving detection methods.
Here are some of the methods we employ:
|Block List||We check every I.P. address against our database of known infected machines. This detects machines that have been hijacked as spambots and also machines that are infected with viruses and generate large amounts of automated traffic and clicks. This database is maintained in realtime in order to detect emerging sources.|
|Data Center Origin||We maintain a database of data center I.P. address ranges, since many bot networks will use data centers to create or proxy traffic. A session from within, for example, an Amazon AWS data center address block is unlikely to be valid.|
|Public Web Proxies||Similar to using a data center to proxy traffic, public web proxies are also used. We maintain a realtime database of public web proxies in order to score sessions from them.|
|TOR||TOR has legitimate uses, but hides the origin of the user, so it can be used to generate random sessions.|
|Spoofed User Agents||Bots often rotate their user agents in order to appear to be more than one device and generate realistic looking traffic. We have developed technology to match the user agent to the browser's capabilities and detect sessions that have altered their user agent.|
|Invalid Searches||To appear to be from a search engine, often bots create fake referrer headers. In many cases, these headers differ from real search engine referrer structures.|
|Collusion||This method detects the coincidence of a set of I.P. addresses and a set of publisher sites.|
|Other Proprietary Methods||We currently have developed several other methods for detecting fraudulent sessions and this continues to be a primary focus of our research efforts.|
Hidden users are from sessions where no page is ever visible on the screen. This is often, but not necessarily due to bots, since there are many generated by search engines pre-loading pages in the background in order to improve performance. Also, a page may be behind a tab that is never shown, or offscreen. Hidden sessions score zero due to this.
Primary reasons for hidden sessions:
|Preloading||Search engines will preload pages in the background while a user types in a search query. The search engine attempts to predict which link or links the user will click on and loads the pages from those links. This is a way to improve the performance of web browsing, however many of the preloaded pages are never made visible and should not be counted.|
|Browser Window Hidden||This occurs when a browser window is behind another window.|
|Background Browser Tabs||A browser tab can be launched in the background and load pages. These pages are never visible unless the user opens the tab.|
|Bots||Even if the session is not detected as a bot, the session will often never be visible and be scored as invalid.|
Our technology tracks whether a session is ever viewed and updates the visibility based on that. For example, if a page is hidden during a pre-load, it is initially recorded as hidden with a score of zero. If the user clicks on the link to view the preloaded page, that is detected and the session is updated with a new score.
Each session is scored and reports all have options to include or exclude users based on score. For example, you may want to view campaigns where the score is less than 100. This would show you the campaigns that are referring the worst quality users.
TrueStats is a pay-as-you-go service based on the total number of hits in a month.
Up to 500,000 hits
Up to 1,000,000 hits
Up to 2,000,000 hits
Up to 3,000,000 hits
Hits are the total sessions, events, page views and API calls.
We offer our first price break at 1 million hits for $200 per month. Every additional million hits after that only costs $100 per month. If your website is recording over 5 million or 40 million hits, we recommend contacting us about high volume and enterprise pricing respectively.
How do I choose a plan?
TrueStats is only available as a pay-as-you-go service. The monthly service charge is based solely on the amount of traffic that TrueStats records and analyses. That traffic is measured in hits: the total sessions, page views and API calls the TrueStats servers receive from your website in a single month.
Is there a discount for a bulk purchase?
TrueStats offers bonus hits when you purchase quarterly or annually. Credit is applied to your account and you are given an extra 10% of hits a month to use. We recommend this discount for high volume and enterprise customers, but it is available to all customers. Contact our sales team or call 888-453-5291 to learn more.