Subscribe to Dr. Granville's Weekly Digest

One million web sites scored by Compete.com: how does Compete eliminate bias, blend multiple data sources and standardize unique counts?

Bigger, more diverse, more actionable online data

Since we started Compete, we have been continuously updating the quality and consistency of our data. With clickstream data available since 2002, and 10 terabytes of new data arriving monthly, we have amassed and organized hundreds of terabytes of daily consumer digital behavior from a dynamic panel of 2 million consumers. But for an industry overloaded with data, how do we make sure our clients’ research investments generate a measurable impact on marketing ROI? How do we connect the dots from audience research and media planning to quantifiable engagement and sales outcomes?

When it comes to online panels, size matters

Compete manages the largest panel of its kind in the industry, combining the online behaviors and attitudes from 2 million consumers across the United States. Our online panel is comprised of a statistically representative cross-section of consumers who have given permission to have their internet clickstream behaviors and opt-in survey responses analyzed anonymously as a new source of marketing research. The Compete panel is several times larger than traditional panels, which means that we help clients measure and benefit from more insights.

Panel representativeness depends on panelist diversity

Compete has pioneered the use of “panel multi-sourcing” to create our panel. This approach is unique in the industry and enables Compete to maintain a large, highly diverse and representative consumer panel. Panel multi-sourcing involves integrating online consumer behavior data from proprietary panels with the same data from licensed clickstream partnerships. Our sources differ by collection, geography, browser, target audiences, and other variables. Without diverse sources, source bias cannot be identified or remedied. Diverse panel sources also allow us to better represent the actual internet browser population with our sample. Compete recruits proprietary panelists directly by inviting consumers to install our clickstream collection software in order to participate in our panel. In addition, Compete has clickstream-sharing partnerships with Internet Service Providers and Application Service Providers, which provide additional granularity to Compete’s base of proprietary panelists. Our panel methodology merges these two major sources of data into a single, statistically representative consumer panel. Compete’s methodology uses the multiple individual sources that comprise our panel to normalize, calibrate, and project accurate audience and engagement metrics. No other panel can represent highly fragmented online audiences as effectively as the Compete panel.

Not all data are created equal.

Compete is committed to providing marketers with the most actionable digital intelligence in the industry.

What are the strengths of Compete’s data?

Compete’s clickstream data are collected from a 2,000,000 member panel of US Internet users (about a 1% sample), using diverse sources. Using a rigorous statistical normalization methodology, Compete creates precise projections of the behavior of the entire US Internet browser population on monthly and weekly basis. In addition, Compete provides daily estimates of share of consumer attention garnered by the top Internet sites and the velocity of change of this attention. Compete is the only commercial web analytics provider to make its data freely available online for all Internet users.

What is clickstream data?

Clickstream data are the sequence and timing of the Universal Resource Locators (URLs) used by panel members on the web.

How does Compete protect the privacy of its panel members?

For Compete’s own panels, personally identifying information (PII) is processed and removed at the member’s computer and before transmittal to Compete. When Compete receives data from licensors and partners, Compete scrubs the clickstream with its own quality assurance procedures to eliminate PII.

Why is it important to have a large panel?

Compete believes it has the largest active panel of US Internet users. Its 2,000,000 members represent approximately 1% of the US Internet browser population. The benefit of this large panel is its ability to represent broad and granular activity. Using the breadth of its panel, Compete is able to estimate activity on the top 1,000,000 Internet sites and to track detailed consumer behavior patterns that would not be possible with smaller panels (e.g., shoppers for a Toyota Prius who are comparing it against Honda Civics.)

How does Compete estimate site traffic?

Compete’s experts in the fields of mathematics, statistics and the data sciences have developed a proprietary methodology to aggregate, normalize and project the data to estimate US Internet activity. Based on the daily web usage of more than 2,000,000 members in the Compete community, Compete estimates total traffic, rank and other statistics for the top 1,000,000 sites on the web for use by consumers. More detailed and granular metric studies are done for clients on any relevant site.

How are Compete metrics different from Alexa, comScore, Hitwise and Nielsen Online?

Compete estimates site traffic and engagement metrics based on the daily browsing activity of over 2,000,000 U.S. Internet users, the largest, most diverse, most actionable, panel in the industry.

Why are diverse panel sources important?

Compete contrasts the data from its diverse sources against each other to identify and eliminate source bias. Without diverse sources, source bias cannot be identified or remedied. Diverse panel sources also allow us to better represent the actual internet browser population with our sample. Our sources differ by collection, geography, browser, target audiences, and more.

What is the value of precise Unique Visitor estimates compared to rankings and shares?

Compete uses a scientific normalization and prediction process to make precise estimates of US Internet activity. Without precise estimates of total activity, share and ranking publications raise questions about the representativeness of the sample and exactly what marketplace is being represented. Compete’s precise estimates answer these questions clearly.

Why doesn’t Compete’s estimate of site visitors match my local analytics (e.g. Google Analytics, Omniture)?

Compete"s site profiles estimate how many people visit your site from a diverse sample of people that is statistically normalized and projected to represent the size and demographic composition of the total active U.S. Internet population. Compete does not rely on cookies which are often used by log files and web metrics firms. Because of cookie deletion, return visits by the same person (with deleted cookies) wrongly appear to be a new unique visitor. In addition, if cookie implementation on the server side is done incorrectly with vague or inconsistent definitions, visitors will be overcounted. Compete measures only US activity. Traffic at sites with significant international visitors will be understated by Compete’s projections. In addition, local analytic solutions will often include the activity of spiders and bots that appear as traffic, but do not represent actual human activity. Compete"s services do not rely on log files or cookies and therefore do not count this activity as traffic.

How are Compete Unique Visitors counts different than Unique Visitors reported by log files and local web analytics tools?

How does Compete normalize its data?

Compete applies a rigorous normalization methodology, leveraging scientific multi-dimensional scaling to ensure metrics are representative of the U.S. Internet population. Compete members are recruited through multiple sources to ensure a diverse distribution of user types and to facilitate de-biasing across the data sources. Compete’s standard normalization process is applied to our monthly and weekly metrics. Compete relies on share based measures to report its daily metrics.

Does Compete data include international Internet users?

Compete’s panel measures US, but not international users. Compete has worked diligently to develop what it believes is the largest and most diverse panel of online consumers in the United States.

How does Compete measure the web sites it reports on, why can’t I see my site?

With a panel of 2,000,000 users Compete is able to measure and report on the actual internet behavior of our panelists including the websites they visit and search terms they use. Based on this,we are able to report web metrics for any sites that our panelists visit. This includes the top million websites in the US, and is different than direct measurement which requires site owners to tag and track their sites. If you cannot see your site , it likely means we do not have statistically relevant user information to project traffic with.

For more details, visit http://www.compete.com/us/about/our-data/

Views: 662

Comment

You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

Comment by Amy on February 14, 2012 at 8:06am
They just released the January numbers. They are off (way too low) for all the websites that I checked. It looks like they changed the definition of a unique user, making comparison with previous months meaningless.
Comment by Vincent Granville on February 9, 2012 at 4:31pm

One area where Compete.com could get better is in reducing KPI volatility for smaller websites, by taking into account historical data to remove volatility, and have better time series models. 

© 2014   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC

Badges  |  Report an Issue  |  Terms of Service