Crowdsourced consumer data: how do we make sure it's good?

Article: Friday, 6 July 2018

Crowdsourcing data through online marketplaces such as Amazon Mechanical Turk poses new challenges about how consumer research should be designed, conducted and analysed. Additionally, it raises questions about the validity of the participants and the information they provide. As protocols for crowdsourcing data are still being worked out, we have developed a few guidelines that will benefit those using such platforms for research purposes.

When Amazon launched Mechanical Turk (MTurk) in 2005, executives touted it as a way to augment artificial intelligence with the old-fashioned human variety. Organizations would post details about a small task that needed to be completed, such as writing product descriptions or identifying performers on music CDs, and then people searching the MTurk site would browse the jobs available and start working on those they were qualified for and thought sounded interesting or lucrative.

But as is often the case with an innovation, one of MTurk’s most popular applications seems to have caught Amazon by surprise: social science research. In consumer behaviour research alone, over 15,000 studies have been published based on evidence collected from MTurk workers, making it perhaps the most represented pool of participants in the history of my discipline. In the Journal of Consumer Research, one of the major journals in my area of specialism, 43 per cent of behavioural studies in the June 2015-April 2016 volume were conducted using MTurk.

MTurk and analogous platforms such as Prolific have enabled my colleagues and me to collect samples much more quickly and more cheaply than through traditional alternatives (eg, university participant pools), and those of us who looked into the quality of the resulting data found it to be comparable to such alternatives.

But every powerful new tool creates a new set of risks, and crowdsourced data is no exception. Some fear that the people filling out the materials might not be giving them their full attention. Others worry that people who taking many surveys may no longer be naïve respondents, possibly compromising the results. Finally, how do you know the person filling out the survey is who he says he is? After all, as the old New Yorker cartoon put it, on the internet, nobody knows you’re a dog.

“Crowdsourcing websites like MTurk make survey and experimental investigations more efficient.”

Getting it right

To address these worries, my colleague Joe Goodman of the Ohio State University and I undertook a review of the evidence underlying them, and came up with some guidelines for survey and experimental researchers to harness the benefits of online pools and avoid their drawbacks.

Some of these problems are difficult to prevent, but you can steer away from most by adopting a few strategies. You’ll find more if you read our paper, but:

Avoid asking for a specific quality

Avoid asking for a specific quality unless it’s a pre-sortable category, such as geography. Without knowing in advance what you want, respondents won’t be tempted to lie about themselves simply to get the job.

Require participants to formally enrol

Require participants to formally enrol before you show them the study. Requiring enrolment prevents previewing the study (which can compromise its validity) and raises the time costs required by participants. In addition, increasing the effort demanded before the survey decreases the attractiveness of quitting halfway through.

Consider using a third-party crowdsourcing support service

If you are worried about “professional participants”, consider using a third-party crowdsourcing support service such as Turk Prime that can help you recruit people with somewhat less experience taking psychological surveys.

Pay a reasonable rate

Pay a reasonable rate. Though quality seems to be relatively independent of pay rate, you may compromise your individual reputation and the future attractiveness of participating in your studies and in MTurk in general. (Besides, there are obvious ethical reasons not to exploit those who work for you, right?)

Resist blocking any MTurk worker

For similar reasons, resist blocking any MTurk worker. This can get them knocked off the site (also hurting your own reputation in the end).

Other issues

Other issues may take more time to work out. Crowdsourced data has been around for roughly a decade now but protocols for its use are still being worked out. Our work has proposed some guidelines for handling it better, but there are other issues that remain.

For instance, we need time to know the distinctive qualities of the crowd behind the data. For example, one survey has found that American MTurk workers tend to score higher on reporting if they needed to think about an answer (need for cognition) and higher on civics questions. They tend to be younger and better educated than the general run of people. They are also unusual in that they are slightly more introverted, show greater levels of social anxiety, and have slightly lower self-esteem than the general population. This should serve to remind us that absent more sophisticated recruitment tools, we should always treat crowdsourced samples as non-representative.

Also, the technology itself still has plenty of room for improvement. For example, though third parties can help, these sites don’t have a good way yet to handle interaction between participants. Similarly, easier tools to share projects and data across researchers would be helpful. By enabling researchers to collect larger samples, crowdsourcing is already contributing to making consumer research better, but more can be done to facilitate more open collaboration between scientists.

Crowdsourcing websites like MTurk make survey and experimental investigations more efficient. When used conscientiously, crowdsourcing can also help improve consumer science by enabling more numerous and informative studies and increasing participant and researcher diversity. However, online research and crowdsourcing in particular have their own set of risks, and researchers need to design studies in ways that mitigate them.

Dr. Gabriele Paolacci

Associate Professor

Rotterdam School of Management (RSM)

Journal of Consumer Research

Read the paper Crowdsourcing consumer research, written by Joseph K. Goodman and Gabriele Paolacci, is published in the Journal of Consumer Research (STAR), Volume 44, Issue 1, 1 June 2017, p196–210.

Read the paper
Read the abstract

Pile of books with vibrant bookmarks protruding from various pages, symbolizing in-depth research.

What gets some product content talked about more than others?

Barometer 2013: Cost-cutting main expectation from the New Worlds of Work

Conflict towards food improves portion size estimation

What is the economic value of a human life?

How to manage consumers' packaging perceptions

RSM's Research stories in a fun and accessible format. Read the latest insights from the best researchers in the field of business. You can also subscribe to the newsletter to receive a bimonthly highlight with the most popular articles.

Subscribe to the newsletter

Want to learn more about this subject?

RSM offers Executive Education and Master programmes in various business areas for any stage of your career. For instance:

Last seats

Leading Change

3-day programme

More information

Young woman posing in front of De Rotterdam, Rotterdam's iconic modern skyscraper.

Personal brand effectiveness

3-day programme

More information

Elegant man posing in front of the Erasmus Bridge in Rotterdam

Corporate Responsibility

3-day programme

More information

Elegant man holding a skyline of skyscrapers connected to a globe in his palm, symbolizing corporate responsibility.

Erika Harriford-McLaren

Corporate Communications & PR Manager

31(0)104082877

harriford@rsm.nl

Danielle Baan

Science Communication and Media Officer

31(0)104082028

baan@rsm.nl

Rotterdam School of Management, Erasmus University (RSM) is one of Europe’s top-ranked business schools. RSM provides ground-breaking research and education furthering excellence in all aspects of management and is based in the international port city of Rotterdam – a vital nexus of business, logistics and trade. RSM’s primary focus is on developing business leaders with international careers who can become a force for positive change by carrying their innovative mindset into a sustainable future. Our first-class range of bachelor, master, MBA, PhD and executive programmes encourage them to become to become critical, creative, caring and collaborative thinkers and doers. www.rsm.nl

For more information about RSM or this article, please contact Danielle Baan, Media Officer for RSM, via +31 10 408 2028 or baan@rsm.nl.

Crowdsourced consumer data: how do we make sure it's good?

“Crowdsourcing websites like MTurk make survey and experimental investigations more efficient.”

Getting it right

Avoid asking for a specific quality

Require participants to formally enrol

Consider using a third-party crowdsourcing support service

Pay a reasonable rate

Resist blocking any MTurk worker

Other issues

Dr. Gabriele Paolacci

Associate Professor

Journal of Consumer Research

Related articles

What gets some product content talked about more than others?

What gets some product content talked about more than others?

Barometer 2013: Cost-cutting main expectation from the New Worlds of Work

Conflict towards food improves portion size estimation

What is the economic value of a human life?

How to manage consumers' packaging perceptions

RSM Discovery

Want to learn more about this subject?

Leading Change

Personal brand effectiveness

Corporate Responsibility

Erika Harriford-McLaren

Danielle Baan

Education

Education

Information for

Information for

Contact

Contact