Online participant recruitment has become pivotal for the field of psychology, especially concerning subjective areas like face identity processing ability. A recent study, published by researchers at UNSW Sydney, highlights significant differences between participants sourced via Amazon Mechanical Turk (MTurk) and those recruited through Prolific or tested in laboratory settings.
The study investigates normative test scores for three standard measures of face identity processing ability: the Glasgow Face Matching Test 2 (GFMT2), the Cambridge Face Memory Test (CFMT+), and the Models Face Matching Test (MFMT). It uncovered compelling evidence showing MTurk participants scored approximately 10 percentage points lower on all tests compared to their Prolific counterparts and university students tested face-to-face. This discrepancy raises important questions about data quality and the dependability of participant pools commonly used in psychological research.
Researchers have long recognized the benefits of online recruitment, such as cost efficiency and enhanced demographic diversity. Recent analyses indicate MTurk is featured in 30-40% of papers published within prominent psychology journals. This trend presents both new opportunities and challenges; with accessibility also come concerns about attentiveness and data validity.
During the course of the study, stricter exclusion criteria based on attention checks were applied. This adjustment revealed significantly higher exclusion rates within the MTurk group—around 62%—versus just 22% for Prolific participants. Despite these efforts to improve the quality of data, the test scores for MTurk participants remained lower when compared to other groups, asserting the conclusion drawn by researchers: MTurk may not provide generalizable normative measures.
To substantiate these findings, participants involved were subjected to various assessments, including the GFMT2-Short and GFMT2-High subtests. MTurk participants exhibited scores consistent with earlier studies on the platform—averaging at 75% accuracy for GFMT2-Short, yet Prolific participants performed slightly higher, achieving accuracy rates of 80-82%. Such outcomes indicate the delivery of notable psychometric properties for the GFMT2 subtests, which also demonstrate strong reliability and validity characteristics.
The broader significance of face identity processing ability cannot be overstressed, with potential applications ranging from identity verification tasks to clinical assessments for social cognition disorders. Evaluative tests can provide unique insights amid growing public discourse surrounding issues of security and social interaction—especially as the world grows more diverse. With super-recognisers playing notable roles across settings, it is important for research to appropriately identify and leverage their expertise.
Importantly, test scores reveal age-related accuracy patterns—showing peak performance occurs around age 36. This research also provides normative data for GFMT2 across various participant groups, proposing the scores for Prolific samples as standard measures for evaluating test performance.
Concerning participants recruited from MTurk, the findings suggest caution with how normative scores are interpreted. Scores obtained from these participants could lead to underdiagnosis of impaired ability or potential misrepresentation of exceptionally high aptitude within the population.
Reflecting on the takeaway messages derived from this research, it becomes exceptionally clear: as online participant pools evolve, so too must the methodologies and benchmarks we employ. The authors of the study encourage researchers to adopt the normative measures of GFMT2-S and GFMT2-H derived from Prolific samples, positing them as standard for future studies on face identity processing.
Through rigorous assessment and updated perspective on participant characteristics, this research aims to lay the foundations for the reliable measurement of cognitive processes involved with face recognition. That is not only significant for psychological assessment but also for the broader scientific community curious to understand the cognitive underpinnings of perception and identity.