Today : Mar 02, 2025
Science
01 March 2025

New Study Predicts YouTube Users' MBTI Personalities From Comments

Research reveals the dominant personality types among users engaging with conspiracy, spirituality, and travel topics.

For decades, personality measurement has primarily relied on self-report questionnaires, leaving room for alternative methods. A recent study by researchers Luisa Stracqualursi and Patrizia Agati explores how machine learning techniques can analyze language to predict the Myers-Briggs Type Indicator (MBTI) personality types of YouTube users based on their comments. This innovative approach not only provides insights but also enhances the personalization of online content.

The classifier developed for this study utilizes natural language processing techniques to categorize YouTube commenters by their MBTI type, focusing on videos related to conspiracy theories, spirituality, and travel. Findings reveal intriguing insights about the personalities commenting on these diverse topics, showcasing a clear predominance of certain types.

The research involved scraping 140,933 comments from YouTube and filtering them to include only users with at least 100 words per comment. The data analysis indicated the most widespread MBTI type among commenters was INFP, or the 'Mediator', often recognized for being open-minded and imaginative. Close on its heels was the INTP, known for its penchant for logic and intellectual exploration.

Interestingly, the study noted distinct patterns among users commenting on different themes. For conspiracy theory discussions, INTP types emerged as the most prominent, whereas the INFP personality dominated the conversations surrounding spirituality and travel. This finding suggests how emotional engagement and interests shape online interactions.

"The most common MBTI type among YouTubers who comment on these types of videos is INFP, followed by INTP, INFJ, and INTJ, indicating a predominance of Introverted and Intuitive individuals," the authors wrote, elucidated by the number of comments analyzed. These personality types typically engage more deeply with content, potentially reflecting their preferences for solitary contemplation or introspection.

The findings contribute to our broader approach to personality assessment, moving away from traditional self-reports and aligning with the 'self-presentation' view of personality. This perspective considers individuals' choices of language and expression as performative acts significant for personality characterization.

Throughout the years, the MBTI has faced criticism for its methodology, but its popularity persists, particularly within organizational contexts. This research demonstrates its relevance, especially as machine learning models can provide significant predictive insights about user behavior and personality traits derived from language analysis.

Utilizing the well-established MBTI Kaggle dataset for training purposes, the researchers employed Extreme Gradient Boosting (XGBoost), yielding reliable personality forecasts. The study not only enhances the predictive power of MBTI evaluations using social media data but also affirms the concept of personality prediction with as few as 100 words from users.

Interestingly, the study notes, "Our work introduces the notion...that merely 100 words per user can yield adequate personality predictions for the first time." This statement underlines the potential for rapid personality assessments via minimal text analysis, which could eventually lead to more customized user experiences on platforms like YouTube.

The research yields opportunities for applying these findings beyond YouTube comments and pushes the envelope for targeted advertising, content recommendations, and engagement strategies. With increasing demand for personalized content online, introducing machine learning-driven personality assessments can markedly improve audience targeting across digital platforms.

Despite these promising results, the authors acknowledge potential limitations, such as the imbalanced representation of personality types within the dataset. While the project draws from data indicating trends within user interactions, the inherent imbalances could skew certain personality profiles unfairly, warranting careful consideration for organizations applying these insights strategically.

Conclusively, the study presents evidence of language patterns having the power to decode personality traits, drawing attention to the intersection of psychology and technology. By framing personality assessment through the lens of language use and machine learning, new pathways may emerge, transforming how content is marketed, displayed, and personalized across digital landscapes.