Governments around the world are increasingly prioritizing population wellbeing as key to evaluating societal prosperity. Recognizing the limitations of traditional metrics like GDP, New Zealand has been at the forefront of this movement, seeking new and innovative methods to assess the wellbeing of its citizens. A recent study developed and validated machine learning models to predict subjective wellbeing outcomes using census-level administrative data, aiming to enrich the existing wellbeing framework.
The New Zealand General Social Survey (GSS) collects data on the wellbeing of approximately 10,000 individuals; yet, this sample size often fails to capture the nuanced experiences of smaller, underrepresented population subgroups, such as those living in social housing. To bridge this gap, the researchers leveraged the Integrated Data Infrastructure (IDI) to develop predictive models for key aspects of wellbeing: life satisfaction, life worthwhileness, family wellbeing, and mental wellbeing.
Three distinct predictive models were evaluated: Stepwise Linear Regression, Elastic Net Regression, and Random Forest. The results highlighted the Random Forest model as the most effective, yielding low root mean square error (RMSE) values around 1.5, making it advantageous for generating estimated wellbeing scores for larger populations.
Despite these promising results, the models demonstrated low R-squared values, indicating limited capability to explain variations across individual outcomes. This circumstance emphasizes the necessity for continual refinement of analytical methods to optimize the capture of wellbeing determinants.
New Zealand’s commitment to wellbeing is evidenced by its landmark ‘wellbeing budget’, introduced by the government to tackle pressing social issues, including child poverty and mental health crises. Yet, evaluating the effectiveness of such policies has remained complex due to insufficient population-level data. The integration of machine learning models presents promising opportunities to extrapolate wellbeing measures across diverse demographic segments.
The research utilized data from the GSS and New Zealand Census of Population and Dwellings, enabling the exploration of demographic variables like age, income, and education as predictors of subjective wellbeing. By linking these data sources, the study aimed to create broader, population-level estimates of wellbeing applicable for policy recommendations.
This innovative approach highlights the potential of using administrative data and machine learning techniques to improve the accuracy of wellbeing predictions, allowing policymakers to craft more targeted interventions. The study has opened avenues for future research to incorporate richer datasets, including environmental factors, which were found to rank among the top predictors of wellbeing.
While the findings provide a foundational framework for conducting wellbeing research using machine learning, they also suggest the need for advancements to capture the multifaceted nature of subjective wellbeing. The limitations identified, such as response biases and skewed data distributions, pose challenges for model training and predictive accuracy.
Nevertheless, this exploration marks a significant step toward deeply embedding wellbeing analytics within New Zealand’s policy framework, with the ultimate goal of fostering happier and healthier communities.