Date of Award
2020
Degree Type
Thesis
Degree Name
Honors Thesis
Department
Mathematics
First Advisor
Rasitha Jayasekare
Abstract
Annual reports of the U.S. Old-Age, Survivors, and Disability Insurance (OASDI) program, published by the Social Security Administration, detail the aggregate information about the program for each U.S. Postal ZIP code. This information includes the types of beneficiaries and monthly benefits received. These reports present the opportunity for contemporary analysis of the aggregate information about the OASDI program. To better capture the significance of the most-recent report for 2018, this project will use model-based cluster analysis, the unsupervised machine-learning process of grouping similar data points, to compare the 2017 and 2018 data. Due to the large amount of data, the project will look solely at the information for the state of Indiana. The form of model-based clustering used in this research assumes that the probability with which each data point belongs to a cluster is determined through a Gaussian mixture model. Maximum Likelihood Estimation will be used to estimate the parameters of the model and the Expectation-Maximization Algorithm will be used to complete this estimation. Bayesian Information Criterion will select the optimal number of clusters. This model should uncover underlying patterns in Social Security benefits paid in Indiana over recent years, as categorized by ZIP code.
Recommended Citation
Spencer, Gwendolyn, "Model-Based Cluster Analysis of Indiana Social Security Beneficiary Data" (2020). Undergraduate Honors Thesis Collection. 526.
https://digitalcommons.butler.edu/ugtheses/526