Date of Award


Degree Type


Degree Name

Honors Thesis


Actuarial Science

First Advisor

Rasitha Jayasekare


The Centers for Medicare and Medicare Services (CMS) releases annual reports regarding the Market Saturation and Utilization of nationwide Medicare coverage. CMS data provide an opportunity for an in-depth analysis of Medicare usage patterns within the United States that may provide insight into socioeconomic conditions in certain regions. To discover any potential patterns, the KAMILA (KAy-means for MIxed LArge data sets) clustering algorithm has been utilized within the most recent CMS dataset from 2018. Due to the large size of the original data set, the focus of this research has been limited to Illinois Medicare data, grouped by the 102 counties in Illinois. The KAMILA algorithm extends the well-known k-means clustering algorithm to include mixed-type data by using a weighted semi-parametric procedure. Therefore, it balances the contribution of quantitative and qualitative variables. The optimal number of clusters is decided in-part by the operator of the algorithm with respect to the number of cross-validation runs. After the application of the KAMILA clustering algorithm with both the main CMS dataset and a modified version of it to exclude Cook County, two clusters were found with both datasets. This offers insight into the structure of Medicare Services in the state of Illinois.

Included in

Mathematics Commons