In this blog series, we're proud to shine a light on one of the top Capstone projects from the graduating class of Data Science for All / Women. Capstone projects are a critical component of the DS4A / Women curriculum in which teams get together to work on projects that solve real-world data challenges faced by today’s leading companies and public sector organizations.
“I am a research scientist with an interest in personalized medicine and digital health technologies. I have a PhD from the University of Washington and am currently working at a startup focused on expanding access to diagnostic tests. I joined DS4A Women to expand my knowledge of ML/AI technologies and am excited to apply what I learned to new projects. "
“I received my PhD in Psychology from Stanford University and completed a postdoctoral fellowship at UC Berkeley. I am currently a research scientist at UCSF, where I use neuroimaging and behavioral datasets to understand human brain function. I joined DS4A to update my analytics skill set and to connect with other women working in data-driven fields"
“I am a recent graduate of UCLA with a bachelor’s degree in statistics and a minor in digital humanities, and I currently work as a data analyst at TikTok (Bytedance Inc.). I decided to apply to DS4A Women to work on a data science project with excellent individuals from a larger community and to enhance data science skills through exclusive classes. My goals post-DS4A include improving my data and analytical skills to find better and more impactful ways to approach problems in tech, user experience, and social issues."
“I am a postdoctoral scholar at the Institute for Health Metrics and Evaluation, with a PhD in Applied Mathematics from the University of Washington. I joined DS4A / Women for professional development mentoring and to network with other data science professionals. I am looking forward to applying all that I learned from DS4A to my current work in global health.
"I have dual bachelor’s degrees from the University of Pennsylvania in Computer Engineering and Economics. As part of the executive track cohort ofDS4A / Women, I have an extensive background in analytics, with experiences applying my analytical skills at Nielsen, Hulu, and my own successful e-commerce business. Most recently, I work at Spark Foundry, where I help clients assess, measure, and optimize their marketing efforts using analytics. I gained a lot of great experiences at DS4A, including the teamwork in the final project, as well as the provided classes and mentorship.
"I have a bachelor’s degree in advertising and am completing my master’s degree in Business Analytics at UCLA. I currently intern as a data scientist at Shopify. I joined DS4A to expand my professional network and to leverage advanced analytics skills on an impactful capstone project"
"I am a recent double master’s graduate in International Business and Business Analytics and now work in the tech industry as a data analyst. At Toric, I help pave the way to introduce no-code business intelligence. I joined DS4A to expand my knowledge and professional network and in hopes of pushing towards a data-focused future in my new role."
About the Project: Prenatal Care and its Impact on Healthy Births
The United States has higher rates of infant morbidity and mortality when compared to countries with a similar GDP. In this project, we set out to understand the various factors that contribute to infant morbidity and mortality by analyzing data from the CDC Natality and Linked Births / Infant Deaths data sets. Our initial exploratory data analysis revealed many benefits of prenatal care, so we used a combination of modeling techniques to investigate the following questions: (1) How does prenatal care affect birth outcomes? (2) What factors influence who receives prenatal care?
What was the most exciting/surprising finding from your project?
One lesson we learned was the importance of using balanced data sets when trying to predict rare events. For example, only 1.48% of the records in our data did not use any form of prenatal care, so our initial model predicted that everyone would receive prenatal care. However, by sampling smaller data sets with an equal number of records with and without prenatal care, we were able to build a model with similar levels of specificity and sensitivity.
What were some challenges you faced and how did you overcome them?
The data sets we chose to use had many interesting patterns and dimensions, so narrowing our topic into something feasible given our timeline was one of the hardest parts of the project. Ultimately, we chose to center our analysis around a common theme (prenatal care), where we could explore questions that we hoped would lead to useful insights with the potential for positive impact.
Who is your team's mentor and how did they help?
Our mentor was Wen-Ying Feng. She met with our team at various stages of the project to help us define and scope our problem, to suggest useful models and provide technical advice, and to review our final results.
What do you view as the impact of your project?
Overall, we observed that those with access to prenatal care had lower infant death rates, lower rates of admission to the neonatal intensive care unit, and a lower percentage of infants with very low birth weight. Furthermore, starting prenatal care earlier in the pregnancy was a significant factor in lowering death rates, with a greater impact for infants born at earlier gestational age. Finally, we found that differences in insurance status, race, and education were important indicators in predicting who receives prenatal care. These results suggest that expanded access to and use of prenatal care, especially within specific population groups, could help to lower rates of infant morbidity and mortality in the US.