In this blog series, we're proud to shine a light on one of the top Capstone projects from the graduating class of Data Science for All / Women. Capstone projects are a critical component of the DS4A / Women curriculum in which teams get together to work on projects that solve real-world data challenges faced by today’s leading companies and public sector organizations.
“I am a research scientist with an interest in personalized medicine and digital health
technologies. I have a PhD from the University of Washington and am currently working at a startup focused on expanding access to diagnostic tests. I joined DS4A Women to expand my knowledge of ML/AI technologies and am excited to apply what I learned to new projects. "
“I received my PhD in Psychology from Stanford University and completed a
postdoctoral fellowship at UC Berkeley. I am currently a research scientist at
UCSF, where I use neuroimaging and behavioral datasets to understand
human brain function. I joined DS4A to update my analytics skill set and to
connect with other women working in data-driven fields"
“I am a recent graduate of UCLA with a bachelor’s degree in statistics and a minor in digital humanities, and I currently work as a data analyst at TikTok (Bytedance Inc.).
I decided to apply to DS4A Women to work on a data science project with excellent individuals from a larger community and to enhance data science skills through exclusive classes. My goals post-DS4A include improving my data and analytical skills to find better and more impactful ways to approach problems in tech, user experience, and social issues."
“I am a postdoctoral scholar at the Institute for Health Metrics and Evaluation, with a PhD in Applied Mathematics from the University of Washington. I joined DS4A / Women for professional development mentoring and to network with other data science professionals. I am looking forward to applying all that I learned from DS4A to my current work in global health.
"I have dual bachelor’s degrees from the University of Pennsylvania in Computer Engineering and Economics. As part of the executive track cohort ofDS4A / Women, I have an extensive background in analytics, with experiences applying my analytical skills at Nielsen, Hulu, and my own successful e-commerce business. Most recently, I work at Spark Foundry, where I help clients assess, measure, and optimize their marketing efforts using
analytics. I gained a lot of great experiences at DS4A, including the teamwork in the final project, as well as the provided classes and mentorship.
"I have a bachelor’s degree in advertising and am completing my master’s degree in Business Analytics at UCLA. I currently intern as a data scientist at Shopify. I joined DS4A to expand my professional network and to leverage advanced analytics skills on an impactful capstone project"
"I am a recent double master’s graduate in International Business and Business
Analytics and now work in the tech industry as a data analyst. At Toric, I
help pave the way to introduce no-code business intelligence. I joined DS4A to
expand my knowledge and professional network and in hopes of pushing
towards a data-focused future in my new role."
The United States has higher rates of infant morbidity and mortality when
compared to countries with a similar GDP. In this project, we set out to understand the various factors that contribute to infant morbidity and mortality by analyzing data from the CDC Natality and Linked Births / Infant Deaths data sets. Our initial exploratory data analysis revealed many benefits of prenatal care, so we used a combination of modeling techniques to investigate the following questions: (1) How does prenatal care affect birth outcomes? (2) What factors influence who receives prenatal care?
Click to read the datafolio
Read the presentation of this capstone project
One lesson we learned was the importance of using balanced data sets when
trying to predict rare events. For example, only 1.48% of the records in our data
did not use any form of prenatal care, so our initial model predicted that everyone
would receive prenatal care. However, by sampling smaller data sets with an
equal number of records with and without prenatal care, we were able to build a
model with similar levels of specificity and sensitivity.
The data sets we chose to use had many interesting patterns and dimensions, so
narrowing our topic into something feasible given our timeline was one of the
hardest parts of the project. Ultimately, we chose to center our analysis around a
common theme (prenatal care), where we could explore questions that we hoped
would lead to useful insights with the potential for positive impact.
Our mentor was Wen-Ying Feng. She met with our team at various stages
of the project to help us define and scope our problem, to suggest useful
models and provide technical advice, and to review our final results.
Overall, we observed that those with access to prenatal care had lower infant
death rates, lower rates of admission to the neonatal intensive care unit, and a
lower percentage of infants with very low birth weight. Furthermore, starting
prenatal care earlier in the pregnancy was a significant factor in lowering death
rates, with a greater impact for infants born at earlier gestational age. Finally, we
found that differences in insurance status, race, and education were important
indicators in predicting who receives prenatal care. These results suggest that
expanded access to and use of prenatal care, especially within specific
population groups, could help to lower rates of infant morbidity and mortality in