Data Analytics and Healthcare Innovation: Insights from Najat Khan


Meet Najat Khan, Ph.D. of the Janssen Pharmaceutical Companies of Johnson & Johnson to learn more about data and healthcare innovation.

Najat Khan PhD Janssen Johnson And Johnson Correlation One

Written in collaboration with Iris González

As the Chief Data Science Officer and Global Head, Strategy and Operations for Research & Development at the Janssen Pharmaceutical Companies of Johnson & Johnson, Najat Khan has remarkable insights into the intersection of data and healthcare.

And, in late 2021, she took time to share them as part of Correlation One's DS4A / Women training program.

Presentation attendees commented afterward that they were fascinated by three main topics Khan touched upon:

  • What it’s like to work at the intersection of data science and healthcare innovative
  • How Janssen’s R&D data science team contributed to the development of J&J’s COVID-19 vaccine
  • How predictive and research analytics can contribute to the development of innovative and more targeted medical treatments

Intrigued? Let’s take a closer look at Khan’s remarks.

All in a Day’s Work

"I have to tell you it's one of the most fun and impactful jobs I've ever had because it combines medical science, data science, and business strategy and acumen so that you can  have a true and lasting impact on patients," Khan said of her role at Janssen.  "Data science has already transformed so many other industries like retail and finance. The time is ripe for that transformation to take hold in medicine."

As DNA and single-cell sequencing provides us with more data on our biological complexity, she noted, there's a push for more personalized medicines, or more targeted therapies prescribed for the right patients at the right time.

"The AI based algorithms are getting more sophisticated for data science in healthcare applications, and computing is getting faster. Not surprisingly, the amount of investment in the space has skyrocketed," Khan said. "What's needed in this industry is just the will and the fortitude to integrate data science in our ways of thinking, working, and culture to drive decisions."

Khan said that the R&D data science team at Janssen, which is composed of over 100 data scientists and data engineers, uses analytics and artificial intelligence (AI) to improve the probability that research programs will produce desired outcomes for better medicine.

One example of their work with which you may be familiar? J&J’s historic COVID-19 vaccine response.

The Power of Data and Medicine

Knowing what's causing the disease is just as important as working on promising innovations, as Khan pointed out regarding the Janssen team’s work on J&J’s COVID-19 vaccine development.

For the effort, data scientists and data engineers under Khan’s leadership collaborated with their clinical and operations counterparts in a crucial, game-changing, cross-team endeavor.

"Data science was central to how we answered many questions,” Khan recalled. “First, [we asked] how do we understand what's moderate to severe COVID? How do we know what's bad versus medium versus mild disease? How do we predict what the risk factors are? Do we know who's getting COVID?"

When time came to plan clinical trials for the Johnson & Johnson vaccine, the Janssen data specialists built a machine learning model, in partnership with the Massachusetts Institute of Technology (MIT), aimed at predicting where the pandemic “hot spots” would erupt months down the road. These hot spot predictions were later used to select locations for clinical trial sites, as placing sites in areas with high infection rates and, thus, higher rates of potential participant exposure to COVID-19, would enable medical researchers to collect richer data sets – more quickly.

The team’s predictions were 90% accurate, helping to accelerate the vaccine’s development by approximately 6 weeks, at a time when every moment mattered. Additionally, the placement of sites in hot spots, such as in South Africa, allowed us to discover “how these our vaccines would hold up in the real world, especially as we had new variants coming through," Khan said.

Looking Beyond the Pandemic

Khan noted that, with the COVID-19 vaccine trials, the team maintained a commitment to testing diverse groups of people. This was a critical choice, especially given the racial disparities in healthcare that have made the pandemic harder for minority populations. 

As part of their work, the R&D data science team at Janssen team included disease risk factors like age, socioeconomic status (because the risk levels can vary between income levels), and degree of social compliance.

"We ended up having one of the most diverse COVID-19 Phase 3 trials,” Khan said. “When you're building a medicine or a vaccine and running trials, the patient population should be representative of the patients you're going to treat or reach."

The use of data science in healthcare and therapeutics research isn't limited to COVID-19. In fact, Janssen is leveraging data science across 90% of its pharmaceutical R&D pipeline.

"We have over a hundred programs outside of COVID-19 because there are patients with many different diseases, from cancer to other infectious diseases who are waiting," Khan said.

One area with high potential for impact is the rare disease space. Patients with rare diseases typically have poorer outcomes because they are diagnosed too late in their illnesses.

As with the COVID-19 vaccine, Khan and her data team count on predictive analytics when developing treatments for patients suffering from rare diseases. At any given time, in fact, they’re running multiple research projects.

For instance, the Janssen team once brainstormed strategies to help earlier detect pulmonary arterial hypertension, or PAH — a rare, life-threatening disease with diagnostic delays of two to three years from symptom onset.

Analyzing common patient data gathered on PAH, the team decided to take a deep data dive on echocardiogram (ECG) test results for patients in the early stages of the disease.

Using a massive patient data set from the tests, Khan's team used an AI algorithm to detect subtle signs that signal a patient may have PAH.

"We have been able to develop an accurate algorithm that can detect PAH within about half-a-year to a year-and-a-half with 88% accuracy," Khan said. “That translates into more time for patient intervention and is hoped to decrease patient burden and translate into improved quality of life and longevity.”

What Matters Most

From embracing team and clinical trial population diversity to integrating cross-disciplinary approaches and leveraging analytics in new ways, Khan believes her team’s work will continue to have real-world implications for patients.

This is a simple but important fact that, Khan noted, J&J and Janssen data professionals keep top of mind in their work.

"[It] goes back to the impact you can have on a person's life and their family,” Khan said of the work. “That's what keeps us going."

Explore More

  • To discover more about how Janssen is leveraging R&D across its pipeline, visit their website.
  • To learn more about Correlation One—including how we partner with companies like Janssen and Johnson & Johnson to develop emerging data talent and upskill employees, visit our Enterprises page.