The coming decade is likely to see even more medical breakthroughs arising through the combination of intelligent algorithms and mass, combined datasets. However, for the world of data to achieve its full potential, we need to breakup individual silos and create a collective global intelligence within the medical community.

Jurgi Camblong

From drastically improving diagnostics and treatment decisions, to accelerating patient recruitment for clinical trials, the combination of artificial intelligence (AI) and Big Data is changing healthcare for the better. Progress has been made possible thanks to the massive increase in computing power and the growing amounts of patient information stored in various databases around the world.

Scientists predict that the world’s genomic databanks will soon contain more data than all earth sciences and the endless flow of information on social media combined1. How we leverage these growing datasets to uncover new healthcare solutions, will define the next decade.

Cancer medicine in front

AI and real-world clinical data will eventually transform healthcare across the spectrum, from neurodegenerative diseases to diabetes. But it is in the world of cancer medicine where we have to date seen the biggest benefits from this combination. This is due in part, to the intrinsic nature of the disease and how it progresses over time.

Cancer is not only a multifaceted disease but also a constantly evolving one. Understanding its dynamics therefore requires a large amount of statistical power. This can only be derived from cleaned and labelled real-world datasets that are more easily annotated.

Training AI algorithms on clinical datasets has already demonstrated effectiveness in identifying subgroups of cancer patients who respond best to particular treatments. The AI revolution is also coming to the lab.

In the coming years, cancer treatment is likely to become more precise as a result of digital pathology; one of the fastest moving areas in oncology. Pathologists worldwide will soon digitalize immunohistochemistry information, generating a new class of data which has been dubbed ‘histomics’.

We believe this will make it easier for smart algorithms to step in and speed up the diagnostic process. They will do this by learning to detect the earliest signs of cancer, rapidly directing clinicians to areas of abnormality in tissue samples, as well as enabling more accurate treatment recommendations.

Computer-assisted AI technology will be able to use this data, for example, to obtain a better signal for PDL1 tests. Having this data will be extremely important at all stages of a patient’s longitudinal journey. If you have CT or MRI scans of the tumor, or its molecular characterisation, and can combine these sources of information, you can make much more accurate predictions of how patients will respond to treatments.

This is because you aren’t looking at just one source of data or one ‘picture’. By combining these multiple sources of health data, individual frames of reference become more like a ‘movie’. This ‘moving picture’ shows us the progression of the disease in a way that you could never see by focusing on a single frame.

Future of clinical trials

Real-world evidence is changing the way we study drug safety and effectiveness. Medical product development is at the brink of a new age of evidence generation. In addition to optimizing the choice of existing treatments for patients, the combination of AI and real-world clinical data is also playing a role in identifying new treatments for the future.

One of the on-going problems with clinical trials, is their skyrocketing cost. A randomized clinical trial is trying to show efficacy. It depends on a comparator or control arm and asks, how well does a drug work under highly controlled circumstances? In contrast, a real-world evidence study tries to show how well something works in a real-world scenario compared to an existing treatment.

This is not only extremely valuable from a product development perspective, it can also reduce costs by eliminating the need for the traditional ‘control arm’ in favour of what has come to be known as a ‘synthetic arm’. Fortunately, regulatory bodies such as the FDA are now shifting towards utilizing clinical grade real- world data as a way of making trials more efficient and effective.

Instead of recruiting a control group of patients to receive a placebo, more and more trials will compare new drugs to a synthetic control arm. This models a comparison group using previously collected data through sources such as historical clinical trials and selected platforms that are computing the data from electronic health records.

AI platforms can also help accelerate recruitment for trials. As many as 80% of clinical trials fail due to insufficient patient recruitment, but intelligent systems are likely to improve this situation by acting as a digital matchmaker. They can recommend patients who could benefit from the trial, and make it simpler and less time consuming for hospitals to run them.

This could accelerate the drug development cycle, which may mean more successful drugs earlier, and hopefully trials, and ultimately drugs, that cost less.

Teasing out the algorithm

Intelligent computing is the engine and datasets the fuel. But the performance of even the most powerful engine will be limited if the fuel is of poor quality. The challenge is getting the right signal from the data. How do we obtain datasets of sufficient statistical integrity for intelligent algorithms to be able to extract a signal which is truly representative of the patient population?

Indeed, real-world data has the potential to change the way we will execute many of tomorrow’s clinical trials, but we will have to ensure the data is not only accurate but also standardized. This requires not only real-world data, but clinical-grade, real- world data. Data has to be collected, cleaned and labelled to meet regulatory-grade criteria and to enable the building of better algorithms.

Selected platforms not EHRs

One of the keys to increasing the statistical power of datasets is through standardizing the data across clinical centers.However, despite the introduction of electronic health records (EHRs), smoothing out the differences between hospitals remains a sizeable hurdle.

EHRs will not help to standardize the data anytime soon. The standardization of data will only come through many hospitals around the world using platforms like the SOPHiA Platform, to improve treatment decisions. These selected platforms are then able to collate the information together from multiple centers.

Break up the silos

The coming decade is likely to see even more medical breakthroughs arising through the combination of intelligent algorithms and mass, combined datasets. The combination of all health data, organized together in new ways that can help discover breakthrough medicines and patient care solutions, will prove to be very powerful.
However, for the world of data to achieve its full potential, we need to break-up individual silos and create a collective global intelligence within the medical community. This will not be easy, but it is essential. It is a paradigm shift in the way all players in the sector need to think.

The more knowledge we’re able to collect and share, while still respecting patient privacy, the more we can leverage it to develop better treatments and help them get to market faster. We already have the computing power and the technology to build solid algorithms. The major challenge awaiting the healthcare industry over the next decade is mutualising the data to create a collective intelligence that is so needed to improve treatments, for the benefit of all.

1. Stephens ZD, Lee SY, Faghri F, et al. Big Data: Astronomical or Genomical?. PLoS Biol. 2015;13(7)