Exploring The Role Of AI And Machine Learning In Clinical Trials

We live in a rapidly evolving digital era shaped by a continuous stream of pioneering technological advances. The world of clinical research is no exception, and in recent years technology has been increasingly used to enable trials to be managed more efficiently, more accurately and, more creatively. Numerous breakthroughs in technology have enabled this to happen, but two areas in particular – Artificial Intelligence (AI) and Machine Learning (ML) – are becoming pivotal in spearheading industry-wide change.

Here, one of our Machine Learning Engineers, Nicolas Huet, discusses all things AI and ML, including what they actually mean and how they work alongside each other.

What’s the Difference Between Artificial Intelligence and Machine Learning?


Artificial Intelligence is the general goal that we want to achieve. For example, we want algorithms able to reproduce what a human would do in specific situations. We can define sub-goals under this general goal, like having algorithms being able to process images (field of computer vision) or having algorithms able to process human text (field of natural language processing). But this doesn’t say anything about the way to achieve these goals.


Machine Learning is a set of techniques or algorithms that help us to achieve the goals mentioned above. These algorithms are trained on data built specifically for a given task.

A famous early example of AI was when Deep Blue defeated Kasparov in chess in 1997. The technique behind Deep Blue was a tree search algorithm [1] which is not machine learning. Now, the best algorithms to play board games use machine learning [2].  Maybe in ten years’ time, we’ll stop focusing on machine learning and utilize other techniques that will enable us to achieve AI, but for now, machine learning is the main technique powering AI.

What Does Machine Learning Look Like from a Practical Perspective?

Machine learning refers to algorithms that are able to extract patterns and correlations from data in a meaningful way so that they are able to use these patterns to process new and unseen data. This extraction process is called the learning or the training phase.

Training such an algorithm requires feeding it with data tailored to your specific task. To give a basic example, you could design an algorithm that recognizes images of cats. You would then feed the algorithm model images that do and don’t contain cats and you would need to provide a lot of pictures, with very different cats, different backgrounds, etc. This would allow the algorithm to extract the meaningful features (i.e. cat whiskers, paws, legs, tails) that are required to identify a cat.

The underlying techniques of machine learning range from linear regression to transformer architecture [3], GAN [4], and much more. For each specific use-case, you need to select the most relevant method. For example, in the case described above, convolutional neural networks [5], which are a specific type of neural networks, are often used.

How Does CluePoints’ Risk-Based Quality Management Platform Leverage Machine Learning?

At CluePoints, machine learning is mainly used in two areas:

  • USER EXPERIENCE MANAGEMENT: By using machine learning, some key tasks (grouping of risks, centralized monitoring study setup, etc.) of the CluePoints platform are automated which eases and speeds up its usage.
  • KNOWLEDGE RETRIEVAL: As machine learning algorithms are very good at extracting meaningful information, we are retrieving valuable insights from past studies and bringing them to users of the platform. This helps Sponsors and CROs to more effectively plan, manage, and document their studies.

Given the large volume and diversity of data on the CluePoints platform, there are a lot of opportunities for machine learning to add value and we are constantly working on new algorithms to bring this value to our users.


In a project that started this year (still in ongoing development), we use machine learning to detect risk signals that are actual study issues.

In the CluePoints platform, users can create risk signals any time they suspect that a potential issue exists. These signals will then be used to monitor and track investigations that should ultimately determine if there was a real issue requiring corrective action (or not). This results in free text entered by various users to document their findings. We developed and trained a new deep learning model that can analyze text data describing those findings and flag signals with a real issue. This algorithm could be used to automatically analyze signals from the platform to prioritize signals review and ensure most effective follow-up and documentation of findings.

How is Machine Learning Shaping the Future of Clinical Trials?

It’s always challenging to make predictions in a complex field like this one. However, given the recent major developments in the field of healthcare [6] and especially with the use of natural language processing [7] in the mining of free text healthcare data, it’s virtually certain that machine learning will have ever-increasing application in clinical trials.

Data management in clinical trials is a good example of what machine learning can bring as it can help automating some tasks that are still mostly manually done [8]. Given the amount of data to handle, the cost of these tasks is very high.

To fully leverage the potential of machine learning and apply it in the real world in the coming years, there are several big challenges to overcome [9], including model robustness in production, privacy protection, model interpretability, ethics issues, etc. These challenges are general to machine learning applications but are especially prominent in healthcare and clinical trials.


[1] M. Campbell, et al. (2002), Deep Blue, Artificial Intelligence 134 57–83
[2] D. Silver, et al. (2018), A general reinforcement learning algorithm that masters chess, shogi, and go through self-play, Science. 362 (6419) 1140–1144
[3] I. Polosukhin, et al. (2017), Attention Is All You Need, arXiv:1706.03762
[4] I. J. Goodfellow, et al. (2014), Generative adversarial nets, In Proceedings of NIPS 2672–2680
[5] A. Krizhevsky, et al. (2012), Imagenet classification with deep convolutional neural networks, In NIPS 1097–1105
[6] P. Shah, et al. (2019), Artificial intelligence and machine learning in clinical development: A translational perspective. NPJ Digit Med 2, 69
[7] T. Brown, et al. (2020), Language Models are Few-Shot Learners, arXiv:2005.14165
[8] Society for Clinical Data Management (2020), The Evolution of Clinical Data Management to Clinical Data Science, Part 2.
[9] M. Brundage, et al. (2020), Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims, arXiv:2004.07213

Press Release
CluePoints, a Leading Provider of AI-Driven Software Solutions for Clinical Data Review, Receives Significant Investment for Continued Growth from EQT
Press Release
CluePoints Brand Evolution Confirms Commitment to ‘Turning Artificial Intelligence into Human Intelligence’
Press Release
FDA And CluePoints Sign New 3 Year Cooperative Research And Development Agreement To Assess Data Quality Using Statistical Modelling And Machine Learning