What Is Trending In Data Management And AI?

With the advent of AI promising to transform the way we process and interpret data, there has never been a more exciting time to be working in clinical data management.

The buzz around technologies like machine and deep learning (ML/DL) has been palpable throughout 2023, and was particularly at hot topic at this year’s Society for Clinical Data Management (SCDM) annual conference, where there was standing room only for the AI stream.

Here, CluePoints’ co-founder and chief product and technology officer, Francois Torche, shares his key takeaways on the challenges and opportunities of AI in clinical data management, explains the importance of retaining “human in the loop” control, and argues that big questions around intellectual property (IP), compliance, and unfair competitive advantage remain.

An evolving landscape

AI is not completely new in the clinical research industry, which has been using ML, a subset of AI that utilizes algorithms to spot patterns in large datasets, for some time. Also, it’s good to refresh people mind that Statistics is a core component of data analytics and machine learning. And statistics is clearly not new in our industry.

One such example is data analytics-driven risk-based quality management (RBQM), which allows research teams to compare accruing data from across sites to detect anomalies that could threaten data quality or integrity, and investigate whether action is needed. More recently, DL, a subset of ML capable of continuously learning in an unsupervised way from data imported over multiple parameters or levels, has been used to streamline time-consuming functions such as medical coding and (EDC) query posting.

The next hot topic, is generative AI, which can create new content based on historical data, and includes large language models (LLM), the most obvious example of which is ChatGPT.

LLMs work by learning patterns in huge written structured and unstructured datasets, then using these learnings to predict the most likely “answer” to a given question – actually the probability of the next word being part of the answer. Over the last 12 to 18 months, pharmaceutical and biotech companies have been using LLMs to assist in tasks such as designing protocols and performing risk assessments. Such applications, speakers at SCDM explained, can shave precious weeks off study set up timelines, accelerating the overall development pathway.

In the future, it is thought that generative AI could also be useful in developing simulated control groups, or “synthetic arms”, for clinical trials. Such an approach, though not yet fully validated, could significantly streamline study initiation, particularly in indications where a control group is hard to recruit or retain, such as rare diseases or those carrying imminent mortality risks.


Generative AI has its limitations, and we can see that systems and approaches are evolving to meet the needs of the biopharma space.

Firstly, the latest LLM iterations have been developed to introduce a degree of critical thinking. While previous versions would have generated a protocol for a pregnancy-related clinical trial in men without question, for example, newer models will now flag the inconsistency in the request. In addition, many of the larger pharmaceutical companies have developed their own LLMs to avoid the obvious IP and privacy-related dangers of feeding confidential clinical and trial data into open-source models such as Chat GTP.

However, many questions still remain.

At CluePoints, we believe there is a danger that the adoption of generative AI could widen the gap between large companies, which have access to decades of historical data, and smaller or younger organizations, that will need to rely on less accurate public data to build their models. The result would be a disparity in potential achievement, to the detriment of smaller companies and, ultimately, patients.

In addition, the industry is still unclear on how such models can be validated – a concern raised by FDA, EMA, and MHRA representatives in attendance at SCDM 2023. Traditional models of validation, which compare the outputs of new approaches to those of their gold standard predecessors, are not feasible when we consider that these models are continually learning and refining their results.

Human in the loop

The common perception that AI will displace skilled workers will also need to be overcome if the industry is to unlock the technologies’ potential.

We first need to accept that modern research, which is generating thousands more datapoints from dozens more data sources than ever before, is simply not possible without some sort of analytical models such as AI/ML/DL but also more traditional statistical techniques. ML and LLMs are all essential tools in the world of increasingly more complex clinical trials, empowering clinical data managers to make faster, more informed decisions.

Rather than resist change, we need to ensure systems are built on a human in the loop model. That means delegating the repetitive, manually-intensive task of data amalgamation, crawling, and pattern detection to machines, while retaining the vital, and arguably more interesting, critical thinking and decision making functions in the hands of humans. Of course, some degree of skills evolution will be necessary, but the industry will likely need people to check that what the machines suggest actually makes sense until it reaches the highest degree of accuracy.

The next chapter

AI techniques such as DL and LLMs hold huge potential. They can make clinical research faster and more efficient, streamlining studies and shortening the path to market. Yet while some applications are already up and running in our industry, many others are still in their infancy,

SCDM 2023 illustrated that there is still some way to go before the promise of AI can be fully realized. We need, for example, more guidance on the ethical and compliance considerations, and on how to validate these new approaches for use in clinical trials.

Such answers will not come overnight, but we are already making headway. As long-term members of the SCDM family, we were delighted to see the 2023 conference was the society’s largest ever and look forward to continuing to work alongside colleagues from all parts of the industry, supporting discussion and sharing knowledge on the future of AI in clinical trials.

Meet SPOT: Transforming Site Monitoring Practices with Adaptive Intelligence
A New Era of Automation: Improving Efficiency & Outcomes with Intelligent Medical Coding
Press Release
CluePoints Continues ‘Turning Artificial Intelligence into Human Intelligence’ by Launching Two New Innovations