Myths of Risk-Based Monitoring

Comments Off on Myths of Risk-Based Monitoring

Person conducting source data verification

Part I: The Myth of Reduced SDV

I have collaborated with many sponsors, regulators and CRO’s and want to highlight the various myths surrounding risk-based monitoring (RBM) in a series of articles. In this first article, I am going to address the myths of reduced source data verification (SDV) and suggest a better way of looking at the problem of too much SDV. The industry has been talking about the appropriate level of SDV for years.

RBM has been touted as a methodology to achieve lower levels of SDV on studies, thereby saving money by decreasing the percentage of data that gets SDV’d. There is a much better way of looking at the problems of SDV. We can all agree that there is a lot of SDV on studies, and modern electronic data capture (EDC) systems contribute to an emphasis on data verification that becomes counterproductive. It is very easy to get information on how much data has been SDV’d at any point, thereby incentivizing monitors to improve this metric. Chasing this metric can become a holy grail for monitors, with very real consequences for those who fall behind for whatever reason.

Let’s only SDV important data

One assertion of RBM regarding reducing SDV is that all data is not the same. Some data is more important than other data. By analyzing and agreeing to which data is important, this will allow us to prioritize which data points need to be SDV’d. This prioritization will result in decreased amounts of SDV and thus save time and money. Using this model, it is then imperative to meet with various groups during the design of the database to discuss which fields should or should not have SDV performed on them.

Depending on the organization, there are two possible outcomes from this discussion. The first outcome is that all data is important and the list of data that doesn’t need to have SDV performed on it is very small. In this situation, there are no immediate savings in the amount SDV to be immediately realized. However; this begs further analysis. Are people being too possessive about their data? In my experience, the question to be asked at this point is whether any data in question is going to be analyzed as part of the analysis plan. Not just summary or descriptive type analysis, but actual hypothesis-testing analysis. If the answer to that question is no, then it probably isn’t important. Answering that question will often get you to the second possible outcome if we didn’t arrive there initially. The second outcome is the list of important data is very small. Therefore, SDV is not required for the data deemed less important. This is the ideal situation for reducing SDV.

At this point, we need to think about why we are collecting data that isn’t important or isn’t part of the analysis plan. We should then push back and only (or, in reality, mostly) collect important data or data we will actually analyze into the EDC system. Now that we have a situation where our database only contains important data, we should SDV all of it.

Another approach to reducing SDV is to collect as much data as possible directly without human data entry and thus the need for SDV is greatly reduced or eliminated. By using direct data transfers from an automatically generated source we can eliminate the need for SDV using a validated transfer. This is often used for central lab values but can be extended beyond this with the use of direct capture technologies, like e-source or electronic medical records. A lot of this technology is not well developed but it will in the long run pay large dividends in the reduced need for SDV of data captured in this way.

For optimal efficiency, our real goal should be getting as close as possible to 100% SDV of the data that is manually transcribed and to make sure that 100% of that data being captured is important. By streamlining the data we collect in the database to only include important data, this provides a much greater overall savings, as there is less data to process and manage, not just less to perform SDV on.

Another thing to keep in mind in analyzing data importance is that not everything required by the protocol needs to be recorded in the EDC database. This is a challenge to get our heads around, as we have this idea that if the protocol stipulates a procedure, we need to record that in the EDC database. This is not necessary at all. If it is deemed critical, it does make sense to document in the monitoring plan that it will be checked. But, unless the data is going to be analyzed, it doesn’t necessarily need to be recorded in the EDC database. For example, if the protocol requires some sort of specific monitoring to help identify AEs, the results of that monitoring do not necessarily need to be recorded in the EDC database. I would suggest that the monitoring process has this covered, with relevant source documentation in place to confirm it.

Using an elegant statistical algorithm to sample data for SDV

I have been asked many times, “What percentage of SDV is the right amount?” Another assertion of RBM is that we can define some elegant statistical sampling algorithm to determine the data that will have SDV performed on it. There are lots of discussions in the industry about sampling algorithms, and what the goal percentage should be. There are two flaws to this approach. For this method to be valid statistically, there are two requirements that need to be true: The errors we are trying to detect need to be normally distributed throughout the dataset, and the data points need to be independent of each other.

When dealing with human error in clinical trials it is not normally distributed. That means it is not something that is a random event. There is certain data that is more problematic, and thus has more errors in it than others. Many factors come into play, including the database design, the application being used, use of defaults in data entry, the protocol, the site personnel, etc. While there is a small background noise of random errors, for the most part, errors will tend to cluster around specific data elements or case report form pages. Because the errors are mostly not random, the idea of a random sampling will tend to grossly underestimate the errors.

The second problem is that errors are not independent of each other and tend to cluster. The idea of having a monitor do SDV on sampled data points is, in practice, not going to happen. Good monitors will want to review a lot of additional data to get a context for the patient and the specific data point that is required to have SDV performed on it. This means that although the algorithm may only stipulate that the monitor needs to monitor 50% of the data points, depending on how that 50% is being defined, the monitors are likely monitoring much more than that to get a context and understand the patient. This means that the savings as a result of that reduction in the required SDV will not be as much as would be expected.

There is another factor in using the reduced SDV algorithm that we often fail to acknowledge: trust. When someone does find an error in monitored data, inevitably the monitor is blamed. “This is such a simple error, why wasn’t it caught by the monitor?” The monitors know this, and are afraid to let anything get by them. The reality is, that despite having some algorithm in place to reduce SDV, many monitors are going to do a much larger amount of SDV to cover their bases. The thing to keep in mind is that there will be errors in the data even with the most stringent of monitoring and it is okay. Tough words to say, but we need to say them. The perfect data set does not exist. In my experience, these monitoring algorithms for reducing SDV using some statistical approach all fail when it comes to the practicality of monitors actually applying them in the field. Let’s focus on trying to monitor 100% of the data and make sure that it is all important.

What About the Term “Risk-Based Monitoring”

Let’s go back to the term “Risk-Based Monitoring.” Instead of focusing on how we can somehow reduce the SDV, we should increase the quality of our studies by focusing on risks, ensuring we have mitigations in place and then documenting that we are monitoring those mitigations. One aspect of that, is to ensure we are only collecting important data in the EDC database that is going to be analyzed, and that we are doing a thorough SDV of all the data we are collecting. Other data, that does not get collected, may be identified as a risk, and a mitigation put in place to check and document that these items are being monitored. The identified risks and the agreed mitigations of those risks then drive the monitoring at the site. The monitoring report will need to reflect that and document that the mitigations are being addressed. Any issues that arise as part of the monitoring process need to be documented and followed to closure. By focusing on what is important in the data monitoring process using a risk and mitigation approach, the focus on the SDV percentage becomes irrelevant.


At Kestrel Biologic, we are passionate about helping clinical operations run more efficient trials that implement risk-based monitoring. Through our iQROS™ platform and our embedded risk manager on every study, we can significantly reduce or even eliminate many of the pain points that sponsors are dealing with today. Together, we will make a difference. Let’s chat.

Call us at +1 (949) 200-8885 or send an e-mail to