4.3. Are any baseline data implausible?
- The reviewer should consider the plausibility of the baseline characteristics.
- ‘Plausibility’ includes clinical or biological plausibility and numerical plausibility. Domain knowledge is necessary to judge clinical or biological plausibility.
- It is important to remember that participants in a clinical trial may not be representative of any particular patient population, and so characteristics of trial participants are not expected to be “typical”. Even if a reasonably representative sample is achieved, random variation means that the characteristics of the sample may not match those of the target population, and this does not indicate a problem.
- Magnitude, frequency, variance, and repetition of values for distinct measurements within a table should be considered.
- Known examples in problematic studies include an excess of even or odd numbers, and an excess of multiples of 5.
- There are proposals to formally assess the degree of balance in baseline characteristics using one of several methods. These balance checks may be useful when applied appropriately by researchers with an understanding of the underlying methodology. However, these methods may malfunction if not used correctly, potentially leading to spurious concerns. As such, the routine use of these methods by non-experts is not recommended at present.
- The routine use of methods to assess digit distribution (for example, for conformity with Benford’s Law) is not recommended. Benford’s law is not expected to be valid for the majority of variables found in RCT baseline tables.
- The reviewer should consider whether unusual values could be due to reporting errors (for example, standard errors reported instead of standard deviations), which may not warrant concerns about trustworthiness but which would need to be corrected if the reviewer uses the data in a meta-analysis.
- The answer to this check should contribute to a domain-level judgement.
Example of check 4.3
A baseline table contains identical values for means, SDs or range limits for both study groups for nine of 11 reported baseline characteristics, reported to two decimal places. The reviewer judges that this is unlikely to be explained by the method of allocation used in the study (which was simple randomisation) and answers “yes” for this check, and this response contributes to the domain-level judgement.