Faillie JL, Ferrer P, Gouverneur A, Driot D, Berkemeyer S, Vidal X, Martínez-Zapata MJ, Huerta C, Castells X, Rottenkolber M, et al. A new risk of bias checklist applicable to randomized trials, observational studies, and systematic reviews was developed and validated to be used for systematic reviews focusing on drug adverse events. J.Clin.Epidemiol. Epub 2017 May 6. PMID: 28487158.

OBJECTIVES:
The objective of the study was to develop and validate an adequate tool to evaluate the risk of bias of randomized controlled trials, observational studies, and systematic reviews assessing drug adverse events.
STUDY DESIGN AND SETTING:
We developed a structured risk of bias checklist applicable to randomized trials, cohort, case-control and nested case-control studies, and systematic reviews focusing on drug safety. Face and content validity was judged by three experienced reviewers. Interrater and intrarater reliability were determined using 20 randomly selected studies, assessed by three other independent reviewers including one performing a 3-week retest.
RESULTS:
The developed checklist examines eight domains: study design and objectives, selection bias, attrition, adverse events information bias, other information bias, statistical methods to control confounding, other statistical methods, and conflicts of interest. The total number of questions varied from 10 to 32 depending on the study design. Interrater and intrarater agreements were fair with Kendall's W of 0.70 and 0.74, respectively. Median time to complete the checklist was 8.5 minutes.
CONCLUSION:
The developed checklist showed face and content validity and acceptable reliability to assess the risk of bias for studies analyzing drug adverse events. Hence, it might be considered as a novel useful tool for systematic reviews and meta-analyses focusing on drug safety.
Copyright © 2017 Elsevier Inc. All rights reserved.

Full-text of the checklist available via online supplementary data

DOI: http://dx.doi.org/10.1016/j.jclinepi.2017.04.023.
PubMed: https://www.ncbi.nlm.nih.gov/pubmed/?term=28487158.

Fabritius ML, Wetterslev J, Dahl JB, Mathiesen O. An urgent call for improved quality of trial design and reporting in postoperative pain research. Acta Anaesthesiol.Scand. 2017 Jan;61(1):8-10. PMID: 27726129.
DOI: http://dx.doi.org/10.1111/aas.12820.
PubMed: https://www.ncbi.nlm.nih.gov/pubmed/?term=27726129.

Ferrante di Ruffano L, Dinnes J, Sitch AJ, Hyde C, Deeks JJ. Test-treatment RCTs are susceptible to bias: a review of the methodological quality of randomized trials that evaluate diagnostic tests. BMC Med.Res.Methodol. 2017 Feb 24;17(1):35. PMID: 28236806.

BACKGROUND:
There is a growing recognition for the need to expand our evidence base for the clinical effectiveness of diagnostic tests. Many international bodies are calling for diagnostic randomized controlled trials to provide the most rigorous evidence of impact to patient health. Although these so-called test-treatment RCTs are very challenging to undertake due to their methodological complexity, they have not been subjected to a systematic appraisal of their methodological quality. The extent to which these trials may be producing biased results therefore remains unknown. We set out to address this issue by conducting a methodological review of published test-treatment trials to determine how often they implement adequate methods to limit bias and safeguard the validity of results.
METHODS:
We ascertained all test-treatment RCTs published 2004-2007, indexed in CENTRAL, including RCTs which randomized patients to diagnostic tests and measured patient outcomes after treatment. Tests used for screening, monitoring or prognosis were excluded. We assessed adequacy of sequence generation, allocation concealment and intention-to-treat, appropriateness of primary analyses, blinding and reporting of power calculations, and extracted study characteristics including the primary outcome.
RESULTS:
One hundred three trials compared 105 control with 119 experimental interventions, and reported 150 primary outcomes. Randomization and allocation concealment were adequate in 57 and 37% of trials. Blinding was uncommon (patients 5%, clinicians 4%, outcome assessors 21%), as was an adequate intention-to-treat analysis (29%). Overall 101 of 103 trials (98%) were at risk of bias, as judged using standard Cochrane criteria.
CONCLUSION:
Test-treatment trials are particularly susceptible to attrition and inadequate primary analyses, lack of blinding and under-powering. These weaknesses pose much greater methodological and practical challenges to conducting reliable RCT evaluations of test-treatment strategies than standard treatment interventions. We suggest a cautious approach that first examines whether a test-treatment intervention can accommodate the methodological safeguards necessary to minimize bias, and highlight that test-treatment RCTs require different methods to ensure reliability than standard treatment trials. Please see the companion paper to this article: http://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-016-0286-0 .

FREE FULL TEXT: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5326492/pdf/12874_2016_Article_287.pdf
DOI: http://dx.doi.org/10.1186/s12874-016-0287-z.
PubMed: https://www.ncbi.nlm.nih.gov/pubmed/?term=28236806.
PubMed Central: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5326492.

Grindlay D. Search strategies for finding systematic reviews. Br.J.Dermatol. 2017 Jun;176(6):1672. PMID: 28295197.

[First paragraph]

This study makes a very important point about the influence of funding sources and conflicts of interest on the methodological quality of systematic reviews. However, I have some concerns about the search strategy used to find systematic reviews for this analysis. There seems to be a recent trend for more overviews of systematic reviews or so-called umbrella reviews to be published, and I have observed that many share the same issues in the search strategies used. This study used only the search terms ‘meta-analysis’ and ‘systematic review’ to find systematic reviews, and, as a result, some relevant systematic reviews would not be found in the searches.

DOI: http://dx.doi.org/10.1111/bjd.15455.
PubMed: https://www.ncbi.nlm.nih.gov/pubmed/28295197.

Hills RK. Non-inferiority trials: No better? No worse? No change? No pain? Br.J.Haematol. 2017 Mar;176(6):883-7. PMID: 28106905.

With improvements in care over time it becomes harder to improve clinical outcomes in those conditions where cure rates are high. The focus of research can thus turn to the so-called non-inferiority trial: where the main aim is not to improve clinical outcome, but instead to provide evidence of a lack of difference, whilst other issues, such as cost or toxicity, are improved. The interpretation of such trials is not always straightforward. The burden of proof is reversed compared to a traditional superiority trial, and this means that many of the statistical safeguards, such as significance and intention-to-treat, which act as restraints from an overhasty adoption of a new therapy, may actually work in the opposite fashion. The issues regarding non-inferiority and equivalence trials are considered, and their interpretation discussed.

DOI: http://dx.doi.org/10.1111/bjh.14504.
PubMed: https://www.ncbi.nlm.nih.gov/pubmed/?term=28106905.

Huttner A, Kaiser L, CMI Editors. Fair reporting of study results. Clin.Microbiol.Infect. 2017 Jun;23(6):345-6. PMID: 28232166.

[First paragraph, reference html links removed]

Of all studies conducted, fewer than half see their results published. Disappointingly, the advent of free study registries such as EudraCT and ClinicalTrials.gov has not led to much improvement [1]. Results that do see the light of day are overwhelmingly positive: studies with statistically significant results are considerably more likely to be published than those finding no difference between study groups. This publication bias afflicts observational studies (the most commonly conducted study type) disproportionately: these are four times more likely to be published if they can report statistically significant differences among groups [2].

DOI: http://dx.doi.org/10.1016/j.cmi.2017.02.015.
PubMed: https://www.ncbi.nlm.nih.gov/pubmed/28232166.

Ioannidis JPA. Hijacked evidence-based medicine: stay the course and throw the pirates overboard. J.Clin.Epidemiol. 2017 Apr;84:11-3. PMID: 28532611.

The article discusses a number of criticisms that have been raised against evidence-based medicine, such as focusing on benefits and ignoring adverse events; being interested in averages and ignoring the wide variability in individual risks and responsiveness; ignoring clinician-patient interaction and clinical judgement; leading to some sort of reductionism; and falling prey to corruption from conflicts of interest. I argue that none of these deficiencies are necessarily inherent to evidence-based medicine. In fact, work in evidence-based medicine has contributed a lot towards minimizing these deficiencies in medical research and medical care. However, evidence-based medicine is paying the price of its success: having become more widely recognized, it is manipulated and misused to support subverted or perverted agendas that are hijacking its reputation value. Sometimes the conflicts behind these agendas are so strong that one worries about whether the hijacking of evidence-based medicine is reversible. Nevertheless, evidence-based medicine is a valuable conceptual toolkit and it is worth to try to remove the biases of the pirates who have hijacked its ship.

DOI: https://doi.org/10.1016/j.jclinepi.2017.02.001.
PubMed: https://www.ncbi.nlm.nih.gov/pubmed/?term=28532611.

Matsumoto H, Kataoka Y. Risk of Bias and Heterogeneity. JAMA Oncol. 2017 Jun 1;3(6):857-8. PMID: 28334363.

[First paragraph, reference html links removed]

First, we are concerned that their systematic review lacked the assessment of the risk of bias that may affect the cumulative evidence. Their systematic review followed Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. The PRISMA statement consists of a 27-item checklist including risk of bias across studies2 and states that “authors should provide a rationale if no assessment of risk of bias was undertaken.2” We ask whether risk of bias should be assessed using some established tools.3

Comment on:

Nishino M, Giobbie-Hurder A, Hatabu H, Ramaiya NH, Hodi FS. Incidence of Programmed Cell Death 1 Inhibitor-Related Pneumonitis in Patients With Advanced Cancer: A Systematic Review and Meta-analysis. JAMA Oncol. 2016 Dec 1;2(12):1607-1616. doi: 10.1001/jamaoncol.2016.2453. Review. PubMed PMID: 27540850.

DOI: http://dx.doi.org/10.1001/jamaoncol.2017.0164.
PubMed: https://www.ncbi.nlm.nih.gov/pubmed/?term=28334363.

Nguyen TL, Collins GS, Lamy A, Devereaux PJ, Daures JP, Landais P, Le Manach Y. Simple Randomization Did not Protect Against Bias in Smaller Trials. J.Clin.Epidemiol. 2017 Apr;84:105-13. PMID: 28257927.

OBJECTIVES:
By removing systematic differences across treatment groups, simple randomization is assumed to protect against bias. However, random differences may remain if the sample size is insufficiently large. We sought to determine the minimal sample size required to eliminate random differences, thereby allowing an unbiased estimation of the treatment effect.
STUDY DESIGN AND SETTING:
We reanalyzed two published multicenter, large, and simple trials: the International Stroke Trial (IST) and the Coronary Artery Bypass Grafting (CABG) Off- or On-Pump Revascularization Study (CORONARY). We reiterated 1,000 times the analysis originally reported by the investigators in random samples of varying size. We measured the covariates balance across the treatment arms. We estimated the effect of aspirin and heparin on death or dependency at 30 days after stroke (IST), and the effect of off-pump CABG on a composite primary outcome of death, nonfatal stroke, nonfatal myocardial infarction, or new renal failure requiring dialysis at 30 days (CORONARY). In addition, we conducted a series of Monte Carlo simulations of randomized trials to supplement these analyses.
RESULTS:
Randomization removes random differences between treatment groups when including at least 1,000 participants, thereby resulting in minimal bias in effects estimation. Later, substantial bias is observed. In a short review, we show such an enrollment is achieved in 41.5% of phase 3 trials published in the highest impact medical journals.
CONCLUSIONS:
Conclusions drawn from completely randomized trials enrolling a few participants may not be reliable. In these circumstances, alternatives such as minimization or blocking should be considered for allocating the treatment.
Copyright © 2017 Elsevier Inc. All rights reserved.

DOI: https://doi.org/10.1016/j.jclinepi.2017.02.010.
PubMed: https://www.ncbi.nlm.nih.gov/pubmed/28257927.

Palmas W. The CONSORT guidelines for noninferiority trials should be updated to go beyond the absolute risk difference. J.Clin.Epidemiol. 2017 Mar;83:6-7. PMID: 28093264.

HIGHLIGHTS
• Noninferiority clinical trials create important ethical challenges, and a wide societal discussion of those challenges appears warranted.
• However, the results of many noninferiority trials are reported mostly as the absolute risk difference between the newer treatment and the standard of care, a format that may be uninformative for clinicians.
• This commentary uses, as a case study, a recent publication to show how the relative risk and the number needed to harm may add clarity to the conversation.
• It proposes a change to the CONSORT guidelines to require the inclusion of those statistics in the Abstract and Discussion sections of future articles.

DOI: https://doi.org/10.1016/j.jclinepi.2016.12.014.
PubMed: https://www.ncbi.nlm.nih.gov/pubmed/?term=28093264.

Perlmutter AS, Tran VT, Dechartres A, Ravaud P. Statistical controversies in clinical research: comparison of primary outcomes in protocols, public clinical-trial registries and publications: the example of oncology trials. Ann.Oncol. 2017 Apr 1;28(4):688-95. PMID: 28011448.

Background:
Protocols are often unavailable to peer-reviewers and readers. To detect outcome reporting bias (ORB), readers usually have to resort to publicly available descriptions of study design such as public clinical trial registries. We compared primary outcomes in protocols, ClinicalTrials.gov and publications of oncology trials and evaluated the use of ClinicalTrials.gov as compared with protocols in detecting discrepancies between planned and published outcomes.
Method:
We searched for phase III oncology trials registered in ClinicalTrials.gov and published in the Journal of Clinical Oncology and New England Journal of Medicine between January 2014 and June 2015. We extracted primary outcomes reported in the protocol, ClinicalTrials.gov and the publication. First, we assessed the quality of primary outcome descriptions by using a published framework. Second, we evaluated modifications of primary outcomes between each source. Finally, we evaluated the agreement, specificity and sensitivity of detecting modifications between planned and published outcomes by using protocols or ClinicalTrials.gov.
Results:
We included 65 trials, with 81 primary outcomes common among the 3 sources. The proportion of primary outcomes reporting all items from the framework was 73%, 22%, and 75% for protocols, ClinicalTrials.gov and publications, respectively. Eight (12%) trials presented a discrepancy between primary outcomes reported in the protocol and in the publication. Twelve (18.5%) trials presented a discrepancy between primary outcomes registered at ClinicalTrials.gov and in publications. We found a moderate agreement in detecting discrepant reporting of outcomes by using protocols or ClinicalTrials.gov [??=?0.53, 95% confidence interval (0.25-0.81)]. Using ClinicalTrials.gov to detect discrepant reporting of outcomes showed high specificity (89.5%) but lacked sensitivity (75%) as compared with use of protocols.
Conclusion:
In oncology trials, primary outcome descriptions in ClinicalTrials.gov are often of low quality and may not reflect what is in the protocol, thus limiting the detection of modifications between planned and published outcomes.

DOI: http://dx.doi.org/10.1093/annonc/mdw682.
PubMed: https://www.ncbi.nlm.nih.gov/pubmed/?term=28011448.

Rogozinska E, Khan K. Grading evidence from test accuracy studies: what makes it challenging compared with the grading of effectiveness studies? Evid Based.Med. 2017 Jun;22(3):81-4. PMID: 28600330.

Guideline panels need to process a sizeable amount of information to issue a decision on whether to recommend a health technology or not. Grading of Recommendations Assessment, Development, and Evaluation (GRADE) is being frequently applied in guideline development to facilitate this task, typically for the synthesis of effectiveness research. Questions regarding the accuracy of medical tests are ubiquitous, and they temporally precede questions about therapy. However, literature summarising the experience of applying GRADE approach to accuracy evaluations is not as rich as one for effectiveness evidence. Type of study design (cross-sectional), two-dimensional nature of the performance measures (sensitivity and specificity), propensity towards a higher level of between-study heterogeneity, poor reporting of quality features and uncertainty about how best to assess for publication bias among other features make this task challenging. This article presents solutions adopted to addresses above challenges for judicious estimation of the strength of test accuracy evidence used to inform evidence syntheses for guideline development.

DOI: http://dx.doi.org/10.1136/ebmed-2017-110717.
PubMed: https://www.ncbi.nlm.nih.gov/pubmed/?term=28600330.

Shekelle PG, Shetty K, Newberry S, Maglione M, Motala A. Machine Learning Versus Standard Techniques for Updating Searches for Systematic Reviews: A Diagnostic Accuracy Study. Ann.Intern.Med. 2017 Jun 13. PMID: 28605762.

[First paragraph]

Background: Systematic reviews are a cornerstone of evidence-based care and a necessary foundation for care recommendations to be labeled clinical practice guidelines. However, they become outdated relatively quickly and require substantial resources to maintain relevance. One particularly time-consuming task is updating the search to identify relevant articles published since the last search.

DOI: http://dx.doi.org/10.7326/L17-0124.
PubMed: https://www.ncbi.nlm.nih.gov/pubmed/?term=28605762.

Smith AR, Rule S, Price PC. Sample size bias in retrospective estimates of average duration. Acta Psychol. 2017 Mar 25;176:39-46. PMID: 28351001.

People often estimate the average duration of several events (e.g., on average, how long does it take to drive from one's home to his or her office). While there is a great deal of research investigating estimates of duration for a single event, few studies have examined estimates when people must average across numerous stimuli or events. The current studies were designed to fill this gap by examining how people's estimates of average duration were influenced by the number of stimuli being averaged (i.e., the sample size). Based on research investigating the sample size bias, we predicted that participants' judgments of average duration would increase as the sample size increased. Across four studies, we demonstrated a sample size bias for estimates of average duration with different judgment types (numeric estimates and comparisons), study designs (between and within-subjects), and paradigms (observing images and performing tasks). The results are consistent with the more general notion that psychological representations of magnitudes in one dimension (e.g., quantity) can influence representations of magnitudes in another dimension (e.g., duration).

DOI: http://dx.doi.org/10.1016/j.actpsy.2017.03.008.
PubMed: https://www.ncbi.nlm.nih.gov/pubmed/?term=28351001.

Spencer-Bonilla G, Singh Ospina N, Rodriguez-Gutierrez R, Brito JP, Iñiguez-Ariza N, Tamhane S, Erwin PJ, Murad MH, Montori VM. Systematic reviews of diagnostic tests in endocrinology: an audit of methods, reporting, and performance. Endocrine. 2017 Jul;57(1):18-34. PMID: 28585154.

BACKGROUND:
Systematic reviews provide clinicians and policymakers estimates of diagnostic test accuracy and their usefulness in clinical practice. We identified all available systematic reviews of diagnosis in endocrinology, summarized the diagnostic accuracy of the tests included, and assessed the credibility and clinical usefulness of the methods and reporting.
METHODS:
We searched Ovid MEDLINE, EMBASE, and Cochrane CENTRAL from inception to December 2015 for systematic reviews and meta-analyses reporting accuracy measures of diagnostic tests in endocrinology. Experienced reviewers independently screened for eligible studies and collected data. We summarized the results, methods, and reporting of the reviews. We performed subgroup analyses to categorize diagnostic tests as most useful based on their accuracy.
RESULTS:
We identified 84 systematic reviews; half of the tests included were classified as helpful when positive, one-fourth as helpful when negative. Most authors adequately reported how studies were identified and selected and how their trustworthiness (risk of bias) was judged. Only one in three reviews, however, reported an overall judgment about trustworthiness and one in five reported using adequate meta-analytic methods. One in four reported contacting authors for further information and about half included only patients with diagnostic uncertainty.
CONCLUSION:
Up to half of the diagnostic endocrine tests in which the likelihood ratio was calculated or provided are likely to be helpful in practice when positive as are one-quarter when negative. Most diagnostic systematic reviews in endocrine lack methodological rigor, protection against bias, and offer limited credibility. Substantial efforts, therefore, seem necessary to improve the quality of diagnostic systematic reviews in endocrinology.

DOI: http://dx.doi.org/10.1007/s12020-017-1298-1.
PubMed: https://www.ncbi.nlm.nih.gov/pubmed/?term=28585154.

Stern RS. 'Whoever has will be given more' (Matthew 13:12): authorship of systematic reviews and clinical trials in the biological age. Br.J.Dermatol. 2017 Jun;176(6):1422-4. PMID: 28581224.
DOI: http://dx.doi.org/10.1111/bjd.15568.
PubMed: https://www.ncbi.nlm.nih.gov/pubmed/?term=28581224.