Six Red Flags for Identifying Falsified Data in Randomized Controlled Trial Publications

Randomized controlled trials (RCTs) are often upheld as the gold standard of medical research. As such, the academic community has been buzzing about news articles from Nature and Science reporting on a research study by John Carlisle. In his study, Carlisle claims that, of 526 evaluated manuscripts reporting RCTs, 73 (14%) had falsified data. Carlisle’s alarming findings have sparked a broader discussion within the academic community about the trustworthiness of published research and the plague of false data in global healthcare research.

Nature, Science, and other outlets primarily focus on systemic causes of and solutions to prevent this problem from occurring in the first place. While these are crucial conversations, these proposed industry shifts and large-scale changes will take time. In the interim, healthcare professionals can take immediate steps to identify and react to falsified data in research. In this article, we’ll discuss some key things that everyone in the healthcare field, from administrators to practitioners, should know about how to identify and react to falsified data in research.

What’s considered ‘falsified data’?

When considering falsified data, the first thing to come to mind is blatant data fabrication. This can involve inventing new information about experiments that were never conducted or modifying the results of performed experiments to better match a study’s hypothesis or produce impressive results.

However, Carlisle’s research suggests that this type of data fabrication is relatively uncommon. Instead, Carlisle explains that most instances of falsified data arise from errors, such as duplicated data entries and typos. While these errors are much less ethically concerning, the unfortunate outcome remains the same: inaccurate data and findings have entered our collective medical knowledge, potentially leading to misinterpretations of results and incorrect conclusions that impact clinical practice.

How can falsified data be identified?

Unfortunately, spotting falsified data can be a highly challenging and time-consuming task. In his research, Carlisle requested anonymized individual participant data for each study and meticulously scrutinized the datasets—most of which spanned hundreds of rows and columns. He checked for suspicious patterns, such as duplicated, missing, or repeated data sequences.

In a similar effort to expose fraudulent clinical trial reports, Mol et al. identified significant data replication and outcome duplication across separate studies published by the same author team. Like Carlisle, they also observed data repetition and unexpected internal consistency between patients’ outcomes and laboratory values, as well as incorrect statistical calculations, such as P values.

For modern healthcare professionals, conducting such in-depth analysis is often impractical. Still, there are some critical warning signs to be vigilant for when reading a scientific research report.

  1. Notice when key methods information is missing. Mol et al. found that several allegedly fraudulent articles did not disclose their recruitment sites or exact study time period, making it harder for skeptical readers to verify the study’s legitimacy. Check to see whether critical information that could be used to hold authors accountable for their data is missing.
  2. Stay alert for internal inconsistencies. If a research paper reports different overall survival rates in its abstract, results, and figures, it’s worth delving deeper into the details.
  3. Check the authors’ history. Researchers with a history of misconduct, undisclosed conflicts of interest, academic scandal, or manuscript retractions may be more likely to produce unreliable results.
  4. Trust your instincts. If the results of a study feel too good to be true, be sure to investigate further. Integrity is critical in healthcare research, and raising your concerns to the journal editors is a responsible action.
  5. Evaluate the journal. While no journal is infallible in detecting research misconduct, some journals have more rigorous integrity checks than others. Highly respected journals in your field are more likely to invest substantial time and resources in integrity evaluations than smaller journals or those that are dedicated to fast publication turnaround times.
  6. Discuss with colleagues. Seek input from peers, whether within your organization or through online communities of healthcare professionals in your specific subject area, to gain other’s perspectives on the paper’s credibility.

The responsibility for safeguarding research integrity extends beyond the authors of a study, journal editors, and peer review team. Healthcare professionals play a vital role in maintaining the credibility and trustworthiness of the field by critically evaluating and responding to research findings. By actively engaging in these practices, you contribute to the overall improvement of research integrity in healthcare.

Understanding Gray Literature: The Value of Nontraditional Publications

As a standard practice, many literature reviews exclude ‘gray literature,’ a category that describes research and literature published outside of the traditional academic publishing industry. However, completely overlooking gray literature results in a wide array of valuable and excellent research being excluded from the overall body of scientific knowledge. A thorough understanding of what gray literature is and the ideal circumstances and cautions for using it could help you uncover hidden evidence that greatly improves your research.

What’s Considered Gray Literature?

In the broadest sense, gray literature comprises any material that isn’t published through a traditional academic publisher (e.g., published through a journal). Importantly, this typically means that the piece hasn’t undergone the traditional peer review process. These pieces may report on a research study, an individual’s opinion, an event, a stakeholder or advisory board discussion, an organizational policy, and more. They’re usually published online only.  Some common examples of gray literature include governmental or industry resorts/white papers, graduate dissertations, conference proceedings, newsletters or mail-outs, policy documents, and blog posts.

When Should Gray Literature Be Used?

Early during the information synthesis process for a research project, it’s a good idea to complete at least a cursory review the gray literature. By searching through gray literature, you could find niche or null-results studies that may influence your research direction—for example, for a hypothesis-driven research study, you may find a report from researchers who attempted to answer the same or similar question but ran into unexpected hurdles or null results. Often, null-result studies aren’t accepted published traditional academic journals, but this type of crucial information could affect how you choose to proceed.

In many cases, gray literature expresses findings and opinions that are more indicative of the ‘real world’ than the often contrived or carefully constructed scenarios used in academic research. Additionally, gray literature often includes a more diverse range of authors who may often be excluded from the profit-driven traditional publishing industry.

However, there are some times when gray literature isn’t appropriate to include. Gray literature often isn’t used in highly regimented systematic reviews with strict inclusion/exclusion criteria. Fast-paced or urgent projects that must be published quickly may be hampered by the time required to sort through the large pool of gray literature available online.

Cautions of Using Gray Literature

It’s absolutely crucial to critically evaluate all gray literature that contributes to your research. While you should never fully depend upon peer review and blindly trust the reliability and rigor of journal-published research, it’s especially important to note that gray literature undergoes no such review. Some gray literature sources (such as governmental reports or conference proceedings) may be less inclined to bias and misinformation than others (such as blog posts), but all sources should be critically reviewed. Consider using medical librarian Jess Tyndall’s AACODS checklist for evaluating gray literature, which includes:

  • Authority. Who wrote the piece? Do they have the expected credentials or experience to speak knowledgably on the topic?
  • Accuracy. Does this piece have adequately rigorous methodology, evidence, or data to support its claims? Are their sources properly cited?
  • Coverage. Does the piece outline its limitations or the authors’ biases/conflicts of interest? Is the scope of the piece clearly outlined?
  • Objectivity. Do the authors or the organization have any explicit bias, such as financial interest in promoting specific results or opinions? Are counterarguments or conflicting evidence/perspectives presented?
  • Date. How long ago was the piece published? Have new advancements or discoveries been made that might disprove, support, or otherwise affect the information in the piece?
  • Significance. Does the piece include enough important, feasible, and relevant information that enriches your own research to justify its inclusion in your citation list?

Additionally, it should be noted that including gray literature in your evidence synthesis will greatly expand the scope of your search. Plan to budget in extra time to evaluate the relevance of sources as well as their credibility.

Tips for Finding Gray Literature

There are a few key strategies you can use to help you navigate the wide range of resources available online. For example, be sure to make the most of Google search techniques! Some helpful search modifiers include:

  • Phrase searching, or using quotation marks around a word or phrase to only pull results that use the exact same words. Example: “Art in sustainability” university programs
  • Site type searches, or using site: in combination with specific website extensions (e.g., .edu, .org., .gov) to only pull results from those pages. Example: Biophysics of mitosis site:edu
  • File type searches, or using filetype: in combination with a specific file extension (e.g., ppt, pdf, xlsx [for excel documents]) to only generate results for a specific document type. Example: Romanesque vs classical art filetype:.ppt

You can also explore databases that collect high-quality gray literature, such as WorldCat, Open Grey, and GreyNet International. Consider also talking to your colleagues about your search— they may know of subject-specific resources that would be hard to discover independently!

Open Science and Data Sharing: What Research and Publication Professionals Need to Know

The open science movement is poised to become a momentous industry shift in medical publication. The National Institutes of Health, one of the largest U.S. medical research funding bodies, recently implemented a policy requiring all applications to include a formal data management plan, with resultant data being publicly available. This policy, described as “seismic” by Nature, has already caused a ripple effect of similar policy shifts throughout the field. Here, we’ll discuss the meteoric rise of open science and data sharing in recent years, with a focus on what anyone working with medical publications needs to know.

What actually is ‘open’ science?

There have been several ‘open’ shifts in medical publications over the last few decades. For most, the first identifiable trend was open access publication, which was recently discussed on Cabells’ The Source blog. Since then, many other aspects of medical research have, so to say, opened, including open peer reviewopen source software, and open altmetrics, to name a few. In the context of these initiatives, ‘open science’ has become an umbrella term referring to movements toward public exchange of research information and data.

Who supports open science?

Open science has gained proponents from many sectors of the industry. Most notably, as we previously mentioned, the National Institutes of Health recently mandated that all applications must have a data management plan that outlines how the raw data will be made publicly accessible. Supporting this decision were several attentional U.S. government statements and policies, including the White House Office of Science Technology & Policy’s new requirement for “free, immediate, and equitable access to federally funded research.”

This movement kickstarted a wave of open science policies. As of September of 2022, it is estimated that approximately half of the 110 largest health research funders either recommend or mandate data sharing. Many academic journal publishers have also committed to this initiative by strengthening their data sharing policies. Though most high-impact journals recommend data deposition in publicly available databases, publishers like PLOS have taken the extra step to require data deposition as a condition of publication. Data sharing requirements are especially common for clinical trial reports.

Key concepts for research and administrative professionals

The open science initiative is, in many ways, still in its fledgling stages. While it’s unclear what the long-term implications will be for this movement, there are a few key concepts that anyone working in the research or publication industries should be aware of.

Data respositories

Data repositories are one of the largest-growing aspects of the open science movement. Essentially, these platforms serve as comprehensive databases to effectively store and organize the massive amounts of raw data generated by today’s medical research industry. Repositories are typically restricted to a specific niche, either by topic or by file type (eg, image-only repositories). Deposited data are assigned a digital object identifier (DOI) or alternative persistent identifier and are typically published under a Creative Commons Attribution (CC BY) license.

Standardly, depositing data in a repository is free, with no upfront or recurring charges for data management. It’s possible that this may change in the future, especially as repositories bear the weight of years upon years’ worth of massive datasets. Some institutions have also begun establishing their own internal repositories intended for use by their research faculty.

When selecting a repository, there are several key factors to consider. First, be sure to contact your institution’s library or research administration to check whether there are any explicitly recommended or banned repositories. Second, determine what kinds of data you plan to deposit. Many repositories only process specific types of data or data files. Third, identify any discipline-specific repositories that are related to your research focus. There are several reputable websites that evaluate and recommend repositories across several fields, such as PLOS’s Recommended Repositories page or Harvard’s Data Repositories resource. Fourth, check the website of potential repositories you identified. Some repositories may have particular guidelines regarding who can share data and when data can be deposited among other considerations, which may guide your selection process.

Data availability statements

As the name implies, data availability statements are short, one-to-two sentence statements published with manuscripts that describe how data can be made available to interested readers. Heavily encouraged since late 2010’s, data sharing statements have become very common in medical research publications, especially among high impact journals. There are several templates available for these statements based on the particular circumstances of your research project; Taylor & Francis have an excellent table detailing statement types and templates/examples.

I want to submit a manuscript to a journal. How do I find their data requirements?

Most journals will include their data sharing requirements on the ‘Instructions for Authors’ page (this might also be titled ‘Guide for Authors’, ‘Submission Guidelines’, etc.). At this point, if there’s no information about data sharing mandates on this page, it can generally be assumed that the journal does not have any data sharing–related publication conditions. However, it’s always a good idea to reach out to the editorial board if you have any questions. Contact information should be available either on the ‘Instructions for Authors’ page, the ‘About the Journal’ page, or a dedicated ‘Contact Us’ page. 

Giving up the Ghost: Unmasking Unethical Ghostwriting in Medical Publications

You’ve probably heard the term ‘ghostwriting’ before, maybe while discussing how most politicians have entire teams of speech writers or how many celebrities don’t actually write their own autobiographies. You may not have heard, however, that ghostwriting is also quietly rampant throughout academia, with a wide array of implications and potential dangers.

How Does Academic Ghostwriting Work?

When we talk about ghostwriting in academia, there are two distinct concepts that often become conflated. Though the difference between them is nuanced, each carries its own implications and threats.

Ghost/gift/guest/honorary authorship

As you can tell, this form of ghostwriting goes by many names. ‘Ghost’, ‘gift’, ‘guest’, or ‘honorary’ authorship defines a serious ethical violation in which an author that hasn’t adequately contributed to a manuscript—standardly defined by meeting the International Committee of Medical Journal Editors’ (ICMJE’s) 4 requirements for authorship—is still attributed as an author on a manuscript. This is an incredibly widespread practice; in fact, a 2020 study published in the journal Accountability in Research found that researchers perceive this to be the “most prevalent” form of research misconduct.

Unattributed authors

This definition of ghostwriting more closely aligns with the traditional, non-academic version and is becoming increasingly exposed in scholarly publishing. In a sense, this is essentially the opposite of the above definition: rather than an author who didn’t make substantial contributions being included as an author on a paper, unattributed authorship ghostwriting entails someone who has made a significant contribution to the paper being excluded from the author list, without any acknowledgment. As the scholarly publishing resource American Journal Experts (AJE) explains, the intentions of this practice can be benign (e.g., an author hiring a medical writer to write their paper with insufficient attribution) or malicious (e.g., a hidden industry sponsor intentionally obfuscating its involvement in a research study).

Why is Ghostwriting Dangerous?

Though ghostwriting comes with an array of concerns, there are a few primary dangers associated with this practice.

  • The unattributed authorship form of ghostwriting poses a major public health risk, as the ghostwriters may unintentionally miscommunicate or, in the most malicious cases, overstate or outright lie about results. These papers then influence physicians’ clinical decision-making and treatment recommendations. This is especially worrisome when considering how some drug manufacturers have been found to misrepresent their research in order to improve physicians’ opinion of a drug or product, also called ‘spinning’ their results.
  • Ghost/guest/gift/honorary authorship improperly confers academic respect or credit onto individuals, and job positions or promotions could be granted in part because of their false authorship credits. These physicians may be consulted on clinical cases related to the literature they’ve ‘published’ on, but they may not actually have the relevant specialist knowledge.
  • There’s an argument to be made that ghostwriting may be a form of plagiarism, though this is an active debate in the scholarly publishing community.
  • Especially with paper mills, authorship has become a black-market trade, with some sites being specifically designed to buy and sell authorship. For example, Retraction Watch has called public attention to the Latvian website Science Publisher Company in September 2021; over a year later, their website still explicitly states that they provide “ready-made articles on a wide variety of topics,” ranging from medicine to architecture or law.

Steps Forward: How to Address Ghostwriting

Unfortunately, there’s no singular solution that can entirely erase the ghostwriting problem. However, there are several smaller steps that can be implemented to confront it.

  1. Prioritize thorough, data-driven research on ghostwriting. Due to the inherently secretive nature of ghostwriting, we don’t have a comprehensive modern understanding of how widespread the problem is. In a 2019 meta-analysis, DeTora et al. found that most research on this topic is conducted with small sample sizes or are opinion-based, resulting in prevalence metrics ranging from <1% to 91%. To help understand the current state of the problem, we need to establish a more universally-accepted definition of ghostwriting, clarify indexing terms, and promote systematic and evidence-based research on the topic.
  2. Strengthen journals’ authorship evaluations. The practice of journals identifying ghostwriters, also referred to as ‘ghostbusting’, is gaining traction with some publishers. For example, editors at PLoS Medicine have called for increasingly stringent requirements for author contribution statements, disclosures of industry involvement, and “enforceable sanctions” of retracting published articles that have retroactively been found to involve ghostwriters.
  3. Increase efforts to eliminate paper mills and malicious ghostwriting firms. Thankfully, exposure of these practices is already increasing, to the point that a 2022 United States congressional hearing was called exclusively to discuss the dangers of and responses to paper mills.
  4. Educate authors on the dangers of ghostwriting. This solution is designed to address benign or unintentional cases of ghostwriting. Landmark studies have found that researchers, especially in low- or middle-income countries, don’t receive sufficient education on research/publication ethics; further training is vital for avoiding inadvertent or nonmalicious ghostwriting.
  5. Reduce the publication pressure that researchers face. It’s an unignorable undercurrent of the ghostwriting problem. The current academic environment places immense, unsustainable pressure on researchers to publish their research, and unfortunately, many authors are beginning to see ghostwriting or paper mills as their best chance at a successful research career.

Overall, a focus on prevention and halting the problem before it snowballs will be critical in order to substantially reduce ghostwriting in academic publications.