Health Care

Famed Bangladesh Mask Study Excluded Crucial Data

By James Agresti

Published April 12, 2022

Overview

With one exception, every gold standard study of masks in community settings has failed to find that they slow the spread of contagious respiratory diseases. The outlier is a widely cited study run in Bangladesh during the Covid-19 pandemic, and some of its authors claim it proves that mask mandates “or strategies like handing out masks at churches and other public events—could save thousands of lives each day globally and hundreds each day in the United States.”

In reality, the authors altered their study to exclude the data that could prove or disprove that very claim. This is a blatant violation of research ethics, and it biases the study to hide the harms of masks, which are far more common and serious than portrayed by governments and media outlets.

Study Design

The Bangladesh mask study, published by the journal Science in December 2021, was a “cluster-randomized controlled trial.” In plain language:

randomized controlled trials (RCTs) are studies in which people are randomly assigned to receive or not receive a certain treatment, like wearing a mask. This allows the study to control for every possible confounding factor, something that “is not possible with any other study design.” Thus, clinical research guides call RCTs the “gold standard.”

cluster RCTs involve giving “the same treatment” to people who interact with one another, whether it be households, villages, workplaces, etc. This is useful for studies on masks because the “prevention of one infection in an individual can prevent a chain of subsequent transmission” to others.

In short, the basic study design was ideally suited to determine if masks work, but as will be shown, the execution and interpretation are not.

Excluding the Death Data

As required by research ethics, the authors of the Bangladesh mask study published a pre-analysis plan explaining how their RCT would be conducted and what it would measure. Pre-analysis plans are a tool to prevent biased or dishonest researchers from changing the goal posts after results begin to pour in. Per the journal Epidemiological Reviews:

In designing the protocol of any clinical trial conducted today, there would be a requirement that the endpoints and case definitions be clearly laid out in advance. In fact, regulatory authorities hold the investigators to these predetermined endpoints to avoid what is sometimes termed “data dredging,” or looking for those outcomes for which significant differences would be found.

Likewise, an article from the World Bank’s Development Research Group explains that the primary reason for pre-analysis plans “is to avoid many of the issues associated with data mining and specification searching by setting out in advance exactly the specifications that will be run and with which variables.”

The pre-analysis plan for the Bangladesh mask study states that it will measure “hospitalizations and mortality,” but these measures are completely absent from the study results. This is a flagrant breach of research ethics, and it obscures the only data that can objectively prove whether masks save or cost lives on net.

Import of the Death Data

To accurately measure the impact of masking or any other medical intervention on death, one has to measure actual deaths—not some other variable. This is because measuring whether masks prevent C-19 infections, as the Bangladesh study does, doesn’t measure how many people died from C-19 or any of the lethal risks of masks identified in medical journals, such as:

cardio-pulmonary events.
elevated CO2 inhalation, which can impair high-level brain functions and lead to fatal mistakes.
social isolation, which can lead to drug abuse and suicide.
heat, humidity, and other discomforts of wearing a mask, which can cause increased error rates and response times in situations where mental sharpness is vital to safety.

Only RCTs that measure deaths can capture the net effects of all such factors. That’s why medical journals call “all-cause mortality” in RCTs:

“the most objective outcome” (Journal of Critical Care)
“the most relevant outcome” (The Lancet Respiratory Medicine)
“the most significant outcome” (JAMA Internal Medicine)
“the most important outcome” (PLoS Medicine)
“the most important outcome” (Journal of the National Medical Association)
“the most important outcome” (International Journal of Cardiology)
“a hard and important end point” (JAMA Internal Medicine)

Unlike other data which can be easily manipulated through statistical tampering, all-cause mortality in RCTs is straightforward and solid. If an RCT is large enough and properly conducted, a simple tally of all deaths among people who receive and don’t receive a treatment proves whether the treatment saves more lives than it takes. This gets more complicated for cluster RCTs, but it is still a clear-cut process.

The Excuses

Because the lead and final authors of a clinical study are most responsible for it, Just Facts asked Yale economics professors Jason Abaluck and Ahmed Mushfiq Mobarak why they flouted their pre-analysis plan to measure deaths. As documented in this full record of the email exchange, they gave counterfactual answers and then failed to reply after they painted themselves in a corner.

In a key part of the exchange, Abaluck claimed:

Collecting mortality data would have required us to revisit every household at endline in order to survey them (we only collected blood from the small subset of households symptomatic during our study period). Given the nationwide lockdown that went into effect, another round of revisits would have been prohibitively expensive and complicated, and we prioritized the other outcome variables where we had much better hope of being statistically powered.

Directly quoting the authors’ working paper, Just Facts asked:

Given that your team was “able to collect follow-up symptom data” from “98%” of the individuals in the study, why would they need to “revisit every household at endline to survey them”?

Likewise, the working paper reveals that their study “surveyed all reachable participants about Covid-related symptoms” and then used the data to calculate that masks reduce the risk of “Covid-like symptoms.”

During the very same surveys, they could have easily asked the participants if anyone in their household died. In fact, the authors may have done that, because they wouldn’t answer these questions:

Did you collect mortality data during any part of the study before the endline? If so, would you share it?

Just Facts asked those straightforward questions twice, but the authors did not reply.

Other Issues

Beyond excluding the death data, the authors engaged in other actions that reflect poorly on their integrity. One of the worst is touting their findings with far more certainty than warranted by the actual evidence. For example, some of the authors wrote a New York Times op-ed declaring that “masks work,” a claim undercut by the following facts from their own study:

Their study’s “primary outcome,” a positive blood test for Covid-19 antibodies, found that less than 1% of the participants caught C-19, including 0.68% in villages where people were pressured to wear masks, and 0.76% in villages that were not. This is a total difference of 0.08 percentage points in a study of more than 300,000 people.

Their paper lays down 4,000 words before it reveals the sampling margins of error in the results above, which show with 95% confidence that:
- surgical masks reduced the relative risk of catching symptomatic C-19 by as much as 22% to not at all.
- cloth masks reduced the risk of catching symptomatic C-19 by as much as 23% or increased the risk by as much as 8%.

“Not statistically significant” is the common term used to describe study results that aren’t totally positive or totally negative throughout the full margin of error, like the results above. Yet, the authors skip this fact in their op-ed and bury it in their paper, writing at the end of an unrelated paragraph that it showed “no statistically significant effect for cloth masks.”

Their analysis doesn’t quantify the uncertainty caused by the fact that they tested only 3%, or 10,790, of the study’s 342,126 participants. This sample may not reflect the other 97% because:
- the study didn’t attempt to test people for Covid-19 unless the “owner of the household’s primary phone” admitted that a member of their household had symptoms like a fever, sore throat, fatigue, and headache.
- 60% of the people who reportedly had symptoms did not submit to a Covid-19 test.

Their analysis assumes that the following “mask promotion interventions” had no effects on the objectivity or willingness of participants to accurately report symptoms:
- Making them watch “a brief video of notable public figures discussing why, how, and when to wear a mask.”
- Sending them “twice-weekly text message reminders about the importance of mask-wearing.”
- Asking them to make “a verbal commitment to be a mask-wearing household.”
- Asking “them to place signage on doors that declares they are a mask-wearing household.”
- Giving a “monetary incentive” of 190 U.S. dollars to “the village leader” if at least 75% of the village adults wore masks.

Like their paper, their op-ed states that “people over age 50 benefited most, especially in communities where we distributed surgical masks,” but this “does not suggest that only older people need to wear masks, but rather that widespread community mask wearing reduces Covid-19 risk, especially for older people.” However, their pre-analysis plan to measure results for “each decade” of age ranges shows no statistically significant effects among people aged 18–29, 30–39, 40–49, and 70+. Furthermore, they excluded this breakdown from their paper and relegated it to a supplement.

Their paper engages in the dishonest practice of data dredging by featuring results that were not included in their pre-analysis plan, like “imputing symptomatic-seroprevalence for missing blood draws.” This allows them to transform statistically insignificant results into significant ones.

Their analysis uses complex analytic strategies like a “generalized linear model with a Poisson family and log-link function,” evoking these warnings from academic works:
- “Manipulation of data involves subjecting data to multiple statistical techniques until one achieves the desired outcome.”
- “A general principle of data analysis recommends using the most appropriate, yet simplest, statistical techniques in research so findings can be better understood, interpreted, and communicated.”
- Statistical “malpractice typically occurs when complex analytical techniques are combined with large data sets. … Indeed, as a general rule, the better the science, the less the need for complex analysis….”

Summary

Based on their study’s finding that “mask-wearing” was 13% in villages that were not pressured to wear masks and 42% in those that were, the authors extrapolated their results to claim that:

“if everyone wore masks, the reductions in Covid-19 cases would most likely have been substantially larger.”
“our best estimate is that every 600 people who wear surgical masks in public areas prevent an average of one death per year given recent death rates in the United States.”

Those claims are based on shaky statistics laced with absurd assumptions, and they fail to account for any of the lethal risks of masks. Worse still, simply counting the people who died in their study could settle this issue, or at the very least, define the outer boundaries of the effects of masks on death. Yet, the authors didn’t collect this easily obtained data as specified in their pre-analysis plan—or they are withholding it.

Photo by 7C0, Attribution 2.0 Generic (CC BY 2.0).