Co-written with Alex Park
In late August 2014, Tom Frieden, then director of the Centers for Disease Control and Prevention, traveled to West Africa to assess the raging Ebola crisis.
In the five months before Frieden’s visit, Ebola had spread from a village in Guinea, across borders and into cities in Liberia and Sierra Leone. Médecins Sans Frontières, the first international responder on the scene, had run out of staff to treat the rising numbers of sick people and had deemed the outbreak “out of control” back in June.
But when Frieden arrived in West Africa, the World Health Organization, the United Nations agency charged with coordinating the global response to disease outbreaks, had only just declared Ebola to be an international public health emergency. Although WHO had announced a $100 million Ebola action plan the week prior to that declaration, many major donors were still sitting on the sidelines....
Read the rest at Huffington Post
HER2 and Herceptin represent the biggest success story in breast
cancer of the last two decades. All might be as advertised, but growing
evidence raises key questions: Does HER2 drive breast cancer? Is it prognostic?
Is the definition of HER2-positive clinically validated? Are HER2 assays
reproducible? Can re-testing HER2 status change trial outcomes? Does HER2
predict benefit from Herceptin? Does Herceptin’s mechanism of action depend on
HER2? Has Herceptin increased five-year breast cancer survival at the
population level? Would Herceptin be approved by the FDA today? In the most
recent trials of Herceptin and T-DM1, can the spectacular success of adding
pertuzumab to Herceptin be squared with the failure of pertuzumab to make any
difference when given with Herceptin-based T-DM1?
Examination of these questions (ten in total) casts doubt on the validity and clinical utility of the HER2 subtype and current treatment guidelines for Herceptin in breast cancer.
1. Does HER2 drive breast cancer?
Summary: Evidence conflicts regarding whether HER2 drives cancer in humans; the decades-old experiments should be repeated to resolve the question definitively. Today, HER2 is the sole “Class I” oncogene that works by overexpression or amplification rather than mutation; no other type of cancer with a targeted therapy is driven by overexpression or amplification of the target. GRB7 is frequently co-amplified with HER2 and might be required to drive cancer, or GRB7 might contribute to oncogenesis independently of HER2. Conceivably, neither HER2, GRB7 nor any of the genes on the amplicon they share drive oncogenesis.
Robert Weinberg’s lab discovered that mutation of neu in rats drives cancer. But the analogous gene in humans, HER2*, is rarely mutated in breast cancer. Instead HER2 is either amplified or its protein product overexpressed. However, Weinberg’s lab found that amplifying neu 100-fold and increasing expression 10-fold did not transform mouse cells from normal to cancerous. Subsequently, the lab of Stuart Aaronson performed two similar experiments in the same mouse NIH/3T3 cell line but amplifying HER2 rather than neu. The first experiment confirmed Weinberg’s finding. But a second test used a different promoter that ratcheted HER2 expression five to ten times higher than in the first experiment, resulting in transformed cells, according to the researchers. Three decades later, however, Weinberg seems unpersuaded: the “report of Aaronson… may or may not have been independently replicated over the past 30 years since it appeared,” Weinberg wrote in email.
The experiments should be repeated. (I was not able to determine if HER2 has been shown to be transforming in human cell lines.)
HER2 is the only “Class I” oncogene that drives by overexpression and/or amplification
A 2004 census of amplified and overexpressed oncogenes found that just six met the most stringent “Class I” criteria. But by 2010, this fell to three: EGFR, AR and HER2. Now in 2016, HER2 stands alone as a Class I gene.
Class I genes must meet the lesser criteria of Class II and III genes and also require “that a drug that targets the encoded protein is used to treat patients for which efficacy must have been shown in clinical trials.” For EGFR in colo-rectal cancer, the targeted agent is Erbitux. However, although the FDA label for Erbitux lists EGFR/ERBB1 expression as an indication, the “consensus is that ERBB1 expression not required for therapeutic success,” according to Bert Vogelstein at Johns Hopkins University. In prostate cancer, amplification of the androgen receptor gene (AR) does not initiate prostate cancer but results from treatment. That leaves HER2 in a class by itself.
I asked Mike Stratton, a co-author of the 2010 census paper and director of the Singer Institute, if HER2 was indeed the sole Class I gene. Stratton replied: “I think that what you say is correct.”
A fundamental principle of targeted therapy is to attack tumor cells without harming normal cells. Consequently, targets are usually mutations. Herceptin’s ostensible target, however, is unmutated HER2. There is no driving oncogene for any human cancer where a targeted therapy aims for an unmutated target—except HER2 breast cancer.
In cancer cells, amplified regions are quite common. The near absence of Class I genes, according to the 2010 census paper, “reflects the difficulty encountered in identifying the true cancer gene on amplicons that often include several candidate genes.” GRB7 resides on the same amplicon as HER2, and one research group found that both genes are co-amplified in 15% of invasive breast cancers. The same group suggested HER2 was not transforming by itself but required GRB7 co-amplification, creating the possibility that a “combination of multiple genes, which do not have independent transforming activity, causes transformation.” That is, HER2 might not be transforming. An analysis of the Cancer Genome Atlas concurred that “GRB7 may be necessary for cancer cells harboring this amplicon, as previously suggested.” Betsy Ramsey’s lab also studied GRB7 and HER2. In email, Ramsey summarized: “Certainly GRB7 seems to be a player with or perhaps without HER2.”
2. Is HER2 prognostic in breast cancer?
Summary: Individual studies come to disparate conclusions about HER2 as prognostic in breast cancer. Reviews of the literature have been tagged with Expressions of Concern, leaving belief that HER2 is prognostic unsupported.
Women testing HER2-positive opt for more radical surgery than patients found to be HER2-negative. According to an analysis of more than 113,000 women, “mastectomy rates were higher in women with HER2-positive tumors than in those with HER2-negative tumors.” The reason is not known although the negative prognosis associated with HER2 breast cancer, long considered as “aggressive,” might be leading doctors and patients to pursue more aggressive treatment. But the evidence no longer supports HER2 as prognostic.
The HER2 prognostic literature begins with a 1987 paper from Slamon et al. which considered 86 node-positive patients from an unrelated clinical trial. (This might have made the cases non-random. In addition, the authors did not discuss whether the 86 were all from the same arm, raising the possibility of treatment confounding the analysis.) Slamon and colleagues found a statistically significant association between degree of HER2 amplification and recurrence and, to a lesser degree, survival: greater HER2 amplification increased the risk of recurrence and shortened survival. But when simply separating cases into amplified (n=52) vs. non-amplified (n=34), no difference in recurrence or survival emerged. To show such a difference and that HER2 was prognostic, the researchers dropped patients with a middling HER2 copy number of 2-4 (n=23) from the analysis. Based on the few remaining patients (n=11), with gene copy numbers of five or more, the authors reported a statistically significant difference in disease free survival (p=0.015) although still no difference in survival (p=0.06). Thus was born HER2’s reputation as an aggressive form of breast cancer.
More than one hundred studies followed Slamon et al., with an array of disparate results that might be expected given the methodology of the founding paper. However, the contradictory mess resolved into iron scientific consensus: “Initial conflicting reports regarding the prognostic relevance of HER2 were resolved with improved methodologies,” wrote Mark Moasser “and the overwhelming data now confirms this initial [Slamon et al.] landmark genetic-biologic finding.” The overwhelming data, according to Moasser, was “nicely reviewed” in a paper by Jeffrey Ross and colleagues.
Ross served as first author on three reviews of the HER2 prognostic literature, published in 1999, 2003, and 2009. However, in March of this year, all three reviews received an Expression of Concern from The Oncologist. (The EOC was prompted by my re-analysis of the Ross et al. reviews.)
The 2009 review included 107 papers. No search or inclusion criteria were specified, creating the possibility of selection bias. More important than how the 107 papers were chosen, however, the review contains errors on 30. More than one in four (28%) of the papers reviewed are mis-reported or should not have been included. For example, a paper from Battifora et al. is misreported: Ross et al. categorized it as “yes” under multivariate analysis of prognostic factors when the paper clearly did not find that HER2 was independently prognostic:
“This analysis identified independent prognostic factors of DFS and OS when all variables were considered together. Independent predictors of DFS included stage of disease, histology, and nuclear grade. Nuclear grade and stage were the only significant predictors of OS.”
There are 10 errors of this particularly blatant sort among the 107 papers reviewed in Ross et al. 2009. A separate group of seven of the 107 papers conducted no multivariate analysis of whether HER2 was prognostic, but Ross et al. reported that those studies did and that each of the seven found HER2 independently prognostic. Six of the seven did not conduct a multivariate analysis of any kind; one of the seven did but all patients were HER2-positive.
An additional set of 11 papers should have been excluded. Nine of these correlated HER2 with different biomarkers, not clinical outcomes. Among these, one paper included 3,655 patients, by far the largest study in the review. Together, these 11 papers contributed 7,511 (19%) of the 39,730 patients in the review. Of these extraneous patients, 7,213 (96%) were adduced in support of HER2 being independently prognostic.
I contacted Jeffrey Ross regarding these errors. Concerning those in his 2003 paper, Ross acknowledged “scattered errors.” Ross disputed none of the 30 errors I identified.
There do not appear to be other literature reviews showing HER2 to be prognostic in breast cancer, leaving that belief unsupported.
3. Is the definition of HER2-positive clinically validated?
Summary: There is no gold standard assay for definitively identifying HER2-positive tumors. The preferred assay and cutoff values have changed and changed back over time. But the modified standards are arbitrary, not clinically validated. The changes may reduce interobserver disagreement but do not increase accuracy in identifying true HER2-positives.
Changing assays: IHC vs. FISH
“There is no gold standard at present,” according to the most recent guidelines for HER2 testing, published in 2013. Perhaps for that reason, the preferred method for determining HER2 status has changed over time. Early research had shown that “amplification added little predictive value to the expression data.” Instead, the assay of choice, immunohistochemical staining (IHC), measured HER2 overexpression. The senior author of the paper, Dennis Slamon, subsequently led the first Phase III trial of Herceptin, for metastatic breast cancer. That study used immunohistochemical analysis and led to the first FDA approval of Herceptin in 1998.
However, not long after, re-analysis by the trialists showed that amplification measured by fluorescence in situ hybridization (FISH) predicted response better than IHC:
“FISH assays have higher sensitivity and higher accuracy and more frequently correctly identify altered HER-2/neu status (amplification/overexpression) in previously molecularly characterized specimens than did the FDA-approved immunohistochemistry assays interpreted manually.”
Of note, two of the authors, Michael Press and Dennis Slamon, also co-wrote the paper nine years before which found that “amplification added little predictive value to the expression data.” Now it was the opposite. Slamon-led researchers ultimately dismissed IHC with prejudice in 2005: “We do not consider immunohistochemistry screening for entry to clinical trials or for selection to Herceptin immunotherapy to be an acceptable strategy.”
The FDA, not long after approving Herceptin and IHC, voiced its displeasure about HER2 testing and the “many unanswered questions regarding HER2 detection systems…” The agency threatened to change the Herceptin label because of the “considerable confusion and misunderstanding on the part of the oncology community,” which was “significant enough to warrant general precautionary comment in the trastuzumab [Herceptin] label…”
By 2006, HER2 testing was broken, “a disorganized practice” with a “high rate of inaccuracy,” according to guidelines published that year by the American Society of Clinical Oncology (ASCO) and the College of American Pathologists (CAP). Both organizations had previously recommended HER2 testing but without seeing a need to specify how to do it.
Instead of choosing sides in the IHC vs. FISH debate, the new guidelines put the two assays together. No single kind of test sufficed to identify all HER2-positive tumors. An equivocal result from one assay would trigger another test using the other assay. The UK used this “two tier” system, and the ASCO/CAP 2006 guidelines proposed the same approach for the United States.
Although the UK guidelines detailed testing procedures, they did not cite clinical evidence in their support. Instead, the UK authors “expected that emerging data on accuracy of prediction of response to HER2 targeted treatments will influence the choice of testing method.” The US guidelines also adduced no evidence in support of the two tier system, acknowledging that, “current data are insufficient to define whether these patients represent true- or false-positives.”
Some of the ASCO/CAP panelists had wanted to scrap IHC entirely: “A minority view expressed within the panel was that IHC is not a sufficiently accurate assay to determine HER2 status and that FISH should be preferentially used.” Instead, between 80 and 90% of primary HER2 testing in the United States is done with IHC, and only 10 to 20% uses FISH, according to figures from 2008. A 2010 paper reported that 80% of HER2 assessments in the United States used the IHC-based HercepTest, manufactured by Dako. FISH costs more and requires more expensive equipment, a possible barrier for some hospitals and conceivably part of why FISH is not mandatory.
Herceptin trials were no help in shaping the ASCO/CAP guidelines: “the large prospective randomized clinical trials of trastuzumab were not prospectively designed to answer these questions.” Instead, retrospective “correlative studies” would have to suffice. Also, instead of seeking a true gold standard, any new assays would be measured by how well they reproduced the results of the old assays. According to the guidelines:
“Although a new HER2 assay ideally should have its clinical utility validated using specimens from prospective therapeutic trials that tested the effects of anti-HER2 therapy, the Update Committee recognizes that the rarity of these valuable specimens requires that new HER2 assays be approved on the basis of concordance studies comparing them with other established HER2 tests.”
In addition, the guidelines aimed for concordance rather than accurate diagnosis, with concordance substituting for accuracy. Inaccurate diagnosis is not possible to detect because there is no gold standard. By contrast, discordance can be measured, causes embarrassment and raises concerns at the FDA. As the guidelines recognized, however, interobserver agreement is not the same thing as accurate diagnosis: “concordance of assays does not assure accuracy (i.e., how close the measured values are to a supposed true value…).” Early UK guidelines strove to reduce “interobserver variation in the assessment of staining” by standardizing scoring “against known positive, negative, and borderline cases.” I asked Ian Ellis, corresponding author of the long-ago UK guidelines paper, whether the cases were clinically validated or if the staining patterns were selected for the purpose of minimizing interobserver disagreement. Ellis did not reply to multiple inquiries.
In 2006, widespread HER2 assay discordance prompted ASCO/CAP to announce more stringent cutoffs than the FDA because “the original US Food and Drug Administration-approved interpretation guidelines provide insufficient specificity.” The FDA had approved an IHC staining threshold of 10% of cells. But the panelists believed this resulted in an unacceptably large number of false positives. The guidelines recommended a higher cutoff of 30%, not based on published evidence but anecdote, “the cumulative experience of panel members that usually a high percentage of the cells will be positive if it is a true IHC 3+.”
The guidelines referred to “published reports using cutoff values higher than 10%,” but the footnote pointed to a single study in France which achieved 95% concordance between IHC and FISH by using a vastly higher staining cutoff of 60%. Why the revised US guidelines recommended a 30% cutoff is not clear.
The cutoff for FISH HER2/CEP 17 ratio was also raised, from 2.0 to >2.2. In addition to changing the definition of IHC3+, IHC 2+ patients were no longer considered HER2-positive as they had been in the original, FDA approval-winning trial of Herceptin.
But then seven years after raising the IHC thresholds, the ASCO/CAP guidelines rolled them back. The 2013 guidelines committee “decided to revert to the previously used IHC criterion of more than 10% cells staining.” Re-examination of one trial with 2,904 patients found that the higher threshold only excluded 107 or 3.7%. This seemed to contradict the “cumulative experience” of the 2006 panel that “usually a high percentage of the cells will be positive if it is a true IHC 3+.” Now it seemed to be remarkably rare for IHC 3+ to have more than 10% staining.
The re-examination found that patients not meeting the higher ASCO/CAP guidelines showed no difference (p=.55) in disease free survival whether they received Herceptin or not. But the guidelines were switched back nonetheless.
The threshold for IHC had been raised in 2006 out of concern for false positives. But a then-ongoing study assuaged this worry. The 2013 guidelines said that study found “less than 6% of patients initially considered eligible were not subsequently centrally confirmed as being HER2-positive.” The US, at least, appeared to have its house in order.
The 6% figure came from a central laboratory at the Mayo Clinic that re-examined about one thousand locally-tested patients. However, central vs. local testing in Europe of more than 8,000 patients yielded discordant results in 15% of cases. And when the two US and European central labs later compared results, examining only samples known to be false negatives, they differed on IHC scores for 6 of 25 cases (24%) while FISH scores differed for 3 of 25 (12%). Moreover, the Mayo Clinic systematically assigned higher IHC and FISH scores to a set of 23 cases previously judged as equivocal in local testing. Of the 23, the Mayo Clinic found 15 to be HER2-positive while versus 11 according to the European lab. In other words, the Mayo Clinic might have generated a high percentage of false positives, the concern that had led to raising the IHC staining threshold to 30%, which the 2013 guidelines rolled back.
Which central lab was right? Did the high US concordance rate mean US labs were correctly identifying true HER2-positive patients and the European lab was wrong? “It is not possible to know,” according to the ring study paper, “which central laboratory determination of HER2 status… was biologically correct in terms of distinguishing patients who do or do not benefit from HER2-targeted… therapies.”
4. Are HER2 assays reproducible?
Summary: Neither IHC, FISH, local or central testing generate reliably reproducible HER2 results. Re-examination of the clinical trials leading to FDA approval of Herceptin found discordant HER2 status on as many 26% of patients. Technical shortcomings of both IHC and FISH assays contribute to reproducibility problems. Each laboratory may be using its own cutoff criteria in judging these semi-quantitative assays. The type of assay, its manufacturer and who performs the test can decisively influence the HER2 status of any given patient.
The major trials leading to FDA approvals of Herceptin in breast cancer have been re-examined, producing substantially different results for patients’ HER2 status. As mentioned, the breakthrough trial leading to the first FDA approval of Herceptin depended on IHC. But re-examination found amplification measured by FISH predicted response to Herceptin more accurately; i.e., FISH did not reproduce IHC.
Central testing frequently contradicts local results, a problem which affected both US trials that supported FDA approval of Herceptin in the adjuvant setting. Central retesting of patients in NCCTG N9831 failed to reproduce a local HER2-positive result in as many as 26% of re-examined cases. Similarly, retrospective analysis of NSABP B-31 found 18% of tumors were HER2-positive according to local testing but HER2-negative by central testing using both IHC and FISH.
Trial MA.31 compared Herceptin and lapatinib in metastatic breast cancer in 652 patients found HER2-positive by local testing. However, central re-testing found 115 patients (18%) were not HER2-positive, although they had been enrolled in the trial and treated with an anti-HER2 therapy.
False negatives are also a problem. A 2014 study of 552 locally HER2-negative patients found that 4% were HER2-positive by central testing.
Non-reproducibility also impacts central laboratory testing. As mentioned, a US and European lab each examined the same set of 23 equivocal HER2 cases. The US lab found 15 (65%) HER2-positive while the European lab found only 11 (48%) HER2-positive.
Contributing to reproducibility problems are shortcomings of the tests themselves. In 2006, the ASCO/CAP HER2 testing panel had considered throwing out IHC, the original technology for identifying HER2-positive breast cancer. Skepticism, even condemnation, of IHC continues to this day. “The IHC assay is lousy,” according to Bert Vogelstein at Johns Hopkins University. “No IHC assay is great, many are inaccurate,” he added. FISH, according to Vogelstein, “is not that great either, but it’s the best that the pathologists have.”
For both IHC and FISH, the handling of samples before the test can affect test results. Also, the assays are only semi-quantitative. As the FDA observed in 2001, it “views both IHC and FISH as semi-quantitative if performed under ideal circumstances.” In addition, “Both methods require subjective interpretation.”
According to David Rimm, a HER2 testing expert at Yale Medical School, each lab has its own cutoffs, which he considered a “dirty secret.” (In reply to my initial inquiries about issues with HER2 testing, Rimm replied: “You are about to uncover a landmine.”) The College of American Pathologists (CAP) sends HER2 testing facilities samples to measure and encourage adherence to common, cross-laboratory criteria for HER2-positives and negatives. But according to Rimm, “the College doesn’t send them too many hard ones,” possibly to avoid generating discordant, non-reproducible results.
An additional complicating and underappreciated problem is that tumors are sometimes heterogeneous. A biopsy from one part of a tumor can test HER2-positive while a sample from a different part of the tumor tests negative. Researchers reporting such a case wrote: “We do not know the frequency with which a disparity of this degree occurs, but it is not even mentioned in reviews on this subject or consensus guidelines published previously. We therefore assume that it must be a rare phenomenon or one clearly underappreciated.”
According to Kornelia Polyak of the Dana Farber Cancer Institute, the phenomenon is not rare: “This is a pretty serious problem as we see that ~30-40% of HER2+ tumors have high heterogeneity for HER2 itself within the tumor.”
Not only do different labs sometimes disagree about assay results for a given specimen, in addition, where the tissue sample comes from in the tumor can determine whether a patient is deemed HER2-positive or HER2-negative. (Also tumor cells can interconvert between HER2-positive and HER2-negative states. See question 6.)
5. Can non-reproducible HER2 assays alter trial outcomes?
Summary: Central and local testing can produce conflicting HER2 assessments. In at least one trial, central retesting changed the outcome of the study. There are implications not only for bioethics but a forthcoming meta-analysis that attempts to measure Herceptin’s effects across trials, some with multiple, conflicting HER2 assessments.
The outcome of trial MA.31 depended on whether local or central HER2 determinations were used. By central testing, Herceptin extended life more than lapatinib; by local testing, there was no difference. Recall that in MA.31, local testing identified 652 patients as HER2-positive but central re-testing later found 115 (18%) weren’t HER2-positive.
This makes trial interpretation difficult or impossible, while the timing of the tests created an ethical conundrum. The central re-testing occurred during MA.31 according to Karen Gelmon, the study’s corresponding author. Regarding the unusual timing, Gelmon said: “the thought was to make it easier for the patients and doctors to use local HER2 for randomization to avoid a delay in starting treatment…” However, in speeding patients into the possible benefits of an untested treatment regimen, the design and conduct of the trial resulted in centrally HER2-negative patients being treated with anti-HER2 therapies.
Gelmon said “the central results are considered definitive.” But most patients, including those that were centrally HER2-negative, appear to have completed the study. Regarding this ethical conundrum, Gelmon said: “If the central confirmation showed negative results it was up to the treating physician to decide how to treat, which is always how it is, and they could continue the Herceptin if they thought the local testing was valid.” Gelmon did not reply when asked how many patients stopped receiving anti-HER2 treatment after being found HER2-negative by central testing. It is not clear if patients were informed of the equivocal test results. Anti-HER2 therapies are of course not approved for use in HER2-negative patients because the benefits, if any, are outweighed by toxicities and other side effects.
Although this ethical dilemma arose in a clinical trial, it conceivably impacts every breast cancer patient. It is unclear whether to believe local testing, central testing or neither. Gelmon doubled-down on central testing: “yes – central or validated testing is what should be recommended.” However, the ASCO/CAP guidelines recommend only that the testing laboratory be accredited. Gelmon simultaneously regards central testing as definitive but supports optionally ignoring it. Edith Perez, prompted by FISH-negative cases who later turned out HER2-positive by IHC, recommended that “in the case of negative results, it’s advisable to repeat the test you started with or to run a different test,” perhaps making it sound like testing should be continued until the result is positive. Perez led the NCCTG N9831 study.
It is unclear that a coherent testing algorithm is obtainable from the recommendations and practices coming from these clinical trials.
The Clinical Trials Service Unit (CTSU) at Oxford is conducting a meta-analysis of Herceptin in early breast cancer. But how will the study define “HER2-positive?” For trials with two sets of assessments, the meta-analysis will have to choose between them (and ignore one set) or present results for both local and central testing. However, not all trials re-tested HER2 status, further complicating the aggregation of Herceptin’s effects across trials.
In addition, the type of assay used might be important and worth reporting. “Scientists who actually do these assays (rather than see the reports of the results) know that neither of these assays (FISH or IHC) are particularly reliable on clinical samples,” said Bert Vogelstein. Given the actual complexity and uncertainties in HER2 assessments, CTSU could (and should) examine their accuracy, computing the likelihood of an assessment being correct and/or linking assessment accuracy estimates to the confidence interval around Herceptin’s clinical benefit. Since there is no way to identify “true” HER2-positives, perhaps the best that can be done is to calculate the likelihood of concordance or discordance if a sample were subjected to re-testing.
I asked CTSU’s Richard Gray: “Will your study use the initial or retested results for those trials? How will your meta-analysis deal with the mixture of protocols for determining HER2 status across trials?” Gray replied indirectly: “One prime aim of the meta-analysis will be to investigate whether there is benefit in HER2 receptor equivocal patients, and we’ll collect results of all available local and central assays to look at this.”
However, arguably the real problem is how to perform a meta-analysis of randomized clinical trials where a main variable, HER2, was not controlled. Kornelia Polyak believes Herceptin works, but observed: “If you pick variable patients by definition you will have variable responses leading to confusion.”
6. Is HER2 a valid biomarker that predicts benefit from Herceptin in breast cancer?
Summary: The published literature no longer supports the validity of HER2 as a biomarker for Herceptin in breast cancer. Both US trials leading to FDA approval of Herceptin in early breast cancer later found HER2 did not predict benefit from Herceptin. In small subgroups, HER2-negative patients appeared to benefit more than HER2-positive patients, and Herceptin is now being tested in HER2-negative patients. Alternative biomarkers for Herceptin have been proposed but none accepted.
The FDA approved Herceptin in 2006 for early breast cancer based mainly on two US trials: NCCTG N9831 and NSABP B-31. Both trials later announced some patients had been misdiagnosed as HER2-positive, making it possible to examine clinical outcomes of patients negative for HER2 by central testing who had been treated with Herceptin.
In 2007, one year after helping win FDA approval of Herceptin, B-31 trialists reported neither FISH nor IHC predicted response to Herceptin: “No statistical interaction was found between DFS benefit from trastuzumab and levels of protein (p=0.26) or HER2 gene copy number (p=0.60).” However, although the B-31 trialists wrote of “no statistical interaction,” it appears that HER2-negative patients benefited more than HER2-positive patients. The subgroups were small but nonetheless the researchers reported significant values for each of them, turning the HER2 world on its head.
FISH- IHC- (0-2+)
HER2-negative patients benefited more from Herceptin than HER2-positive patients in NSABP B-31. (Adapted from Paik et al., 2007)
In 2013, B-31 trialists sought
a new biomarker, reiterating that “HER2 itself failed to show predictive interaction
The second trial key to FDA approval, NCCTG N9831, corroborated B-31’s finding that FISH did not predict response to Herceptin. A 2010 re-analysis of N9831, again by the original investigators, found “Trastuzumab benefit seemed independent of HER2/centromere 17 ratio and chromosome 17 copy number,” i.e. independent of FISH.
The two trials which had established HER2 by IHC and/or FISH as the biomarker for Herceptin subsequently disestablished both. On this basis alone, HER2 would seem to no longer be a valid biomarker for predicting response to Herceptin. Logically, HER2 status no longer stands as a valid indicator for treatment with Herceptin. We have known this since 2010.
FISH and IHC might have found redemption in Hera, the trial that led to approval of Herceptin in the adjuvant setting in Europe and also supported FDA approval in the US. Post-approval, Hera trialists examined the relationship between degree of HER2 amplification by FISH and benefit from Herceptin. But they chose not to examine 41 patients with a FISH ratio under two, i.e. HER2-negative by FISH. “We deemed it inappropriate to analyze this small group,” wrote the investigators. Consequently, they could not say how FISH-negative patients responded to Herceptin and whether or not IHC by itself predicted response.
Without looking at the FISH-negative group, researchers continued to posit “a strong threshold effect whereby any degree of amplification above the cutoff ratio of 2.0 is of equal clinical significance.” However, B-31 previously and N9831 found no threshold among a combined 330 FISH-negative patients. The Hera team simply looked away.
The Hera trialists also examined IHC staining intensity and clinical outcomes. This time, IHC-negative patients were not included, which prevented analysis of whether FISH predicted response to Herceptin. Hera investigator Mitch Dowsett explained, in June 2014: “Because of our policy on recruiting only centrally confirmed HER2-positive cases to Hera we were not in a position to do this.” However, apprised that Hera enrolled at least 299 centrally confirmed, HER2-positive patients who were IHC-negative, Dowsett revised his explanation. “I think ‘policy’ is overstating things. We could and maybe should have looked at this group in more detail previously.” But, “prompted by a UK pathologist,” rather than the failure of IHC to predict response in B-31 and N9831, Dowsett said the Hera trialists would examine the IHC-negative, FISH-positive subgroup.
In August 2015, more than a year later, I asked Dowsett how the project was going. “The work was conducted and a manuscript created,” he replied. But then the primary investigator, Bharat Jasani, “left for [a] job in Kazakhstan,” said Dowsett, stalling the investigation. I emailed Jasani and asked: “Were you examining IHC-negative, FISH-positive cases from Hera before leaving for Kazakhstan?” Jasani seemed to contradict Dowsett: “The simple answer is no and I would like to confirm once again that I have not examined at any time any IHC-negative, FISH-positive cases from Hera.”
Analyses of key subgroups in the Hera trial appear to have been avoided. As it stands, every re-test of any assay used to assess HER2 in the FDA approval-winning trials in early breast cancer found that that assay did not predict benefit from Herceptin, or that being HER2-negative predicted greater benefit.
Similar to the Hera trials evasions, HER2 testing experts also avoided addressing HER2’s validity as a biomarker when I raised the issue to them in 2014. I emailed John Bartlett, at the Ontario Institute for Cancer Research, asking: “what established HER2 as a biomarker and what data informed the cutoff point for positive vs. negative?” Bartlett previously co-authored HER2 testing guidelines. His assistant replied: “John says he should be able to answer it via email.” However, Bartlett eventually wanted to speak by phone. When I requested email, the assistant wrote back: “Unfortunately Dr. Bartlett is unable to answer this question.”
The lead author of HER2 testing guidelines, Antonio Wolff, wrote me that FISH “Absolutely yes” predicts response to Herceptin even though N9831 and B-31 showed it did not. “I do fear that the dots you are connecting don’t quite tell a story,” Wolff said. Rather than explain, he wrote: “I think I will stop here.” He requested to speak by telephone but would not allow recording it: “Recording our conversation will not be ok and you do not have my permission.” He added: “My goal was to walk you through your questions informally as an expert source.” Arguably, Wolff declined to go on record to explain why FISH remained valid.
Reliably identifying HER2-positive patients might be impossible. According to Daniel Haber, at Massachusetts General Hospital, “Whether there are ‘true HER2’ tumors or not is up for discussion.” Instead of HER2 status predicting response to Herceptin, response to Herceptin determines who is HER2-positive. Said Haber: “the real definition is probably whether they [patients] respond to HER2 therapy or not…” But if so, HER2 is not a valid biomarker for Herceptin, and Herceptin has no valid biomarker, and prescribing Herceptin for HER2-positive patients makes no medical sense.
7. Does Herceptin’s mechanism of action depend on HER2?
Summary: Recent research suggests Herceptin does not block HER2 signaling, once considered its mechanism of action. No new mechanism of action has been clearly established. Some researchers believe Herceptin might work in HER2 0 patients, i.e. independently of HER2 status.
Herceptin does not block HER2 signaling
“[T]he talking points, the posters, the advertisements, are all about ‘HER2 blockade.’ It makes a good story, much simpler to understand, very pretty pictures, and nicely amenable to commercialization. Unfortunately it's not true.”
So wrote Mark Moasser, at the University of California at San Francisco, in email. More formally, in a published paper, Moasser wrote that Herceptin “was developed on the basis of 1980s understanding of HER2, and it is now clear that it does not actually inhibit HER2 signaling functions very well.” Tyrosine kinase inhibitors like lapatinib do block HER2 signaling but the clinical benefits of lapatinib are scant. Dual anti-HER2 therapy in which lapatinib is added to Herceptin showed no survival benefit in either the adjuvant or neoadjuvant settings as tested in the ALTTO and NeoALTTO trials. A lapatinib-only arm in ALTTO was closed early due to futility.
Additionally, there appears to be no consensus whether degree of HER2 positivity increases response to Herceptin, with greater amplification or overexpression leading to more pronounced clinical benefit. Also unexplained is how Herceptin might work in patients positive for HER2 by amplification but negative for overexpression. Krop and Burstein further fragment the HER2 edifice: in wondering “qui bono” or who benefits from Herceptin, they posit that “the mechanisms may differ in early- and late-stage breast cancer.”
Also, tumor cells appear to convert back and forth between HER2-positive and HER2-negative. Thus HER2 expression “identifies dynamic functional states,” according to Jordan et al. Interconversion may make tumors heterogeneous for any HER2 signal and might partly explain difficulties linking HER2 expression to any tumor phenotype.
HER2 may have nothing to do with Herceptin’s mechanism of action: “We don’t know that trastuzumab would not work in the adjuvant setting for HER2 0 patients,” according to Lou Fehrenbacher. Fehrenbacher is leading a trial, NSABP B-47, which tests Herceptin in HER2-negative patients. The trialists considered enrolling HER2 0 patients, but according to Fehrenbacher, this was deemed “too adventurous.” It might have undermined the entire HER2/Herceptin story. Instead of asking the question: “do Herceptin’s effects have anything to do with HER2,” B-47 answers the question: “Should HER2 low patients also be treated with Herceptin?” a stepwise distancing from current orthodoxy rather than quick, complete abandonment. B-47, which only includes HER2 1+ or 2+ patients, might also result in a large increase in patients treated with Herceptin. According to Fehrenbacher:
“[T]he number of women with 1+ and 2+ non HER2-positive tumors in the US, is 4x the number with HER2-positive. So if the trial is successful the number of women benefiting from trastuzumab will rise to a level 500% of the current number.”
By contrast, a trial design including HER2 0 patients might have shown no relationship between Herceptin and HER2 or perhaps an inverse relationship like the re-analysis of B-31.
B-47 represents an opportunity to test both whether Herceptin works in HER2 0 patients and whether the degree of HER2 positivity predicts greater benefit from Herceptin. B-47 should add an arm of HER2 0 patients allocated to Herceptin or placebo. In addition, a partial arm of 3+ patients should be added, all receiving Herceptin, to allow comparison of the drug’s effect across the range of HER2 positivity, from 0 to 3+.
There is no agreed upon alternative mechanism of action for Herceptin. A leading but unproven candidate is antibody dependent cellular cytotoxicity (ADCC). The current FDA label says “Herceptin is a mediator of antibody-dependent cellular cytotoxicity,” but only based on in vitro evidence. For clinical evidence, Mark Moasser pointed to “A recent landmark study… that showed response/resistance to trastuzumab is powerfully predicted by the immunological signature.” This re-examination of NCCTG N9831 found “that immune function genes are strongly linked to clinical outcome.” The authors proposed a complicated signature comprised “of any nine or more of 14 immune function genes at or above the 0.40 quantile for the population.”
But a critique by NSABP B-31 trialists found that randomly selecting any 14 genes at any expression level resulted in an interaction probability of less than 0.01 in 92% of 10,000 runs conducted using data from 731 patients in B-31. Consequently, “the conclusion that immune-related genes are driving the observation may not be valid because this criterion can be eliminated without effect on model performance.”
ADCC remains a hypothesis. “As far as I’m concerned, the jury is still out whether Herceptin works by ADCC, through other indirect mechanisms, or through interrupting some signaling pathway,” according to Bert Vogelstein. “If it does involve ADCC, it would have to discriminate between low and high amounts of cell surface ERBB2 protein.” It is “not so obvious” how Herceptin might do this given the widely varying levels of ERBB2 protein on cancer cells even in HER2-positive tumors.
In 2010, Edith Perez recommended against Herceptin for HER2-negative patients. One reason: “It doesn’t make any biological sense,” according to Perez. If Herceptin’s mechanism of action is independent of HER2, seemingly it would not make biological sense to recommend Herceptin for HER2-positive cases.
Ultimately, however, Mark Moasser is not worried by there no longer being an agreed upon mechanism of action for Herceptin: “At the end of the day, it doesn’t really matter what the mechanism is, as long as it works.”
8. Would the FDA approve Herceptin today?
Summary: Herceptin’s approval in the metastatic setting benefited from a new FDA fast track. The single phase III trial providing the basis for approval underwent extensive mid-trial modifications—adding different treatment arms and unlike patients—practices that are no longer permitted. Avastin later faced a different FDA process which led to revocation of its approval.
In early breast cancer, had Herceptin’s FDA application been based on the re-analyses of NCCTG N9831 and NSABP B-31, it presumably would have been rejected. The initially impressive results presented to the FDA likely required the heavy modifications made to them, including merging N9831 and B-31 together while dropping one arm which showed no survival benefit for Herceptin. It is unlikely such changes would be allowed today. The trials were enabled and shaped by new NCI policies that allowed cooperative groups like NCCTG and NSABP to conduct phase III trials in support of FDA approval while permitting those groups to accept funding from pharmaceutical companies.
Herceptin in metastatic breast cancer
Genentech’s Herceptin first won FDA approval in 1998 for metastatic breast cancer. With the process taking just five months, Herceptin benefitted from being the second drug to come off a new FDA fast track. Public perception at the time was that an approval logjam was blocking life-saving cancer drugs from reaching patients. But going faster required relaxing standards. On the fast track, “potential” effectiveness was to be considered with the standard only that “potential effectiveness of the treatment should outweigh its toxicities.” These educated guesses would not necessarily have to be checked later: “A post-approval study will not necessarily be required in the exact population for which the approval was granted.”
Trial H0648g provided the basis of Herceptin’s FDA application. But according to the FDA review, “multiple major changes in the protocol were enacted during the conduct of the study.” The biggest mid-course change added entirely new arms to the trial after enrollment of only about 100 patients. The original design tested Adriamycin and cyclophosphamide (AC) against AC + Herceptin (H). The new arms tested a taxane (T) against T + H.
The new arms were then pooled with the original arms—to the chagrin of the FDA. It considered patients in the AC and T arms as “clinically distinct.” The taxane cohort represented a “different prognostic group” than the AC patients and “baseline characteristics differed markedly between paclitaxel and AC patients regardless of assignment to Herceptin therapy or not.” However, the FDA acquiesced on pooling.
Remarkably, as arms were added, the double-blind with placebo design was dropped and the trial became open label. “Patients and investigators object to the placebo,” said the FDA, again accepting a fait accompli.
The trial found adding Herceptin to AC made no difference in overall survival. Similarly, in the new taxane arms, adding Herceptin did not increase survival. But the pooling of the AC and taxane arms, which the FDA had frowned upon, produced a statistically significant overall survival benefit, albeit with a confidence interval touching 1.0.) But absent the large, mid-course alterations to the trial, Herceptin would have shown no survival benefit.
Avastin, also from Roche/Genentech, lost its FDA approval for treating breast cancer after post-approval trials failed to demonstrate a survival benefit. Genentech proposed Avastin for treatment of metastatic HER2-negative breast cancer. As with Herceptin earlier, an accelerated FDA application for Avastin relied on an open label trial, E2100. Although the FDA initially approved Avastin, the review scolded the drug sponsor: “Genentech did not meet with FDA to reach agreement on the design of Study E2100 prior to study initiation.” The FDA found a host of problems with trial E2100 including the open label design and loss of patients to follow-up:
“[T]he effect on PFS by an independent group, masked to treatment assignment, was not implemented during the conduct of the trial. Retrospective analyses by an endpoint review team masked to treatment assignment to independently confirm the E2100 results was marred by substantial loss to follow-up prior to the independent review team’s confirmation of disease progression."
In addition, the lack of independent review led to investigator bias—toward Avastin. According to the FDA, “the discordance rates are slightly different for the two study arms, with the difference favoring the PAC/Bev [Paclitaxel/Avastin] arm over the PAC arm in ECOG investigator-determined assessment of PFS.” The FDA also looked at missing and data and found that a worst case analysis resulted in “elimination of the treatment effect altogether.”
The FDA examined financial ties to the sponsor and found five of the sixteen members on the data monitoring committee members received payments greater than $25,000 from Genentech. A sixth reported compensation that “could be affected by the study outcome.” In addition, “Eight out of 26 investigators (30%) who provided financial disclosure in the E2100 study administration body and data monitoring committee reported financial conflict of interest for receiving payment from pharmaceutical companies.” One of the study co-chairs “failed to reply to the Financial Disclosure requests.”
For Herceptin, by contrast, the FDA did not examine financial ties of trial investigators. But when the study was published in the New England Journal of Medicine, nine of the 12 authors reported relationships with Genentech. The FDA allowed arms to be added mid-trial for Herceptin whereas for Avastin, simply starting a trial without it being OK’d by the FDA drew censure.
Subsequent testing of Avastin in a double-blind, placebo-controlled design required by the FDA found no overall survival benefit, and the FDA revoked its approval of Avastin for breast cancer. For Herceptin to show a survival benefit in the metastatic setting had required pooling of arms the FDA regarded as distinct in an open label design.
Herceptin’s approval hurdles were lower; it might not have met later, higher standards.
In early breast cancer, the FDA approved Herceptin in 2006 based on a joint analysis of NCCTG N9831 and NSABP B-31. But by 2007, B-31 trialists reported that HER2 didn’t predict response to Herceptin, whether measured by IHC or FISH: “No statistical interaction was found between DFS benefit from trastuzumab and levels of protein (p=0.26) or HER2 gene copy number (p=0.60).” And although the authors reported no statistical interaction, HER2-negative patients appeared to benefit more than HER2-positive cases. Corroborating B-31’s results, N9831 trialists reported in 2010 that FISH did not predict response to Herceptin: “Trastuzumab benefit seemed independent of HER2/centromere 17 ratio and chromosome 17 copy number…”
Had these results been presented to the FDA when considering the application for Herceptin in early breast cancer, the application presumably would have been rejected.
What the FDA saw in the Herceptin application was a single successful trial, which was actually made from two studies merged together, with one arm discarded. The unplanned changes were made while the trials were in progress.
In N9831, Arm B tested sequential Herceptin in roughly one thousand women and ultimately showed no overall survival benefit from Herceptin: five-year survival for arm B was 89.3% versus 88.4% in the control arm. Arm B was dropped when N9831 was joined to NSABP B-31.
Although the FDA went along with merging the trials, the oncology community was divided. In 2006, one specialist noted that “In terms of combining the data from the two trials, some oncologists were initially questioning whether that was legitimate.” Sandra Swain, who was at NCI when the trials were joined, answered it was “clearly legitimate.” She asserted that the trials were combined because they were going well: “No one had any idea that we’d have the benefit that we do.” However, joining trials increases statistical power, enabling detection of weaker effects while, obviously, dropping a low or non-performing group of patients might have enhanced the perceived effects of Herceptin in the remaining arms.
An FDA spokesperson offered conflicting answers in 2014 regarding whether the individual trials would have met their endpoints, initially saying: “The FDA cannot speculate on if the trials would or would not have met their original endpoints.” But subsequently the spokesperson speculated that the trials would have been “likely to demonstrate efficacy as individual trials…”
No results for B-31 have been published. A number of researchers suggested in a letter, “Trastuzumab: possible publication bias,” published by the Lancet in 2008, that the results of the individual trials should be published separately. Asked in 2014 for efficacy data, NSABP’s Soon Paik declined, saying only that “they are essentially the same as what is in the combined analysis.”
A meta-analysis of Herceptin being conducted by the Clinical Trials Services Unit (CTSU) at Oxford will include all N9831 patients, including Arm B. According to CTSU’s Richard Gray: “We will analyse the combined concurrent and sequential trastuzumab arms versus no trastuzumab from the 3-way randomisation periods of N9831,” as well as the concurrent and sequential arms separately.
The N9831 and B-31 trials were conducted by cooperative groups, the North Central Cancer Treatment Group (NCCTG) and the National Surgical Adjuvant Breast and Bowel Project (NSABP) respectively. Originally, cooperative groups were funded by NCI. However, in 2000, NCI allowed cooperative groups to accept industry funding. And two years before, the FDA said it would accept trials performed by cooperative groups to support applications for FDA approval. Previously, cooperative groups mostly conducted phase II trials. Arguably, these decisions transformed a public and publically-funded system into one dominated by pharmaceutical companies. B-31 and N9831 were started around the time of the new NCI and FDA policies, in July 1999 and April 2000.
In 2002, the FDA sought to tighten a number of clinical trials policies, but they were opposed by the cooperative groups. NSABP, joined by NCCTG, challenged the FDA reforms in a letter signed by John Bryant, the statistician for the joint N9831/B-31 trial. The FDA had sought to treat the cooperative groups as a sponsor, perhaps because they had begun receiving industry funding. Also, the FDA wanted more blinding of study teams and greater independence of statisticians preparing reports. But the NSABP letter answered that “it will not be practical to arrange for statisticians independent of the Cooperative Groups to prepare and present interim reports…” There were too many trials and “simply not enough qualified personnel available to do so.” The cooperative groups claimed these and other proposed changes would have a “substantial negative impact” on clinical trials including even patient safety.
Concern about industry funding of previously trustworthy cooperative groups surfaced at a 2009 NCI workshop on “Multi-Center Phase III Clinical Trials and NCI Cooperative Groups.” As one participant said:
“If we do not have a robust independent review of these trials, the criticism will be raised quite quickly that these trials are being done by industry and that public dollars should not pay for them. What will protect these trials is that they have a very robust independent review, not just a cooperative group–only review.”
The FDA audited none of the US Herceptin trials. According to the FDA medical review of the joint N9831/B-31 trial, “A DSI [Division of Scientific Investigations] inspection was not performed for this application; given the large number of sites and small percentage of patients enrolled at any individual site, no single study or limited number of sites would have substantial impact on the study results.” If multiple sites and widely distributed patients protect against improprieties, then perhaps no phase III trial would ever need to be audited.
The FDA was unable to confirm that the cooperative groups audited their Herceptin trials: “Because of the nature of the conduct and reporting of the clinical site audits, it cannot be determined whether a specific study was audited during the clinical site inspection…” A statement by the sponsor about audits provided “no information on the actual results of site audits,” according the FDA review of Herceptin.
(I suggested to Richard Gray that the CTSU Herceptin meta-analysis could attempt to reproduce the results of the individual studies as one kind of check on the un-audited trials.)
The possibility of investigator bias was not examined. The FDA “did not request confirmation of the events by an independent endpoint assessment panel that was masked to treatment assignment.” The FDA reported “approximately 4% of the population in the ITT efficacy dataset had missing information with respect to surgical type, nodal status, hormone receptor status, tumor size, histological grade and histologic type.” However, the FDA did not examine whether the gaps could have influenced trial endpoints, whereas in the case of Avastin, a worst case analysis found that missing data eliminated the reported treatment effect.
In 2014, the FDA modified the Herceptin label to state for the first time that the drug increases overall survival in early breast cancer. However, the benefit was found in an “efficacy evaluable” population rather than the gold standard, intention to treat population (ITT). In an April 2014 conference call, the FDA asserted that the ITT and efficacy evaluable populations were identical and that the sponsor, Roche/Genentech, requested that the label read “efficacy evaluable.” Why a pharmaceutical company would request a lower grade of evidence for the lifesaving benefits of its drug is not clear.
Also, in the joint trial, disease free survival falls while overall survival climbs. Perhaps only Provenge demonstrates a similar pattern among cancer drugs. Provenge does not enjoy the same reputation for efficacy as Herceptin.
“It is what [it] is,” N9831 statistician Vera Suman wrote in email.
HR: disease event
Joint N9831/B-31 trial results over time (Source: Vera Suman personal communication, 18 October 2013)
Also, in the final report on the joint study, years of median follow-up took an unusually large, 4.5-year leap in the space of approximately one calendar year. Rebecca Gelman, statistician at the Dana Farber Cancer Institute, brought this to my attention in 2013:
“As a side comment, this all leads me to wonder about the ‘8.4 years of follow-up’ in the 2012 SABC abstract, since it is so much longer than the 2011 JCO paper. Either someone did a big update of survival in 2012 (by calling all the patients or by checking the National Death Index), or else the SABC abstract was reporting OS at a time past the median survival).”
Another statistician described the leap in follow-up as “impossible,” saying that median follow-up usually goes up about one year for every calendar year. The FDA said the difference might be explained by the data lock dates for the two papers. However, the agency didn’t provide dates that would allow verifying their explanation.
9. Can trial estimates of survival increases be squared with population-level survival figures?
Summary: Some medical researchers have suggested that therapies containing Herceptin may cure breast cancer. A Genentech-funded study estimated Herceptin saved 156,413 total life years in the United States from 1999 to 2013 for metastatic breast cancer alone. However, NCI reports only a 1.1% increase in five-year survival over a similar period. Estimates of Herceptin’s life-extending benefits should be compared to population-level figures.
HER2 prevalence at the population level is only 15%, according to NCI, well below early estimates of 25-30%.
Impact of Herceptin on five-year survival at the population level
At the 2012 SABC, presenting joint N9831/B-31 results, co-primary investigator Edith Perez advanced the idea that Herceptin cures breast cancer: “We believe that the data support the concept that many patients who present with HER2-positive breast cancer may be cured with combination strategies.” Herceptin had come a long way. Dennis Slamon, a main progenitor of Herceptin, originally believed that, by itself, Herceptin was only cytostatic, halting tumor growth which “resumed on termination of antibody therapy, indicating a cytostatic effect.”
According to a Genentech-funded study, Herceptin has saved 156,413 total life years in the United States from 1999 to 2013 for metastatic breast cancer alone. But it is unclear if population level statistics corroborate Herceptin’s curative powers. It is not known whether five-year survival has increased as much as it would need to in order to match the Genentech-funded estimate of years of life added. “We did not try to triangulate our results to the overall population,” said corresponding author of the study, Mark Danese.
In the overall population, according to NCI’s Jenny Haliski, “we are seeing a small increase in survival since 1998,” the year of Herceptin’s first FDA approval. Haliski is NCI Media Branch Chief. “Part of this increase can be attributed to improvements in treatment,” said Haliski. However, clinical trials are conducted in “ideal situations and usually include younger patients without comorbidity,” according to Haliski. “Thus, treatment efficacy in a clinical trial is usually higher than treatment effectiveness at the population level.”
It ought to be possible and instructive to decompose the 1.1% increase in five-year survival from 1999 to 2012 to determine the contribution from Herceptin. As I have suggested to Richard Gray, CTSU’s meta-study could and perhaps should try to square its estimate of Herceptin’s benefits with population-based survival figures.
Herceptin won FDA approval for metastatic breast cancer in 1998 and early breast cancer in 2006. (Chart source: National Cancer Institute, SEER Cancer Statistics Review 1975-2013, Table 4.13, all ages, all races)
10. Can the Cleopatra and Marianne trials be reconciled?
Summary: The Cleopatra trial, which added pertuzumab to Herceptin and a taxane, produced the largest survival increases of any of clinical trial of Herceptin, nearly 16 months. But the Marianne trial seems to contradict Cleopatra. Marianne tested a version of Herceptin, T-DM1. Adding pertuzumab provided no more clinical benefit than T-DM1 alone. The phase II NeoSphere trial of pertuzumab and Herceptin also did not produce the remarkable results of Cleopatra.
Adding pertuzumab to Herceptin and a taxane in the Cleopatra trial yielded a remarkably large increase in survival, nearly 16 months longer than the standard of care, Herceptin + taxane. However, the Marianne trial seems to contradict Cleopatra: an arm testing pertuzumab with the Herceptin-based T-DM1 did no better than Herceptin + taxane. As a notice on the ASCO website said: “the addition of pertuzumab to T-DM1 provided no efficacy benefit.” T-DM1 conjugates the cytotoxic emtansine to the Herceptin antibody.
Similarly, in the neoadjuvant setting, adding pertuzumab to Herceptin showed no benefit in the NeoSphere trial which reported that “progression-free survival and disease-free survival at 5-year follow-up show large and overlapping CIs.” Pertuzumab by itself showed very little single agent activity in a phase II trial, so the benefit of combination with Herceptin is presumably synergistic. Why would it not also be synergistic with T-DM1?
Paul Ellis, who led the Marianne trial, pointed to a “number of possibilities and probably a mix of a number of issues” that explained why pertuzumab showed no benefit. In Cleopatra, said Ellis, “patients have Taxol/ Taxotere as a backup” if they do not respond to Herceptin. However, T-DM1 by itself performed just as well as Herceptin plus a taxane. No “backup” needed, and the question is why including pertuzumab added nothing in Marianne.
Ellis also observed that the “Herceptin dose per week [was] higher than T-DM1.” Yet the dose of T-DM1 was apparently high enough to perform as well as H + T. And there does not appear to be support for another trial with a different dose. Said Ellis, T-DM1 “will now never see the light of day” in early breast cancer.
Also figuring in Ellis’ possibilities were “slightly different patient populations.” However, the differences would need to be extreme rather than slight: no response at all to pertuzumab in Marianne and incredible life-extending responses among Cleopatra patients.
That leaves the idea that “maybe [T-DM1] binds differently and alters configuration in a different way” than Herceptin. However, Ellis acknowledged this directly contradicted expectation: “Every senior clinician I know in his area expected Marianne to be positive!” In addition, prior to Marianne, one research group reported “T-DM1 plus pertuzumab resulted in synergistic inhibition of cell proliferation and induction of apoptotic cell death” while another found “Trastuzumab-DM1 (T-DM1) retains all the mechanisms of action of trastuzumab.” Commented Ellis: “I think this study [Marianne] has forced them to go back into the lab and try and understand it better.” According to Ellis, “even the guy at Genentech who invented both Pertuzumab and T-DM1 can’t really understand why” pertuzumab did nothing in Marianne.
In other words, it appears that conjugating emtansine to Herceptin completely cancels synergy with pertuzumab, although both drugs were designed by the same person. Alternately, Marianne disconfirms the results of Cleopatra.
I also asked Allan Lipton about the Cleopatra-Marianne dissonance. Lipton replied: “I do not think I am the right person to answer your Cleopatra questions.” But Lipton has co-authored several papers on alternative assays for determining HER2 status and investigated HER2:HER3 dimerization and pertuzumab. I replied to Lipton: “I wonder if you aren’t the ideal person to answer such questions.” He demurred: “I don't think I have any answers for you on these observations from clinical trials.”
Although pertuzumab is frequently described as completing the blockade of HER2 and HER3, according to Mark Moasser, “pertuzumab doesn’t interfere with dimerization when HER2 is overexpressed.” HER2 overexpression has been thought of as the sine qua non of HER2-positive breast cancer. Moasser emphasized that it is “very true” that pertuzumab doesn’t block HER2 signaling when HER2 is overexpressed. Instead, “trastuzumab and pertuzumab work through immunologic mechanisms in HER2-positive cancers, and two antibodies provides double the tumor cell coverage and better immunologic targeting by the immune system.” He added: “This is not universally accepted by everyone but at this point the data is pretty clear to me and many others.”
Moasser attributes the disappointing performance of pertuzumab + T-DM1 in Marianne to the absence of a taxane: “I would say it's because taxol (or taxotere) is so effective, it’s not a shortcoming of T-DM1.” Paul Ellis advanced a similar argument. However, T-DM1 by itself performed as well as Herceptin and a taxane. In fact, progression free survival with T-DM1 alone was higher, 14.1 months vs. 13.7 months although not significantly. But adding pertuzumab to T-DM1 did nothing.
According to Moasser:
“Chemos have a 12-hour high concentration exposure and cause a lot of tumor cell kill in a short time leading to release of many cellular antigens, etc. T-DM1 provides continuous exposure and there is incremental tumor cell killing day-by-day rather than mass killing on one day. That may be less immunogenic than the chemo method.”
However, emtansine provided enough immunologic kick for T-DM1 to equal the clinical benefits of Herceptin and a taxane. Thus Moasser’s explanation for the futility of pertuzumab seems to require that pertuzumab has different immunological prerequisites than T-DM1.
In the trial which won Herceptin initial FDA approval, adding a taxane to Herceptin delayed disease progression by 3.9 months, while in Cleopatra, further adding pertuzumab to the regimen added nearly 16 months. This quite massive effect is unexplained. Said Moasser: “I don’t claim to know all the nuances of how chemo and immunology interact with each other, and frankly nobody really does, the field is still in its infancy.” The pharmacologists, however, have somehow hit a home run with Herceptin and pertuzumab although swinging as if with eyes closed.
With EGFR inhibitors in lung cancers or BRAF inhibitors in melanomas, the mechanisms of action are clear as are the clinical results. However, said Bert Vogelstein, “we do not know how or why Herceptin works,” and the conflicting results of Cleopatra and Marianne show “that all conclusions or predictions are on thin ice,” according to Vogelstein.
The HER2 and Herceptin story used to be simple and compelling: we knew who it worked for and why. Now we don’t, despite nearly two decades of learning. The current balance of scientific evidence arguably no longer supports the idea of a HER2 subtype in breast cancer.
There is conflicting evidence whether HER2 is even transforming and whether it drives breast cancer. Also, the Ross et al. literature reviews supporting the prognostic role of HER2 are especially dubious. (Those papers should be corrected or retracted.) At present, the view that HER2 is prognostic is unsupported.
Although medical diagnostics have gray areas, the reproducibility of HER2 testing appears to be in a range where it perhaps should not be considered scientific. Different pre-analytic conditions, different assays, different subjective assessment criteria, tumor heterogeneity and the lack of any gold standard lead to conflicting results which are resolved arbitrarily. That the Hera trialists evade or perhaps even dissimulate regarding investigations of particular subgroups that could help validate or further discredit FISH and IHC might point to a widening disparity between appearance and reality. The main Herceptin orthodoxy has broken down completely: Herceptin does not block HER2 signaling and its mechanism of action might have little or nothing to do with HER2.
Nonetheless, Krop and Burstein contend: “Beyond a doubt, trastuzumab works.” Yet absent questionable modifications to key trials, Herceptin might not have won FDA approval. Avastin, which lost FDA approval, also works for some breast cancer patients, but there is no biomarker to predict response. The published literature demonstrates that HER2 does not predict response to Herceptin, leaving Herceptin without a valid biomarker. To paraphrase Daniel Haber: “HER2-positive” just means “responds to Herceptin.” Even HER2-negative patients can benefit, perhaps even more than HER2-positive patients. Seemingly, either all breast cancer patients should get Herceptin or none should, the latter option representing the FDA’s decision for Avastin.
At present, the standard of care is for all breast cancer patients to be tested for HER2. The tests suffer very considerable reproducibility problems. In addition, based on the re-analysis of clinical trials leading to FDA approval, HER2 doesn’t predict response to Herceptin. We don’t know who should get Herceptin but current guidelines pretend otherwise with HER2 tests that are too much like divining rods.
The clinical benefits of Herceptin might be smaller than thought. The modifications of the trials leading to FDA approval might have artificially pumped up the drug’s benefits. But in addition, at the population level, five-year survival has only increased about 1.1% since the introduction of Herceptin. Converting that modest rise into median number of months of increased survival per Herceptin patient might be instructive—perhaps corrective—of strong claims regarding the curative powers of Herceptin-containing treatment regimens.
The Cleopatra trial reported the largest increases in survival of any Herceptin trial ever. The addition of pertuzumab to Herceptin and a taxane pushed median survival up by an incredible 16 months, whereas adding the supposed workhorse of the two, Herceptin, to a taxane produced only a 4-month rise. Furthermore, in the Marianne trial, adding pertuzumab to the Herceptin-based T-DM1 did no better than T-DM1 alone, adding zero months of survival instead of 16. Worryingly, researchers who might be able to explain the seemingly contradictory results are silent. Somewhat as with conflicting HER2 assessments, researchers and physicians can just choose what to believe.
A kind of HER2 fundamentalism has taken hold as foundational truths have broken down: “clinicians should rely on established markers of HER2 expression for selecting patients,” suggested Krop and Burstein. But those very same biomarkers are what have been dis-established. Also, “established” does not mean valid, rather physicians are counseled to use the old knowledge from when the HER2/Herceptin story was compelling and coherent.
Like efforts to keep the earth at the center of the solar system, complicated epicycles have been devised to hold on to HER2 orthodoxies. A simpler explanation might better fit the contradictory evidence: while HER2 overexpression and amplification are real phenomena, there might not be a clinically meaningful HER2 breast cancer subtype.
Summary of Recommendations
- Reconduct the experiments addressing whether HER2 is transforming in mouse cell lines
- Add arms to B-47. Include HER 0 patients, receiving either Herceptin or a placebo, and a partial arm of HER2 3+ patients, all receiving Herceptin
- Decompose the 1.1% increase in five-year breast cancer survival from 1999 to 2012 to determine the contribution from Herceptin
The Herceptin meta-analysis being conducted by the Clinical Trials Services Unit at Oxford should:
Attempt to reproduce findings of the individual studies, including the joint N9831/B-31 trial that led to FDA approval of Herceptin in early breast cancer
- Estimate the likelihood of assessed HER2 status being correct, if that is possible
- Allow confidence intervals around Herceptin’s clinical benefits to reflect estimated HER2 test accuracy
- Check estimates of Herceptin’s contribution to overall survival against population-based survival figures
A map from the Malaria Atlas Project, modified and superimposed on a photograph of Maarten Vanden Eynde’s “IKEA Vase”
The Malaria Atlas Project (MAP) found that human interventions this century averted fully 663 million cases of the disease. “Malaria in Africa,” according to MAP, “has halved since the turn of the millennium.”
MAP’s interactive application visually depicts human triumph over disease, malaria driven back, year after year. But is the triumph real or a special effect? More broadly, is malariology accurately representing reality or is it giving malaria a makeover?
Both the visual aspects and the science of MAP invite scrutiny and raise questions. What the maps show sometimes diverges from what the data actually say, for example. And MAP's data sometimes contradict the World Malaria Report when they ought to be nearly the same. It is doubtful that MAP accounted for age shifting while it is certain that MAP did not model the impact of an epidemic of insecticide resistance on the effectiveness of insecticide-treated bed nets. Both decisions might lead to an overestimate of human progress against malaria. Indeed, a different set of choices might show malaria is now resurgent rather than falling.
Images and science are being tweaked elsewhere in the malaria world. A paper in the Lancet on insecticide resistance presents a map that may have been improperly manipulated. In a separate study of insecticide resistance, a senior author "muted" the finding that resistance substantially reduced the protective benefit of bed nets. In addition, estimates from malaria researchers of the economic benefits of malaria have jumped implausibly from $0 in 2010 to $4 trillion today. Malariaologists are also going as far as saying that artemisinin-resistant malaria is spreading in Southeast Asia and threatens a leap to Africa when current published evidence does not support this contention.
A Lancet ombudsman fended off criticism of one publication saying: “the paper conveys information that suffices for the message,” a philosophy that mis-informs too much malaria research.
These dissimulations may be well-intentioned, but they are not science.
Malaria Atlas Project (MAP)
Modelers make choices that shape the model. A few shards from an IKEA coffee mug became an amphora (pictured above) by the hands of artist Maarten Vanden Eynde. Similarly, the actual shape of malaria’s burden is ambiguous. Shards of malaria incidence data are so scarce that the World Health Organization (WHO) said it can't tell if cases are rising or falling in 32 of the 45 countries in the Africa region.
The MAP visualization mostly displays modeled estimates, not data. Importantly, MAP relies not on reports of malaria cases (which tend to be few and dubious) but parasite prevalence surveys. These surveys test for malaria parasites in blood samples taken from people in numerous different locales over time. MAP combines geo-located survey information with many other factors, like satellite weather data, all processed by minutely engineered statistical methods. Along with the visualization, MAP and other malaria researchers (Bhatt et al.) produced a numerical summary, published in Nature last September: “The effect of malaria control on Plasmodium falciparum in Africa between 2000 and 2015.”
But for countries like Madagascar, the map and the numbers used in the paper disagree. MAP displays malaria cases estimated from parasite prevalence surveys while, behind the scenes, the aggregate statistics for malaria cases in Bhatt et al. are based on country data.
Maps of Madagascar
that should show about the same amount of malaria, based on the bar graph
(top), but do not
According to the bar graph, average malaria incidence in Madagascar was roughly 69 cases per thousand people in both 2005 and 2014. The map for 2014, however, clearly shows far less malaria than in 2005. Peter Gething, corresponding author for Bhatt et al., confirmed the disparity, saying it is “entirely correct that there is a mis-match between the time series and the map.”
More than mis-matched, the map and data tell contradictory stories. Visually, malaria in Madagascar appears to be getting crushed. But the data—considered by the researchers as more reliable—say malaria has been rising since 2008, approaching the same level seen in 2000. Malaria in Madagascar looks much better than scientists believe it really is.
Gething, who leads the MAP effort, explained that for some countries “it makes far more sense to base estimates of cases on the country-reported data.” He added: “The list of countries for which the second approach was used is listed in the paper if you are interested.” But the list is not in the paper. Gething did not reply to multiple email requests for the list. However, after the editors of Nature intervened, the MAP tool was changed to list the 11 countries handled like Madagascar where the country data “may not correspond to the parasite-rate derived maps.”
The text in red is a post-publication
clarification for MAP (Source: Malaria Atlas Project)
Gething said the 11 countries had smaller malaria burdens and better health systems, leading to more reliable reporting. “Many were unambiguous but inevitably some were borderline situations where arguments could be made for either approach.” But the position of the borderline might have decided the conclusions of the paper. Based on WHO-published country data, reported malaria cases appear to be rising in up to 28 African countries. If some or all of those 28 countries had been chosen, Bhatt et al. might have found a malaria resurgence. Gething would not further detail the criteria used to select the 11 countries.
In concussion research, the NFL stands accused of cherry picking data to produce a milder picture of head trauma. As one critic put it, in excluding unflattering data, “You’re not doing science here; you are putting forth some idea that you already have,” like Maarten Vanden Eynde choosing to make an amphora from fragments of a coffee mug.
Of concern, MAP used country reporting for Gambia, Mauritania and Senegal, three countries which WHO categorized as not having assessable country data. Also puzzling, Senegal has a relative abundance of parasite prevalence surveys. (See figure 2 in the Bhatt et al. supplement.) Senegal even contributed to the much more rarefied data used to transform parasite prevalence into malaria case estimates.
There are presumably good reasons to use country data for Senegal, but Gething would not say what they were. Also, it is not clear if MAP used all the available parasite prevalence surveys or, NFL-style, an unspecified subset. Again, Gething would not say. (He answered 2 of 13 emails which I sent over a three-month period.)
More concerning, what Gething described as “official country-reported data” used by MAP differs radically at times from similar data published by WHO in the 2015 World Malaria Report (WMR). Gething said “some adjustments for known under-reporting or missing data” were applied to the country data. But for Rwanda in 2014, MAP and the WMR present very different pictures of what is happening, although both are based on some form of national reporting.
WHO shows malaria surging
in Rwanda (top graph, orange line) whereas MAP (bottom graph) shows malaria
tailing off in recent years. (Note: The time axis for the MAP chart runs from
2000 to 2015, one more year than the WHO chart.)
A press account corroborates a sizable malaria resurgence in Rwanda: “Malaria cases in Rwanda rose at 68.6% last year  to reach 1,598,076, against 947,689 cases last year; According to figures released by the Rwandan Ministry of Health.”
It’s not just Rwanda. For 2014, of the 11 nations for which
MAP used country reporting, MAP figures undershoot WHO confirmed cases in five: Botswana, Namibia, South Africa, Swaziland
(Swaziland, as a side note, is not mapped but shows as gray for all years, indicating either intermittent malaria transmission or none. Intentional or not, a gray Swaziland slyly promotes the strategy of “shrinking the malaria map.”)
MAP does not use country data for Burundi. MAP’s survey-based algorithms, however, produce estimates that directly contradict WHO-reported country data.
Malaria incidence in
Burundi: Rising sharply according to WHO-reported country data (top, orange
line) but falling steadily to its lowest point this century according to the Malaria
Atlas Project (bottom).
“We are not sure why the estimates exceed the reported number of cases,” said WHO’s Richard Cibulskis who is also a co-author of Bhatt et al. Cibulskis was uncertain “whether this reflects some double counting of cases or the estimates are just off.” Double counting can be excluded, unless it also afflicts previous versions of the WMR which show much the same chart for Burundi. WHO has not corrected the 2015 edition, so the MAP estimates are “just off.” While data and estimates must be expected to differ in a modeling exercise, the degree of divergence in Burundi might raise proportional concern regarding the model’s validity.
Gaussian process models and reconstruction paste
Maarten Vanden Eynde’s amphora mostly took its shape from reconstruction paste, with just a few pieces of the original blue coffee mug. Similarly, the malaria map for Chad is almost all model. Over the 2000-2015 period, MAP had only a single 2004 study of 960 people.
Red arrow points to the single data fragment, during the 2000-2015 period, to map malaria for all of Chad. (Adapted from Bhatt et al. supplement, Figure 2.)
MAP fills in this data void with exquisite math, computing power and data borrowed from elsewhere to find a steady decline of malaria in Chad, from a peak in 2006 to a low in 2015.
But an amphora is not a coffee mug and malaria in Chad is differently shaped in the eyes of other academics. According to Foster et al., “616,722 malaria cases were reported in 2012, an increase of over 200,000 cases since 2006.” The World Malaria Report also sees malaria in Chad very differently. (Graph not shown.)
Richard Cibulskis suggested that greater use of rapid diagnostic technology possibly increased detection of cases, although “this does not necessarily reflect a true increase in malaria incidence, just an increase in diagnostic effort.”
But Cibulskis acknowledged there were true increases: “Some countries such as Uganda have experienced a resurgence in cases.” To distinguish signal from noise, researchers consider malaria hospital admissions, deaths, diagnostic practices and test positivity rates. I asked Cibulskis if, after taking those factors into account, “can an increase in cases be ruled out for any of the  countries which are showing increasing confirmed cases?” In other words, can the possibility that malaria is actually on the rise across most of Africa be excluded? Cibulskis did not reply.
Paying it forward: shifting malaria to older age groups
Malaria interventions frequently target very young children who lack immune protection which develops over time—and by becoming infected with malaria. Averting malaria in the very young reduces cases and overall malaria transmission, but it also prevents acquisition of immunity. As children get older, even where malaria transmission has been pushed down, some will become clinically ill with malaria because of reduced immunity. Overall, cases are greatly diminished but some are “shifted” to older age groups.
Bhatt et al. report parasite prevalence estimates for a cohort aged from 2 to 10. But it is unclear if they adjusted their estimates for the age shifting effects of malaria interventions. If not, their estimates might overstate progress against malaria by leaving the effects of age shifting off the books.
Best scientific practice seems to require accounting for age shifting. According to Briët & Penny (2013): “Many malariological studies limit themselves to examining malaria in children under ten or under five years of age...” However, “analyses for the whole population are preferred as the analyses for children under five do not capture the shifts of morbidity and mortality to older age groups...” I asked Melissa Penny, a co-author of Bhatt et al., whether that paper did as she recommended. Penny deputized Peter Pemberton-Ross to answer my question, but he didn’t. He said the software used “certainly includes the possibility for age-shifting through its immunity submodel.” He also said that inferring incidence data from prevalence data “may implicitly assume some age-shifting.” But Pemberton-Ross would not say, yes or no, if the Bhatt et al. estimate of 663 million cases averted accounted for age shifting. Peter Gething did not reply to my inquiry about age shifting.
Spatial only Gaussian Markov random field
Bed nets were “by far the largest contributor,” to averting those estimated 663 million cases, blocking 68% or 450 million malaria episodes, according to Bhatt et al. However, although the MAP interactive application shows the distribution of insecticide-treated nets (ITNs) changing in both space and time, the Bhatt et al. paper used a spatial only model for bed nets. The paper’s supplement states that a spatial only model for bed nets “was preferred over the spatio-temporal model.” Researchers made do with “national means estimated previously” for nets, published in the 2013 World Malaria Report.
Conceivably this creates a mismatch between the maps of bed nets shown by the interactive application and the data used to estimate cases averted, perhaps like the mismatch of map and data for Madagascar. But a spatial only model for nets might mean a mismatch for all countries.
According to Pemberton-Ross, the spatial only model is just “a technical issue… This choice will have affected the results, but not necessarily by making them less accurate.”
The issue might be fundamental rather than technical, but Peter Gething did not reply when I asked if the conclusion that nets averted 450 million cases of malaria since 2000 rested on a comparatively crude, spatial only model. I also asked Gething whether the interactive mapping tool was displaying spatio-temporal bed net data when, behind the scenes in the Bhatt et al. paper, calculations for cases averted were actually based on a static, spatial only model. Gething did not reply.
I raised these issues to Nature. The editors were responsive to the Madagascar map discrepancy, and appear to have occasioned MAP’s listing of the 11 countries treated like Madagascar “in the interests of transparency,” said Rebecca Walton, Nature’s Senior Press Manager. But the other issues were met with pro forma dismissal: “The paper was rigorously peer reviewed as part of our usual editorial procedures.”
Insecticide resistance—and intransigence
Although bed nets are thought to have stopped 450 million cases of malaria, Bhatt et al. urged that maintaining their effectiveness in the face of insecticide resistance “should form a cornerstone” of future control strategies. But this grave threat to bed nets is entirely absent from the model, as if it’s solely a future concern. (Peer reviewers presumably agreed.)
However, the very distribution of hundreds of millions of nets sparked a proportionally vast rise of resistant mosquitoes, “a worsening situation that needs urgent action to maintain malaria control,” as the subtitle of a recent paper put it. Only a single class of insecticide, pyrethroids, is used to treat nets. Unsurprisingly, mosquitos have developed multiple genetic escape mechanisms, very much as they did when faced with DDT, the primary weapon used in earlier, mid-20th century efforts to eradicate malaria.
Mosquito resistance to DDT increased gradually, ultimately rendering it ineffective and leading to the failure of eradication efforts. With pyrethroids, we seem to be watching a brutal remake of the DDT story. But researchers raise more questions than they answer about pyrethroid resistance: "Is it a problem? How do you know?" asked David Smith, a member of MAP and co-author of Bhatt et al.
In addition to doubting if insecticide resistance is a problem, Smith suggested that dispelling such doubts is nearly impossible: “The experimental unit is the population,” he contended, and “we would need to start collecting data from across the continent,” meaning Africa. A second study, also at continent scale, would be needed to measure the “attributable effect of resistance.” Even more remarkably, Smith said he “would expect the effect size of ITNs to go up overall,” if these two massive studies were somehow completed.
Smith is not alone in denial and casuistry. In Strode et al., researchers set out to investigate “the evidence that resistance is attenuating the effect of ITNs.” But instead, the scientists declared “ITNs are more effective than [untreated nets] regardless of resistance,” which is tautological. Until 100% of mosquitoes are 100% resistant to pyrethroids, an insecticide-treated net will always be more effective than an untreated one.
“Agreed with respect to the tautology,” acknowledged first author Clare Strode. “The ability of ITNs to kill insecticide resistant mosquitoes was significantly reduced when faced with resistance mosquitoes,” but, this message was “muted” in the paper according to Strode. “I originally included a much stronger statement of fact that ITNs kill fewer resistant mosquitoes than susceptible counterparts,” she continued, “but the statistician and Cochrane expert recommended a less bold statement.”
The Cochrane expert, Paul Garner, did not dispute Strode’s account or explain why he recommended a less bold statement. (Also at Garner’s suggestion, the Strode et al. review excluded 914 studies without explaining why.) In 2004, Garner was involved in the Cochrane Review of bed nets that provided much of the basis for the massive scale-up that intervention. Muting the conclusions of Strode et al. might serve to protect the conclusions of the earlier review and the subsequent, massive, bed net intervention that seems to have gone awry.
Many malariaologists demand proof that pyrethroid resistance reduces the impact of nets, a stance akin to the tobacco lobby’s denial that cigarettes cause cancer. According to Clare Strode, “I cannot see how increasing resistance would NOT impact ITN efficacy.” It’s undeniable: “There is no biological basis to argue otherwise,” said Strode.
Nick Hamon agrees: “Yes, resistance compromises efficacy. That is no longer in question.” Hamon runs IVCC, a consortium tasked with developing new insecticides.
Nonetheless recent peer-reviewed papers still ignore resistance. A paper in the Lancet estimating how much malaria might be reduced by further expanding interventions “assumed no loss of effect due to drug or insecticide resistance.” In fact, the authors almost doubled the killing effect of nets in their model, adapted from Menach et al. (2007). Senior and corresponding author, Azra Ghani, did not reply to emails asking how to reconcile today’s stronger insecticide resistance with an assumption of greater killing power than nine years ago.
Insecticide resistance has been modeled. Brady et al. found that resistance cuts the effectiveness of nets by half or as much as two thirds, depending on how swiftly resistance develops. (See Figure 3D.)
In 2013, Penny & Briët determined that introducing nets in high transmission areas with insecticide resistance only reduced transmission by 75% instead of 90%. Penny went on to co-author Bhatt et al., but that paper ignored insecticide resistance even though one third of the population of Africa lived in high transmission settings in 2000.
I am not aware of any papers estimating increased malaria cases and/or deaths resulting from insecticide resistance. IVCC’s Nick Hamon made recourse to “two respected, independent malaria scientists” for a crude estimate of the number of deaths that would be averted if there were a new insecticide to replace failing pyrethroids. According to Hamon: “One scientist gave me a range of 141,000 – 228,000 and another 125,000.” Adding even part of 125,000 to WHO’s estimated 395,000 malaria deaths in 2015 would make for a very sizeable increase in mortality.
Hamon cautioned that “these are ‘back of envelope calculations’ and should be treated as such,” adding that “nobody wanted to be quoted, and for good reasons.” He would not say what the good reasons were. Nonetheless, among themselves, malaria experts countenance disturbingly large increases in malaria deaths resulting from insecticide resistance.
Another mostly insider conversation is the effect of resistance on the infectivity of mosquitos. Against hope and expectation, early indications are that the genetics of resistance also increase mosquito susceptibility to infection by malaria parasites. (See Ndiaith et al. and Alout et al.) Conceivably, not only has the scale-up of bed nets sparked a massive wave of resistant mosquitos, those mosquitos are also more likely to become infected with malaria and thus are potentially more likely to transmit the disease.
Averting the appearance of a malaria disaster: Possible image manipulation in Hemingway et al.
Another paper this year in the Lancet, “Averting a malaria disaster,” drew attention to mounting insecticide resistance and the need to develop new chemicals to replace those that are failing.
However, the map (Figure 2B) accompanying the paper might have understated the extent of the resistance problem. According to the caption, Figure 2B was “reproduced” from an online tool called IR Mapper. However, Figure 2B includes a number of green dots, indicating no resistance, that are not found on IR Mapper, in Sudan, for example. Some yellow dots in IR Mapper, showing possible insecticide resistance in Angola, appear as green dots in Figure 2B, indicating no resistance.
I found at least seven such discrepancies between Figure 2B and IR Mapper. In addition, Figure 2B does not display any yellow dots. IR Mapper has been displaying red, green and yellow dots since its inception in 2012, according to Duncan Kobia Athinya of Vestergaard Frandsen, which oversees IR Mapper. Said Athinya: “I cannot speak as to why the Lancet figure does not feature possible resistance (yellow) points, but IR Mapper has followed WHO criteria since its launch in 2012.”
Also unexplained is the green dot in Sudan in Figure 2B. Said Vestergaard’s Melinda Hadi: “I still do not have an answer regarding the green susceptible point in Sudan.” Figure 2B was created in October of 2014 but not published until April of 2016. Studies, and thus dots, have been added subsequently. Also, some might have been removed. But regarding the green dot in Sudan, Hadi said: “I can confirm a publication was not removed from IR Mapper.”
Perhaps explaining these and other discrepancies, according to Hadi: “the maps in the Lancet article were reproduced. The IR Mapper database was provided to the Liverpool School.” Of possible relevance, IR Mapper includes a “View own data” facility that allows users to create a map from a database.
Hadi explained that yellow dots were left out: “Data points that were classified as possible resistance (90-97% mortality) were not presented in Figure 2.” In addition, Figure 2B “only included data from peer-reviewed publications, so you will note other data points available on the platform (e.g., PMI data) were excluded.” PMI, the President’s Malaria Initiative, collected data in 18 countries.
Asked about issues regarding Figure 2B, first author Janet Hemingway said: “the figure is a screen shot downloaded back in 2014…” Hemingway is Director of the Liverpool School of Tropical Medicine. The discrepancies resulted from the passing of time, according to Hemingway: “Lancet have sat on the paper for almost a year since submission and acceptance so I guess it is possible over this period that IR mapper has been updated for historical data, but we made no alterations to the download.”
Hemingway declined to answer any more questions: “I have no intention of responding further on this, as there is no further explanation…”
However, the passing of time did not explain the discrepancies such as the missing yellow dots and the presence of a green dot in Sudan and half a dozen other anomalies that surfaced in a non-exhaustive analysis.
I raised these issues with Figure 2B to the attention of Lancet editor-in-chief, Richard Horton. Horton did not reply.
Prompted by the intervention of the Committee on Publication Ethics (COPE), Lancet editor Zoë Mullan relayed an explanation from Hemingway (who had earlier declined to answer questions). Hemingway said: “The single green dot in Sudan he refers to, I suspect is a point that was subsequently corrected if GPS co-ordinates had been incorrectly allocated for example.” Hemingway did not say where the dot could now be found.
Regarding the absence of yellow dots, Mullan explained, perhaps implausibly: “IR Mapper was clearly not showing any yellow dots on the day the authors downloaded the screenshot that became their figure.” Besides the unfortunate timing, Mullan’s explanation would also seem to require that the authors, who are experts on insecticide resistance, didn’t notice the absence of yellow dots that indicate possible resistance, nor did peer reviewers.
In addition to being an editor at the Lancet, Mullan is a trustee of COPE, perhaps creating a conflict of interest in responding to COPE-initiated inquiries. (Mullan’s role at COPE was not disclosed to me; I happened upon it later by chance.)
I also asked Richard Horton directly: “Is Figure 2B a screen capture or reproduced from the IR Mapper database? Why does Figure 2B not show any yellow dots?”
However, Horton only replied that the editors “feel confident that the data reported” in the paper “are accurate and reliable.” He vouched, to some degree, for the data but not for the accuracy of Figure 2B. He did not address the absence of yellow dots.
COPE declined to press the Lancet further. Wrote COPE’s Iratxe Puebla: “Given that the issues relate to a specific figure, we do not feel this falls within COPE’s remit to evaluate...” Furthermore, “COPE considers it beyond its remit to comment on… how facts are presented in individual publications.”
Richard Horton recently criticized COPE for not intervening sufficiently in a controversy regarding statins, bemoaning “the lack of a central institution where scientists who wish to question the actions or ethics of other scientists or scientific institutions can go.”
COPE suggested that I contact “the authors' institution so that
they can review and consider what follow up may be appropriate.” Via
transatlantic mail, I contacted LSTM board secretary R.E. Holland inquiring
about a possible institutional investigation. Holland replied, also by letter:
“Having reviewed the issue thoroughly, and having spoken to several experts in
respect of resistance incident imaging, I have concluded there is no case for
the authors to answer in respect of your complaint.”
Holland’s review assumed that Figure 2B was a screen shot. His letter said: “it is impossible to compare a snapshot of image data from one period to that of another and therefor there is no case to answer.” Holland’s answer was essentially the same as LSTM’s Director, Janet Hemingway. Holland did not explain the absence of yellow dots or the green dot in Sudan. He touted LSTM’s “rigorous research misconduct policy” which had been used in this case.
“[T]he paper conveys information that suffices for the message”
Figure 2B and the statements by the authors and editors of the Lancet also passed muster with Lancet ombudsman, Malcolm Molyneux. However, Molyneux acknowledged the possibility that Figure 2B was not a screen capture.
The word ‘reproduced,’ according to Molyneux, “can mean either of the possibilities - a screen-shot or a figure re-drawn from data.” He added: “I really do not think it matters.” In his view, if the authors changed the figure to suit their purposes—including adding dots—they were within their rights: “the paper conveys information that suffices for the message - removing or adding yellow dots or (a few) other dots would make no difference at all to that message.”
Molyneux's statement, “the paper conveys information that suffices for the message,” nicely captures the philosophy that is mis-informing too many papers in malaria.
Placing a green dot in Sudan or anywhere appears to be legitimate in the eyes of the Lancet ombudsman, as long as the number of dots added does not exceed "a few.” Leaving dots out is no infraction, according to Molyneux because “the legend to Fig 2 says ‘reproduced from...’ - it does not say ‘with no subtractions’.”
He distinguished “falsification of data” from “simplification for purposes of clarity.” However, adding green dots that do not actually represent studies of insecticide resistance means those dots are fake, while changing the color of dots misrepresents the findings of actual studies. It is hard to see how that would not be falsification of data.
Continued Molyneux, “if the authors had been trying to manipulate the figure in order to make their case more compelling, we would expect them to err towards the red in the later time-period (2b)… In every single case you mention of a difference between the IRMapper and Fig 2b, the difference is from red to green, not the other way round.”
Image manipulation requires no explanation. However, as I
wrote to Molyneux, “the title of the paper is ‘Averting a malaria disaster.’
Unless the map shows that there is a disaster to avert then the title doesn't
fit… A sea of red and yellow dots might lead readers to conclude that
mosquitoes had already won.”
As it stands, figures “reproduced” in the Lancet may differ in unspecified, undisclosed ways from the source in order to convey the authors' message, which might differ from their scientific findings.
Tale of two resistances
If the malaria research community is downplaying insecticide
resistance, it is exaggerating the threat of drug resistance in Southeast Asia
spreading to Africa. Resistance to artemisinin is not spreading even in Southeast Asia and faces scientifically demonstrated
obstacles to overtaking Africa. It’s not happening, but researchers are saying it is.
Arjen Dondorp heads malaria research at the Mahidol-Oxford Tropical Medicine Research Unit in Bangkok. He laid out an accurate chronology of the discovery of resistance to artemisinin-based malaria treatments. Resistance was first found in Western Cambodia, then at the Thai-Myanmar border, in Myanmar, Northern Cambodia, Northeastern Thailand, Eastern Cambodia, Southwestern Vietnam, and Southern Laos.
“Out of the ‘epicentre’ of Western Cambodia,” said Dondorp, “over time the resistant parasite has spread westward, northward, and eastward.” He concluded: “This is spread.”
However, Dondorp’s statement, if not simply false, is not scientifically supported. He described the spread of surveillance, a trick that could equally demonstrate that broken arms or bad breath are “spreading” in Southeast Asia just by conducting surveys in the same places and order he described for artemisinin resistance.
Genetic sequencing has, against expectation, found that nearly every artemisinin resistance hot spot emerged independently, not as a result of spread. Future research might change the current understanding of the epidemiology of artemisinin resistance, but the most comprehensive survey found only three instances of spread out of 112 samples from across the region.
More word play and dissembling are on view in a Lancet paper on resistance in Myanmar that included the word “spread” in its title but
adduced little evidence and no claims
for it. I wrote two of the authors, saying “the title of your paper, ‘Spread of
artemisinin-resistant Plasmodium falciparum in Myanmar’ seems belied by the
evidence actually in the paper (and other papers).” I asked them if they would correct “any misperceptions on my part,” but neither Mallika Imwong nor Charles
François Nosten, who runs a clinic in Mae Sot near the Thai-Myanmar border, also claims resistance is spreading. “Resistance to artemisinin,” according to Nosten, “has emerged in different places in SEA [Southeast Asia] but then it has spread.” I asked: “Can you describe unpublished data or point to papers where spread is documented?” Nosten, regarded by many as a public health hero, replied not with science but anecdote and sophistry: “We find that over 80% of our patients with malaria have parasites that are resistant to artemisinin. It did not emerge in each and every one independently, did it?” Nosten is correct it did not emerge in each patient independently but that is not at all the same as spread. To establish spread requires DNA sequencing from at least two places; Nosten claims spread based on a single cohort and no sequencing data.
A malaria press tour to Southeast Asia, funded by Malaria No More, featured journalist visits and interviews with Nosten and Dondorp. Stories in Slate and other outlets told readers artemisinin resistance was spreading and threatened a malaria apocalypse in Africa.
Distorted science is creating distorted journalism. An AFP story suggested that the reason artemisinin resistance hadn’t spread to Africa was that “international efforts to contain the spread of resistant parasites have been effective.” It is more the case that biologically it is difficult or impossible to install the multiple genetic changes required to create artemisinin resistance. However, the international containment efforts, by increasing drug pressure, might be forcing malaria down evolutionary pathways which could result in a more compact genetic form of resistance that could be more easily exported to Africa.
Meanwhile, the actual spread of insecticide resistance in Africa is ignored. Another journalist field trip funded by Malaria No More featured Tanzania as the destination. Insecticide resistance might have been among the briefing topics, but it did not appear once in an article for Vice written by one of the journalists on the trip.
No one has died from drug-resistant malaria. “As far as I know,” said Nosten, “there has been no confirmed fatal case.” Meanwhile, according to Nick Hamon’s sources, some part of 125,000 people (or more) have died from malaria because insecticide resistance has reduced the effectiveness of bed nets.
Debasing the currency: $4 trillion drawn on the account of science
Worth less than the paper it’s printed on (Source: Wikimedia)
In 2010, researchers concluded that malaria eradication was unlikely to break even and advised that “financial savings should not be a primary rationale for elimination.” But a few years later, an overlapping constellation of researchers discovered that eradication would quickly generate $4.1 trillion in economic benefits, in just 15 years.
The paper touting a $4.1 trillion windfall “is an advocacy document rather than an academic analysis,” according to Rima Shretta at the University of California, San Francisco (UCSF). Shretta is part of the UCSF group which led development of the 2010 Lancet series which found no cost savings from eradication. Shretta also served on the Action and Investment for Malaria task force that developed the advocacy document projecting $4.1 trillion in benefits.
To reconcile the academic analysis of the 2010 Lancet paper with the later discovery of trillions of dollars in benefits, Shretta seems to suggest scientists are free from the standards of science if they are engaged in advocacy. And functionally, it appears malaria advocacy has detached from science, although much of the advocacy comes from scientists.
The End: Malaria Goes Hollywood
Images, which can surreptitiously mislead, end up exposing a conscious mis-shaping of malaria research. Authors are making undisclosed and perhaps improper choices regarding the visual elements in their papers. However, within the papers, the same authors are free to make any number of choices that can decisively influence findings and there is little or no possibility of suggesting impropriety. The sources of data, how they are processed, type of model and parameters partly or entirely decide the research results. Authors can “mute” statements about the loss in bed net effectiveness caused by pyrethroid resistance. Editors can entitle a paper “spread of resistance” when there is none mentioned in the paper and the evidence contradicts the spread hypothesis. But when the philosophy of managing reader beliefs extends to choices about visual elements, the curtain is drawn aside and we see not a scientist but the Wizard of Oz.
A spot of bother in Maiduguri district, Nigeria (Source: Wikimedia)
Worldwide, in all but three of 155 countries, the trivalent oral polio vaccine has been replaced with bivalent oral vaccine. The bivalent formulation includes only attenuated versions of type 1 and 3 of poliovirus. The type 2 component has been dropped because, far more than the other types, it sometimes mutates back into virulent form. Also, type 2 polio was eradicated in 1999.
But just as the world moved to the bivalent vaccine, Nigeria reported finding a type 2 vaccine-derived virus in a sewage sample. Consequently, right on the heels of the vaccine switch, the type 2 vaccine is being immediately pressed back into service, although it will be used by itself, in monovalent form, according to the Global Polio Eradication Initiative.
Sequencing indicates the Nigerian virus has been circulating undetected since May of 2014. The sample comes from Maidaguri district, an area contested by government forces and Boko Horam, making vaccination problematic.
Initially, the polio eradication project envisioned stamping out all type 2 vaccine-derived virus transmission before dropping the type 2 vaccine component. But plans to switch vaccines ultimately went ahead despite the likelihood of continued circulation of type 2 vaccine-derived virus somewhere in the world.
There are now multiple hotspots. Besides Nigeria, according to the CDC's Steve Wassilak, "We consider [the] Guinea and Myanmar outbreaks still active." In addition, Brazil reported what researchers described as a "highly evolved" type 2 vaccine-derived virus found in sea water off São Paulo. Found in January 2014, sequencing indicates the virus has been circulating undetected for eight years. Brazil has very high population immunity to polio, so this virus likely came from somewhere else, according to Wassilak.
Eight years of undetected circulation suggests a perhaps large and as yet undiscovered surveillance gap somewhere in the world. Asked whether eight years set the record for undetected circulation, Wassilak answered: "Nigeria had documented circulation for 10 years." However, in Nigeria, there were multiple transmission chains, and it is not clear from Wassilak's answer if any one chain circulated eight years. The Brazilian isolate also had mutations at antigenic sites, suggesting possible evolution of resistance. However, researchers reported that type 2 antibodies still killed the virus.
The process of switching to the bivalent formulation also risks creating new type 2 vaccine derived virus. The switch was synchronized globally because if use of the trivalent vaccine continues anywhere, it might potentially infect children who have only been immunized with the bivalent vaccine. According to WHO:
"The primary risk associated with the cessation of use of type 2 OPV [oral polio vaccine] is the re-introduction of disease-causing type 2 poliovirus into a population with increasing susceptibility to type 2 poliovirus. The switch from tOPV to bOPV must therefore be globally synchronized to minimize the risk of new cVDPV type 2 emergence."
The precision of the large and un-rehearsable switch remains to be seen. Globally, susceptibility to type 2 vaccine derived virus is now rising given the switch to bivalent vaccine and the slow (and arguably belated) introduction of the injected vaccine, which includes all three virus types in a form in which mutation is not possible. Also, while the injected vaccine protects against paralysis caused by poliovirus, it does not prevent infection nor halt transmission. Polio circulated in Israel without causing any cases of paralysis because coverage with the injected vaccine was so high. Eventually, however, circulation might find someone missed by vaccination or with a compromised immune system, resulting in polio's hallmark acute flaccid paralysis.
The success in beating back wild poliovirus bodes well for the eradication effort to also smash outbreaks from vaccine-derived virus. But, out of the gate in the post-trivalent world, the race is already on. And, in Nigeria at least, type 2 vaccine-derived virus circulation has gone uninterrupted for a decade.
Perhaps you read some of the same publications as Bill Gates, like the New York Times or Slate. You tune into NPR and watch the PBS NewsHour, part of the sacred ritual of thoughtful Americans becoming informed citizens.
From Slate, we know time is running out to eliminate drug-resistant malaria. The Gates Foundation believes this too. But is the foundation’s logic irresistible or did Slate run an infomercial for the foundation funded by a $40,000 grant? The story (including a trip to Thailand) was paid for by Malaria No More which has received $20 million in Gates Foundation grants.Media matter. As Bill Gates observed, even Theodore Roosevelt’s reform program “wasn’t really successful until journalists at McClure’s and other publications had rallied public support for change.” Now Gates has rallied public support for malaria eradication in Slate, and President Obama tentatively endorsed it in the State of the Union.
It’s not just Slate or only global health. Carefully restricted Gates Foundation grants to NPR, the PBS NewsHour, the Pulitzer Center on Crisis Reporting and other news organizations shape what gets covered, what doesn’t, when and how.The Gates-funded PBS NewsHour just began a new series on education called “Making the Grade.” The first episode is difficult to distinguish from an earlier Gates Foundation video on postsecondary education. In the NewsHour version, Gates-funded journalists and academics deliver the messages of the foundation’s postsecondary strategy, but neither the foundation nor its funding role are mentioned.
Trusted media organization receiving Gates Foundation grants are not following good journalistic practices. And like improper food labeling, undisclosed funding misleads news consumers about what they are actually getting.Readers got nothing on Ebola from the Gates-funded Pulitzer Center for Crisis Reporting until more than half a year after the crisis broke. Pulitzer Center stories appear in an array of top-shelf outlets like the New York Times, Nature, and the Economist, where the center’s first article on Ebola eventually came out. Although restricted Gates funds paid for 59 of 240 Pulitzer Center stories over a 30-month period, neither readers nor perhaps even the editors publishing them could tell which were actually Gates-funded.
NPR, with its Gates grant, cut staffing for covering climate change in order to expand and transform its global health coverage into an upbeat, advocacy-oriented approach, the opposite of muckraking. Gates funding of this specific initiative is not disclosed. While NPR gives the impression that the Gates Foundation just writes a check to support all of NPR’s good work, it doesn’t.
NPR’s restricted Gates grant actually requires NPR to contribute unrestricted money towards Gates-initiated projects. Ironically, listener donations might be funding broadcast of the Gates Foundation’s news values on public radio.
Which is perhaps why you think like Bill Gates when it comes to global health.
Slate: not so clean
In late December, CNN ran an op-ed advocating malaria eradication, written by the CEO of the advocacy group Malaria No More. In January, a week later, Slate too proclaimed “The World Can Eliminate Malaria.” The article delivered Malaria No More’s messages but was written by a Slate staff writer—funded by the Gates-backed Malaria No More. Jackpot: advocacy runs as news from a credible source.
Nightline veteran Dan Green orchestrates the Gates Foundation’s media and communications grant portfolio. Speaking in 2011 on the “media metamorphosis,” Green observed that with the demise of old media, many news organizations “don’t have a global health reporter anymore.” Consequently, when journalists cover global health, “they need more guidance.” For advocacy groups, according to Green, this created “an enormous opportunity for you to educate those reporters about what it is they need to be thinking about.”
The Malaria No More grant provided reporters with ample guidance:
During the tour, participants will conduct site visits to clinics and treatment centers, attend briefings with health officials and disease experts, hear from organizations working to eliminate the disease and meet with local journalists covering the issue.
Participants will be expected to produce stories based on the information gathered and contacts made during the tour.
Slate staff writer, Joshua Keating, while possessed of formidable reporting chops, focuses on international affairs and does not appear to write much about malaria for Slate. When domain expertise is short, journalists are at the mercy of their sources. When a journalist’s sources are curated by an advocacy group, the result is not journalism.
Technically, Keating’s trip wasn’t directly funded by Malaria No More. Indeed, it is unlikely Slate would have accepted money straight from an advocacy group. Instead Malaria No More funded the International Center For Journalists (ICFJ). Passing the money through ICFJ, which called the five-day trip a “fellowship,” seemed to overcome any journalistic scruple. As Slate science editor, Laura Helmuth, wrote me:
Josh Keating’s editors were all fully aware of his trip and how it was funded, and we fully support him and the reporting that came out of his trip and his story in Slate.
I asked Executive Editor Josh Levin about Slate’s policy on accepting funding from advocacy groups. Levin did not reply.
Drug resistant malaria is undoubtedly important. But for Thailand, is it more important than dengue? Globally, multidrug-resistant tuberculosis might be far more urgent and deadly, with half a million cases a year. CDC Director Thomas Frieden believes “There can be no delay” in combating drug resistant TB. But Frieden’s views appear on the CDC blog, not in Slate. (Slate has covered the media’s neglect of TB.)
Slate’s malaria piece takes for granted that a single-disease approach to public health is best, without considering whether health systems might be more effective. In addition, current scientific evidence suggests drug-resistant malaria has not spread even within Southeast Asia and faces surprising barriers to taking over in Africa.
For malaria’s considerable importance, neither Slate nor perhaps any media outlet has written about why Rollback Malaria, the global consortium responsible for combatting malaria, disbanded itself in 2015.
For its grant to Slate, Malaria No More got a narrowly focused piece getting out its key messages. Indeed there were four other ICFJ fellowships, so Slate participated in an orchestrated news boomlet. ICFJ would not disclose the names of the other publications, so the impact (and degree of funding disclosure) are untrackable.
The over $200,000 spent on these trips could go a long way towards putting a journalist on the global health beat. But who needs global health reporters if it’s possible to generate “news” that faithfully delivers an advocacy message?
Structural changes in the news industry have made this easier. Said the foundation’s Dan Green, back in 2011: “You have now media organizations that are far more open to innovative partnerships.” Why? Because “their resources are stretched.” As revenue streams for traditional media dried up, enter the world’s wealthiest foundation as innovative partner.
Promise to say you’re independent
With much solemnity, the foundation and its media partners proclaim the full editorial independence of Gates grantees. But Green acknowledged a “fear” felt by Gates-supported news organizations:
Green insisted it would be short-sighted for funders to take such an approach. And yet the Gates Foundation seeks demonstrable results, according to Green: “We as funders try to think in terms of outcomes. What would be the outcomes we’re hoping for by telling these stories, by engaging with the content creator?”
...that fear that as my grant ends, will I get renewed and will any foundation funder, or any outside philanthropic funder, say, ‘Hmm. I looked at the stories and they weren’t all that positive, and they weren’t filled with success. Maybe we don’t want to fund that anymore.’
The foundation engages with content creators not to give readers a puzzle to solve thoughtfully but to deliver pre-specified, actionable messages. “We really think a lot about ‘Is it reaching an audience that we think is an important audience we need to reach?’ ” Green opined in 2013. “And, if it is, does it have the credibility and the trust so when it puts out evidence-based information that people say, ‘I believe that. I’ll follow what that says?’ ”
Wearing his journalist hat, Green said, “Now you come from journalism and we don’t sit around talking about messaging. Messaging makes us cringe. Because then it makes us feel that you’re using all the journalists as tools for your messages.” Green concluded, forthrightly: “You might say, ‘Yeah, we are.’ ”
Green defended using journalists as tools because “it’s a mistake to think that if your subject that you care about is getting talked about, and stories are being told and information is out there, that is incredibly valuable.” Journalists get to cover global health; the price is carrying the foundation’s messages. It’s painting by numbers, but it’s still painting.
The dissolution of traditional media, according to Green, brought fragmentation and proliferation of information outlets, and created a news environment with fewer facts and more opinions. Some digital media consultants, said Green, recommended that “the louder and stronger your opinion is, sometimes the more people gravitate to you…” However, even Theodore Roosevelt’s bully pulpit did not suffice to create change. Regarding the loud opinion strategy, Green said “I’m not a huge fan of that necessarily.” Far better that the foundation’s opinions appear as news.Like Slate’s malaria piece.
The Pulitzer Center—presented by the Gates Foundation
The Pulitzer Center for Crisis Reporting frowns on free trips. Pulitzer Center-funded articles appear in elite publications like the New Yorker, Nature, the Economist, the Washington Post, Slate, Foreign Policy, National Geographic etc.But regarding trips, the Pulitzer Center’s ethics policy says journalists “should not normally accept free travel, with the exception of military embeds and other situations in which travel assistance is essential to the reporting.” To further protect its integrity, the center counsels writers to “avoid activities that might interfere with your ability to function as a journalist.” Otherwise, “you may be precluded from working on certain topics for the Pulitzer Center if you're personally involved.”
Although the center closely polices the integrity of worker bee journalists, different standards apply to donors. Many donors write a check with no strings attached, leaving the Pulitzer Center with full editorial discretion. “In recent years,” said Executive Director, Jon Sawyer, “we have consistently gotten 50 percent or more of our budget from unrestricted donations…”However, the other 50 percent of donations have strings, although the center’s ethics policy seems to guard against any improper influence. The policy asserts: “Donors will not dictate in any way the editorial products of the Pulitzer Center.” But restricted donors, like the Gates Foundation, restrict their grants because they do not believe the Pulitzer Center would, by itself, create the desired editorial products. Influencing the Pulitzer Center’s editorial products is the only reason restrictions exist.
“Over a four-year period our Gates funding has totaled approximately $2.4 million,” said Sawyer. “These were restricted grants but the terms were broad, with funding for a broad range of global health/development topics and educational outreach and full autonomy as to the selection of specific projects, news-media placements and outreach activities.” But the center’s “full autonomy” is over selecting specific projects. The Gates Foundation draws the big picture and contracts out for the needed words and images.
Recall that the Pulitzer Center will disqualify journalists from writing on subjects in which they are personally involved. To guard against donor bias, the center’s ethics policy asserts: “We do not accept donations that raise the possibility, or the appearance, of a conflict of interest.” However, the center’s Gates funding, at minimum, creates the possibility of a conflict. The Gates Foundation is the largest in the world. Most of its donations go to global health and development, the same subjects funded by its grants to the Pulitzer Center. The foundation, far from being policy-agnostic, funds research into policy and advocates for specific approaches to global public health.
This possible conflict of interest is not disclosed to readers nor perhaps even to editors of the publications running stories from the Pulitzer Center. Slate at least disclosed the funding of its story on malaria. Slate didn’t just name the funding intermediary, the International Center for Journalists, it named (sort of) the funder, Malaria No More. Anyone wanting to dig further could discover the Gates Foundation’s $20 million funding of Malaria No More, which advocates for the foundation’s malaria policy, eradication, set by the foundation in 2007.The Pulitzer Center, with its Gates funding, produced a substantial amount of global health coverage. Over the 30 months of its most recent Gates grant, “we applied Gates funds to support a total of 59 projects,” said Sawyer. “For purposes of comparison, over that same 30 month period we supported some 240 projects overall.” These stories ran with the disclosure of funding provided by the Pulitzer Center. However, one in four is actually the Pulitzer Center presented by the Gates Foundation.
Which 59 projects were Gates funded? Sawyer would not say. He previously mentioned “On some of those [Gates] projects we also drew on funds from other donors.” He emphasized the point: “Also, as point of clarification, our grants to journalists often mix restricted/unrestricted funds.” Sawyer perhaps was suggesting that mixed funding mitigates conflicts of interest. The idea might be that if funding from interested donors passes through intermediaries who stir in some amount of disinterested money, then journalism is not compromised and disclosure is unnecessary.From the Gates Foundation perspective, however, adding unrestricted funds to those of its restricted grant leverages the foundation’s investment. (It’s possible the grant stipulated that the Pulitzer Center contribute additional funds.) The restricted Gates grant shifted Pulitzer Center resources to more closely match the news values of the Gates Foundation. Maybe not by much; maybe a lot.
Initially, Sawyer wrote me: “Happy to discuss this further. Complicated numbers and we're eager to have it reported accurately.” But when I asked for a spreadsheet listing Gates-funded projects and the funding mix for each, Sawyer did not reply.Sawyer defended the center’s work: “I hope you'll take the time to read some of the reporting,” he wrote me. “It's quite good!” Read the stories; don’t ask where they came from. But Sawyer is right about quality: the center’s production values are top-shelf, and the finely wrought stories bring attention to a broad array of important but neglected subjects. Slate’s article on the neglect of TB, for example, was supported by the Pulitzer Center. Nonetheless, reporting loses the name of journalism when it comes from restricted funding.
The Pulitzer Center website quotes Joseph Pulitzer: “We will illuminate dark places and, with a deep sense of responsibility, interpret these troubled times.” But Sawyer shed very little light on funding of stories bearing Pulitzer’s name. “Ebola, malaria and other health projects relied in part on Gates, in part on other funding sources,” he said, perhaps again suggesting that mixed funding ameliorated conflicts of interest not disclosed by the Pulitzer Center.It is true that finding such conflicts is much harder when 59 restricted projects are mixed with 201 that are not. However, in a far from exhaustive search, I came across a speech in which Bill Gates advocated an intervention called seasonal malaria chemoprevention. Later, there is Pulitzer Center article about it, indeed a multi-article project on the subject. Whatever the merits of seasonal malaria chemoprevention, there is no way to determine if its coverage was funded by an interested party.
The Pulitzer Center tells its reporters: “Let the audience know any information about yourself or your sources that might affect its understanding of your work.” Brick-laying journalists are closely scrutinized but the audience has no idea even of the existence of restricted donors shaping the overall news architecture.
The void: Why no Ebola coverage for half a year?
If Gates Foundation influence on malaria, for example, is worrisome, evidence on Pulitzer Center coverage of Ebola raises far more serious concerns: The Pulitzer Center supported no stories on Ebola for more than half a year.The outbreak began in March of 2014 but no Pulitzer Center stories appeared on Ebola until mid-December. The center’s full name is the Pulitzer Center for Crisis Reporting, and Ebola is the most important global health crisis since HIV/AIDS. Although funded by the Gates Foundation to cover global health, the Pulitzer Center produced nothing on Ebola for the better part of a year.
I conducted my search for “Ebola” articles using the center’s website. (I asked Jon Sawyer for confirmation of my results. He did not reply.) The first article I found is dated December 13, 2014, “The Fight Against Ebola: Donating the Cure,” appearing in the Economist.According to Sawyer, the Pulitzer Center received what he described as an “extension” grant of $300,000 from the Gates Foundation. It is possible that the timing of the grant coincides with the onset of Pulitzer Center stories about Ebola.
In difficult to parse grammar, Sawyer said: “Gates extension was continuation of previous grant, support for reporting/outreach on broad range of global health/development issues: choice of projects, journalists and outlets left to us.”Unsure whether that meant “no,” the extension grant did not fund the center’s Ebola coverage, I asked Sawyer again, several times, if the grant was to cover Ebola. I sought details on timing and who approached whom. Sawyer did not reply.
When I inquired of the Gates Foundation’s Bryan Callahan whether the extension grant was for Ebola, he did not reply. Callahan is the foundation’s Senior Program Officer for Program Advocacy & Communications.Back in 2011, the foundation’s Dan Green, claimed: “We want people to say ‘We get our money from the Gates Foundation.’ ” Later, writing on the foundation’s blog, Green put transparency first among the guiding principles for media grants. Green also promised “in the coming weeks I’ll post another blog listing all of our current investments in this portfolio.” I asked Green for the listing of the foundation’s current media grants. He did not reply.
I asked Amy Maxmen, who wrote stories on Ebola for the Pulitzer Center, whether she knew if her efforts had been Gates funded. “I don't know where the Pulitzer Center gets their funding,” answered Maxmen, without saying yes or no. “I admit I don't ask.”Maxmen did assert: “I independently came up with the idea for my reporting on Ebola.” However, Maxmen thinks a lot like the Gates Foundation.
The Pulitzer Center’s Ebola project is entitled “Disaster Science During the Ebola Outbreak.” The center took care to explain this odd-seeming focus: “Research during a disaster can seem frivolous when there aren’t enough resources to handle the immediate response. But in the Ebola outbreak it's become clear that data collection must happen now.” The Pulitzer Center had ignored Ebola for more than half a year and now focused not on an Ebola response but Ebola research—rather like the Gates Foundation.Had Médecins Sans Frontières (MSF) been the funder of the Pulitzer Center’s Ebola coverage, the stories would likely have come sooner, indeed immediately, and with a different emphasis: the need to act.
In contrast to MSF, the Gates Foundation remained silent on Ebola for months until moments before WHO’s belated declaration of an emergency. Barely beating WHO to the punch, the foundation announced an Austin Powers-sized $1 million dollar grant to “help address the immediate need on the ground.” One day after its token grant, the foundation blogged that meningitis, “could end up being far more destructive than the current Ebola epidemic.” Remarkably, the foundation moved on from Ebola before WHO even declared it to be an emergency.The crisis worsened. As it reached increasingly apocalyptic scale and the world belatedly mobilized billions of dollars, the foundation chipped in $50 million. The announcement committed $10 million to “emergency operations” but also to “R&D assessments.” For the remaining $40 million, “the foundation will provide further details on its funding commitments to on-the-ground operations and to research and development for Ebola drugs, vaccines, and diagnostics.” The foundation was not going to fund the operational response costing billions but research costing millions. The Pulitzer Center’s Ebola coverage, when it finally came, also focused on research.
The center’s Ebola coverage can be seen as favorable to the Gates Foundation which funded the stories, at least in part. Maxmen’s first article, for example, appearing in the Economist, focused on the silver bullet of blood transfusions potentially curing Ebola. It turned out not to work, and new research contributed little to containing the epidemic. However, one of Maxmen’s stories, appearing in Newsweek, criticized the Ebola response as wastefully managed. Undoubtedly. But the foundation had mostly not contributed to on-the-ground efforts which, in the end, worked.
In the pages of Nature, Maxmen reminded readers of the importance of malaria and that Ebola was disrupting mass administration of anti-malarial drugs.Another Maxmen piece provided a reporter's timeline of the world's “plodding attack on Ebola.” It pummeled bureaucratic organizations “bogged down in democratic decision-making processes and bureaucratic policies,” perhaps meaning the World Health Organization. The timeline doesn’t mention the inaction of the Gates Foundation. Nor does the article examine the role of the CDC, which only declared Ebola a top-level emergency one day before WHO.
Latest of all, however, was the Gates-funded Pulitzer Center.
The Economist: “We do not publish articles 'supported' by any organisation”
Maxmen's article in the Economist runs without disclosure of Pulitzer Center funding. I asked Economist science editor, Geoffrey Carr, whether the Pulitzer Center disclosed to the Economist any funding of its work by the Gates Foundation.
Carr replied: “We do not publish articles 'supported' by any organisation, and we certainly do not publish anything funded by anyone.” The Economist is journalism at its purest, or at least proudest.
I pointed out that the Ebola story appearing on the Pulitzer Center site was identical to the one appearing in the Economist. (The Pulitzer Center lists 25 articles and 1 photo as published by the Economist.)
Carr changed tunes: He described Maxmen as “a freelance who seems to have some sort of travel and support grant from the Pulitzer Centre.” Carr added: “I don't see any impropriety in this, since we pay our freelances a market rate for their copy.”
The Economist does publish articles supported by other organizations, but without disclosing that support to its readers. (In this regard, the Economist is perhaps the perfect vehicle for maximally credible stories with undisclosed conflicts of interest.) Regarding the question of whether any Gates funding of the Ebola article was disclosed to the Economist, Carr wrote: “I will pass your thoughts on to the Editor of the Middle East and Africa section, whose section this story appeared in.”
NPR: Gates and Soda
Think of your brain as a pie chart, the slices representing the subjects you pay attention to, and the size of the slice indicating how much. If NPR programming influences your pie chart, then your slice on climate change might have shrunk like a receding glacier.In 2014, NPR cut its environment team to one reporter, according to Inside Climate News, with resources reassigned to “the outlet’s global health and development coverage, which includes a new project launched this summer using a grant from the Bill and Melinda Gates Foundation.”
NPR will not say how much of that project, called Goats and Soda, is Gates-funded. One report said it would “likely not exist” absent Gates funding. But NPR’s Isabel Lara said: “Goats and Soda is possible in large part due to the Gates Foundation grant but it isn't accurate to say that it wouldn't exist otherwise.” Lara is NPR’s Media Relations Director. When asked for details, Lara would only repeat the amount and duration of the grant. “Cannot get more specific than that,” Lara said.The Gates Foundation’s funding relationship with NPR goes back 15 years. Its most recent grant in 2013 provided $4.5 million to “advance global health and development coverage.”
The Gates initiatives at NPR, however, are not 100%-funded by the foundation. According to Lara: “As is common with many foundation grant agreements, our Gates agreement references NPR’s proposed budget for the initiative which included other resources beyond their investment.” More plainly, the Gates grant requires NPR to help fund the foundation’s projects.I asked Lara if the “other resources” contributed by NPR included listener donations. She did not reply. However, as at the Pulitzer Center, a restricted Gates grant might be drawing unrestricted funds into the support of the foundation’s news values. Conceivably, listeners are funding NPR’s Gates-designed presentation of global health news.
NPR does not disclose Gates Foundation support for Goats and Soda on its website except, it seems, when Gates or his foundation are the subject. A commentary applauding BIll Gates’ views on solar power, for example, parenthetically disclosed: “As our readers may know, the Gates Foundation is a funder of NPR.” But readers of the laudatory piece on Bill Gates do not know that the Goats and Soda enterprise is mainly and specifically funded by Bill Gates.Goats and Soda might even be preferentially covering its funder. I asked the author of the commentary, Michael Hayden, if he approached Goats and Soda or vice versa, but he would not say. “Sorry,” Hayden wrote back, “what are you trying to do exactly?”
Unlike NPR’s Goats and Soda, the Guardian puts the Gates Foundation’s logo on all the pages appearing in its Gates-funded development section. Guardian readers do not have to guess what is Gates-funded and what is not. Whether foundation influence extends beyond what it pays for is another question. But a dedicated page describes the funding relationship including the declaration that “content is editorially independent.”I wrote to Goats and Soda editor, Vikki Valentine, asking whether Gates funding was properly disclosed. Valentine did not reply.
Solutions journalism: turn that frown upside down
Goats and Soda represents not just a switch in coverage from climate to global health. The news production line now turns out a very different editorial product based on a new template, solutions journalism.In 2012, the Gates Foundation issued a challenge to “find ground-breaking ways to gather and share stories of aid working well.” In the foundation’s view, “The media seems full of stories of corruption, waste and broken systems.”
Responding to the challenge, New York Times writer David Bornstein and colleagues won an initial $100,000 grant from the foundation for an idea called “solutions journalism.” As Bornstein explained:
So much of what we do as journalists is aimed at holding powerful people accountable and identifying failure, which is very important and valuable. But if we stop there, with just identifying failures and the bad actors, it becomes frustrating to people. It’s a broken narrative.
The foundation has supported Bornstein’s efforts with a further $1 million.Solutions journalism, according to Bornstein, “has more in common with a Harry Potter novel, a quest or struggle, than the traditional journalism narrative.” Harry Potter, of course, is fiction.
Traditional journalists on the global health beat, like Tom Paulson, questioned the solutions emphasis: “A number of journalists, including me, remain concerned that making reporters responsible for emphasizing solutions – along with this Gates push for ‘success stories’ – could undermine basic watch-dogging.”Paulson leaned toward what he called “cranky” stories. The blog Paulson edits, Humanosphere, ran a story entitled “How Tanzania failed to fix its water access problem.” The piece delivers a very cranky, evidence-based beatdown of the World Bank. The story held powerful people accountable and identified failure. The story was not solutions journalism.
By contrast, a Goats and Soda article on water featured a solution: Bill Gates drinking water “made from poop.” The Gates-funded piece stars Gates and promotes a Gates-funded project. The article’s solutions journalism style, favored and funded by the Gates Foundation, leaves readers with gee whiz wonderment, a sense that there’s an app for the water crisis.Although the water-from-waste system appears to be the size of small refinery, the story does not delve into what it costs to construct or operate. The price of a gallon of water and whether the system works where there is no sewage system or electricity are not addressed. Broken narratives about the water crisis, however, are avoided.
Change the perception, change the reality
Sally Struthers, circa 1992, told television viewers: “Every year, 10 million third world children don’t live to see their third birthday.” Ten million avoidable child deaths, said Struthers—and that’s on you, viewer. Look: tiny bodies, bloated bellies, skeletal ribs, eyes outlined in flies.Global Health, 1992
Today, moralizing and macabre messages are out. Even the news category “global health” has been left behind. NPR buried its old Twitter handle @nprglobalhealth, pointing followers instead to the new @nprgoatsandsoda. In place of 1990s-era, grim scenes of despair, a Goats and Soda music video shows the modern day “bliss” of living in low-income rural India.
Goats and Soda, 2015
Struther’s moral importuning came in television commercials clearly paid for by the Christian Children's Fund. By contrast, what Goats and Soda presents appears as NPR-certified reality, a perception unspoiled by disclosure of Gates Foundation funding.
Very few Struthers-like sermons have appeared in Goats and Soda. Indeed, a story about ethics and the making of blue jeans argued against moralizing. The piece concluded with a quotation from a researcher: “To get people to be more ethical, do not ever present your message as, 'If you're not doing this, you're a bad person...'”And instead of counting dead children, today we count those who have been saved. Said Melinda Gates at Davos recently, “When we look at the fact that since 2000, childhood deaths have been cut in half, a big percentage of that is because of vaccines.” Quite reasonably, Melinda describes the glass as half full. And the world is doing great on vaccinating children, right?
Omission of bad news is bad journalism—or worse
There is one hiccup: measles vaccination is “falling behind,” according to a story in Goats and Soda. Not to worry, though. Annual measles deaths have fallen from 546,800 to 114,900 since 2000. That’s fantastic—except measles progress actually flattened back in 2007. The good news stopped eight years ago but is still being reported.More than just measles vaccination is falling behind. Of six targets set in 2010 for global child vaccination, “Just one of these six is on track to be achieved,” according to a report from WHO’s Strategic Advisory Group of Experts (SAGE). At Davos, Melinda Gates chose to speak about the one target that was on track: introduction of new vaccines.
Goats and Soda’s measles story promised to explain “why the world is falling behind,” but did not. Solutions journalism style, however, it covered “new strategies that seem promising” and “other success stories from the front lines.”By contrast, SAGE explained what had actually gone wrong:
In addition, the total number of unvaccinated children had “basically not changed” and those at greatest risk became more vulnerable: “Looking closer, the number in the lowest bands is getting worse not better,” SAGE reported. However, few or no journalists explored the halt in progress and backslide in immunizing the world's children. How this failure is possible and who is responsible is not a solutions journalism story. Adding to the broken narrative, SAGE wrote: “The habit of missing major vaccination targets undermines global trust in these efforts…” Global trust, however, remains high because no one reads SAGE reports.
The targets each relate to different vaccines and diseases, but common threads run throughout: failure to extend vaccination services to people who cannot currently access them at all, and failure to strengthen the healthcare system so that all doses of vaccine are reliably provided.
The Gates-funded Center for Global Development reported that new vaccine introductions have made no detectable difference in saving lives, finding only “small and statistically insignificant effects for the three high-priced vaccines promoted by Gavi...”
Vaccine coverage, not introductions, is what saves lives. And according to SAGE, immunization coverage has recently shown “no improvement,” leaving the number of unvaccinated children at 22 million. Children that aren’t vaccinated can and do die from preventable disease in large numbers. “1.5 million children die every year of diseases that could be readily prevented by vaccines that already exist,” SAGE reported, based on a 2008 WHO estimate.Not a problem for solutions journalism.
PBS NewsHour: copying the Gates Foundation's homework
The Gates and MacArthur foundations both support the PBS NewsHour. Although frequently credited together, this is misleading. The two foundations hold very different, indeed opposing worldviews.
Gates grants are, once more, restricted. A $3.6 million grant to the NewsHour in 2008 supported only global health coverage. A current Gates grant directs $320,000 toward stories that “inform the public” about higher education issues. This media spend hits its mark.
Given this problem, the question and title of the NewsHour segment was: “Should more kids skip college for workforce training?”
No one from the Gates Foundation appeared in the NewsHour segment. Their parts were taken by people funded by the Gates Foundation. The NewsHour introduced series host, John Tulenko, as a “special correspondent from Education Week.” Education Week’s parent company has received $12.6 million in Gates Foundation funding. Before joining Education Week, Tulenko worked at Learning Matters, recipient of $1 million in Gates grants.Tulenko interviewed Anthony Carnevale, head of Georgetown’s Center on Education and the Workforce (CEW) and recipient of $9.7 million in Gates grants. CEW’s postsecondary policy appeared as early as 2012 in a Gates-funded report. CEW’s research informs the Gates Foundation’s current postsecondary strategy. It also appeared in Bill Gate’s blog, in the foundation’s video on postsecondary success, and most recently on the Gates-funded PBS NewsHour.
Tulenko also interviewed Michael Petrilli, president of the Fordham Institute, recipient of $7.8 million in Gates funding. Petrilli, Carnevale and the Gates Foundation argue that too many students go to college and amass debt only to drop out. The solution they propose is that students at risk of dropping out receive advice to consider vocational education instead of going to college.The only person on the show opposed to re-directing students toward job skills programs was Carol Burris. Burris worried that such career advice would be based on stereotypes, especially racial stereotypes. Of the three academics interviewed, Burris was the only one not funded by the Gates Foundation.
For journalism, however, the question is not whether the Gates Foundation’s postsecondary policy should be followed or not. The issue is that the PBS NewsHour ran a story as news that is not distinguishable from the advocacy of a funder.The Gates Foundation’s role as funder in the story also was not visible to viewers. The credits for the segment stated that principal support came from the Noyce Foundation. The Noyce Foundation is defunct. And although NewsHour spokesperson Nick Masella said “NewsHour's education funders are listed on our education web page,” the Noyce Foundation is not among them.
I asked Masella why the NewsHour used a “special correspondent” rather than a NewsHour correspondent and whether Education Week contributed funding. Masella did not reply. Similarly, Masella would not say whether its Gates Foundation grant supported the segment, only that: “The PBS NewsHour credits the Gates Foundation every night on our broadcast, as we do with other foundations, in accordance with PBS's funding standards.”
But the NewsHour gives viewers the impression that the Gates Foundation supports all the NewsHour's good work, when actually Gates money funds stories only on education, stories which do not disclose this restricted funding. By contrast, when the NewsHour covers, for example, rail issues, it clearly states that it receives funding from BNSF.
More in line with the impression PBS gives to viewers, the MacArthur Foundation does support all the NewsHour's good work. MacArthur's $1.5 million grant is not restricted. Although MacArthur does issue some restricted journalism grants, according to Kathy Im, MacArthur’s Director of Journalism and Media: “When we have a well-established relationship with a grantee and have confidence in their editorial vision and dissemination strategies, we tend to provide unrestricted support in order to provide maximum flexibility to the organization and its leadership.”
Gates Foundation v. the People of the United States
MacArthur supports journalism in the public interest; the Gates Foundation supports journalism in support of its policy interests. The MacArthur Foundation believes in open society principles; Bill Gates believes institutions of civil society are iffy: “The closer you get to it and see how the sausage is made, the more you go, oh my God!” Gates told the Financial Times. He wondered whether in American democracy, “can complex, technocratically deep things – like running a healthcare system properly in the US in terms of impact and cost – can that get done?”Imagine, continued Gates, “the idea that all these people are going to vote and have an opinion about subjects that are increasingly complex... Do democracies faced with these current problems do these things well?” Perhaps if they are shown how by their betters.
Whether foundations “do” global health better than democracies and the institutions of civil society is a question that is not asked. Instead of holding the Gates Foundation accountable, a number of influential journalists at trusted news organizations write to foundation storylines and pay down their mortgages with foundation funding.Muckrakers might have called this corruption. At the Gates Foundation, it’s philanthropy.
"I do not see any conflict of interest." - Seth Berkley
Gavi's new board chair, Ngozi Okonjo-Iweala, simultaneously joined the sovereignty practice at Lazard, an investment bank which Bloomberg has described as "banker to the broke." A number of Gavi-supported countries are Lazard clients. Also, Gavi-eligible countries might consider retaining Lazard to enhance their prospects for Gavi funding.
[See previous article, "Gavi Board Chair-elect Joins Lazard's Sovereignty Practice the Same Day."]
However, Gavi CEO, Seth Berkley, said: "I do not see any conflict of interest." Continued Berkeley: "Many of our Board members have other jobs and board positions and we are very careful to monitor any potential conflict of interest issues." As a sign of the legitimacy of the arrangement, Berkley said "the announcement of her work with Lazard and Gavi were coordinated by the communication team and announced the same day." Previously, Gavi spokesperson Rob Kelly denied such coordination of the announcement. Kelly also said Gavi had not facilitated Okonjo-Iweala's employment arrangement with Lazard.
As board chair of Gavi, Okonjo-Iweala will oversee and have signing authority on Gavi grants. Gavi recently raised $7.5 billion dollars to fund vaccine grants over the next four years.
Although Gavi board members do indeed have other jobs, I asked Berkley;
Is it not the case that, as Gavi board chair, Ngozi Okonjo-Iweala will have a hand in distributing several billion dollars to finance ministries while, on the other hand, as part of Lazard, Okonjo-Iweala will be receiving money from finance ministries?
Berkley did not reply.
The Gavi board, which elected Okonjo-Iweala unanimously, appears to have been unaware of her arrangement with Lazard. According to Berkley, "The Board delegated responsibility for due diligence to the Board-appointed Recruitment Committee." I asked Gavi board member Zulfiqar Buttha if the board knew of the Lazard affiliiation when the board elected Okonjo-Iweala. He wrote back: "No."
None of the Gavi board members I emailed responded regarding whether joint Gavi-Lazard positions represented a conflict of interest. I emailed:
- Flavia Bustreo, WHO (Assistant Director-General)
- Zulfiqar Buttha, Unaffilliated
- Tim Evans, World Bank (Director)
- Geeta Rao Gupta, Unicef (Deputy Executive Director)
- Orin Levine, Gates Foundation (Director)
- Katie Taylor, United States (Deputy , USAID)
Okonjo-Iweala's election appears to violate Gavi statutes and bylaws
"Board members will select the Chair and a Vice Chair of the Board from among their own voting members..."
The Gavi bylaws appear to reinforce the statutes. Section 2.6 says:
"The Chair and Vice Chair will be selected according to Article 12 of the Statutes from among voting Board Members (not Alternate Board Members)."
Okonjo-Iweala was not a Gavi board member. Perhaps to circumvent Gavi statutes, Okonjo-Iweala was both named to the board and elected board chair at the September 2015 board meeting. By contrast, outgoing board chair, Dagfinn Høybråten, previously served on Gavi's board before becoming chair.
I asked Seth Berkley if Okonjo-Iweala's election violated Gavi statutes. He did not reply.
Alleged disappearance of Nigerian oil money; Gavi investigation of Nigeria
Okonjo-Iweala is widely known as an anti-corruption crusader. However, the former finance minister of Nigeria has been caught up in allegations that Nigerian oil revenues were improperly diverted. Investigations continue. The newly-elected president of Nigeria, Muhammadu Buhari, recently said, "We have some documents where Nigerian crude oil was lifted illegally and the proceeds were put into some personal accounts instead of the federal government accounts." One estimate put the amount of fraud at $20 billion dollars. A former oil minister is being investigated and others might be named.
In theory, Gavi is still investigating misuse of Gavi funds distributed to Nigeria while Okonjo-Iweala was Finance Minister. A 2014 audit recommended that the Economic and Financial Crimes Commission carry out "a thorough and detailed investigation of the Gavi grants disbursed to Nigeria." In addition, the audit sought "a full-scale audit to cover both select, high-risk expenditures in prior years, and other expenditure from the period 2011-2013" not examined in the course of the 2014 audit.
It is difficult to see how Gavi could pursue or cooperate fully with any investigation of wrongdoing that might involve its board chair. I emailed Simon Lamb, Gavi's Managing Director, Audit and Investigations, and asked if Gavi was following through on the 2014 audit recommendations. He did not reply.
Problems with 30 of 107 papers reviewed
In the 1998 paper, 10 of the 47 studies were mishandled; correcting the errors overturned the review's conclusion that HER2 is independently prognostic.
The error rate increased in subsequent reviews. The 2003 update added 34 more papers and 10 new errors. The most recent review, published in 2009, added 26 papers and 10 more new errors.
All told, in the 2009 review, of the 107 papers reviewed, a total of 30 (28%) are either miscoded or should not have been included:
- 10 papers coded 'Yes' for multivariate significance should be 'No'
- 7 papers coded 'Yes' for multivariate significance should be 'NA'
- 11 papers should not have been included
- 2 papers coded 'No' for multivariate significance should be 'Yes'
Appendix A below defines what constitutes an error. Appendix A also enumerates and explains the 30 errors contained in the 2009 review.
Stuffing the ballot box
First author acknowledges possibility of errors, disputes none of them
Jeffrey Ross, the first author on all three reviews, acknowledged the first two might contain errors. Regarding the 1998 review, Ross wrote in email: "It is certainly possible that the studies you have cited were not perfectly listed in my manuscript from so many years ago.”
With respect to the 2003 review, Ross wrote: "I have no reason to believe that your conclusions are not correct and that there were scattered errors in the meta-analysis of the published literature in our 2003 manuscript."
However, contacted regarding the most recent, 2009 paper, Ross wrote: "Due to time constraints, I am unable at this time to either agree or disagree with your analysis..." In PubMed, the 2009 review is cited 133 times.
No response from The Oncologist
According to the Committee on Publication Ethics (COPE) guidelines, journal editors should consider issuing a correction if "a small portion of an otherwise reliable publication proves to be misleading (especially because of honest error)."
Three emails documenting possible issues in the Ross et al. reviews, sent to Martin Murphy, executive editor at The Oncologist, have not been answered. The Oncologist is a member of COPE.
Papers counted as representing an error were either miscoded or inappropriately included. Note the 2009 review includes all the papers and errors contained in the 1998 and 2003 reviews.
Enumeration of Errors
84. Lal et al.: Yes to Exclude
Correlates HER2 with other biomarkers not clinical outcomes, as the title suggests: "Correlation of HER-2 Status With Estrogen and Progesterone Receptors and Histologic Features in 3,655 Invasive Breast Carcinomas"
85. Huang et al.: Yes to Exclude
Correlates HER2 with other biomarkers not clinical outcomes, as the title suggests: "Association between tumour characteristics and HER-2/neu by immunohistochemistry in 1362 women with primary operable breast cancer"
87. Ariga et al.: Yes to Exclude
Correlates HER2 with other biomarkers not clinical outcomes, as the title suggests: "Correlation of Her-2/neu Gene Amplification with Other Prognostic and Predictive Factors in Female Breast Carcinoma"
89. Prati et al.: Yes to Exclude
Correlates HER2 with other biomarkers not clinical outcomes, as the title suggests: "Histopathologic Characteristics Predicting HER-2/neu Amplification in Breast Cancer"
90. Tanner et al.:Yes to NA
The study does not include a multivariate analysis of HER2 as an independent prognostic factor. In the paper's only multivariate analysis, all the patients were HER2+:
91. Diallo et al.: Yes to Exclude
Correlates HER2 with other biomarkers not clinical outcomes.
99. Sandri et al. Yes to Exclude
Examines HER2 in serum, as the title suggests: "Serum EGFR and serum HER-2/neu are useful predictive and prognostic markers in metastatic breast cancer patients treated with metronomic chemotherapy"
101. Sunami et al.: Yes to Exclude
Correlates HER2 with other biomarkers not clinical outcomes, as the title suggests: "Estrogen receptor and HER2/neu status affect epigenetic differences of tumor-related genes in primary breast tumors"
106. Ludovini et al.: Yes to No
Found HER2 by IHC and FISH significant in univariate analysis. But only serum HER2 was found prognostic in the multivariate analysis. (See table 5.)
HER2 is widely, even universally recognized as prognostic of adverse clinical outcomes in breast cancer. However, two review papers supporting this belief contain a remarkable number of errors, raising the question of what evidence now supports a prognostic role for HER2.
Correcting the errors in a 1998 review of 47 studies by Jeffrey Ross and Jonathan Fletcher overturns the review's conclusion that HER2 is independently prognostic. Ross did not dispute the corrections.
The 47 papers and the errors of the 1998 review are included in a 2003 update from Ross et al. The 2003 edition adds 34 more papers and introduces 10 new errors. All told, the 2003 review examined 81 papers and erred on 20.
I previously documented the mistakes of the 1998 review. There were nine coding errors and two papers that should not have been included in the review. (One of the two papers was also miscoded, but I only count the paper mistaken once, making for 10 total errors rather than 11.)
The 2003 review adds the following 10 new errors:
- 5 papers coded 'Yes' for multivariate significance should be 'No'
- 2 papers coded 'Yes' for multivariate significance should be 'NA'
- 1 paper should not have been included
- 2 papers coded 'No' for multivariate significance should be 'Yes'
The basis for these conclusions are found in Appendix I below.
Contacted regarding these errors, first author Jeffrey Ross replied that because he was traveling, he didn't "have complete access to review your findings." But, continued Ross: "I have no reason to believe that your conclusions are not correct and that there were scattered errors in the meta-analysis of the published literature in our 2003 manuscript."
The scope and scale of the errors might make both papers candidates for correction or retraction. The Oncologist published both. Executive Editor Martin Murphy did not reply to an email regarding problems with the 1998 review.
5 papers coded 'Yes' for multivariate significance should be 'No'
1) Jukkola et al. (2001)
The abstract reports: "In multivariate regression analysis, only tumour size and nodal involvement were risk factors for poor survival when analysed separately together with c-erbB-2 and receptor status..."
Section 3.2 states: "In multivariate Cox stepwise regression analsis, tumour size and nodal involvement emerged as independent prognostic factors when analysed separately in combination c-erbB-2, indicating a 2.9 (90% CI 1.9-4.4) risk of death in node-positive patients. For patients with tumour sizes T3 or T4 the risk of death was 2.7 (90% CI 1.4-5.1) and 4.8 (90% CI 2.5-9.5), respectively, c-erbB-2 status did not reach significance in this model, nor when analysed in combination with tumour size, nodal involvement and receptors."
2) Rudolph et al. (2001)
HER2 only emerges as prognostic if CR is removed: "When all variables that attained statistical significance in the univariate analysis were included in the multivariate model, the CR was the first and most significant independent indicator of both AOS and DFS (P .0001; Table 3). Next to CR, only PR status was found to be an independent prognostic factor, albeit of borderline significance."
3) Pinto et al. (2001)
HER2 is not independently prognostic: "C-erbB-2 is an independent prognostic indicator when evaluated in conjunction with ploidy and SPF."
4) Suo et al. (2002)
HER2 is only prognostic when combined with EGFR or HER4. See Table 5.
5) Spizzo et al. (2002)
The paper states: "Multivariate analysis for DROS revealed that nodal status, EpCAM overexpression, tumor size and histological grade were significant prognostic factors. Hormone receptor expression and Her-2/neu overexpression were not significant predictors of DROS. For DFS, nodal status, Ep-CAM overexpression, tumor size and progesterone receptor expression were significant prognostic factors. Her-2/neu overexpression, histologic grade and estrogen receptor expression had no prognostic value for disease-free survival (Table III)."
2 papers coded 'Yes' for multivariate significance should be 'NA'
1) Agrup et al. (2000)
No multivariate analysis
2) Horita et al. (2001)
No multivariate analysis
1 paper should not have been included
Wright et al. (1989) is one of three studies incorporated in Gullick et al. (1991) with the result that the same 185 patients are counted twice.
2 papers coded 'No' for multivariate significance should be 'Yes'
1) Scorilas et al. (1999)
Tables 2 and 3 show HER2 overexpression prognostic in multivariate analyses of early relapse and overall survival.
2) Rosenthal et al. (2002)
A paper on which Ross is senior author found "Multivariate analysis of the combined LN+ and LN− lobular and ductal cases revealed that HER-2/neu amplification (P 0.002), pathologic stage (P < 0.0001), and node positivity (P < 0.0001) were all independent predictors of disease-related death."
Ngozi Okonjo-Iweala, Chair-elect of the Gavi Board (Photo credit: Gavi)
Gavi, the public-private partnership in charge of global immunization efforts, recently announced the unanimous approval of Ngozi Okonjo-Iweala as board chair-elect. The same day, Lazard announced that Okonjo-Iweala, the former finance minister of Nigeria, had joined its sovereignty practice. Recent Lazard clients have included countries receiving Gavi funding, potentially creating a conflict of interest.
In January, Gavi raised $7.5 billion to be disbursed to developing countries from 2016 through 2020.
Gavi knew of Okonjo-Iweala's Lazard appointment and believes it will not pose a problem. According to Gavi spokesperson, Rob Kelly: "Financial oversight of programmes [is] the responsibility of the Gavi CEO and is managed on a day-to-day basis through teams within the Gavi Secretariat." Potential conflicts of interest, Kelly argues, won't compromise decisions about money because of how Gavi is structured. However, the CEO reports to the board which has ultimate financial oversight of Gavi. The board chaired by Okonjo-Iweala is Gavi's "supreme governing body," according to its statutes.
Lazard’s sovereignty clients include Gavi grant recipients such as the Democratic Republic of Congo, Mauritania, Nicaragua
and Ukraine, according to recent regulatory filings. Retaining Lazard, so-called “Banker to the Broke,” might be seen by all Gavi-eligible
countries as a way to, for example, win larger grants. Also, Gavi eligibility and criteria for graduating from Gavi support have less obvious but still significant financial implications for many countries in the world. A country paying Lazard might lead directly or indirectly to a financial benefit to Okonjo-Iweala who, at least according to Gavi statutes, exerts considerable influence on Gavi decisions having financial consequences for countries seeking or receiving Gavi support.
In addition, Nigeria was found by Gavi to have misused vaccine grant money while Okonjo-Iweala was finance minister. After a 2014 audit, Gavi demanded repayment of $2.2 million, a figure which may understate the extent of fraud. As much as 87% of the amount audited might have been skimmed off. Okonjo-Iweala's signature, along with that of the health minister, is on Nigeria's status reports to Gavi for 2011, 2012 and 2013, the years examined by the Gavi audit. Gavi has announced a more far-reaching audit and requested that Nigeria conduct a criminal investigation. Okonjo-Iweala might play multiple, conflicting roles in these investigations.
Okonjo-Iweala is also embroiled in an alleged missing $20 billion in missing Nigerian oil revenue. According to Rob Kelly, Gavi was aware of the matter and Okonjo-Iweala "was selected following an intensive and competitive search, which included a thorough due diligence process." Okonjo-Iweala has a reputation as an anti-corruption crusader. In 2012, she published a book on her experience entitled "Reforming the Unreformable: Lessons from Nigeria."
Nigeria has one of the worst immunization systems in the world which Kelly said "didn’t play a role" in Okonjo-Iweala's selection to Gavi board chair. Nigeria's system is so weak that it is difficult to ascertain immunization rates. According to Gavi, Nigeria reported 70% coverage for 2014, but a 2013 house-to-house survey found only 38% of children immunized.
Prior to the founding of Gavi in 2000, WHO and UNICEF ran global immunization. Gavi originated partly in reaction to the perception that WHO had been debilitated by politically and financially motivated appointments and staffing decisions. In contrast to WHO processes, the most recent selection of Gavi's CEO and board chairs have been tightly controlled.
The current CEO, Seth Berkeley, won unanimous approval from the board on March 8, 2011. His nomination by the Governance Committee came earlier the same day, again unanimously. Board minutes record one member mentioning that this “short turnaround time” meant there was little opportunity to consult with board constituencies. Both meetings were by teleconference.
Berkeley's selection was actually the work of a four-person subcommittee. Donor nations, who provide most Gavi funding, were placed in a "reference group" outside the four-person subcommittee with actual authority to choose a CEO. The countries Gavi is supposed to serve appear to have had no involvement in selecting the CEO: “Developing country voices need to be part of this process," noted Gavi meeting minutes, "however no volunteers from this constituency emerged." And although Gavi has board seats for developing countries, Gavi chooses who will "represent" those countries. WHO and UNICEF get only 2/3 of a seat each, squeezing in with the World Bank to share two seats total, the same number held by the vaccine industry.
Selection of the last two Gavi board chairs followed a ramrod process similar to the 2010 CEO decision. The Governance Committee appointed a smaller subcommittee. In both 2010 and 2015, this group was chaired by George Wellde Jr., a former partner at Goldman Sachs, and one of nine "unaffiliated individuals" on Gavi's 28-member board. Wellde's subcommittee proposed Ngozi Okonjo-Iwealaa as nominee to the Governance Committee which approved the choice on September 17. The board unanimously approved her selection the next day, September 18th, according to Gavi's Rob Kelly.
The simultaneous announcements about Okonjo-Iweala raises the question of whether Lazard and Gavi coordinated their timing. Gavi has not yet said if the coordination extended to helping facilitate Okonjo-Iweala's joining Lazard. [Update 10/20/2015: Gavi's Rob Kelly says Gavi did not coordinate announcement timing with Lazard nor did Gavi facilitate Okonjo-Iweala's position at Lazard.]
The Gates Foundation, which started Gavi, and the US representative to Gavi, USAID's Katie Taylor, had not responded to requests for comment by publication time.