December 12, 2016
By James Lyons-Weiler, Bernadette Pajer, and Zoey O’Toole
Summary: We describe seven major flaws in the study by Zerbo et al. (2016). The combined tactic of “correcting for” interacting and functionally related risk factors leads to the non-sequitur conclusion and false generalization that vaccines in pregnancy are not related to increased risk of autism. We conclude that these design and analysis flaws make their recent study of vaccination during pregnancy irrelevant for those most at risk, and even if the results were valid (which they are not), they would only be relevant for the mother/fetus pairs in the general population who are least likely to see autism develop as result of vaccination during pregnancy.
A recent study of 196,929 children born between 2000 and 2010 was conducted to determine whether an association exists between influenza infection and/or vaccination during pregnancy and subsequent ASD diagnosis in the child. Our review of the study found serious design and analysis flaws.
In the initial analysis, the data showed vaccination increased autism risk.
According to the authors
The unadjusted proportion of ASD was slightly higher throughout follow-up among children of women who received influenza vaccinations during pregnancy compared with children of unvaccinated women.
After initially finding association, they analyzed the data and ultimately dismissed the result as likely due to chance.
Here are the flaws we identified in the study:
- Use of Bonferroni Correction for Multiple Hypothesis Testing
The Bonferroni correction is universally recognized as too conservative. It works as follows: If you test one hypothesis, the normal type I error risk is usually 0.05 – that’s called alpha. At alpha = 0.05, it is expected that you will fail to reject a true null hypothesis by chance only 5/100 times. The Bonferroni correction works by dividing that risk among all of the hypotheses, so if you test two hypotheses, you can only accept a risk of 0.025, or 2.5% risk, for each. For three hypotheses, it’s 0.05/3, and so on. This study applied Bonferroni to the main hypothesis considering multiple supposed “confounders” – variables that might (emphasis might) explain a spurious correlation. A statistically significant but unwanted positive association is easy to hide by adding several additional proposed confounders, and treating them as independent “hypotheses”, when they are not, to justify invoking correction of the type I error risk for multiple comparisons. One can make a statistically significant finding vanish in a sea of statistical shamwizardy (Lyons-Weiler, 2016).
There are in fact less stringent methods to correct for multiple hypothesis testing, including adaptive methods. But the larger problem is that use of the Bonferroni correction in this manner is also unorthodox. Model selection criteria should be used, not multiple hypothesis testing, because the covariates used are not clearly independent of each other, and model fit is not assessed by corrections for multiple comparisons because the model terms may in fact interact. At the least, variance inflation factors should be reported.
- Over-Correction Using Collinear Variables without Objective Model Selection Criteria
The data were “adjusted” with many covariates because they are “known” to be associated with increased risk of autism. In some cases, they treated variables as separate risk factors for autism on their own, when they are obviously associated with risk for autism due to susceptibility to environmental exposures. For example, it has been acknowledged that a mitochondrial genetic variant may increase susceptibility to the toxins in vaccines, eliciting a severe adverse reaction, including autism (e.g., Hannah Poling). In the overall analysis, Zerbo et al. misinterpret what those variables are telling them by ignoring their combined utility in predicting which mothers would have children with autism. Instead, they use the variables to merely assess association. Prediction model performance evaluation should include the model’s sensitivity, specificity, and accuracy, and machine-learning techniques exist that allow one to optimize such predictions using training sets (sets of patients used to learn the model) and test sets (independent sets of patients used to evaluate the model’s performance, and its generalizability).
- Failure to Consider and Report Interactions Terms
This flaw is related to #2. Many of the variables treated as “confounders” by Zerbo et al. may instead be useful predictor variables to understand specifically who may be at risk of serious adverse events after vaccination, when used in combination. The variables are potential risk factors or risk indicators of environmental and/or genetic susceptibility that may interact with vaccination. For example, maleness increases susceptibility to mercury injury (Vahter et al., 2007; Camsari C et al., 2016), and thus gender could be a useful predictor variable (in general). Other variables may be even more informative, such as having allergies, asthma, or other autoimmune condition. There is a great deal of overlap between these three conditions, and all indicate an overactive immune system (Chan et al., 2015), and thus each is a potentially useful indicator of increased risk of overreaction to vaccine components (IOM, 2012). Methods have long existed to assess the degree of independent and combined contribution of such variables, rather than treat them as competing hypotheses. Factoring out variation from covariates leaving residual variation to be “explained” by other covariates is “descriptive statistics”. Testing the utility of the covariates in combination is “prediction modeling”. Both require consideration of the interaction terms, and the interaction terms, when significant, point to (in the case of adverse events or drug efficacy) potentially useful indications that important subgroups may exist within the clinical population being studied. Focus on descriptive statistics is identifiable by reporting p-values (significance) of covariates only, and not reporting the accuracy, sensitivity and specificity of any of the prediction models. Interpretation here is key; integrative prediction modeling yield increased understanding of the contribution of each covariate. We need studies that reveal potentially useful indicators of which patients are least able to tolerate a vaccine.
In 2010, a study entitled “What’s the best statistic for a simple test of genetic association in a case-control study” (Kuo and Feingold, 2010) concluded that “that the most commonly used approach to handle covariates–modeling covariate main effects but not interactions–is almost never a good idea”.
The same is true of correlative studies of health outcomes, even when they do not consider genetics, especially when the variables considered may be proxy for genetic variation. Both asthma and autoimmunity are thought to, at least in part, have a genetic basis, and that genetic contribution may be related to the genetic component of the familial risk seen in autism. In other words, we need to start looking for the genetic susceptibility subgroup(s) in all future studies on causes of autism. Zerbo et al. failed to objectively compare the predictive performance of alternative models, including those with interactions, and used a weak analysis framework that seems designed to indemnify vaccination during pregnancy. By doing so, they have severely impeded understanding rather than improving it.
The length the analysts went to in torturing the data is telling:
Maternal and infant characteristics, obtained from KPNC prenatal and pediatric electronic medical records and state vital statistics databases, adjusted the measure of association between maternal influenza infection and vaccination during pregnancy and ASD risk in multivariate analyses. Covariates included the child’s sex, calendar conception year (categorical variable), gestational age, maternal prepregnancy body mass index (BMI, calculated as weight in kilograms divided by height in meters squared) (BMI < 18.5 = underweight; 18.5 ≤ BMI < 25 = normal weight; 25 ≤ BMI < 30 = overweight; BMI ≥ 30 = obese), maternal age at delivery (younger than 20, 20 to 24, 25 to 29, 30 to 34, and ≥ 35 years), maternal education at delivery (≤ high school graduate, some college education, college graduate, postgraduate, or unknown), maternal race/ethnicity (Asian, black, white, or other), and gestational diabetes (yes/no). Additional covariates included maternal asthma (yes/no), hypertension (yes/no), autoimmune disease (yes/no), and allergies (yes/no) recorded in the electronic medical record before the conception date. All covariates were chosen a priori because of their association with ASD in previous studies. . . or because they are indications for influenza vaccination.
We find a major flaw in the use of the concept of “correcting for” maternal asthma and autoimmunity. In addition to the fact that vaccines can cause both asthma and autoimmunity, maternal asthma and autoimmunity are associated with autism. Autism is clearly a complex medical condition that involves both the innate and the adaptive immune system. Because it would be expected that pregnant women with these conditions have high risk, treating these conditions only as independent hypotheses is a serious flaw.
Based on background knowledge, the study is deeply flawed in “correcting for” certain variables, specifically maternal asthma and autoimmunity. First, they are not independent variables. Both maternal asthma and autoimmunity are intimately associated with autism and immune activation (see for example, (Lyall et al., 2016). Given the background knowledge supported by science on this question, one would expect to find that the vaccine association is higher in mothers with asthma and autoimmunity. They also mention that risk is highest for children conceived in influenza-heavy months. Their mothers are the most likely to be vaccinated in the first trimester. Those pregnancies should have been compared separately, and, if that were done, separate power calculations should be conducted to ensure that association could be found if it did, in fact, exist.
In technical terms, the use of highly collinear variables (variables that are highly correlated and may not be independent of each other) is called “overfitting the model.” When used in the extreme, such as was done with past, flawed studies on the question of vaccines and autism, the analyst can “pull out” all of the variance (differences) among the patients using multiple, non-independent (redundant) variables, thus leaving only residual, random difference (noise) to be “explained” by the final independent variable (vaccination). The study did not report any of the objective measures of model overfit, and yet each of these factors were used to change variance in the model, and the authors used those changes in the data to dismiss the original connection as “possibly due to chance.”
Dr. Lyons-Weiler’s “Ice Cream Consumption Corrected For Cone Sales” Analogy
An apt analogy would be a study that sought to examine whether there was an association between daytime temperatures and ice cream consumption at ice cream parlors across cities in the US. After collecting both variables (daytime temperature being an independent, or predictor variable; ice cream consumption being the dependent variable), the data analyst decides to “correct for” the variables “consumption of cones” and “trips to ice cream parlors” because they already are suspected to be able to “explain” the volume of ice cream consumption across cities in the US. They clearly could use the sale of cones to predict ice cream consumption, but it would be pointless. Better indicators might be the median daytime temperature, the average amount of expendable income, or the frequency of lactose intolerance. These variables would tend to lead to increased understanding of “ice cream consumption”, and would empower the analyst to predict ice cream sales. “Correcting” the model for the sale of cones, or for the number of trips to ice cream parlors would leave only random variation among customers. They are dependent and collinear variables: as trips to ice cream parlors increase, both the volume of consumed ice cream and the number of cones consumed increases. Use of other, known dependent variables as independent variables, use of highly collinear variables, and model fit are examples of statistical shamwizardry.
These are clever techniques in this setting because the non-specialist (even peer-reviewers) might think they seem appropriate. A give-away is that the functional relationship among variables (dependent/independent) has been perturbed. Other studies have been warped using the same strategy. It took CDC four years to figure this out for the Verstraeten et al. study, so much so, that the lead author sent an email complaining that the association just would not “go away” (Verstraeten, 1999).
- Lack of Power Analysis
The study does not report whether their sample sizes were sufficient to achieve rejection of the null hypothesis of no association, if one did, in fact, exist. And the fact that they studied no interactions is telling: such studies require much larger sample sizes. Because power for interactions is not reported, the suitability of the sample size of the study cannot be assessed.
- Biased Eligibility
The study’s eligibility restricts the potential relevance of the Zerbo et al. study to parents whose children do not have later-diagnosed conditions on the spectrum. The report:
Eligibility was restricted to singleton children who were born at a gestational age of at least 24 weeks and who remained health plan members until at least 2 years of age (n = 196,929).
Many cases of ASD are not diagnosed until a child enters school; only severe cases of autism are usually diagnosed at age 2 or 3. Asperger’s and other higher-functioning ASDs and PDDs are diagnosed much later, often not until age 4 or 5, or even later. The Zerbo et al. study results are diluted by including children who left the health plan between the ages of 2 and 5, but who may have received an ASD diagnosis after leaving the health plan.
There may therefore be a large ascertainment bias issue, especially if leaving Kaiser is related to likelihood of later diagnosis as may be the case if parents determine they need a “better” insurance plan, or if a parent loses their job due to having to care for a child with autism.
This study appears to have winnowed the study population down to people with lower risk. That isn’t science; it’s marketing. When assessing risk meant to have implications for the entire population, studies should never deliberately exclude individuals with indicators that might include potentially related risk. That includes genetic mutations.
These people matter.
Anyone who would be given a vaccine in real-world conditions should qualify for inclusion in a study, and it is essential to determine whether, and more importantly how, special risk exists for vulnerable population groups. For example, if women with asthma and other autoimmune conditions who are homozygous for MTHFR C677 are removed from such studies, and an association disappears, that points toward, not away, from causality for a subset of individuals. As one such woman told us, “I shouldn’t be ‘controlled for,’ I should be front and center in the analysis!”
Excluding individuals with mutations that might confer risk should be avoided, too.
These descriptive studies use the weakest of possible multivariate statistical approaches, and badly at that. We need to do predictive science and see if these factors contribute predictive power jointly, or independently, and remain constantly diligent in seeking to understand which population(s) the results of studies like these may be irrelevant for as a direct result of the design of analysis. The public health policies emerging from Zerbo et al. must be restricted to a narrow population, defined by both the eligibility criteria and by the incorrect “correction for” collinear predictive and outcome variables. The study is, as conducted, irrelevant for women with asthma and autoimmunity, and for patients who change health care practices when the child is around the age of 2.5-5 years.
- Extreme Cohort Effect
A cohort effect is a variable not included in the study, that changes over time differently between clinical groups, that can be expected to directly influence the dependent variable, and is a type of confounding variable. We find evidence of an extreme cohort effect in the Zerbo et al. study. No major effort existed before 2009 to vaccinate women during pregnancy. This study covers the time period 2000-2010. Most of the vaccinated women are therefore from the later years in the study:
. . . vaccination among pregnant women in this cohort (45,231, 23%). Vaccination rates increased from a low of 6% in 2000 to a high of 58% in 2010, and most vaccinated women were older, educated women.
Flu shots during pregnancy are now accompanied by pressure to accept other vaccines as well. For example, during the study time period, 14% received Tdap during pregnancy. By 2013, Tdap was administered during pregnancy in 41.7% of live births (Kharbanda et al., 2016).
- Failure to Report all Serious Adverse Events (SAEs)
The study was narrow in focus and did not consider other SAEs. Rates of spontaneous termination of pregnancies, for example, were not considered; other studies have reported (without note) dramatically increased rates of spontaneous terminations in pregnant women who received flu vaccines (England, 2012).
In sum, the very design of the study, and both the design and the implementation of the analysis approach used in the study, severely limits the utility of the conclusions for the general population.
The effect of factoring out variation in risk contributed by such variables effectively removes women from the study who might be at most risk of serious adverse events due to vaccines. This shifts the focus away from the general population to the subset of the population who are not likely to have SAE’s. In the roll-out of news stories about this study, this critical detail is completely lost, and in clinical translation, the incorrect, and potentially unwarranted and non-sequitur conclusion that “vaccines during pregnancy do not contribute to autism risk” is thereby grossly misleading. Second, even if the results were not flawed, the changes in the vaccination protocol during and since the study makes the results irrelevant for the current population of pregnant women in the US.
This study’s initial results showing an increase in autism in the vaccinated group should be heeded as a warning to prospective parents. The logic of vaccine safety science is contorted and twisted. Several flu vaccine formulations still contain thimerosal. We know from first principles that thimerosal is a toxin; it shuts down the expression of the protein ERAP1, and causes widespread problems with proper protein truncation (Stamogiannos et al., 2016). We call for an independent reanalysis of the findings using appropriate modeling, and a retraction of the current publication.
Camsari C et al., 2016. Effects of periconception cadmium and mercury co-administration to mice on indices of chronic disease in male offspring at maturity. Environ Health Perspect. DOI:10.1289/EHP481
Chan SK, Gelfand EW. 2015. Primary immunodeficiency masquerading as allergic disease. Immunol Allergy Clin North Am. 35(4):767-78. doi: 10.1016/j.iac.2015.07.008.
England, C. 2012. 4,250% Increase in Fetal Deaths Reported to VAERS After Flu Shot Given to Pregnant Women https://vactruth.com/2012/11/23/flu-shot-spikes-fetal-death/
Kharbanda EO et al., 2016. Maternal Tdap vaccination: Coverage and acute safety outcomes in the vaccine safety datalink, 2007-2013. Vaccine. 34(7):968-73. doi: 10.1016/j.vaccine.2015.12.046.
Kuo CL, Feingold E. 2010. What’s the best statistic for a simple test of genetic association in a case-control study? Genet Epidemiol. 34(3):246-53. doi: 10.1002/gepi.20455.
Lyall K J et al., 2016. Maternal immune-mediated conditions, autism spectrum disorders, and developmental delay. Autism Dev Disord. 44(7):1546-55. doi: 10.1007/s10803-013-2017-2.
Lyons-Weiler, J. 2016. Cures vs. Profits: Successes in Translational Medicine. World Scientific. http://amzn.to/1qeo9Ef
Vahter M et al., 2007. Gender differences in the disposition and toxicity of metals. Environ Res. 104(1):85-95.
Verstraeten, T. “It just won’t go away” email to Robert Davis and Frank DeStefano, Dec 17, 1999
Stamogiannos A et al., (2016). Screening Identifies Thimerosal as a Selective Inhibitor of Endoplasmic Reticulum Aminopeptidase 1. ACS Med Chem Lett. 7(7):681-5. doi: 10.1021/acsmedchemlett.6b00084.
James Lyons-Weiler, PhD is the CEO and Director of the Institute for Pure and Applied Knowledge, and author of three books: The Environmental and Genetic Causes of Autism (Skyhorse), Cures vs. Profits, and Ebola: An Evolving Story. He has directed the analysis of data from over 100 biomedical research studies.
Bernadette Pajer is a novelist, citizen journalist, and informed-consent advocate. She has a BA in Interdisciplinary Arts and Science from the University of Washington, Bothell. Her mystery novels have been peer-reviewed by the Washington Academy of Sciences and earned their science seal-of-approval.
Zoey O’Toole is the Editor-in-Chief of the Blog at the Thinking Moms’ Revolution and is currently serving on the TMR board as VP of Communications. She has two children, a 17-year-old girl and a 10-year-old boy, that inspire her to take children’s health concerns extremely seriously.