The trouble with Ostarine: Jimmy Wallhead’s
16th March 2018
We would like to respond to several incorrect and misleading claims made by the IAAF in its Briefing Notes released on 7 May 2019. These claims relate primarily to a 2017 research paper that provided the evidence used for the selection of specific track and field events covered by the IAAF’s DSD Regulations (Bermon and Garnier BJSM, hereafter BG17).
Last year, at our request, the IAAF shared with us 25% of the data used in that study. As far as we know, we are the only researchers in the world outside of IAAF who have had a chance to reproduce a segment of their dataset and replicate parts of their analysis.
Further, the IAAF analysis of this data remains the only performance data that IAAF cites as the basis for the selection of its restricted events under its DSD Regulations. The IAAF continues to assert the validity of this data even though we have shown conclusively that the data suffered from systematic errors rendering any analysis and conclusions unreliable.
We are concerned because the IAAF representations of BG17 made in their recent statement are factually inaccurate and misleading. This matters not just because the flawed IAAF data and research using that data are a key element upholding the new IAAF DSD Regulations, but more broadly, this issue is a matter of fundamental scientific integrity.
In its new Briefing Note, IAAF writes:
[T]he 2017 Bermon & Garnier BJSM paper was criticised for its statistical approach. A new set of statistics were provided on a modified database (taking into account some of the criticisms raised).
This is incomplete at best and highly misleading at worst. It is true that some scholars criticized the statistical approach of IAAF (see Sonkesen et al; Menier; Franklin et al). However, the concerns about BG17 go well beyond those statistical concerns.
They also include methodological considerations that no reanalysis can overcome, and perhaps most of most concern, the possible persistence of erroneous data in the database. Our reanalysis of BG17 using data provided by the IAAF, and published in ISLJ, found that in the four restricted events covered by the DSD Regulations, between 17% and 33% of the data were erroneous, including duplicate data and ‘phantom’ data that did not exist during the competitions analyzed. This was highly concerning and led us to call for the publisher to retract the paper in accordance with the journal’s own retraction policy.
The presence of erroneous data is not even in question – the IAAF subsequently acknowledged that more than 20% of the data in BG17 was flawed, and had been dropped from its subsequent analysis. This is what IAAF means above when they use the phrase “modified database.” It is noteworthy that this modified database has never been reviewed or evaluated in the same way that we were enabled to do for a portion of the original database. We thus have no assurances of whether the same number or types of errors may be present in this modified database.
Further, when the IAAF states that ‘other criticisms of this paper are misplaced’, they sidestep the most damaging criticism of all, that documented in our paper in ISLJ, which represents the only available external and independent analysis of their data. It has been peer reviewed, is available to anyone and is certainly known to IAAF.
If the IAAF scientists disagree with our findings, why do they not enter a scientific discussion about the issue? It is scientifically dishonest to act as if our reanalysis, criticisms and concerns of the research do not exist.
The IAAF further states:
All published papers have been peer-reviewed.
This too is untrue. In response to numerous criticisms of its original study, including the highlighting of significant data errors, the IAAF re-evaluated the data and submitted a paper as a ‘Discussion’ (in effect, a short letter) that was published in the BJSM in 2018, seeking to re-do the flawed BG17 study. It is explicitly noted in that paper that the IAAF Discussion was not sent out for peer review, but instead was reviewed solely by the BJSM Editor. Internal, editorial review is not what anyone in the scientific community would characterize as ‘peer-reviewed’.
In our critique of the IAAF’s original study (BG17), we also looked at the analysis in the follow-up Discussion in BJSM (which we call BHKE18). We found that analysis also to be unreliable. Here is what we concluded in our paper:
Clearly and unambiguously, the results reported in BG17 change quantitatively in BHKE18 upon removal of 220 data points and introduction of new methods. The results of BG17 are clearly unreliable, and those of BHKE18 are of unknown validity. Further, without access to the medical data and all linked performances used in BG17, it is impossible to know how or why certain athletes/results were removed and others not. What is unequivocal is that BG17 used unreliable data, and thus, its results are also unreliable. Different data and methods were used in BHKE18, leading to significantly different results, based on the almost certain use of flawed data, leading consequently to unreliable results. The bottom line is that the use of flawed data makes it impossible to know what, if any, relationship exists between the variables of BG17 and BHKE18 or to verify the reported results.
The fact that IAAF themselves performed the research and analysed the data on which it has based its controversial regulations is non-transparent and problematic. They have failed to respond to our criticism, to explain what data errors have been detected and possibly corrected or to release their data for independent verification. These facts highlight the nature of the deep conflict of interest that the IAAF researchers have in this case. Here is what we stated in our paper:
The IAAF set itself up for problems by conducting research on performance effects associated with testosterone using in-house researchers. This creates at a minimum a perception of a conflict of interest that could have been mitigated to some degree by allowing independent researchers access to data and evidence, in order to replicate findings. In this case, such access was not allowed, except for the small amount of data shared with us, which was subsequently found to contain numerous errors. The unwillingness of the IAAF to correct or acknowledge errors highlights its conflict of interest.
An alternative to the approach to science and evidence employed by the IAAF would have been to provide research funding to an independent body which could request proposals from researchers unaffiliated with the IAAF to address the scientific questions at issue. We would not find it appropriate for cigarette companies to provide the scientific basis for the regulation of smoking or oil companies to provide the scientific basis for regulation of fossil fuels. Sport regulation should be held to the same high standards that we expect of researchers in other settings where science informs regulation and policy.
We believe that a comparison of the statements made by IAAF with our analysis can only conclude that IAAF is failing to uphold basic standards of scientific integrity that should be expected in such an important matter that affects global sport and individuals’ lives. We should all expect better.
Prof Roger Pielke Jr.
Prof Erik Boye
• This Open Letter was published on Sports Scientists, Ross Tucker’s internet site on 9 May 2019. You can access the original by clicking here.
• Twenty three athletes from 14 countries, competing in 11 sports, were involved in anti-doping...
• Twelve athletes from nine countries, competing in seven sports, were involved in anti-doping proceedings...
• 36 athletes from 12 countries, competing in 12 sports, were involved in anti-doping proceedings...