It is a great privilege to become an associate editor at Cortex.
Cortex was one of the first journals I published in, and I have reviewed at the journal for many years now. I’m particularly humbled to join such a distinguished editorial board.
As delighted as I am to join Cortex, I think we need to be doing more than editing submissions according to standard practices. In most journals, the traditional approach for handling empirical articles is archaic and demonstrably flawed. I believe we should be using our editorial positions to institute reforms that are long overdue.
1. General proposal and rationale
I would therefore like to propose a new form of empirical article at Cortex, called Registered Reports. I hope to start a discussion among the editorial board and wider scientific community about the merits and drawbacks of such a proposal. In addition to emailing this document to the editorial board, I have also published it on my blog for open discussion, so please feel free to reply either confidentially (via email) or publicly (on the blog). This proposal is very much a working document so any edits or comments on the document itself are most welcome.
I need to make one point clear at the outset. At this stage I am not proposing that we drop any of the existing article formats at Cortex. Rather, I am suggesting an additional option for authors.
The cornerstone of Registered Reports is that a large part of the manuscript would be reviewed prior to the experiments being conducted. Initial manuscripts would be submitted before a study has been undertaken and would include a description of the key background literature, hypotheses, experimental procedures, analysis pipeline, a statistical power analysis, and pilot data (where applicable). Following peer review, the article would then be either rejected or accepted in principle for publication.
Once in principle acceptance (IPA) has been obtained, the authors would then proceed to conduct the study, adhering exactly to their peer-reviewed procedures. When the study is complete the authors would submit their finalised manuscript for re-review and would upload their raw data and laboratory log via Figshare for full public access. Pending quality checks and a sensible interpretation of the findings, the manuscript would be published – and, crucially, independently of what the results actually look like.
This form of article has a number of advantages over the traditional publishing model. First and foremost, it is immune to publication bias because the decision to accept or reject manuscripts will be based on the significance of the research question and methodological validity, never on whether results are statistically significant.
Second, by requiring prospective authors to adhere to a preapproved methodology and analysis pipeline, it will eliminate a host of suspect but common practices that increase false discoveries, including p value fishing (i.e. adding subjects to an experiment until statistical significance is obtained – a practice admitted to by 71% of recently surveyed psychologists; ) and selective reporting of experiments to reveal manipulations that “work”. Currently, many authors partake in these practices because doing so helps convince editors and reviewers that their research is worthy of publication. By providing IPA prior to data collection, the incentive to engage in these practices will be largely eliminated.
Third, by requiring an a priori power analysis, including a stringent minimum power level (see below), false negatives will be greatly reduced compared with standard empirical reports. This will increase the veracity of non-significant effects.
Taken together, these practices will ensure that articles published as Registered Reports have a substantially higher truth value than regular studies. Such articles can therefore be expected to be more replicable and have a greater impact on the field.
Why should we want to make this change? The life sciences, in general, suffer from a number of serious problems including publication bias [1, 2], low statistical power [3, 4], undisclosed post-hoc analytic flexibility [5, 6, 7], and a lack of data transparency . By valuing findings that are novel and eye-catching over those that are likely to be true, we have incentivised a range of questionable practices at individual and group levels. What’s more, a worryingly high percentage of psychologists admit to engaging in dubious practices such as selectively reporting experiments that produced desirable outcomes (67%) and p value fishing (71%) .
So why should we change now? After all, these problems are far from new [10, 11]. My instinctive response to this question is, why haven't we changed already? In addition, there are several reasons why advances in scientific publishing are especially timely. The culture of science is evolving quickly under heightened funding pressure, with an increasing emphasis on transparency and reproducibility , open access publication , and the rising popularity of the PLoS model and other alternative publication avenues. Furthermore, retractions are at a record high , and recent high-profile fraud cases (e.g. Stapel, Smeesters, Sanna, Hauser) are casting a long shadow over our discipline as a whole.
The ideas outlined here are not new and I certainly can’t claim credit for them. I formulated this proposal after a year of discussion with scientists in multiple disciplines (including journal editors), science policy makers, science journalists and writers, and the Science Media Centre, as well as key blog articles (e.g. here, here and here).
I hope I can convince you that Registered Reports would provide an important innovation in scientific publishing and would position Cortex as a leader in the field. If you agree, in principle, then our next step will be to decide on the details. Then, finally, we would need to convince Elsevier to take this journey with us.
If we succeed then it will bring the scientific community one step closer to a system in which the incentive to discover something true, however small, outweighs the incentive to produce ‘good results’. Call me a shameless idealist, but I find that possibility hugely exciting.
2. The proposed mechanism
Registered Reports would work as follows.
(a) Stage 1: Registration review
Authors submit their initial manuscript prior to commencing their experiment(s). The initial submission would include the following sections:
· Background and Hypotheses
o A review of the relevant literature that motivates the research question, and a full description of the aims and experimental hypotheses.
o Full description of proposed sample characteristics, including criteria for subject inclusion and exclusion, and detailed description of procedures for defining outliers. Procedures for objectively defining exclusion criteria due to technical errors (e.g. defining what counts as ‘excessive’ head movement during fMRI) or for any other reasons (where applicable) must be documented, including details of how and under what conditions subjects would be replaced.
o A description of experimental procedures in sufficient detail to allow another researcher to repeat the methodology exactly, without requiring any further information.
o Proposed analysis pipeline, including all preprocessing steps, and a precise description of every analysis that will be undertaken and appropriate correction for multiple comparisons. Any covariates or regressors must be stated. Consistent with the guidelines of Simmons et al. (2011; see 5), proposed analyses involving covariates must be reported with and without the covariate(s) included. Neuroimaging studies must document in advance, and in precise detail, the complete pipeline from raw data onwards.
o Where analysis decisions or follow-up experiments are contingent on the outcome of prior analyses, these contingencies must be detailed and adhered to.
o A statistical power analysis. Estimated effect sizes should be justified with reference to the existing literature. To account for existing publication bias, which leads to overestimation of true effect sizes [15, 16], power analysis must be based on the lowest available estimate of the effect size. Moreover, the a priori power (1 - B) must be 0.9 or higher. Setting a high power criterion for discovery of minimal effect sizes is paramount given that this model will lead to the publication non-significant effects.
o In the case of very uncertain effect sizes, a variable sample size and interim data analysis would be permissible but with inspection points stated in advance, appropriate Type I error correction for ‘peeking’ employed , and a final stopping rule for data collection outlined.
o Full description of any outcome-neutral criteria that are required for successful testing of the study hypotheses. Such ‘reality checks’ might include the absence of floor or ceiling effects, or other appropriate baseline measures. Editors must ensure that such criteria are not used by reviewers to enforce dogma about accepted ‘truths’. That is, we must allow for the possibility that failure to show evidence for a critical ‘reality check’ can raise doubt about the truth of that accepted reality in the first place.
o Timeline for completion of the study and proposed resubmission date if registration review is successful. Extensions to this deadline can be arranged with the action editor.
· Pilot Data
o Optional. Can be included to establish reality checks, feasibility, or proof of principle. Any pilot data would be published with the final version of the manuscript and will be clearly distinguished from data obtained for the main experiment(s).
In considering papers in the registration stage, reviewers will be asked to assess:
- The significance of the research question(s)
- The logic, rationale, and plausibility of the proposed hypotheses
- The soundness and feasibility of the methodology and analysis pipeline
- Whether the level of methodological detail provided would be sufficient to duplicate exactly the proposed experimental procedures and analytic approach
Attempted replications of high profile studies would be welcomed. For replication attempts to be accepted, they must be regarded by the reviewers as significant and important regardless of outcome (i.e. having a high replication value  as was the case in the recent attempted replication of precognition effects ).
Manuscripts that pass registration review will be issued an in principle acceptance (IPA). This means that the manuscript is accepted for publication pending successful completion of the study according to the exact methods and analytic procedures outlined, as well as a defensible and evidence-based interpretation of the results.
Upon receiving IPA, authors will be informed that any deviation from the stated methods, regardless of how minor it may seem, will be lead to summary rejection of the manuscript. If the authors wish to alter the experimental procedures following IPA but still wish to publish it as a Registered Report in Cortex then the manuscript must be withdrawn and resubmitted as a new Stage 1 submission.
(b) Stage 2: Full manuscript review
Once the study is complete, the authors then prepare and resubmit their manuscript for full review, with the following additions:
· Submission of raw data and laboratory log
o Raw data must be made freely available via the website Figshare (or an alternative free service). Data files must be appropriately time stamped to show that it was collected after IPA and not before. Other than pre-registered and approved pilot data, no data acquired prior to the date of IPA is admissible in the final submission. Raw data must be accompanied by guidance notes, where required, to assist other scientists in replicating the analysis pipeline.
o The authors must collectively certify that all non-pilot data was collected after the date of IPA. A simple laboratory log will be provided outlining the range of dates during which data collection took place.
· Revisions to the Background and Rationale
o The stated hypotheses cannot be altered or appended. However, it is perfectly reasonable for the tone and content of an Introduction to be shaped by the results of a study. Moreover, depending on the timeframe of data collection, new relevant literature may have appeared between registration review and full manuscript review. Therefore, authors will be allowed to update at least part of the Introduction.
· Results & Discussion
o This will be included as per standard submissions. With one exception, all registered analyses must be included in the manuscript. The exception would be (very) rare instances where a registered and approved analysis is subsequently shown to be logically flawed or unfounded in the first place (i.e. the authors, reviewers, and editor made a collective error of judgment and must collectively agree that the analysis is, in fact, inappropriate). In such cases the analysis would still be mentioned in the Method but omitted from the Results (with the omission justified).
o It is sensible that authors may occasionally wish to include additional analyses that were not included in the registered submission; for instance, a new analytic approach might emerge between IPA and full review, or a particularly interesting and unexpected finding may emerge. Such analyses are admissible but must be clearly justified in the text, caveated, and reported in a separate section of the Results titled “Post hoc analyses”. Editors must ensure that authors do not base their conclusions entirely on the outcome of significant post hoc analyses.
o Authors will be required to report exact p values and effect sizes for all inferential tests.
The resubmission will ideally be considered by the same reviewers as in the registration stage, but could also be assessed by fresh reviewers. In considering papers at the full manuscript stage, reviewers will be asked to appraise:
- Whether the data are able to test the authors’ proposed hypotheses by passing the approved outcome-neutral criteria (such as absence of floor and ceiling effects)
- Whether any changes to the Introduction are reasonable and do not alter the rationale or hypotheses
- Whether the authors adhered precisely to the registered experimental procedures
- Whether any post-hoc analyses are justified, robust, and add to the informational content of the paper
- Whether the authors’ conclusions are justified given the data
Crucially, reviewers will be informed that editorial decisions will not be based on the perceived importance or clarity of the data. Thus while reviewers are free to enter such comments on the record, they will not influence editorial decisions.
Reviews will be anonymous. To maximise transparency, however, the anonymous reviews and authors’ response to reviewers will be published alongside the full paper in an online supplement.
It is possible that authors with IPA may seek to withdraw their manuscripts following or during data collection. Possible reasons could include technical error or an inability to complete the study due to other unforeseen circumstances. In all such cases, manuscripts can of course be withdrawn. However, the journal will publicly record each case in a section called Retracted Registrations. This will include the authors, proposed title, an abstract briefly outlining the original aim of the study, and brief reason(s) for the failure to complete the study. Partial retractions are not possible; i.e. authors cannot publish part of a registered study by selectively retracting one of the planned experiments. Such cases must lead to retraction of the entire paper.
3. Concerns, Responses and Discussion Points
Here follows a paraphrased Q & A, including some actual and hypothetical discussions about the proposal with colleagues.
1. Won’t Registered Reports just become a dumping ground for inconclusive null effects?
a. No. The required power level will increase the chances of detecting statistical significance when it reflects reality. Average power in psychology/cognitive neuroscience is low whereas IPA will be contingent on power of 0.9 or above. Thus, any non-significant findings will, by definition, be more conclusive than typically observed in the literature.
b. It is crucial that we provide a respected outlet for well-powered non-significant findings. This will help combat the file drawer effect and reduce the publication of false discoveries. Moreover, authors are welcome to propose superior alternatives to conventional null hypothesis testing, such as Bayesian approaches .
c. By guaranteeing publication prior to data being collected, this model would encourage authors to propose large scale studies for more definitive hypothesis testing – studies which investigators would otherwise be reluctant to pursue given the risk of yielding unpublishable null effects.
d. Registration review will be stringent, with reviewers asked to consider the methodology in detail for possible oversights and flaws that could prevent the study from testing the proposed hypotheses.
2. It all sounds too strict. Why would authors submit to this scheme when they can’t change even one small aspect of their experimental procedure without being ‘summarily rejected’? Even grant applications are not so demanding.
a. Yes it is stringent, and so it should be. This format of article is primarily intended for well-prepared scientists who have carefully considered their methodology and hypotheses in advance. And isn’t that how we ought to be doing science most of the time anyway?
b. Note that the strict methodological stringency is coupled with a complete lack of expectation of how the results should look. Whether an experiment supports the stated hypothesis is the one aspect of science that scientists (should) have no control over – yet the traditional publishing model encourages a host of dodgy practices to exert such control. This new model replaces the artificial and counterproductive ‘data stringency’ with constructive ‘methodological stringency’, and so would largely eliminate the pressure for scientists to submit data that perfectly fit their predictions or confirm someone’s theory. I believe many scientists would approach this model with relief rather than trepidation.
3. Authors could game the system by running a complete study as per usual and submitting the methodology for registration review after the fact.
a. No, raw data must be made freely available at the full review stage and time stamped for inspection, along with a laboratory log indicating that data collection took place between dates X and Y. Final submission must also be accompanied by a certification from each author that no data (other than approved pilot data) was collected prior to the date of IPA. Any violation of this rule would be considered misconduct; the article would be retracted by Cortex and referred to Retraction Watch.
4. What’s to stop unscrupulous reviewers stealing my ideas at the registration stage, running the experiments faster than I can (or rejecting my registration submission outright to buy time), and then publishing their own study?
a. This is a legitimate worry, and it is true that there is no perfect defense against bad practice. But we shouldn’t overstate this concern. Gazumping is rare and, in any case, is present in many areas of science. Fear of being scooped doesn’t stop us presenting preliminary data at conferences or writing grant applications. So why should we be so afraid of registration review?
b. Even if an unscrupulous reviewer decided to run a similar/identical experiment following IPA, the decision to publish would not be influenced. So being scooped would not cost the authors a publication once the authors pass IPA.
c. Unlike existing protocol journals, such as BMC Protocols, the IPA submission would not be published in advance of the main paper. So only the reviewers and editors would see it. This will reduce the chances of being gazumped.
5. A lot of the most interesting discoveries in science are serendipitous. Your approach will stifle creativity and data exploration.
a. No, it won’t. Authors will be allowed to include “post-hoc analyses” in the manuscript that were not in the registered submission. They simply won’t be able to pretend that such analyses were planned in advance or adjust their hypotheses to predict unexpected outcomes. And, sensibly, they won’t be able to base the conclusions of their study on the outcome of unplanned analyses – the original registered analyses would take precedence and must also be reported.
b. It should also be noted that a priori analyses in the registration stage could include exploration of possible serendipitous findings.
c. Serendipitous findings are, by their nature, rare. A far greater problem is the proliferation of false positives due to excessive post-hoc flexibility in analysis approaches. So let’s deal with the big problem first.
6. You propose allowing authors to alter the Introduction to include new literature. Doesn’t this create a slippery slope for changing the rationale or hypotheses too?
a. No, but we must be vigilant on this point. I think it is entirely sensible to allow revisions to the Introduction to contextualise the literature based on the findings and to focus on most recent publications that emerged following IPA. After all, we want readers to be engaged as well as informed. However, we must also ensure that such changes are reasonable. Monitoring this aspect in particular would be one of the central reviewing criteria at Stage 2 (see above). In a revised Introduction, the authors would not be permitted to alter the rationale for the study, to state new hypotheses, or to alter the existing hypotheses. These could be flagged in distinct sections of the Introduction that are untouchable following IPA.
7. What if the authors never submit a final manuscript because the results disagree with some desired outcome (such as supporting their preferred explanation)? How can you prevent publication bias on the part of the authors?
a. We can’t stop authors censoring themselves. As noted above, however, if a study is withdrawn following IPA then this will be noted in a Retracted Registrations section of the journal. So there would at least be a public record of the withdrawal and some explanation for why it happened.
b. Note also that if the authors have not submitted by their own stated deadline then the manuscript will be automatically withdrawn, considered retracted, and noted in the Retracted Registrations section. Extensions to the deadline are permissible following prior agreement with the action editor.
8. What would stop authors getting IPA, then running many more subjects than proposed and selectively including only the ones that support their desired hypothesis?
a. Nothing. But doing so is outright fraud, similar to the conduct of Dirk Smeesters . No mechanism can fully guard against fraud, and regular submissions under the traditional publishing route are equally vulnerable to such misbehaviour. Note also that the proposed model requires submission of raw data, which will help protect against such eventualities. Selective exclusion of subjects to attain statistical significance can be detected using the statistical methods developed by Uri Simonsohn . This alone will act as a significant deterrent to fraudsters.
9. How can IPA be guaranteed without knowing the author’s interpretation of the findings?
a. It isn’t. IPA ensures that the article cannot, and will not, be rejected based on the results themselves (with the exception of failing outcome-neutral reality checks, such as floor or ceiling effects, which prevent the stated hypotheses being appropriately tested). Manuscripts can still be rejected if the reviewers and editor believe the author’s interpretation is unreasonable given the data. And they will be rejected summarily if the authors change their experimental procedures in any way following IPA.
10. What if the authors obtain IPA but then realise (after data collection commenced) that part of their proposed methods or analyses were incorrect or suboptimal?
a. In the case of changes to the experimental procedures, the manuscript would have to be fully withdrawn but could be returned to Stage 1 for fresh registration review.
b. In this case of changes to the analysis approach, depending on the nature of the proposed change, Stage 2 may be able to proceed following a phase of interim review and discussion with the editor and reviewers (if all agree that a different form of analysis is preferable). In such cases, the original proposed analysis would still be published in the final article but may not be reported, and the reasons for excluding it would be acknowledged.
11. Cortex already has a long backlog of in-press articles. Adding yet another article format could make this problem worse.
a. I propose that each article published as a Registered Report takes the place of a standard research report, thus requiring similar journal space to the current model.
b. If registered reports become increasingly popular and well cited, the journal could gradually phase the standard report format out altogether, making registration reports the norm.
I hope I can convince you that Registration Reports would be a useful and valid initiative at Cortex. And even if not, I look forward to the ensuing discussion. Below is a list of key supporting references.
 Rosenthal R (1979). The file drawer problem and tolerance for null results. Psychological Bulletin, 86: 638–641.
 Thornton A & Lee P (2000). Publication bias in meta-analysis: its causes and consequences
Journal of Clinical Epidemiology, 53: 207–216.
 Chase, LJ & Chase, RB (1976). A statistical power analysis of applied psychological research. Journal of Applied Psychology, 61: 234-237.
 Tressoldi, PE (2012). Replication unreliability in psychology: elusive phenomena or "elusive" statistical power? Frontiers in Psychology, 3: 218.
 Simmons JP, Nelson LD, and Simonsohn U. (2011). False-positive psychology: Undisclosed
flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22: 359-66.
 Wagenmakers, EJ (2007). A practical solution to the pervasive problems of p values. Psychonomic Bulletin & Review, 14: 779–804.
 Masicampo, EJ & Lalande, DR (in press). A peculiar prevalence of p values just below .05. Quarterly Journal of Experimental Psychology.
 Ioannidis JPA (2005). Why Most Published Research Findings Are False. PLoS Medicine 2(8): e124. doi:10.1371/journal.pmed.0020124
 John, L, Loewenstein, G, & Prelec, D (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23: 524-532 DOI: 10.1177/0956797611430953
 Smith MB (1956). Editorial. Journal of Abnormal & Social Psychology, 52:1-4.
 Cohen, J (1962). The statistical power of abnormal – social psychological research: A review. Journal of Abnormal & Social Psychology, 65, 145‐153.
 Fang, FC, Steen, RG & Casadevalld, A. (2012) Misconduct accounts for the majority of retracted scientific publications. Proceedings of the National Academy of Sciences USA: 10.1073/pnas.1212247109
 Lane, DM & Dunlap, WP (1978). Estimating effect size: Bias resulting from the significance criterion in editorial decisions. British Journal of Mathematical and Statistical Psychology, 31: 107‐112.
 Hedges LV & Vevea, JL (1996). Estimating effect size under publication bias: Small sample properties and robustness of a random effects selection model. Journal of Educational and Behavioral Statistics, 21: 299-332.
 Strube, MJ (2006). SNOOP: A program for demonstrating the consequences of premature and repeated null hypothesis testing. Behavior Research Methods, 38: 24-27. Software available from here: http://www.artsci.wustl.edu/~socpsy/Snoop.7z
 Nosek, B. A., Spies, J. R., & Motyl, M. (in press). Scientific utopia: II. Restructuring incentives and practices to promote truth over publishability. Perspectives on Psychological Science. arxiv.org/pdf/1205.4251
 Ritchie SJ, Wiseman R, French CC (2012) Failing the Future: Three Unsuccessful Attempts to Replicate Bem’s ‘Retroactive Facilitation of Recall’ Effect. PLoS ONE 7(3): e33423. doi:10.1371/journal.pone.0033423
 Kruschke, JK (in press). Bayesian estimation supercedes the t test. Journal of Experimental Psychology: General. www.indiana.edu/~kruschke/BEST/BEST.pdf