Category Archives: Publishing in scientific journals

About publishing in scientific journals.

Not solely about that Bayes: Interview with Prof. Eric-Jan Wagenmakers

Last summer saw the publication of the most important work in psychology in decades: the Reproducibility Project (Open Science Collaboration, 2015; see here and here for context). It stirred up the community, resulting in many constructive discussions but also in verbally violent disagreement. What unites all parties, however, is the call for more transparency and openness in research.

Eric-Jan “EJ” Wagenmakers has argued for pre-registration of research (Wagenmakers et al., 2012; see also here) and direct replications (e.g., Boekel et al., 2015; Wagenmakers et al., 2015), for a clearer demarcation of exploratory and confirmatory research (de Groot, 1954/2013), and for a change in the way we analyze our data (Wagenmakers et al., 2011; Wagenmakers et al., in press).

Concerning the latter point, EJ is a staunch advocate of Bayesian statistics. With his many collaborators, he writes the clearest and wittiest exposures to the topic (e.g., Wagenmakers et al., 2016; Wagenmakers et al., 2010). Crucially, he is also a key player in opening Bayesian inference up to social and behavioral scientists more generally; in fact, the software JASP is EJ’s brainchild (see also our previous interview).

EJ

In sum, psychology is changing rapidly, both in how researchers communicate and do science, but increasingly also in how they analyze their data. This makes it nearly impossible for university curricula to keep up; courses in psychology are often years, if not decades, behind. Statistics classes in particular are usually boringly cookbook oriented and often fraught with misconceptions (Wagenmakers, 2014). At the University of Amsterdam, Wagenmakers succeeds in doing differently. He has previously taught a class called “Good Science, Bad Science”, discussing novel developments in methodology as well as supervising students in preparing and conducting direct replications of recent research findings (cf. Frank & Saxe, 2012).

Now, at the end of the day, testing undirected hypotheses using p values or Bayes factors only gets you so far – even if you preregister the heck out of it. To move the field forward, we need formal models that instantiate theories and make precise quantitative predictions. Together with Michael Lee, Eric-Jan Wagenmakers has written an amazing practical cognitive modeling book, harnessing the power of computational Bayesian methods to estimate arbitrarily complex models (for an overview, see Lee, submitted). More recently, he has co-edited a book on model-based cognitive neuroscience on how formal models can help bridge the gap between brain measurements and cognitive processes (Forstmann & Wagenmakers, 2015).

Long-term readers of the JEPS bulletin will note that topics ranging from openness of research, pre-registration and replication, and research methodology and Bayesian statistics are recurring themes. It has thus been only a matter of time for us to interview Eric-Jan Wagenmakers and ask him questions concerning all areas above. In addition, we ask: how does he stay so immensely productive? What tips does he have for students interested in an academic career; and what can instructors learn from “Good Science, Bad Science”? Enjoy the ride!


Bobby Fischer, the famous chess player, once said that he does not believe in psychology. You actually switched from playing chess to pursuing a career in psychology; tell us how this came about. Was it a good move?

It was an excellent move, but I have to be painfully honest: I simply did not have the talent and the predisposition to make a living out of playing chess. Several of my close friends did have that talent and went on to become international grandmasters; they play chess professionally. But I was actually lucky. For players outside of the world top-50, professional chess is a career trap. The pay is poor, the work insanely competitive, and the life is lonely. And society has little appreciation for professional chess players. In terms of creativity, hard work, and intellectual effort, an international chess grandmaster easily outdoes the average tenured professor. People who do not play chess themselves do not realize this.

Your list of publications gets updated so frequently, it should have its own RSS feed! How do you grow and cultivate such an impressive network of collaborators? Do you have specific tips for early career researchers?

At the start of my career I did not publish much. For instance, when I finished my four years of grad studies I think I had two papers. My current publication rate is higher, and part of that is due to an increase in expertise. It is just easier to write papers when you know (or think you know) what you’re talking about. But the current productivity is mainly due to the quality of my collaborators. First, at the psychology department of the University of Amsterdam we have a fantastic research master program. Many of my graduate students come from this program, having been tried and tested in the lab as RAs. When you have, say, four excellent graduate students, and each publishes one article a year, that obviously helps productivity. Second, the field of Mathematical Psychology has several exceptional researchers that I have somehow managed to collaborate with. In the early stages I was a graduate student with Jeroen Raaijmakers, and this made it easy to start work with Rich Shiffrin and Roger Ratcliff. So I was privileged and I took the opportunities that were given. But I also work hard, of course.

There is a lot of advice that I could give to early career researchers but I will have to keep it short. First, in order to excel in whatever area of life, commitment is key. What this usually means is that you have to enjoy what you are doing. Your drive and your enthusiasm will act as a magnet for collaborators. Second, you have to take initiative. So read broadly, follow the latest articles (I remain up to date through Twitter and Google Scholar), get involved with scientific organizations, coordinate a colloquium series, set up a reading group, offer your advisor to review papers with him/her, attend summer schools, etc. For example, when I started my career I had seen a new book on memory and asked the editor of Acta Psychologica whether I could review it for them. Another example is Erik-Jan van Kesteren, an undergraduate student from a different university who had attended one of my talks about JASP. He later approached me and asked whether he could help out with JASP. He is now a valuable member of the JASP team. Third, it helps if you are methodologically strong. When you are methodologically strong –in statistics, mathematics, or programming– you have something concrete to offer in a collaboration.

Considering all projects you are involved in, JASP is probably the one that will have most impact on psychology, or the social and behavioral sciences in general. How did it all start?

In 2005 I had a conversation with Mark Steyvers. I had just shown him a first draft of a paper that summarized the statistical drawbacks of p-values. Mark told me “it is not enough to critique p-values. You should also offer a concrete alternative”. I agreed and added a section about BIC (the Bayesian Information Criterion). However, the BIC is only a rough approximation to the Bayesian hypothesis test. Later I became convinced that social scientists will only use Bayesian tests when these are readily available in a user-friendly software package. About 5 years ago I submitted an ERC grant proposal “Bayes or Bust! Sensible hypothesis tests for social scientists” that contained the development of JASP (or “Bayesian SPSS” as I called it in the proposal) as a core activity. I received the grant and then we were on our way.

I should acknowledge that much of the Bayesian computations in JASP depend on the R BayesFactor package developed by Richard Morey and Jeff Rouder. I should also emphasize the contribution by JASPs first software engineer, Jonathon Love, who suggested that JASP ought to feature classical statistics as well. In the end we agreed that by including classical statistics, JASP could act as a Trojan horse and boost the adoption of Bayesian procedures. So the project started as “Bayesian SPSS”, but the scope was quickly broadened to include p-values.

JASP is already game-changing software, but it is under continuous development and improvement. More concretely, what do you plan to add in the near future? What do you hope to achieve in the long-term?

In terms of the software, we will shortly include several standard procedures that are still missing, such as logistic regression and chi-square tests. We also want to upgrade the popular Bayesian procedures we have already implemented, and we are going to create new modules. Before too long we hope to offer a variable views menu and a data-editing facility. When all this is done it would be great if we could make it easier for other researchers to add their own modules to JASP.

One of my tasks in the next years is to write a JASP manual and JASP books. In the long run, the goal is to have JASP be financially independent of government grants and university support. I am grateful for the support that the psychology department at the University of Amsterdam offers now, and for the support they will continue to offer in the future. However, the aim of JASP is to conquer the world, and this requires that we continue to develop the program “at break-neck speed”. We will soon be exploring alternative sources of funding. JASP will remain free and open-source, of course.

You are a leading advocate of Bayesian statistics. What do researchers gain by changing the way they analyze their data?

They gain intellectual hygiene, and a coherent answer to questions that makes scientific sense. A more elaborate answer is outlined in a paper that is currently submitted to a special issue for Psychonomic Bulletin & Review: https://osf.io/m6bi8/ (Part I).

The Reproducibility Project used different metrics to quantify the success of a replication – none of them really satisfactory. How can a Bayesian perspective help illuminate the “crisis of replication”?

As a theory of knowledge updating, Bayesian statistics is ideally suited to address questions of replication. However, the question “did the effect replicate?” is underspecified. Are the effect sizes comparable? Does the replication provide independent support for the presence of the effect? Does the replication provide support for the position of the proponents versus the skeptics? All these questions are slightly different, but each receives the appropriate answer within the Bayesian framework. Together with Josine Verhagen, I have explored a method –the replication Bayes factor– in which the prior distribution for the replication test is the posterior distribution obtained from the original experiment (e.g., Verhagen & Wagenmakers, 2014). We have applied this intuitive procedure to a series of recent experiments, including the multi-lab Registered Replication Report of Fritz Strack’s Facial Feedback hypothesis. In Strack’s original experiment, participants who held a pen with their teeth (causing a smile) judged cartoons to be funnier than participants who held a pen with their lips (causing a pout). I am not allowed to tell you the result of this massive replication effort, but the paper will be out soon.

You have recently co-edited a book on model-based cognitive neuroscience. What is the main idea here, and what developments in this area are most exciting to you?

The main idea is that much of experimental psychology, mathematical psychology, and the neurosciences pursue a common goal: to learn more about human cognition. So ultimately the interest is in latent constructs such as intelligence, confidence, memory strength, inhibition, and attention. The models that have been developed in mathematical psychology are able to link these latent constructs to specific model parameters. These parameters may in turn be estimated by behavioral data, by neural data, or by both data sets jointly. Brandon Turner is one of the early career mathematical psychologists who has made great progress in this area. So the mathematical models are a vehicle to achieve an integration of data from different sources. Moreover, insights from neuroscience can provide important constraints that help inform mathematical modeling. The relation is therefore mutually beneficial. This is summarized in the following paper: http://www.ejwagenmakers.com/2011/ForstmannEtAl2011TICS.pdf

One thing that distinguishes science from sophistry is replication; yet it is not standard practice. In “Good Science, Bad Science”, you had students prepare a registered replication plan. What was your experience teaching this class? What did you learn from the students?

This was a great class to teach. The students were highly motivated and oftentimes it felt more like lab-meeting than like a class. The idea was to develop four Registered Report submissions. Some time has passed, but the students and I still intend to submit the proposals for publication.

The most important lesson this class has taught me is that our research master students want to learn relevant skills and conduct real research. In the next semester I will teach a related course, “Good Research Practices”, and I hope to attain the same high levels of student involvement. For the new course, I plan to have students read a classic methods paper that identifies a fallacy; next the students will conduct a literature search to assess the current prevalence of the fallacy. I have done several similar projects, but never with master students (e.g., http://www.ejwagenmakers.com/2011/NieuwenhuisEtAl2011.pdf and http://link.springer.com/article/10.3758/s13423-015-0913-5).

What tips and tricks can you share with instructors planning to teach a similar class?

The first tip is to set your aims high. For a research master class, the goal should be publication. Of course this may not always be realized, but it should be the goal. It helps if you can involve colleagues or graduate students. If you set your aims high, the students know that you take them seriously, and that their work matters. The second tip is to arrange the teaching so that the students do most of the work. The students need to develop a sense of ownership about their projects, and they need to learn. This will not happen if you treat the students as passive receptacles. I am reminded of a course that I took as an undergraduate. In this course I had to read chapters, deliver presentations, and prepare questions. It was one of the most enjoyable and inspiring courses I had ever taken, and it took me decades to realize that the professor who taught the course actually did not have to do much at all.

Many scholarly discussions these days take place on social media and blogs. You’ve joined twitter yourself over a year ago. How do you navigate the social media jungle, and what resources can you recommend to our readers?

I am completely addicted to Twitter, but I also feel it makes me a better scientist. When you are new to Twitter, I recommend that you start by following a few people that have interesting things to say. Coming from a Bayesian perspective, I recommend Alexander Etz (@AlxEtz) and Richard Morey (@richarddmorey). And of course it is essential to follow JASP (@JASPStats). As is the case for all social media, the most valuable resource you have is the “mute” option. Prevent yourself from being swamped by holiday pictures and exercise it ruthlessly.

Facebooktwitterrss

Publishing a Registered Report as a Postgraduate Researcher

Registered Reports (RRs) are a new publishing format pioneered by the journal Cortex (Chambers 2013). This publication format emphasises the process of rigorous research, rather than the results, in an attempt to avoid questionable research practices such as p-hacking and HARK-ing, which ultimately reduce the reproducibility of research and contribute to publication bias in cognitive science (Chambers et al. 2014). A recent JEPS post by Dablander (2016) and JEPS’ own editorial for adopting RRs (King et al. 2016) have given a detailed explanation of the RR process. However, you may have thought that publishing a RR is reserved for only senior scientists, and is not a viable option for a postgraduate student. In fact, 5 out of 6 of the first RRs published by Cortex have had post-graduate students as authors, and publishing by RR offers postgraduates and early career researchers many unique benefits.

In the following article you will hear about the experience of Dr. Hannah Hobson, who published a RR in the journal Cortex as a part of her PhD project. I spoke to Hannah about the planning that was involved, the useful reviewer comments she received, and asked her what tips she has for postgraduates interested in publishing a RR. Furthermore, there are some comments from Professor Chris Chambers who is a section editor for Cortex on how postgraduates can benefit from using this publishing format.

Interview with Dr. Hannah Hobson

Hannah completed her PhD project on children’s behavioural imitation skills, and potential neurophysiological measures of the brain systems underlying imitation. Her PhD was based at the University of Oxford, under the supervision of Professor Dorothy Bishop. During her studies, Hannah became interested in mu suppression, an EEG measure purported to reflect the activity of the human mirror neuron system. However, she was concerned that much of research on mu suppression suffered from methodological problems, despite this measure being widely used in social cognitive neuroscience. Hannah and Dorothy thought it would be appropriate to publish a RR to focus on some of these issues. This study was published in the journal Cortex, and investigated whether mu suppression is a good measure of the human mirror neuron system (Hobson and Bishop 2016). I spoke to Hannah about her project and what her experience of publishing a RR was like during her PhD.

 

As you can hear from Hannah’s experience, publishing a RR was beneficial in ways that would not be possible with standard publishing formats. However, they are not suitable for every study. Drawing from Hannah’s experience and Chris Chambers’ role in promoting RRs, the main strengths and concerns for postgraduate students publishing a RR are summarised below.

Strengths

Reproducible findings

It has been highlighted that the majority of psychological studies suffer from low power. As well as limiting the chances of finding an effect, low-powered studies are more likely to lack reproducibility as they overemphasise the effect size (Button et al. 2013). As a part of the stage one submission, a formal power analysis needs to be performed to identify the number of participants required for a high powered study (>90%). Therefore, PhD studies published as RRs will have greater power and reproducibility in comparison to the average unregistered study (Chambers et al. 2014).

More certainty over publications

The majority of published PhD studies begin to emerge during the final year or during your first post-doctoral position. As the academic job markets becomes ever more competitive, publications are essential. As Professor Chambers notes, RRs “enable PhD students to list provisionally accepted papers on their CVs by the time they submit their PhDs”. Employers will see greater certainty in a RR with stage one approval than the ‘in preparation’ listed next to innumerable papers following the standard publishing format.

Lower rejection rate at stage two submission

Although reaching stage one approval is more difficult due to the strict methodological rigour required, there is greater certainty in the eventual outcome of the paper once you have in-principal acceptance. In Cortex, approximately 90% of unregistered reports are rejected upon submission, but only 10% of RRs which reach stage one review have been rejected, with none being rejected so far with in-principal acceptance.

“This means you are far more likely to get your paper accepted at the first journal you submit to, reducing the tedious and time-wasting exercise of submitting down a chain of journals after your work is finished and you may already be competing on the job market”. – Professor Chris Chambers

As Dorothy Bishop explains in her blog, once you have in-principle acceptance you are in control of the timing of the publication (Bishop 2016). This means that you will have a publication in print during your PhD, as opposed to starting to submit papers towards the end which may only be ‘in preparation’ by the time of your viva voce.

Constructive reviewer comments

As the rationale and methodology is peer-reviewed before the data-collection process, reviewers are able to make suggestions to improve the design of your study. In Hannah’s experience, a reviewer pointed out an issue with her control stimuli. If she had conducted the study following the standard format, reviewers would only be able to point this out retrospectively when there is no option to change it. This experience will also be invaluable during your viva voce. As you defend your work in front of the examiners, you know your study has already gone through several rounds of review, so you can be confident in how robust it is.

Things to consider

Time restraints

Recruiting and testing participants is a lengthy process, and you often encounter a series of setbacks. If you are already in the middle of your PhD, then you may not have time to go through stage one submission before collecting your data. In Hannah’s case, publishing a RR was identified early in the project which provided a sufficient amount of time to complete it during her PhD. If you are interested in RRs, it is advisable to start the submission process as early into your PhD as possible. You may even want to start the discussion during the interview process.

Ethics merry-go-round

During stage one submission, you need to provide evidence that you already have ethical approval. If the reviewers want you to make changes to the methodology, this may necessitate amending your ethics application. In busy periods, this process of going back and forth between the reviewers and your ethics committee can become time-consuming. As time constraints is the pertinent concern for postgraduates publishing a RR, this is an additional hurdle that must be negotiated. Whilst there is no easy solution to this problem, aiming to publish a RR must be identified early in your project to ensure you will have enough time, and have a back-up plan prepared for if things do not work out.

RRs are not available in every journal

Although there has been a surge in journals offering RRs, they are not available in every one. Your research might be highly specialised and the key journal in your area may not offer the option of a RR. If your research does not fit into the scope of a journal that offers RRs, you may not have the option to publish your study as a RR. Whist there is no simple solution for this, there is a regular list of journals offering RRs on the Open Science Framework (OSF).

Supervisor conflict

Although there are a number of prominent researchers behind the initiative (Guardian Open Letter 2013), there is not universal agreement with some researchers voicing concerns (Scott 2013, although see Chambers et al. 2014 for a rebuttal to many common concerns). There have been some vocal critics of RRs, and one of these critics might end up being your supervisor. If you want to conduct a RR as a part of your PhD and your supervisor is against it, there may be some conflict. Again, it is best to identify early on in your PhD if you want to publish a RR, and make sure both you and your supervisor are on the same page.

Conclusion

Publishing a RR as a postgraduate researcher is a feasible option that provides several benefits, both to the individual student and to wider scientific progress. Research published as a RR is more likely to produce reproducible findings, due to the necessary high level of power, reviewers’ critique before data collection, and guards against questionable research practices such as p-hacking or HARK-ing. Providing the work is carried out as agreed, a study that has achieved stage one approval is likely to be published, allowing students the opportunity to publish their hard work, even if the findings are negative. Moreover, going through several rounds of peer-review on the proposed methodology provides an additional layer of rigour (good for science), that aids your defence in your viva voce (good for you). Of course, it is not all plain sailing and there are a several considerations students will need to make before embarking on an RR. Nonetheless, despite these concerns, this publishing format is a step in the right direction for ensuring that robust research is being conducted right down to the level of postgraduate students.

If you like the idea but do not think formal pre-registration with a journal is suitable for your project, perhaps consider using the OSF. The OSF is a site where researchers can timestamp their hypotheses and planned analyses, allowing them to develop hypothesis-driven research habits. In one research group, it is necessary for all studies ranging from undergraduate projects to grant-funded projects to be registered on third-party websites such as the OSF (Munafò 2015). Some researchers such as Chris Chambers have even made it a requirement for applicants wanting to join their group to demonstrate a prior commitment to open science practices (Chambers 2016). Starting to pre-register your studies and publish RRs as a postgraduate student demonstrates this commitment, and will prove to be crucial as open science practices become an essential criterion in recruitment.

“To junior researchers I would say that pre-registration — especially as a Registered Report — is an ideal option for publishing high-quality, hypothesis-driven research that reflects an investment both in good science and your future career” – Professor Chris Chambers 

Pre-registration and RRs are both initiatives to improve the rigour and transparency of psychological science (Munafò et al. 2014). These initiatives are available to us as research students, and it is not just the responsibility of senior academics to fight against questionable research practises. We can join in too.

Acknowledgements

Thank you to Dr. Hannah Hobson who was happy to talk about her experience as a PhD student and for her expertise in recording the interview. Hannah also helped to write and revise the post. I would also like to thank Professor Chris Chambers for taking the time to provide some comments for the post.

Facebooktwitterrss

Replicability and Registered Reports

Last summer saw the publication of a monumental piece of work: the reproducibility project (Open Science Collaboration, 2015). In a huge community effort, over 250 researchers directly replicated 100 experiments initially conducted in 2008. Only 39% of the replications were significant at the 5% level. Average effect size estimates were halved. The study design itself—conducting direct replications on a large scale—as well as its outcome are game-changing to the way we view our discipline, but students might wonder: what game were we playing before, and how did we get here?

In this blog post, I provide a selective account of what has been dubbed the “reproducibility crisis”, discussing its potential causes and possible remedies. Concretely, I will argue that adopting Registered Reports, a new publishing format recently also implemented in JEPS (King et al., 2016; see also here), increases scientific rigor, transparency, and thus replicability of research. Wherever possible, I have linked to additional resources and further reading, which should help you contextualize current developments within psychological science and the social and behavioral sciences more general.

How did we get here?

In 2005, Ioannidis made an intriguing argument. Because the prior probability of any hypothesis being true is low, researchers continuously running low powered experiments, and as the current publishing system is biased toward significant results, most published research findings are false. Within this context, spectacular fraud cases like Diederik Stapel (see here) and the publication of a curious paper about people “feeling the future” (Bem, 2011) made 2011 a “year of horrors” (Wagenmakers, 2012), and toppled psychology into a “crisis of confidence” (Pashler & Wagenmakers, 2012). As argued below, Stapel and Bem are emblematic of two highly interconnected problems of scientific research in general.

Publication bias

Stapel, who faked results of more than 55 papers, is the reductio ad absurdum of the current “publish or perish” culture[1]. Still, the gold standard to merit publication, certainly in a high impact journal, is p < .05, which results in publication bias (Sterling, 1959) and file-drawers full of nonsignificant results (Rosenthal, 1979; see Lane et al., 2016, for a brave opening; and #BringOutYerNulls). This leads to a biased view of nature, distorting any conclusion we draw from the published literature. In combination with low-powered studies (Cohen, 1962; Button et al., 2013; Fraley & Vazire; 2014), effect size estimates are seriously inflated and can easily point in the wrong direction (Yarkoni, 2009; Gelman & Carlin, 2014). A curious consequence is what Lehrer has titled “the truth wears off” (Lehrer, 2010). Initially high estimates of effect size attenuate over time, until nothing is left of them. Just recently, Kaplan and Lirvin reported that the proportion of positive effects in large clinical trials shrank from 57% before 2000 to 8% after 2000 (Kaplan & Lirvin, 2015). Even a powerful tool like meta-analysis cannot clear the view of a landscape filled with inflated and biased results (van Elk et al., 2015). For example, while meta-analyses concluded that there is a strong effect of ego-depletion of Cohen’s d=.63, recent replications failed to find an effect (Lurquin et al., 2016; Sripada et al., in press)[2].

Garden of forking paths

In 2011, Daryl Bem reported nine experiments on people being able to “feel to future” in the Journal of Social and Personality Psychology, the flagship journal of its field (Bem, 2011). Eight of them yielded statistical significance, p < .05. We could dismissively say that extraordinary claims require extraordinary evidence, and try to sail away as quickly as possible from this research area, but Bem would be quick to steal our thunder.

A recent meta-analysis of 90 experiments on precognition yielded overwhelming evidence in favor of an effect (Bem et al., 2015). Alan Turing, discussing research on psi related phenomena, famously stated that

“These disturbing phenomena seem to deny all our usual scientific ideas. How we should like to discredit them! Unfortunately, the statistical evidence, at least of telepathy, is overwhelming.” (Turing, 1950, p. 453; cf. Wagenmakers et al., 2015)

How is this possible? It’s simple: Not all evidence is created equal. Research on psi provides us with a mirror of “questionable research practices” (John, Loewenstein, & Prelec, 2012) and researchers’ degrees of freedom (Simmons, Nelson, & Simonsohn, 2011), obscuring the evidential value of individual experiments as well as whole research areas[3]. However, it would be foolish to dismiss this as being a unique property of obscure research areas like psi. The problem is much more subtle.

The main issue is that there is a one-to-many mapping from scientific to statistical hypotheses[4]. When doing research, there are many parameters one must set; for example, should observations be excluded? Which control variables should be measured? How to code participants’ responses? What dependent variables should be analyzed? By varying only a small number of these, Simmons et al. (2011) found that the nominal false positive rate of 5% skyrocketed to over 60%. They conclude that the “increased flexibility allows researchers to present anything as significant.” These issues are elevated by providing insufficient methodological detail in research articles, by a low percentage of researchers sharing their data (Wicherts et al., 2006; Wicherts, Bakker, & Molenaar, 2011), and in fields that require complicated preprocessing steps like neuroimaging (Carp, 2012; Cohen, 2016; Luck and Gaspelin, in press).

An important amendment is that researchers need not be aware of this flexibility; a p value might be misleading even when there is no “p-hacking”, and the hypothesis was posited ahead of time (i.e. was not changed after the fact—HARKing; Kerr, 1992). When decisions are contingent on the data are made in an environment in which different data would lead to different decisions, even when these decisions “just make sense,” there is a hidden multiple comparison problem lurking (Gelman & Loken, 2014). Usually, when conducting N statistical tests, we control for the number of tests in order to keep the false positive rate at, say, 5%. However, in the aforementioned setting, it is not clear what N should be exactly. Thus, results of statistical tests lose their meaning and carry little evidential value in such exploratory settings; they only do so in confirmatory settings (de Groot, 1954/2014; Wagenmakers et al., 2012). This distinction is at the heart of the problem, and gets obscured because many results in the literature are reported as confirmatory, when in fact they may very well be exploratory—most frequently, because of the way scientific reporting is currently done, there is no way for us to tell the difference.

To get a feeling for the many choices possible in statistical analysis, consider a recent paper in which data analysis was crowdsourced from 29 teams (Silberzahn et al., submitted). The question posited to them was whether dark-skinned soccer players are red-carded more frequently. The estimated effect size across teams ranged from .83 to 2.93 (odds ratios). Nineteen different analysis strategies were used in total, with 21 unique combinations of covariates; 69% found a significant relationship, while 31% did not.

A reanalysis of Berkowitz et al. (2016) by Michael Frank (2016; blog here) is another, more subtle example. Berkowitz and colleagues report a randomized controlled trial, claiming that solving short numerical problems increase children’s math achievement across the school year. The intervention was well designed and well conducted, but still, Frank found that, as he put it, “the results differ by analytic strategy, suggesting the importance of preregistration.”

Frequently, the issue is with measurement. Malte Elson—whose twitter is highly germane to our topic—has created a daunting website that lists how researchers use the Competitive Reaction Time Task (CRTT), one of the most commonly used tools to measure aggressive behavior. It states that there are 120 publications using the CRTT, which in total analyze the data in 147 different ways!

This increased awareness of researchers’ degrees of freedom and the garden of forking paths is mostly a product of this century, although some authors have expressed this much earlier (e.g., de Groot, 1954/2014; Meehl, 1985; see also Gelman’s comments here). The next point considers an issue much older (e.g., Berkson, 1938), but which nonetheless bears repeating.

Statistical inference

In psychology and much of the social and behavioral sciences in general, researchers overly rely on null hypothesis significance testing and p values to draw inferences from data. However, the statistical community has long known that p values overestimate the evidence against H0 (Berger & Delampady, 1987; Wagenmakers, 2007; Nuzzo, 2014). Just recently, the American Statistical Association released a statement drawing attention to this fact (Wasserstein & Lazar, 2016); that is, in addition to it being easy to obtain p < .05 (Simmons, Nelson, & Simonsohn, 2011), it is also quite a weak standard of evidence overall.

The last point is quite pertinent because the statement that 39% of replications in the reproducibility project were “successful” is misleading. A recent Bayesian reanalysis concluded that the original studies themselves found weak evidence in support of an effect (Etz & Vandekerckhove, 2016), reinforcing all points I have made so far.

Notwithstanding the above, p < .05 is still the gold standard in psychology, and is so for intricate historical reasons (cf., Gigerenzer, 1993). At JEPS, we certainly do not want to echo calls nor actions to ban p values (Trafimow & Marks, 2015), but we urge students and their instructors to bring more nuance to their use (cf., Gigerenzer, 2004).

Procedures based on classical statistics provide different answers from what most researchers and students expect (Oakes, 1986; Haller & Krauss; 2002; Hoekstra et al., 2014). To be sure, p values have their place in model checking (e.g., Gelman, 2006—are the data consistent with the null hypothesis?), but they are poorly equipped to measure the relative evidence for H1 or H0 brought about by the data; for this, researchers need to use Bayesian inference (Wagenmakers et al., in press). Because university curricula often lag behind current developments, students reading this are encouraged to advance their methodological toolbox by browsing through Etz et al. (submitted) and playing with JASP[5].

Teaching the exciting history of statistics (cf. Gigerenzer et al., 1989; McGrayne, 2012), or at least contextualizing the developments of currently dominating statistical ideas, is a first step away from their cookbook oriented application.

Registered reports to the rescue

While we can only point to the latter, statistical issue, we can actually eradicate the issue of publication bias and the garden of forking paths by introducing a new publishing format called Registered Reports. This format was initially introduced to the journal Cortex by Chris Chambers (Chambers, 2013), and it is now offered by more than two dozen journals in the fields of psychology, neuroscience, psychiatry, and medicine (link). Recently, we have also introduced this publishing format at JEPS (see King et al., 2016).

Specifically, researchers submit a document including the introduction, theoretical motivation, experimental design, data preprocessing steps (e.g., outlier removal criteria), and the planned statistical analyses prior to data collection. Peer review only focuses on the merit of the proposed study and the adequacy of the statistical analyses[5]. If there is sufficient merit to the planned study, the authors are guaranteed in-principle acceptance (Nosek & Lakens, 2014). Upon receiving this acceptance, researchers subsequently carry out the experiment, and submit the final manuscript. Deviations from the first submissions must be discussed, and additional statistical analyses are labeled exploratory.

In sum, by publishing regardless of the outcome of the statistical analysis, registered reports eliminate publication bias; by specifying the hypotheses and analysis plan beforehand, they make apparent the distinction between exploratory and confirmatory studies (de Groot 1954/2014), avoid the garden of forking paths (Gelman & Loken, 2014), and guard against post-hoc theorizing (Kerr, 1998).

Even though registered reports are commonly associated with high power (80-95%), this is unfeasible for student research. However, note that a single study cannot be decisive in any case. Reporting sound, hypothesis-driven, not-cherry-picked research can be important fuel for future meta-analysis (for an example, see Scheibehenne, Jamil, & Wagenmakers, in press).

To avoid possible confusion, note that preregistration is different from Registered Reports: The former is the act of specifying the methodology before data collection, while the latter is a publishing format. You can preregister your study on several platforms such as the Open Science Framework or AsPredicted. Registered reports include preregistration but go further and have the additional benefits such as peer review prior to data collection and in-principle acceptance.

Conclusion

In sum, there are several issues impeding progress in psychological science, most pressingly the failure to distinguish between exploratory and confirmatory research, and publication bias. A new publishing format, Registered Reports, provides a powerful means to address them both, and, to borrow a phrase from Daniel Lakens, enable us to “sail away from the seas of chaos into a corridor of stability” (Lakens & Evers, 2014).

Suggested Readings

  • Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716.
  • Wagenmakers, E. J., Wetzels, R., Borsboom, D., van der Maas, H. L., & Kievit, R. A. (2012). An agenda for purely confirmatory research. Perspectives on Psychological Science, 7(6), 632-638.
  • Gelman, A., & Loken, E. (2014). The Statistical Crisis in Science. American Scientist, 102(6), 460-465.
  • King, M., Dablander, F., Jakob, L., Agan, M., Huber, F., Haslbeck, J., & Brecht, K. (2016). Registered Reports for Student Research. Journal of European Psychology Students, 7(1), 20-23
  • Twitter (or you might miss out)

Footnotes

[1] Incidentally, Diederik Stapel published a book about his fraud. See here for more.

[2] Baumeister (2016) is a perfect example of how not to respond to such a result. Michael Inzlicht shows how to respond adequately here.

[3] For a discussion of these issues with respect to the precognition meta-analysis, see Lakens (2015) and Gelman (2014).

[4] Another related, crucial point is the lack of theory in psychology. However, as this depends on whether you read the Journal of Mathematical Psychology or, say, Psychological Science, it is not addressed further. For more on this point, see for example Meehl (1978), Gigerenzer (1998), and a class by Paul Meehl which has been kindly converted to mp3 by Uri Simonsohn.

[5] However, it would be premature to put too much blame on p. More pressingly, the misunderstandings and misuse of this little fellow point towards a catastrophic failure in undergraduate teaching of statistics and methods classes (for the latter, see Richard Morey’s recent blog post). Statistics classes in psychology are often boringly cookbook oriented, and so students just learn the cookbook. If you are an instructor, I urge you to have a look at “Statistical Rethinking” by Richard McElreath. In general, however, statistics is hard, and there are many issues transcending the frequentist versus Bayesian debate (for examples, see Judd, Westfall, and Kenny, 2012; Westfall & Yarkoni, 2016).

[6] Note that JEPS already publishes research regardless of whether p < .05. However, this does not discourage us from drawing attention to this benefit of Registered Reports, especially because most other journals have a different policy.

This post was edited by Altan Orhon.

Facebooktwitterrss

Meet the Authors

Do you wish to publish your work but don’t know how to get started? We asked some of our student authors, Janne Hellerup Nielsen, Dimitar Karadzhov, and Noelle Sammon, to share their experience of getting published.

Janne Hellerup Nielsen is a psychology graduate from Copenhagen University. Currently, she works in the field of selection and recruitment within the Danish Defence. She is the first author of the research article “Posttraumatic Stress Disorder among Danish Soldiers 2.5 Years after Military Deployment in Afghanistan: The Role of Personality Traits as Predisposing Risk Factors”. Prior to this publication, she had no experience with publishing or peer review but she decided to submit her research to JEPS because “it is a peer reviewed journal and the staff at JEPS are very helpful, which was a great help during the editing and publishing process.”

Dimitar Karadzhov moved to Glasgow, United Kingdom to study psychology (bachelor of science) at the University of Glasgow. He completed his undergraduate degree in 2014 and he is currently completing a part-time master of science in global mental health at the University of Glasgow. He is the author of “Assessing Resilience in War-Affected Children and Adolescents: A Critical Review”. Prior to this publication, he had no experience with publishing or peer review. Now having gone through the publication process, he recommends fellow students to submit their work because “it is a great research and networking experience.”

Noelle Sammon has an honors degree in business studies. She returned to study in university in 2010 and completed a higher diploma in psychology in the National University of Ireland, Galway. She is currently completing a master’s degree in applied psychology at the University of Ulster, Northern Ireland. She plans to pursue a career in clinical psychology. She is the first author of the research article “The Impact of Attention on Eyewitness Identification and Change Blindness”. Noelle had some experience with the publication process while previously working as a research assistant. She describes her experience with JEPS as follows: “[It was] very professional and a nice introduction to publishing research. I found the editors that I was in contact with to be really helpful in offering guidance and support. Overall, the publication process took approximately 10 months from start to finish but having had the opportunity to experience this process, I would encourage other students to publish their research.”

How did the research you published come about?

Janne: “During my psychology studies, I had an internship at a research center in the Danish Defence. Here I was a part of a big prospective study regarding deployed soldiers and their psychological well-being after homecoming. I was so lucky to get to use the data from the research project to conduct my own studies regarding personality traits and the development of PTSD. I’ve always been interested in differential psychology—for example, why people manage the same traumatic experiences differently. Therefore, it was a great opportunity to do research within the field of personality traits and the development of PTSD, and even to do so with some greatly experienced supervisors, Annie and Søren.”

Dimitar: “In my final year of the bachelor of science degree in psychology, I undertook a critical review module. My assigned supervisor was liberal enough and gave me complete freedom to choose the topic I would like to write about. I then browsed a few The Psychologist editions I had for inspiration and was particularly interested in the area of resilience from a social justice perspective. Resilience is a controversial and fluid concept, and it is key to recovery from traumatic events such as natural disasters, personal trauma, war, terrorism, etc. It originates from biomedical sciences and it was fascinating to explore how such a concept had been adopted and researched by the social and humanitarian sciences. I was intrigued to research the similarities between biological resilience of human and non-human animals and psychological resilience in the face of extremely traumatic experiences such as war. To add an extra layer of complexity, I was fascinated by how the most vulnerable of all, children and adolescents, conceptualize, build, maintain, and experience resilience. From a researcher’s perspective, one of the biggest challenges is to devise and apply methods of inquiry in order to investigate the concept of resilience in the most valid, reliable, and culturally appropriate manner. The quantitative–qualitative dyad was a useful organizing framework for my work and it was interesting to see how it would fit within the resilience discourse.”

Noelle: “The research piece was my thesis project for the higher diploma (HDIP). I have always had an interest in forensic psychology. Moreover, while attending the National University of Ireland, Galway as part of my HDIP, I studied forensic psychology. This got me really interested in eyewitness testimony and the overwhelming amount of research highlighting the problematic reliability with it.”

What did you enjoy most in your research and what did you find difficult?

Janne: “There is a lot of editing and so forth when you publish your research, but then again it really makes sense because you have to be able to communicate the results of your research out to the public. To me, that is one of the main purposes of research: to be able to share the knowledge that comes out of it.”

Dimitar: “[I enjoyed] my familiarization with conflicting models of resilience (including biological models), with the origins and evolution of the concept, and with the qualitative framework for investigation of coping mechanisms in vulnerable, deprived populations. In the research process, the most difficult part was creating a coherent piece of work that was very informative and also interesting and readable, and relevant to current affairs and sociopolitical processes in low- and middle-income countries. In the publication process, the most difficult bit was ensuring my work adhered to the publication standards of the journal and addressing the feedback provided at each stage of the review process within the time scale requested.”

Noelle: “I enjoyed developing the methodology to test the research hypothesis and then getting the opportunity to test it. [What I found difficult was] ensuring the methodology would manipulate the variables required.”

How did you overcome these difficulties?

Janne: “[By] staying focused on the goal of publishing my research.”

Dimitar: “With persistence, motivation, belief, and a love for science! And, of course, with the fantastic support from the JEPS publication staff.”

Noelle: “I conducted a pilot using a sample of students asking them to identify any problems with materials or methodology that may need to be altered.”

What did you find helpful when you were doing your research and writing your paper?

Janne: “It was very important for me to get competent feedback from experienced supervisors.”

Dimitar: “Particularly helpful was reading systematic reviews, meta-analyses, conceptual papers, and methodological critique.”

Noelle: “I found my supervisor to be very helpful when conducting my research. In relation to the write-up of the paper, I found that having peers and non-psychology friends read and review my paper helped ensure that it was understandable, especially for lay people.”

Finally, here are some words of wisdom from our authors.

Janne: “Don’t think you can’t do it. It requires some hard work, but the effort is worth it when you see your research published in a journal.”

Dimitar: “Choose a topic you are truly passionate about and be prepared to explore the problem from multiple perspectives, and don’t forget about the ethical dimension of every scientific inquiry. Do not be afraid to share your work with others, look for feedback, and be ready to receive feedback constructively.”

Noelle: “When conducting research it is important to pick an area of research that you are interested in and really refine the research question being asked. Also, if you are able to get a colleague or peer to review it for you, do so.”

We hope our authors have inspired you to go ahead and make that first step towards publishing your research. We welcome your submissions anytime! Our publication guidelines can be viewed here. We also prepared a manual for authors that we hope will make your life easier. If you do have questions, feel free to get in touch at journal@efpsa.org.

This post was edited by Altan Orhon.

Facebooktwitterrss

How not to worry about APA style

If you have gone through the trouble of picking up a copy of the Publication Manual of the American Psychological Association (APA, 2010), I’m sure your first reaction was similar to mine: “Ugh! 272 pages of boredom.” Do people actually read this monster? I don’t know. I don’t think so. I know I haven’t read every last bit of it. You may be relieved to hear that your reaction resonates with some of the critique that has been voiced by senior researchers in Psychology, such as Henry L. Roediger III (2004). But let’s face it: APA style is not going anywhere. It is one of the major style regimes in academia and is used in many fields other than Psychology, including medical and other public health journals. And to be fair, standardizing academic documents is not a bad idea. It helps readers to efficiently access the desired information. It helps authors by making the journal’s expectations regarding style explicit, and it helps reviewers to concentrate on the content of a manuscript. Most importantly, the guidelines set a standard that is accepted by a large number of outlets. Imagine a world in which you had to familiarize yourself with a different style every time you chose a new outlet for your scholarly work. Continue reading

Facebooktwitterrss

Answering Frequently Asked Questions about JEPS

Is there anything you ever wanted to know about JEPS and the people behind it? Here are answers to our ten most frequently asked questions.

  1.  Who are we?

We are students from all over Europe and, as Editorial Team of the Journal of European Psychology Students (check out our Website here), we run JEPS.  Together with a group of other people (Associate Editors, Reviewers, Copyeditors, and Proofreaders), we see students’ manuscripts through the publication process.

Continue reading

Facebooktwitterrss

Most frequent APA mistakes at a glance

APA-guidelines, don’t we all love them? As an example, take one simple black line used to separate words – the hyphen: not only do you have to check whether a term needs a hyphen or a blank space will suffice, you also have to think about the different types of hyphens (Em-dash, En-dash, minus, and hyphen). Yes, it is not that much fun. And at JEPS we often get the question: why do we even have to adhere to those guidelines?

APA_errors

Common APA Errors; Infographic taken from the EndNote Blog http://bit.ly/1uWDqnO

The answer is rather simple: The formatting constraints imposed by journals enable for the emphasis to be placed on the manuscript’s content during the review process. The fact that all manuscripts submitted share the same format allows for the Reviewers to concentrate on the content without being distracted by unfamiliar and irregular formatting and reporting styles.

The Publication Manual counts an impressive 286 pages and causes quite some confusion. In JEPS, we have counted the most frequent mistakes in manuscripts submitted to us – data that the EndNote-blog has translated into this nice little graphic.

Here you can find some suggestions on how to avoid these mistakes in the first place.

 References

American Psychological Association. (2009). Publication Manual of the American Psychological Association (6th ed.). Washington, DC: American Psychological Association.

Vainre, M. (2011). Common mistakes made in APA style. JEPS Bulletin, retrieved from http://blog.efpsa.org/2011/11/20/common-mistakes-made-in-apa-style/

Facebooktwitterrss

“Set the default to ‘Open'” – Impressions from the OpenCon2014

In November 2014, 150 early-career researchers and students met in Washington D.C. for OpenCon, organized by the Right to Research Coalition, to talk about the movement to open science up – be it through Open Access to published literature, Open Data, or Open Educational Resources. The three day event offered lectures and panels on the state of the open today, but also served as an incubator for the future of the whole debate that spans universities, research funders, and publishers. It was an opportunity for the already experienced advocates and academics to interact with the younger generation of students and researchers interested in these issues. Continue reading

Facebooktwitterrss

Ethics – The Science of Morals, Rules and Behaviour

 

ethicsEthical boards are in place to evaluate the ethical feasibility of a study by weighing the possible negative effects against the possible positive effects of the research project (Barret, 2006). When designing your research project, it could be that you need to apply for ethical approval. This is a challenging task as there are strict guidelines to abide to when drawing up a proposal. This is where your supervisor can help – with their experience, they have a clear idea of what would be accepted for someone applying for ethical approval for an undergraduate or master study. There is a great importance to abiding by ethics, in research and in practice.The importance lies in the fact that care is taken for the participant, researcher and wider society. It creates a filter for good standard of research with as minimal harm being done as possible. Continue reading

Facebooktwitterrss

Why Would Researchers Skip Peer-Review? Media Reports of Unpublished Findings

‘You love your iPhone. Literally.’ ‘This is your brain on politics.’ ‘Overclock your brain using transcranial Direct Current Stimulation (tDCS).’ There are many other claims in psychology which have been publicised by the media, yet remain unchecked by academic experts. Peer-reviewed publications – papers which have been checked by researchers of similar expertise to the authors – are produced very slowly and only occasionally make instant impacts outside the walls of academia. In contrast, media publications are produced very quickly and provoke immediate reactions from the general public. Continue reading

Facebooktwitterrss