Not solely about that Bayes: Interview with Prof. Eric-Jan Wagenmakers

Last summer saw the publication of the most important work in psychology in decades: the Reproducibility Project (Open Science Collaboration, 2015; see here and here for context). It stirred up the community, resulting in many constructive discussions but also in verbally violent disagreement. What unites all parties, however, is the call for more transparency and openness in research.

Eric-Jan “EJ” Wagenmakers has argued for pre-registration of research (Wagenmakers et al., 2012; see also here) and direct replications (e.g., Boekel et al., 2015; Wagenmakers et al., 2015), for a clearer demarcation of exploratory and confirmatory research (de Groot, 1954/2013), and for a change in the way we analyze our data (Wagenmakers et al., 2011; Wagenmakers et al., in press).

Concerning the latter point, EJ is a staunch advocate of Bayesian statistics. With his many collaborators, he writes the clearest and wittiest exposures to the topic (e.g., Wagenmakers et al., 2016; Wagenmakers et al., 2010). Crucially, he is also a key player in opening Bayesian inference up to social and behavioral scientists more generally; in fact, the software JASP is EJ’s brainchild (see also our previous interview).

EJ

In sum, psychology is changing rapidly, both in how researchers communicate and do science, but increasingly also in how they analyze their data. This makes it nearly impossible for university curricula to keep up; courses in psychology are often years, if not decades, behind. Statistics classes in particular are usually boringly cookbook oriented and often fraught with misconceptions (Wagenmakers, 2014). At the University of Amsterdam, Wagenmakers succeeds in doing differently. He has previously taught a class called “Good Science, Bad Science”, discussing novel developments in methodology as well as supervising students in preparing and conducting direct replications of recent research findings (cf. Frank & Saxe, 2012).

Now, at the end of the day, testing undirected hypotheses using p values or Bayes factors only gets you so far – even if you preregister the heck out of it. To move the field forward, we need formal models that instantiate theories and make precise quantitative predictions. Together with Michael Lee, Eric-Jan Wagenmakers has written an amazing practical cognitive modeling book, harnessing the power of computational Bayesian methods to estimate arbitrarily complex models (for an overview, see Lee, submitted). More recently, he has co-edited a book on model-based cognitive neuroscience on how formal models can help bridge the gap between brain measurements and cognitive processes (Forstmann & Wagenmakers, 2015).

Long-term readers of the JEPS bulletin will note that topics ranging from openness of research, pre-registration and replication, and research methodology and Bayesian statistics are recurring themes. It has thus been only a matter of time for us to interview Eric-Jan Wagenmakers and ask him questions concerning all areas above. In addition, we ask: how does he stay so immensely productive? What tips does he have for students interested in an academic career; and what can instructors learn from “Good Science, Bad Science”? Enjoy the ride!


Bobby Fischer, the famous chess player, once said that he does not believe in psychology. You actually switched from playing chess to pursuing a career in psychology; tell us how this came about. Was it a good move?

It was an excellent move, but I have to be painfully honest: I simply did not have the talent and the predisposition to make a living out of playing chess. Several of my close friends did have that talent and went on to become international grandmasters; they play chess professionally. But I was actually lucky. For players outside of the world top-50, professional chess is a career trap. The pay is poor, the work insanely competitive, and the life is lonely. And society has little appreciation for professional chess players. In terms of creativity, hard work, and intellectual effort, an international chess grandmaster easily outdoes the average tenured professor. People who do not play chess themselves do not realize this.

Your list of publications gets updated so frequently, it should have its own RSS feed! How do you grow and cultivate such an impressive network of collaborators? Do you have specific tips for early career researchers?

At the start of my career I did not publish much. For instance, when I finished my four years of grad studies I think I had two papers. My current publication rate is higher, and part of that is due to an increase in expertise. It is just easier to write papers when you know (or think you know) what you’re talking about. But the current productivity is mainly due to the quality of my collaborators. First, at the psychology department of the University of Amsterdam we have a fantastic research master program. Many of my graduate students come from this program, having been tried and tested in the lab as RAs. When you have, say, four excellent graduate students, and each publishes one article a year, that obviously helps productivity. Second, the field of Mathematical Psychology has several exceptional researchers that I have somehow managed to collaborate with. In the early stages I was a graduate student with Jeroen Raaijmakers, and this made it easy to start work with Rich Shiffrin and Roger Ratcliff. So I was privileged and I took the opportunities that were given. But I also work hard, of course.

There is a lot of advice that I could give to early career researchers but I will have to keep it short. First, in order to excel in whatever area of life, commitment is key. What this usually means is that you have to enjoy what you are doing. Your drive and your enthusiasm will act as a magnet for collaborators. Second, you have to take initiative. So read broadly, follow the latest articles (I remain up to date through Twitter and Google Scholar), get involved with scientific organizations, coordinate a colloquium series, set up a reading group, offer your advisor to review papers with him/her, attend summer schools, etc. For example, when I started my career I had seen a new book on memory and asked the editor of Acta Psychologica whether I could review it for them. Another example is Erik-Jan van Kesteren, an undergraduate student from a different university who had attended one of my talks about JASP. He later approached me and asked whether he could help out with JASP. He is now a valuable member of the JASP team. Third, it helps if you are methodologically strong. When you are methodologically strong –in statistics, mathematics, or programming– you have something concrete to offer in a collaboration.

Considering all projects you are involved in, JASP is probably the one that will have most impact on psychology, or the social and behavioral sciences in general. How did it all start?

In 2005 I had a conversation with Mark Steyvers. I had just shown him a first draft of a paper that summarized the statistical drawbacks of p-values. Mark told me “it is not enough to critique p-values. You should also offer a concrete alternative”. I agreed and added a section about BIC (the Bayesian Information Criterion). However, the BIC is only a rough approximation to the Bayesian hypothesis test. Later I became convinced that social scientists will only use Bayesian tests when these are readily available in a user-friendly software package. About 5 years ago I submitted an ERC grant proposal “Bayes or Bust! Sensible hypothesis tests for social scientists” that contained the development of JASP (or “Bayesian SPSS” as I called it in the proposal) as a core activity. I received the grant and then we were on our way.

I should acknowledge that much of the Bayesian computations in JASP depend on the R BayesFactor package developed by Richard Morey and Jeff Rouder. I should also emphasize the contribution by JASPs first software engineer, Jonathon Love, who suggested that JASP ought to feature classical statistics as well. In the end we agreed that by including classical statistics, JASP could act as a Trojan horse and boost the adoption of Bayesian procedures. So the project started as “Bayesian SPSS”, but the scope was quickly broadened to include p-values.

JASP is already game-changing software, but it is under continuous development and improvement. More concretely, what do you plan to add in the near future? What do you hope to achieve in the long-term?

In terms of the software, we will shortly include several standard procedures that are still missing, such as logistic regression and chi-square tests. We also want to upgrade the popular Bayesian procedures we have already implemented, and we are going to create new modules. Before too long we hope to offer a variable views menu and a data-editing facility. When all this is done it would be great if we could make it easier for other researchers to add their own modules to JASP.

One of my tasks in the next years is to write a JASP manual and JASP books. In the long run, the goal is to have JASP be financially independent of government grants and university support. I am grateful for the support that the psychology department at the University of Amsterdam offers now, and for the support they will continue to offer in the future. However, the aim of JASP is to conquer the world, and this requires that we continue to develop the program “at break-neck speed”. We will soon be exploring alternative sources of funding. JASP will remain free and open-source, of course.

You are a leading advocate of Bayesian statistics. What do researchers gain by changing the way they analyze their data?

They gain intellectual hygiene, and a coherent answer to questions that makes scientific sense. A more elaborate answer is outlined in a paper that is currently submitted to a special issue for Psychonomic Bulletin & Review: https://osf.io/m6bi8/ (Part I).

The Reproducibility Project used different metrics to quantify the success of a replication – none of them really satisfactory. How can a Bayesian perspective help illuminate the “crisis of replication”?

As a theory of knowledge updating, Bayesian statistics is ideally suited to address questions of replication. However, the question “did the effect replicate?” is underspecified. Are the effect sizes comparable? Does the replication provide independent support for the presence of the effect? Does the replication provide support for the position of the proponents versus the skeptics? All these questions are slightly different, but each receives the appropriate answer within the Bayesian framework. Together with Josine Verhagen, I have explored a method –the replication Bayes factor– in which the prior distribution for the replication test is the posterior distribution obtained from the original experiment (e.g., Verhagen & Wagenmakers, 2014). We have applied this intuitive procedure to a series of recent experiments, including the multi-lab Registered Replication Report of Fritz Strack’s Facial Feedback hypothesis. In Strack’s original experiment, participants who held a pen with their teeth (causing a smile) judged cartoons to be funnier than participants who held a pen with their lips (causing a pout). I am not allowed to tell you the result of this massive replication effort, but the paper will be out soon.

You have recently co-edited a book on model-based cognitive neuroscience. What is the main idea here, and what developments in this area are most exciting to you?

The main idea is that much of experimental psychology, mathematical psychology, and the neurosciences pursue a common goal: to learn more about human cognition. So ultimately the interest is in latent constructs such as intelligence, confidence, memory strength, inhibition, and attention. The models that have been developed in mathematical psychology are able to link these latent constructs to specific model parameters. These parameters may in turn be estimated by behavioral data, by neural data, or by both data sets jointly. Brandon Turner is one of the early career mathematical psychologists who has made great progress in this area. So the mathematical models are a vehicle to achieve an integration of data from different sources. Moreover, insights from neuroscience can provide important constraints that help inform mathematical modeling. The relation is therefore mutually beneficial. This is summarized in the following paper: http://www.ejwagenmakers.com/2011/ForstmannEtAl2011TICS.pdf

One thing that distinguishes science from sophistry is replication; yet it is not standard practice. In “Good Science, Bad Science”, you had students prepare a registered replication plan. What was your experience teaching this class? What did you learn from the students?

This was a great class to teach. The students were highly motivated and oftentimes it felt more like lab-meeting than like a class. The idea was to develop four Registered Report submissions. Some time has passed, but the students and I still intend to submit the proposals for publication.

The most important lesson this class has taught me is that our research master students want to learn relevant skills and conduct real research. In the next semester I will teach a related course, “Good Research Practices”, and I hope to attain the same high levels of student involvement. For the new course, I plan to have students read a classic methods paper that identifies a fallacy; next the students will conduct a literature search to assess the current prevalence of the fallacy. I have done several similar projects, but never with master students (e.g., http://www.ejwagenmakers.com/2011/NieuwenhuisEtAl2011.pdf and http://link.springer.com/article/10.3758/s13423-015-0913-5).

What tips and tricks can you share with instructors planning to teach a similar class?

The first tip is to set your aims high. For a research master class, the goal should be publication. Of course this may not always be realized, but it should be the goal. It helps if you can involve colleagues or graduate students. If you set your aims high, the students know that you take them seriously, and that their work matters. The second tip is to arrange the teaching so that the students do most of the work. The students need to develop a sense of ownership about their projects, and they need to learn. This will not happen if you treat the students as passive receptacles. I am reminded of a course that I took as an undergraduate. In this course I had to read chapters, deliver presentations, and prepare questions. It was one of the most enjoyable and inspiring courses I had ever taken, and it took me decades to realize that the professor who taught the course actually did not have to do much at all.

Many scholarly discussions these days take place on social media and blogs. You’ve joined twitter yourself over a year ago. How do you navigate the social media jungle, and what resources can you recommend to our readers?

I am completely addicted to Twitter, but I also feel it makes me a better scientist. When you are new to Twitter, I recommend that you start by following a few people that have interesting things to say. Coming from a Bayesian perspective, I recommend Alexander Etz (@AlxEtz) and Richard Morey (@richarddmorey). And of course it is essential to follow JASP (@JASPStats). As is the case for all social media, the most valuable resource you have is the “mute” option. Prevent yourself from being swamped by holiday pictures and exercise it ruthlessly.

Fabian Dablander

Fabian Dablander just finished his Masters in Cognitive Science at the University of Tübingen. He is interested in innovative ways of data collection, Bayesian statistics, open science, and effective altruism. You can find him on Twitter @fdabl.

More Posts - Website

Facebooktwitterrss