Inventorizing Reproducibility

Network Nodes Meeting, 4th October 2024

What happens if you ask representatives from 11 different research performing and supporting institutions to think about how reproducibility ready their own institution is?
On 4th October, the contact persons of our node members met in Utrecht to learn from each other and to get to know each other during NLRN’s first network meeting. 

We used the framework of the Knowledge Exchange (KE) report on reproducibility at research performing organizations to systematically think through enablers and hindrances of reproducible research. 

In small groups, we first categorized our own institutions into how reproducibility ready they are. The KE report suggests three levels of readiness; 1) there are some pockets of excellence; 2) efforts are partially coordinated and 3) there is organizational level commitment with coordinated processes. We concluded that those levels would depend on which disciplines, departments and research methodologies you are considering. We often encountered  that there are differences between management levels and researchers: it can be that the university-wide management sets policies to foster reproducible research, but this might not trickle down to an individual researchers’ work. This might even lead to window dressing or “open washing” where institutions present themselves as committed to open research practices but the culture within the institution didn’t change.

The second discussion exercise was about enablers of reproducible working such as training, mentorship or recognition. In small groups, we tried to identify which enablers are already in place and on which level, and if they indeed function as enablers. The KE report comes with an assessment worksheet for institutions that some of the participants tested. 

Visualization of enablers at a sample institution. Each enabler is present with their current state and its target state.

During the final discussion, we tried to figure out how the tools of the report can be used to further reproducibility in the node institutions.

The general consensus was that the tools as presented wouldn’t work as a general tool for all areas of scholarship and research. They would work largely for quantitative methodologies and would need tweaking for interpretative, qualitative, art and action based research methodologies. 

The idea was raised to find an institution that would use the framework to assess their current state and work towards becoming more reproducibility ready. This process could be followed and presented as a concrete example on how the framework works. 

We didn’t have enough time to talk in detail about each of the enablers. One question that was raised was if there is a hierarchy of enablers and if an institution should aim for the highest score on all of those or just some of them. 

We spent a lot of time discussing training for researchers as an enabler. There were ideas of introducing reproducibility related courses to the mandatory courses at graduate schools. Others remarked that there are already a lot of training modules offered but they don’t seem to reach everyone.

The budget cuts for higher education were also discussed during this meeting with the conclusion that we need to work even more collaboratively and coordinated to make the most of the means that are already available. NLRN could play an important role in this.

In conclusion, the network event was a great opportunity to get to know each other with a lot of engagement from all participants in the discussions on the current state and future directions to increase reproducible working. In our next meeting, we will focus on concrete steps towards that future.

Striking a balance 

In June, I attended the World Conference on Research Integrity in Athens. I am still inspired by the many fruitful and fun encounters with colleagues from different places in the world and by thought provoking presentations. One of these talks will form the basis of this blogpost. It is the keynote of Daniele Fanelli entitled “Cautionary tales from metascience” in a session on the effects of research integrity on innovation and policy making. Among his main messages were (1) that replication rates are not as bad as we make them to be,(2) that reproducibility is related to the complexity of the research at hand, (3) that changing policies as a reaction to the reproducibility crisis might do more harm than good and that (4) it is not a one size fits all. Below, I will discuss these issues and will try to conclude what this means for the work within NLRN. 

Let’s start with his premises that replication rates from the literature are actually not that bad. He mentioned rates of 60-90%, taking the higher values from ranges reported in the literature. I think a more fair representation would be a median rate between 50-60%. Whether that means that we are in a crisis is a different question. Crises are usually associated with specific periods in time and it’s probably reasonable to assume that replication rates would not have been much different 20 or 30 years ago, if there had been replication studies at that time. Fanelli went on to mention the large variance in replication rates across studies and apparently also (sub) disciplines and there he has a good point.  

Fanelli presented results from own work,  performed with data from the Brazilian reproducibility initiative, showing that complexity might indeed be related to replication. So there is at least some empirical evidence for his statement. It also seems logical that simpler, more straightforward research or research in a mature field, where there is a high degree of consensus on methods and procedures, would be easier to reproduce and results to replicate.  

Fanelli went on to argue that policies focusing on incentive structures are not effective in combating questionable research practices (QRP). He showed that bias and questionable research practices are overall only strongly related to country of first author. Additionally , within countries in which QPRs are prevalent, incentive structures and publication pressure seem to be important drivers, but in other countries, these things do not seem to be related to QPRs. This would, according to Fanelli, imply that policies focused on these things would not be effective in many countries. Here I think Fanelli jumps to the conclusion a bit too quickly. All of his evidence comes from meta-research, which is by nature observational and on an aggregated level. This means that there might be confounds underlying the relations he showed. Moreover, we would need intervention studies to explore whether intervening on these aspects change outcomes. Such studies are scarce. In the field of reproducibility, there is some evidence that rigorous-enhancing practices in both original studies and replication studies can lead to high replication rates and effect sizes that are virtually unchanged in the replications.1 These practices included confirmatory tests, large sample sizes, preregistration and methodological transparency . However, this multi-lab study was done in social psychology and it is uncertain how results will be in other fields or (sub) disciplines.  

All in all, there is not much evidence yet for policy interventions improving the reproducibility and replication of studies and it is probably not one size fits all. Fanelli concludes that policy should be light and adaptive and that makes sense. We will have to strike a balance between incorporating some generic principles and leaving enough room for discipline, field and country/region specific differences. How do we know what works for whom? By developing interventions / policies together with academic and non academic staff, piloting and evaluating these and when deemed viable, by implementing them on a broader scale and evaluating and adapting where necessary. These efforts need continuous monitoring. The reproducibility networks are ideally suited to support these efforts through their network of research performing institutions, communities of researchers and educators and other relevant stakeholders.  

Within the Dutch Reproducibility Network we acknowledge the specificity of reproducibility and replication across disciplines and fields, which is why one of our focus areas for the coming years is non-quantitative research. We are eager to work on these and other pressing issues with our partners, striving for evidence-informed implementation of interventions and policies on reproducibility.   

1 Protzko, J., Krosnick, J., Nelson, L. et al. High replicability of newly discovered social-behavioural findings is achievable. Nat Hum Behav 8, 311–319 (2024). https://doi.org/10.1038/s41562-023-01749-9

find the slides to Danielle Fanelli’s talk here: https://az659834.vo.msecnd.net/eventsairwesteuprod/production-pcoconvin-public/e2e0a11ab8514551a3376e9b49af030d

Platform for Young Meta-Scientists (PYMS): Empowering the Future of Meta-Science 

Meta-science, the study of scientific practice itself, is a field crucial for fostering and monitoring research transparency, reproducibility, and integrity. Recognizing the need for a community among early-career meta scientists in The Netherlands (and its neighbouring countries), the Platform for Young Meta-Scientists (PYMS) formed in 2018. PYMS is dedicated to supporting and connecting young meta-scientists, providing them with a collaborative environment to share resources and discuss new research ideas. 

NLRN and PYMS: strong bonds 

Collaborating with meta-scientists is one of the focus areas of the NLRN. Evidence-based interventions and monitoring strategies are pivotal to a successful move towards more reproducible science. It is the meta-sciences that can generate such evidence. We are therefore happy to work together with PYMS and wholeheartedly support their mission.   

PYMS map of expertise

Highlights from the PYMS Meeting in Tilburg: May 31, 2024 

The recent PYMS meeting at the Meta-Research Center at Tilburg University was a testament to the vibrant and dynamic nature of the PYMS network. The program featured both formal presentations and informal networking opportunities. A crucial part of the meeting was the brainstorming session on the future of PYMS, where attendees provided valuable feedback and ideas for future events and organizational strategies. 

Here are a few examples of presentations to illustrate the wide range of expertise that participants brought to the meeting: 

  • Signe Glaesel (Leiden University): Discussed the challenges surrounding data sharing, including misinterpretations, intellectual property concerns, and the impact of data policies on participant willingness in sensitive studies. 
  • Michele Nuijten (Tilburg University): Shared a four-step robustness check for research replicability, highlighting the prevalence of reproducibility problems and strategies to improve research robustness. 
  • Ana Barbosa Mendez (Erasmus University and Promovendi Netwerk Nederland (PNN)): Spoke on best practices in Open Science, mapping the needs of PhD students, and the holistic approach required for effective science communication and community building. 

Looking Ahead: The Future of PYMS 

The enthusiasm and engagement at the Tilburg meeting underscored the need for regular PYMS gatherings. Participants expressed interest in a more holistic approach, broadening the scope to include researchers from other scientific fields. Formalizing PYMS through stronger links with organizations like PNN and NLRN was also a key takeaway. 

A concrete outcome of the brainstorming session in the afternoon are plans for a satellite event on December 5th, the day before the NLRN symposium in Groningen. All career-young researchers in the field of meta science are invited to join this event. More information will be shared via the NLRN newsletter and on social media.  

For more information on upcoming PYMS events and how to get involved, visit metaresearch.nl and PYMS (metaphant.net).

The many dimensions of reproducible research:  A call to find your fellow advocates.  

Blogpost by steering group member Tamarinde Haven.

Various definitions of reproducibility and its sister concepts (replicability, replication) are floating around [Royal Netherlands Academy of Arts and Sciences 2018, Goodman et al 2016]. Whereas there are relevant (disciplinary) differences between them, they generally share a focus on the technical parts of reproducible research. 

 

With the technical focus increasingly taking centre stage [e.g., Piccolo & Frampton, 2016], one can assume technical solutions are the panacea. Say techno-optimism in the reproducibility debate. The thinking is something along the lines of: Provided that institutions have the option to facilitate the relevant infrastructure and use of tools, researchers employed at those institutions will carry out reproducible research [Knowledge Exchange, 2024].  

To be clear, making reproducible practices possible is a necessary step. But it is one of many [Nosek, 2019]. Now that you have enabled more reproducible practices, how are you going to ensure it is picked up by researchers?  

Now that you have enabled more reproducible practices, how are you going to ensure it is picked up by researchers?  

Back in 2021, I defended my thesis on fostering a responsible research climate. One of the key barriers to responsible science researchers flagged was the lack of support. Our participants were bugged down by inefficient support systems where they were connected to support staff who were generalists and could not efficiently help them [Haven et al., 2020]. 

Many Dutch research-performing organisations nowadays employ data stewards, code experts, and product owners of various types of software that have been recently developed to promote reproducibility. These experts maintain software packages and they select the relevant infrastructure to support a reproducible data flow. They advise on which tools are suitable for a given project and ensure these are up to date. They implement new standards through workshops and training. And they do so much more. 

During the launch of the Netherlands Reproducibility Network last October, we heard of various disconnects at institutions. We learned about meticulously trained data stewards sitting around, waiting for researchers to find them. Those who found them returned, but that’s only partially good news. I am not aware of their exact reward mechanisms, but many organisations follow the flawed principle that when there are no requests for a solution, we do not seem to be bothered by this problem which is false for many different reasons*. Part of this disconnect may simply be a matter of time, and culture change is a process that typically is much slower than its driving forces might be satisfied with.  

In my personal experience, these professionals have been highly knowledgeable. I found reproducibility advocates who were able to help me draft a coherent data management plan for my funding proposals, advised on relevant metadata standards, wrote the piece of Python code to connect my data with existing databases, and finally connected me with yet other specialists who maintain the newly created data archiving infrastructure in The Netherlands.  

As a network, it is in our DNA to exchange information but also – crucially – contacts for professional purposes. That is why we, as the Netherlands Reproducibility Network, want to focus on promoting connections between researchers and research support staff. What kind of strategies are currently being used to connect these groups? How can we learn from successful efforts or institutions where these parties seamlessly find one another? 

And no, we don’t plan on falling into the same trap as the participants in my thesis research talked about. General, one-size-fits-all solutions likely won’t cut it. That is why we hope to facilitate ongoing and launch new pilots to investigate connections. But as so often with these efforts, the best possible world is one in which these kinds of pilots are not necessary. So, to all my research colleagues: Please find your fellow reproducibility advocates in your institution. Acknowledge their help in your research products. And make sure to share their valuable expertise with your lab members; we should honour the crucial human dimension of reproducible research. 
 

PS. Do you know of any ongoing pilots? Get in touch!

* Just because no one reported any social safety issues to the confidential advisor, there are none, and we do not need confidential advisors? 

References: 

Goodman, S. N., D. Fanelli and J. P. Ioannidis (2016). What does research reproducibility mean? Science Translational Medicine 8, 341ps312. 

Royal Netherlands Academy of Arts and Sciences, KNAW. (2018). Replication studies – Improving reproducibility in the empirical sciences. 

Haven, T., Pasman, H. R., Widdershoven, G., Bouter, L., & Tijdink, J. (2020). Researchers’ Perceptions of a Responsible Research Climate: A Multi Focus Group Study. Science and engineering ethics, 26(6), 3017–3036. https://doi.org/10.1007/s11948-020-00256-8 

Piccolo, S. R., & Frampton, M. B. (2016). Tools and techniques for computational reproducibility. GigaScience, 5(1), 30. https://doi.org/10.1186/s13742-016-0135-4 

 

Replication in philosophy, or replicating data-free studies  

Blog post by Hans Van Eyghen, Member of the NLRN steering group

The replication crisis, which arose primarily in the biomedical and psychological sciences, was both a blessing for replications and somewhat of a curse. Its lasting impact lies in the recognition for the need for replicability. Replicability is now generally seen as a way to make the impacted disciplines better and to make them more robust, allowing for quality control and independent confirmation of findings. The minor curse inflicted by the replication crisis is that replicability is sometimes regarded as a specific solution to a specific problem. Disciplines without replication crises would not stand in need of increased replicability and any push for replicability may inflict more problems than are solved. Such a sentiment is especially at work in the humanities. The humanities did not go through a replication crisis. This has left some in the humanities with the idea that replication is a fix for others. Furthermore, pushing for increased replicability in the humanities would mean importing a problem and methodology that is not theirs.  

There is no reason why studies in the humanities would not benefit from increased quality control or corroboration

While there are profound differences between the humanities and other disciplines, the key reasons for increased replicability remain the same. Quality control refers to checking whether studies are well conducted. Quality control is key to weeding out mistakes (willed or unwilled) or other reasons why any study is not up to standards. Corroboration refers to findings the same or similar conclusions while redoing the same (or a similar) study. There is no reason why studies in the humanities would not benefit from increased quality control or corroboration. An argument in favor of increased replicability, which allows for more quality control and corroboration is therefore quickly found.  

Nonetheless, some suggest reasons why increased replicability might not be feasible for the humanities. While the details vary, the reasons center around the idea that the humanities are just too different. More than other disciplines, the humanities involve interpretation. The objects of the humanities also are not just quantifiable data, but qualitative, meaningful objects or subjects. Finally, a considerable number of studies in the humanities do not involve data, analysis thereof or anything like it all. In those cases, it is not at all clear what replication would involve.  

I will focus here on the final argument, i.e. some humanities studies do not have data and therefore have no need for replicability. Replicability usually involves being clear about the data used, how it was analyzed and how conclusions were drawn. Clear examples of such ‘data-less’ studies are a priori reasoning in philosophy. A considerable number of studies in philosophy consist of reflection on arguments or questions like ‘Is knowledge justified true belief?’ or ‘Is morality objectively true?’. Attempts at answers do not rely on surveys (if one is not engaging in experimental philosophy) or empirically collected facts. Instead, philosophers tend to rely on a priori processes, like reflective equilibrium, conceptual analysis or others.  

Photo by Alex Block on Unsplash
Photo by Alex Block on Unsplash

Does lack of data exclude replicability or replication of such studies? It does not. Data-less studies can benefit from increased transparency and details as well. As most beginning PhD-students in philosophy know, it is often highly opaque how conclusions are drawn in a priori philosophy. Philosophers do tend to clearly define terms and meticulously write down arguments for why a conclusion is valid. More than often, however, philosophers are not upfront about how they analyze their concepts. Some kind of method like conceptual analysis is usually at work but the exact type of method used is often not made explicit. Philosophers also tend not to be transparent about how they arrived at their conclusions, how they came up with examples or why they started thinking about the topic in the first place. The answers to these questions may be quite trivial and uninformative, like ‘I was thinking hard for a long time’. However, in many cases, philosophers rely heavily on input during paper discussions, presentations at conferences and peer review. Often this input goes unacknowledged, or acknowledgment is limited to one note in the final paper.  

What would a replicable data-less study look like? Like other replicable studies, it would need a section on methodology where the researcher lays out the methods she used and why these methods are appropriate. Such a section will allow (younger) researchers to reconstruct how the study was brought about and do the same study again. The replicable paper will also include information on how examples were found and how conclusions were reached. It would also include information on how the research topic was altered during the course of the ongoing research and why this was deemed necessary.  

Increased replicability of data-less studies would help make the discipline more open to newcomers and other disciplines.

Increased replicability of data-less studies would help make the discipline more open to newcomers and other disciplines. This can help avoid gatekeeping and make research more available. More importantly, it would help make studies better by allowing for quality control and corroboration, which remains the central goal of replication studies.  

Perspectives on Reproducibility – looking back at the NLRN launch symposium

On 27th October 2023, we welcomed more than 100 researchers, policy makers and research facilitators to our Launch event. The aim of the day was to exchange perspectives on reproducibility and to work towards prioritizing actions for NLRN for the coming year(s). 

During the morning, we heard how the UK Reproducibility network propelled changes in the UK research landscape and we discussed how we can learn from each other across disciplines to improve research transparency in our own field. In the afternoon, participants  followed workshops on the topics of education, infrastructure, community building and research practices. The results of those workshops were discussed in the closing panel discussion of the day, where participants and panel members suggested topics and actions for the NLRN to focus on.

Marcus Munafò giving the keynote lecture on collaborative approaches to improving research culture in practice

The symposium brought a diverse set of researchers and stakeholders together, diverse in terms of roles in the research process but also in terms of disciplines It was important to take count of the current reproducibility landscape in the Netherlands. We noted that most people were very familiar with the current state of their own field, but that the overview of the entire landscape was lacking. The NLRN can act as a connector to enable communities to learn from the challenges and advances in seemingly distant fields. 

The interactions during the plenary sessions and workshops showed how research domains differ in their challenges and current status of reproducibility. The workshop hosts were asked to work towards three focus areas or agenda points for the NLRN to work on. Concrete ideas included creating training materials for researchers on how to use existing digital infrastructure or on how to make executable figures. The community building workshop suggested that the NLRN should coordinate national codecheck events. During the infrastructure workshop, participants saw a need for determining at what level research infrastructures should be organized (local,national, international), and for discussing how research outputs and processes differ between research areas, which in turn influences the required reproducibility infrastructure. Participants from the education workshop suggested lobbying for teaching reproducible research practices from the bachelor level onwards and showcasing existing efforts in teaching team science. 

The steering group is now tasked to see which suggestions fit best with the overall goals of the network and how to prioritize them. We will select a few agenda points first while also extending and growing the network. 

Stay tuned! We will share our progress on this blog, in our newsletter and on our social media (LinkedIn and X). 

You can find the presentation slides on zenodo and watch back the keynote lecture on our website. 

Welcome to the NLRN Blog!

Hi there, welcome to our blog! We are currently setting up this blog and planning our first posts.

Within the next few weeks, you can expect a post about our launch event last month and about our first network partners. Sign-up to our newsletter for general news and follow our social media (on LinkedIn and X).

Group Picture of the Steering Group and all present Advisory Board members during the Launch of the NLRN on 27 October 2023