Advancing reproducibility and Open Science one workshop at a time – community-building in the Netherlands

This is a crosspost from the codecheck website.

This blog post is the final one in our series of posts about the “CODECHECKing goes NL” project, which has been running since May 2024. We have been working hard to promote reproducibility and Open Science in the Netherlands through a series of workshops, and we are excited to share our experiences and future plans with you.

Find the full series of posts on our project page.

Why is this important?

A paper’s code gets rarely checked – everyone in academia knows about peer reviewing articles, but few people engage in reproducibility checks of the materials behind the paper. Reproducibility checks are less about vetting research (e.g., catching fraud, finding errors), but more about ensuring the reusability of research. It is an extension of the thought that if we want to stand on the shoulder of giants, those giants better be standing on solid ground. And solid ground for computational workflows means good documentation that is understandable outside of the inner circle of the authors of a research article.

Reproducibility Checks at the Center for Digital Humanities at Leiden University on 14 th February 2025.

A reproducibility check is about the question whether one can reproduce the reported results in the paper (i.e., the statistics, tables, figures, maps, or graphs) from the provided data, code, or other materials. The CHECK-NL project focuses on the computational reproducibility of published research and tries to answer the question of “Can someone else recreate the output on their hardware using the materials and documentation provided by the authors?”. We call this type of reproduction a CODECHECK.

Who did what?

A bunch of enthusiasts for Open Science and Reproducible Research from University of Twente, TU Delft, and UMCG Groningen applied for funding from NWO via their Open Science funding scheme to organize four in-person events at their respective institutions and beyond. Through these events, they intended to jump start a Dutch reproducibility checking community. Included in the project proposal was also work on the CODECHECK website and registry to better present codechecks and the codecheckers behind them.

Along the way, the group of enthusiasts grew and instead of the planned four events, there were a total of six in-person events: one more as a conference workshop (AGILE in Dresden) and another one (TU Eindhoven) organized by attendees of the first event (exactly what this project was aiming for!). At the events, we also connected with representatives of a data repository, diamond open access publishers, and digital competence centers who are considering their own version of computational reproducibility checks.

The four events in Delft, Enschede, Rotterdam and Leiden brought in a total of 40 researchers, many of whom opened up their own work to be assessed by others, together who codechecked 15 papers. The additional events in Eindhoven and Dresden introduced an international crowd to the CODECHECK principles. Each event had a different topic, focusing on different parts of the research landscape, which resulted in different challenges and learning opportunities at each event. While the groups in Delft and Enschede mainly faced problems with computing environments, documentation, and high computational loads (too big for laptops or the workshop time), the group in Rotterdam raised the issue that reproducibility checks can be pretty dry at their core and may be almost trivial if only heavily summarized data can be shared. At the final event in Leiden, we brought linguists and digital humanists together. One of the questions raised was: how do we start a reproducibility crisis in the humanities? (Because maybe we need one to raise awareness about the important topic in this field?)

What are the results? What did we learn?

One clear lesson learned was about how different crowds from different disciplines are – although the advertisement for the events and their setup and schedule were quite similar, they played out quite differently. Another important lesson is that you need a group of enthusiastic participants to drive such events – fortunately, we always had those!.

There were people with a wide range of coding skills at the events. The wrap-up sessions always gave us the impression that all of them took something home and learned something. Working with someone else code and reproducing another researcher’s workflow requires craftsmanship and a hands-on and can-do attitude that is rarely taught in typical university classes. The workshops and the participating experienced mentors, however, could provide such a setting.

The four main in-person events required attendees to invest an entire workday into this topic. In retrospect, this might have prevented interested people from joining. For raising awareness, shorter, more targeted events might be a suitable alternative.

Getting the certificates was a nice by-product but was certainly not the only outcome. Authors whose project didn’t pass the reproducibility check were given feedback so that they can make their work still reproducible. Participants got the chance to learn from other people’s workflows and software stacks.

Another surprise was how difficult it still is to convince colleagues to submit their work to a reproducibility check. The social layer of this otherwise rather technical question is the biggest challenge for the project team and people working with reproducibility checks. The technological challenges are less exciting than the positive experiences and potential benefits, see e.g., this blog post about an author’s experience how it is to be “codechecked”.

From discussions we distilled the notion that the best time to get a reproducibility check is at the preprint stage or during peer review – then people are still motivated to fix issues before the publication. Also, a certificate is a positive signal towards peer reviewers (at least that’s what we hope). If published work gets checked, authors need to be very motivated to improve documentation or fix bugs, certainly if those are hidden in some deeper level of the code.

Concrete outcomes:

What are the next steps?

The CODECHECK or reproducibility check community in the Netherlands is growing. We met with the wider community to evaluate the project and make new plans. 4TU Research Data is planning to work on codechecks as part of their service as data repository and is working closely with the four technical universities.

The community in the Netherlands will continue to meet and work on topics like reproducibility checks as a service or as part of teaching curricula, and academic culture around code checking. Internationally, we have reached out to colleagues in Bristol and Cologne.

Preregistration For Student Assignments

How can we integrate open science practices into our curricula? In this webinar two lecturers told us how they are including preregistrations in their students’ curricula.

“When you preregister your research, you’re simply specifying your research plan in advance of your study and submitting it to a registry.” (Quote from the website of the Open Science Framework)

Ewout Meijer started the webinar and told us how he convinced thesis coordinators to include pre registration in their courses. As an Open Science Ambassador for his faculty, Ewout got backed by the dean to find out where in the various courses open science practices could be integrated. Pre registrations are just one example of those practices.

He shared several tips of how he convinced colleagues to integrate open science practices:

  • Make it easy for your colleagues – minimize the extra workload by sharing templates and offering introduction lecture materials
  • Make them want it – top down mandates to include open science practices don’t work well, but if your colleagues are convinced that this is the right thing to do, they will follow
  • Don’t be overambitious – first the science, then open science. Finding the balance between what students need to know in terms of scientific content and what they need to know about the scientific system is difficult and will differ between courses and student populations.
  • Searching for “project proposals” as a required component for students to pass a course is a good way to find courses where preregistration can be taught. You just need to replace the proposal with a registration. 
  • Including new (open science) content means kicking out some other content – see what can be replaced and what can be tossed

At Maastricht University, students are asked to use the AsPredicted template and submit that as a pdf (i.e., they don’t upload it on aspredicted.com). Ewout mentioned that not all internship projects are suited for this format, so students might have to adjust it or come up with a project just to fill in the template and pass this grading component.

Students get exposed to the idea of preregistration and the same effect goes for workgroup tutors. Tutors come from a wide range of research groups and are learning themselves about pre registrations while helping students with their thesis work.

Elen Le Foll asked her seminar students to pre-register their term paper analyses. Adding this component required some extra time investment to make sure students understood what was expected of them and for extra feedback rounds. The preregistration adds at least one round of feedback to the term paper and requires students to plan ahead and submit their preregistration on time to have enough time left to incorporate the feedback into their final data analysis. On the positive side, students can learn from feedback and include it in their work. For normal term papers, students get feedback at the end but do not need to use that feedback or cannot improve their work anymore.

As Elen’s course is an advanced course for master students, some of her students want to turn their term paper into a research paper. For them, the preregistration is an excellent way to get a timestamp for their analysis. 

In the discussion with Andrea and other attendees, we discussed how the AsPredicted format can be used by students and if a full registered report might be even more suitable. We briefly touched upon the difficulty of grading pre registrations and how much detail we should ask students. Another point of discussion was how we can sell pre registration to students who are not interested in becoming researchers. This led to a discussion on how to balance the need for academic training with content and application outside of academia.

Thanks to our presentors:

Elen Le Foll is a post-doctoral researcher and lecturer in linguistics at the Department of Romance Studies at the University of Cologne. She likes to integrate Open Science principles and practices in her seminars and recently asked her students to pre-register a study as part of a term paper assignment.

Ewout Meijer works at Maastricht University and coordinates the thesis module for the research master in psychology. He introduced preregistrations for thesis projects. 

Useful Links:

Aspredicted: aspredicted.org

OSF Preregistration Templates: www.cos.io/initiatives/prereg

A LEGO® Metadata Challenge Workshop 

Blog post by DCC Groningen and DCC UMCG

In this workshop, participants experienced reproducibility in action through an experiment using LEGO®. In the first part of the session, each group created a structure using LEGO® bricks and prepared accompanying documentation and instructions of how to create it. Afterwards, another group attempted to recreate the structure based on the provided documentation.  
The exercise sparked lively discussions about the nuances of metadata and different documentation styles, the challenges of interpreting and applying it, and the critical role they play in ensuring that research can be reliably reproduced. One of the interesting conclusions was that keeping reproducibility in mind not only influences the approach to documentation and metadata but also shapes the design of the structure (i.e., research project) to ensure its reproducibility. 

Feedback from the participants was overwhelmingly positive. Many appreciated the innovative approach, noting that the hands-on activity helped solidify their understanding of the importance of proper documentation and metadata’s practical application. The workshops also fostered a sense of community and collaboration, as participants shared their experiences and insights. 

Spot the difference 

Can you spot the difference (or lack thereof) between some of the original creations from our participants and the reconstructed versions from another team?  

Reproducible materials  

In line with the reproducibility theme, this workshop used materials from the University of Glasgow1 and participants were encouraged to bring this workshop to their own networks. 

We are grateful to the Netherlands Reproducibility Network for providing us with the opportunity to share our knowledge and engage with such a passionate group of individuals. We look forward to continuing our efforts to promote research reproducibility and hope to bring more creative and interactive workshops to future events! 

———————————————————- 

On December 6th, 2024, the Digital Competence Center (DCC) from the University of Groningen and the DCC from the University Medical Center Groningen had the pleasure of attending the Netherlands Reproducibility Network Symposium. This event brought together researchers, data managers, and enthusiasts dedicated to improving the reproducibility of scientific research from all over the country. As part of the symposium, the two DCCs collaboratively presented two engaging sessions of a workshop titled “A LEGO® Metadata Challenge.” 

1 Donaldson, M.and Mahon, M.(2019) LEGO® Metadata for Reproducibility game pack. Documentation. University of Glasgow. (doi: 10.36399/gla.pubs.196477).