Best practices for building a reproducible workflow

There are many resources and tools available to us for making our science more open and reproducible. However, it can be quite overwhelming thinking about how to incorporate each thing in your daily workflow!

Below we have outlined some of the tools we think are most useful to learn or become acquainted with, organized by the “stage” in your workflow you would want to think about incorporating these tools. For many of these tools, we have linked handbook pages with tips or tutorials to help you get started using them.

Another great way to get started with several of these tools is by following along with our Pygers Workshop recordings and notes.

Setting up your computing environment

  1. Tips for using the command line (tmux, vim)

  2. Conda environments

  3. Git and GitHub

  4. Get started programming with Python (Jupyter notebooks, iPython debugger)

  5. Using Slurm

Data analysis

  1. fMRI analysis software packages (AFNI, FSL)

  2. Python packages (Nilearn, scikitlearn, Nibabel, BrainIAK)

  3. BrainIAK tutorials

Publishing and data sharing

  1. Overleaf

  2. Code Ocean

Tips for using the command line

The sooner you become comfortable using the command line the better! Here are some helpful tips for using the command line to navigate the PNI server.

tmux: When working on a remote server (like scotty at PNI), you can use tmux to create persistent remote sessions. If you get disconnected from the remote server, the tmux session will keep running. You can attach and detach from these remote sessions, and create multiple windows or panes in a given session. You can control tmux using key combinations; you first type a prefix key combination (by default ctrl + b) followed by additional command keys. See our tmux tip page for a tutorial on using tmux.

Vim:

Conda environments

Conda is a package manager, similar to brew, apt, or pip. This package manager keeps track of your Python installation, versions, and dependencies in an encapsulated environment that can be easily shared or reproduced. This is helpful if you have multiple projects that require different (potentially conflicting) software versions. You can easily switch between different conda environments.

Best practice recommendation: Setup a new conda environment for each of your projects; always use that conda environment when working on that project. This way you can ensure your software versions are consistent within a project and you can easily report which specific versions you used.

See our conda tip page for instructions to setup a “pygers” conda environment with our recommended packages installed.

Git and GitHub

See our git tip page for a tutorial on using Git and GitHub.

Get started programming with Python

Jupyter notebooks:

See our Jupyter notebook tip page for a tutorial on using Jupyter notebook.

iPython debugger:

See our ipdb tip page for a tutorial on using the iPython debugger.

Using Slurm

See our Slurm tip page for a tutorial on using Slurm.

Preregister your study

Add description

Open Science Framework.

Design your scan sequences

Add description

Choosing your acquisition parameters.

Setup your program card using ReproIn

Add description

Using ReproIn.

Setup your protocol

Protocols.io.

Convert your raw data to BIDS

Add description

Using HeuDiConv.

Data quality assurance

Add description

Using MRIQC.

Preprocessing

Add description

Using fMRIPrep.

Data version control

Add description

Using DataLad.

Data visualization

Add description

fMRI analysis software packages

Add description

Python packages

Add description

BrainIAK tutorials

Add description

Link to BrainIAK tutorials

Overleaf

Add description

Using Overleaf at Princeton.

Code Ocean

Add description

Using Code Ocean.

return to timeline