Hands on reproducible analysis of neuroimaging data: Nov. 2-3, UCSD

These lectures and hands-on exercises are a part of the training curriculum from the SfN 2018 training course ran by ReproNim (Reproducible Neuroimaging) Center. Selected materials are taylored for this course and cover only some sections within the full-day, but otherwise really compressed, event schedule. Please visit ReproNim: Teach for more materials, which we will also reference within specific lessons here.

Introduction

The term “reproducibility” conjures a mental image of dedicated systems conducting automated and repeatable computations. However, you can embrace reproducibility as a principle to apply to your day-to-day research activities. Neuroimaging is a heavily data and software driven field of science. As a result, by learning more tricks and techniques of the tools that you already use daily, you will discover ways to not only improve efficiency but also increase the reproducibility of your research.

To some degree, reproducibility requires knowledge of what, when, and how any particular analysis was carried out. Therefore the lessons in this module will focus on helping to answer those questions, while going from how “black box” shell could provide you valuable record of your activities, over to use of complete computational environments where versioning and origin information about each component is either exactly prescribed or just could be identified, and then to entire (simple but complete) data analysis from raw data while maintaining a complete and unambiguous provenance of all actions and access to all components of the study (code, data, computational environments).

Schedule

08:30 Introduction
09:00 Reproducibility Basics
09:00 Shell: Getting around the “black box” Why and how does using the command line/shell efficiently increase reproducibility of neuroimaging studies?
How can we assure that our scripts do the right thing?
09:25 (Neuro)Debian/Git/GitAnnex/DataLad: Distributions and Version Control What are the best ways to obtain and track information about software, code, and data used or produced in the study?
10:10 ReproEnv: Virtual machines/Containers, Neurodocker How to encapsulate complete computational environments into redistributable/reusable containers?
10:45 Break Where to find a bathroom?
11:00 FAIR Data
11:00 Overview: Data and the FAIR Principles What is FAIR?
How does FAIR apply to me?
Towards FAIR neuroimaging data.
Overview of BIDS and data standardization
12:45 Lunch What and where?
14:00 Data Processing
14:00 Neuroimaging Workflows Principles of re-executable processing
14:20 ReproIn/DataLad: A complete portable and reproducible fMRI study from scratch How to implement a basic neuroimaging study with complete and unambiguous provenance tracking of all actions?
14:50 ReproIn/DataLad: A Reproducible GLM Demo Analysis How to implement a basic GLM Demo Analysi with complete and unambiguous provenance tracking of all actions?
15:25 Break Where to find a bathroom?
15:40 Statistics
15:40 An introduction to the Statistics in reproducibility module Who is this module for ?
How can I get some help if I get stuck on solving for an exercise or a question ?
How can I validate this module ?
16:55 Finish