Skip to content

Data Carpentry for Biologists: Introductory data science resources and course website

Zack Brym edited this page Nov 10, 2016 · 11 revisions

Fork our course: A semester-long Data Carpentry course for biologists

This is post is co-authored by Zack Brym and Ethan White

Over the last year and a half we have been actively developing a semester-long Data Carpentry course designed to be easily customized and integrated into existing graduate and undergraduate curricula.

Data Carpentry for Biologists contains course materials for teaching scientists how to work better with data. The course introduces best practices for data management and databasing, data manipulation and analysis, and data visualization. It covers the same general types of material as a two-day Data Carpentry workshop, but expands the materials and opportunities for practice around a full-length university course. The teaching material uses R and SQLite, with some corresponding materials for Python as well. The examples and exercises focus on biological questions and working with real data. The course emphasizes using best practices to produce reusable and reproducible data analysis.

Active-learning Teaching Materials

Computing is best learned with practice and actively working through programming problems. Just diving in to computing is challenging for most scientists, so the course instruction is designed to combine short live-coding introduction to concepts followed immediately by the students working on a related exercise. Additional exercises are assigned later for practice. This follows the "I do", "We do", "You do" approach to teaching, which leverages the benefits of active-learning and flipped classrooms without leaving students who are less comfortable with the material feeling lost. The bulk of class time is spent working on assigned exercises with the instructor moving around the room helping guide students through things they don't understand and engaging with students who are thinking about advanced applications of what they've learned.

This approach is the result of lots of reading about effective teaching methods and Ethan's experience teaching this and related courses over the last six years at Utah State University and the University of Florida. It seems to work well for both students that get the material easy and those that find it more challenging. We've also tried to make these materials as useful as possible for self-guided students. The course website and materials are designed to make information easily accessible and clear to understand with relatable and engaging examples for a broad range of biologists.

Open course development

Software Carpentry and Data Carpentry have shown how powerful collaborative lesson development can be and we're interested in bringing that to the university classroom. We have designed the course materials to be modular and easy to modify, and the course website easy to clone and set up. All of the teaching materials and associated website files are openly available at the Data Carpentry for Biologists repository on GitHub under CC-BY and MIT licenses. The course materials are all written in Markdown and everything runs on Jekyll through GitHub Pages. Making your own version of the course should take less than an hour. We've developed documentation for how to create your own version of the course and how to contribute to development. Exercises and assignments are modular and changing exercises and assignments simply involves reordering items in a list. Adding a new exercise involves creating a new Markdown file and then adding it's title to the list of exercises for an assignment.

Get Involved

If you teach, or want to teach, a course like this, we'd love to get you involved. Here are some useful links for getting started.

We want to be sure getting involved is as easy as possible. We've worked hard to provide documentation and help resources for students and instructors. Students can find all they need to know at our student start guide. Instructors have access to course content and site design documentation.

If your having trouble finding something or getting something to work, or simply have some feedback about the course please open a new issue at GitHub or send us an email.