You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: paper.md
+6-2
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,9 @@ title: "Data Carpentry for Biologists: A semester long Data Carpentry course usi
3
3
tags:
4
4
- R
5
5
- ecology
6
+
- biology
6
7
- data manipulation
8
+
- data management
7
9
- programming
8
10
- databases
9
11
- geospatial data
@@ -53,18 +55,20 @@ bibliography: paper.bib
53
55
54
56
# Summary
55
57
56
-
Data Carpentry for Biologists is a semester-long course in best practices for storing, loading, manipulating, and visualizing data using R. The course material includes video demonstrations, lecture notes for live coding demonstrations, links to openly available reference readings, coding practice exercises, and the output expected from completed exercises. The course is structured in cohesive week-long sections combining sets of learning materials that address a single topic. The lessons and exercises focus on biological examples with a particular focus on ecological examples. The course material is designed to be used in two ways. First, it can be used in a self-paced online format for individual learners. This is achieved by having all of the necessary material to understand and complete the course present on the website along with instructions for self-guided learning. Second, the course is designed to be modified and remixed to be taught in college and university classrooms. This is achieved by a modular design that allows modifying all aspects of the course and by detailed documentation for course customization. The website is viewed by thousands of users each month and the material and infrastructure has been used in courses at multiple colleges and universities.
58
+
Data Carpentry for Biologists is a semester-long course in best practices for storing, loading, manipulating, and visualizing data using R. The course material includes video demonstrations, lecture notes for live coding demonstrations, links to openly available reference readings, coding practice exercises, and the output expected from completed exercises. The course is structured in topics that combine sets of learning materials covering a week of college level material on a single subject. The lessons and exercises focus on biological examples with a particular focus on ecological examples. The course material is designed to be used in two ways. First, it can be used in a self-paced online format for individual learners. This is achieved by having all of the necessary material to understand and complete the course present on the website along with instructions for self-guided learning. Second, the course is designed to be modified and remixed to be taught in college and university classrooms. This is achieved by a modular design that allows modifying all aspects of the course and by detailed documentation for course customization. The website is viewed by thousands of users each month and the material and infrastructure has been used in courses at multiple colleges and universities.
57
59
58
60
# Statement of Need
59
61
60
-
Being able to work within a computational environment to store, load, manipulate, analyze, and visualize data has become a key component of many areas of biology [@jones2006; @hampton2017]. Despite this, many biologists do not have access to formal training in computation and are expected to pick up these skills on their own [@williams2019]. When training is available it is often not focused on a particular research domain, creating significant barriers to novices learning these important skills [@teal2015]. This lack of training has significant costs for the progress of biology because researchers spend more time learning the necessary skills and often develop less optimal approaches to solving problems [@teal2015]. As a result, surveys have identified a major need for more computational training among biologists [@barone2017]. Part of the reason for a lack of domain specific courses at colleges and universities is a lack of faculty with either the time or expertise to develop a new course in scientific computing. For example, a recent survey identified lack of expertise, training, and time as important barriers to faculty for developing undergraduate training in bioinformatics [@williams2019].
62
+
Being able to work within a computational environment to store, load, manipulate, analyze, and visualize data has become a key component of many areas of biology [@jones2006; @hampton2017]. Despite this, many biologists lack access to formal training in computation and are expected to pick up these skills on their own [@williams2019]. When training is available it is often not focused on a particular research domain, creating significant barriers to novices learning these important skills [@teal2015]. This lack of training has significant costs for the progress of biology because researchers spend more time learning the necessary skills and often develop less optimal approaches to solving problems [@teal2015]. As a result, surveys have identified a major need for more computational training among biologists [@barone2017]. Part of the reason for a lack of domain specific courses at colleges and universities is a lack of faculty with either the time or expertise to develop a new course in scientific computing. For example, a recent survey identified lack of expertise, training, and time as important barriers to faculty for developing undergraduate training in bioinformatics [@williams2019].
61
63
62
64
The ‘Data Carpentry for Biologists course’ is designed to help overcome a variety of these training impediments. The materials can be used as is or modified by instructors developing a scientific computing course for either in-person or online courses. The website is also designed to be used by students as an independent self-guided course.
63
65
64
66
# Features
65
67
66
68
## General instructional design
67
69
70
+
The course is focused on teaching practical computational skills in a domain specific context that is directly applicable to ecologists and biologists more broadly. Many general computing courses start with the foundations of computer programming, but this can be demotivating for domain specialists who are learning computing to accomplish specific data management and analysis tasks and can also make learning more difficult since students learn by incorporating new ideas into their existing knowledge [@bada2015]. Therefore this course follows the broader Data Carpentry philosophy of focusing on teaching the skills students need using familiar data and common computational challenges encountered within their scientific domains [@teal2015]. To accomplish this the course uses a number of real ecological datasets and the coding demonstrations and exercises focus on common tasks in the analysis of biological data.
71
+
68
72
The course is built around the “I do, we do, you do” approach to teaching where first the instructor demonstrates how to do something, then the students work on an example with the instructor present to help and answer questions, and finally the students work on additional examples independently. This approach is based on explicit instruction principles, which leverage the benefits of active-learning without leaving students who are less comfortable with the material feeling lost [@rosenshine1987; @archer2010]. This approach, described as “a systematic method of teaching with emphasis on proceeding in small steps, checking for student understanding, and achieving active and successful participation by all students” [@rosenshine1987 p.34], is useful for teaching introductory computing to scientists because it gradually builds comfort and competence for all students in this essential foundation of research.
69
73
70
74
A standard lesson starts with a brief introduction to the concept being taught and why it is important. This is followed in the classroom by a live coding demonstration of the first piece of the material. During and following the demonstration students have the opportunity to ask questions. The class is then assigned a short exercise designed to reinforce and check understanding on the material. While working on the exercise students ask additional questions and the instructor looks for students who are struggling with the concept and engages them to help work through whatever challenge they are facing. The next small chunk of material is then presented. Additional exercises are provided for the students to work on outside of class to further reinforce the material and help the students engage with more complex and integrated approaches to what they are learning.
0 commit comments