Welcome to the DIME Analytics training session on implementing a project workflow in GitHub. This interactive training is designed to help participants learn essential GitHub skills for version control and collaboration. You can read more about DIME Analytics Git and GitHub trainings here.
This project was created to be led by an instructor, and you will follow along. There are options to do this in R or Stata; you will see in the repository there are Stata and R folders. Choose your favorite folder and follow along.
This project aims to teach participants how to use GitHub effectively through practical exercises covering:
- Creating a repository
- Cloning the repository
- Setting up a folder structure
- Creating branches
- Using a main script
- Creating a comprehensive README file
This repository is part of the "Integrating GitHub into Your Project Workflow: Best Practices and Hands-On Exercises" training. It includes key elements to help you establish a well-structured data project. Follow along and make modifications as we progress through the training.
Note: For this training, the repository has already been created. This is just for reference
- Go to GitHub and log in to your account.
- Click on the "New" button to create a new repository.
- Enter a name for your repository.
- Set the repository visibility to private if necessary.
- Click "Create repository".
- Go to the GitHub repository: GitHub-MockProject-jul22.
- Click on the green "Code" button and select "Open with GitHub Desktop".
- Follow the prompts to clone the repository to your local machine.
-
Download the mock data from the provided link.
-
Save the data files to the desired location on your local machine.
-
See the two roots of your project: \texttt{data/} and \texttt{code/}.
-
Arrange the folder structure intuitively as follows: Note: Again, this structure has already been set-up for you, but this is a reference on good practices for your projects
code/ ├── cleaning/ ├── analysis/ ├── visualization/ data/ ├── raw/ ├── intermediate/ ├── analysis/ outputs/ - Folder for your outputs, if relevant. README.md - Project documentation. .gitignore - Specify files and folders to ignore in Git.
- As we will be working collaboratively, create a branch named
workflow_
followed by your initials. - Switch to that branch to start making changes to the project.
- In this trianing we will only work on one branch (each participant in its own branch). For your future projects follow the principle: branch often, merge often. Create a branch for each task and merge it back to the main branch by creating a Pull Request (PR).
- After you hit new branch, this pop-up will appear.
- After you create a branch, GH will move you to that branch, but you can also move between branches.
Here you will have the option to work either with R or Stata.
- Open the
main.do
/main.R
file in the mock project folder (make sure you are in your own branch).
**Parenthesis for R **
For the people using R, you will open the main.R
from the blue box there. This will link the code to the exact location on your computer. This is a recommended practice in R (instead of the famous but not recommended setwd, which can cause all sorts of headaches by breaking code portability).
-
Make the necessary modifications in the
main.do
/main.R
file to match your project structure:- Add the paths to match your computer and structure.
- See how the global paths are set dynamically.
- If you are working in R, the R project file (.Rproj) will set your working directory properly.
-
Use GitHub Desktop to commit your changes.
- Push/Publish your changes to GitHub.
- Open the
README.md
file in the repository, or the template linked here.- Provides a summary of the project's purpose and objectives.
- Includes setup instructions, key decisions, and usage instructions.
- Open the
.gitignore
file in the repository.- Prevents tracking of sensitive or unnecessary files.
- Keeps the repository clean and focused.
- Avoids conflicts from environment-specific files.