Skip to content
Michael Aydinbas edited this page Apr 16, 2023 · 1 revision

Get the code

You should clone the code repository via git instead of downloading the code as zip archive so that you a) get used and comfortable using git and b) can easily get the latest version and c) can easily contribute your changes.

git is available on most systems as pre-installed command, but if you don't already have it, follow the official documentation.

via command line

Once you have git installed, start your preferred command line tool/terminal. If you are on Windows, then installing Git will also give you the Git Bash Console (program) that is well suited for this task.

There are many good resources how to get started with git, for example earthdatascience

In the end, you should run git clone https://github.com/CorrelAidSwitzerland/a4d.git if you want to authenticate with your username and password, or git clone [email protected]:CorrelAidSwitzerland/a4d.git, if you are comfortable using an SSH key.

via GUI

If you don't like to work with a command line, you could also use one of the many Git GUI, for example Sourcetree. This will, however, require from you to create an Atlassian account. But otherwise it's free.

Get the data

You should have access to Google Cloud Platform (GCP) by now.

  1. Open the Cloud Storage service (either via Quick Access from your dashboard or by searching for it)
  2. Open the bucket a4d-315220-dataimport01-zrh-2021
  3. Download the data you want by first selecting it and then hitting the download button.

Make sure that you only download the data to an encrypted drive as required by the Volunteer Agreement you signed

Run the code

By now you should have both the code and the data somewhere locally on your computer. Now it is time to bring everything together.

For this, you are required to install both R and RStudio. Although the latter is not a hard requirement (it's an IDE and so up to your flavour) it might be easier if we all use the same environment.

Important: Whatever IDE you choose, make sure that you configure your IDE to not store ANY data or variables or session objects outside your encrypted drive. To do so for RStudio, open the Settings for RStudio and

  • Under General and Basics, see section Workspace, uncheck "Restore .RData into workspace at startup" and set "Save workspace to .RData on exit" to "Never".
  • You can leave the option "Always save history (even when not saving .RData)" checked, this will give you a history of your previous commands, which is quite helpful
Rstudio Settings

Once you have configured your IDE properly, you can start a new Project (special to RStudio).

  1. Go to File -> New Project
  2. Choose "Existing Directory"
  3. Browse to the cloned git repository
  4. Click "Create Project"

This will open a new session and give you the familiar look of RStudio, but the Files pane already contains the root directory of the repository and because it is a Git project, you can in the upper right corner activate the Git tab and for example browse the History. You can now start to adjust and customize the Layout of RStudio because this is saved on a project base, so it is only applied to this project.

To actually run the code, do the following:

  1. Open 3_Code directory in the Files pane
  2. Click on the script 10_Run_a4d_tracker_extraction.R
  3. Change line 9 so that example_tracker_path points to one of the data files you downloaded from GCP Cloud storage to your encrypted drive!
  4. Execute the whole file by choosing "Run All" or execute each line by choosing "Run Selected Line(s)"
  5. The code should run through without errors and you should be able to inspect the created objects like df_cleaned (line 41) in the Environment pane.

Clone this wiki locally