Welcome to the beginning of your journey to becoming an ML Engineer (MLE)! π Follow these steps to get your development environment teed up! After you've finished this set-up, feel free to go through the associated Whodunit?! π΅οΈββοΈ
We will be using some terminal commands, so let's make sure you know what they are and what they do!
| Command | Stands For | Description |
|---|---|---|
ls |
long listing | lists all files and directories in the present working directory |
ls -a |
long listing all | lists hidden files as well |
cd {dirname} |
change directory | to change to a particular directory |
cd ~ |
change directory home | navigate to HOME directory |
cd .. |
change directory up | move one level up |
cat {filename} |
concatenate | displays the file content |
sudo |
superuser | allows regular users to run programs with the security privileges of the superuser or root |
mv {filename} {newfilename} |
move | renames the file to new filename |
clear |
clear | clears the terminal screen |
mkdir {dirname} |
make directory | create new directory in present working directory or at specified path |
rm {filename} |
remove | remove file with given filename |
touch {filename}.{ext} |
touch | create new empty file |
rmdir {dirname} |
remove directory | deletes a directory |
ssh {username}@{ip-address} or {hostname} |
secure shell | login into a remote Linux machine using SSH |
CTRL + SHIFT + C |
copy | keyboard shortcut for copying from terminal |
CTRL + SHIFT + V |
paste | keyboard shortcut for pasting into terminal |
We will also be using a few tools such as git, conda, and pip.
Git
Git is a free and open source distributed version control system designed to handle everything from small to very large projects. These are the commands we will be using with git:
git clone -> clone a remote repository to your local computer
git add -> add files to a commit
git commit -m {message} -> commit changes with a message
git push -> push commit to remote repository
Conda & Pip
Conda is an open-source, cross-platform, language-agnostic package manager and environment management system. We will use pip within conda environments to manage our package installations. pip is Python's package management system. conda comes with Anaconda. And Anaconda is a convenient way to set up your Python programming environment since it comes with an enviornment management tool (conda) and comes with extra packages that are commonly used in data science and ML.
Some commands we will use in this lesson when it comes to conda and pip:
conda create --name mle-course python=3.8 pip -> This creates a virtual environment. A virtual environment is a Python environment such that the Python interpreter, libraries, amnd scripts installed into it are isolated from those installed on other environments and any libraries installed on the system. So basically, this allows you to keep all your project's code/dependencies/libraries separated from other projects. You are specifically saying to create said environment with the name mle-course, use python version 3.8, and use pip as your package manager. The command conda invokes the underlying logic to actually make the virtual environment and manages said environments for you.
conda activate mle-course -> This activates the virtual environment you made with the above command for your current terminal session.
pip install numpy pandas matplotlib -> This installs the three packages mentioned - numpy, pandas, and matplotlib. numpy is used for scientific computing, pandas is used for data analysis, and matplotlib is used for data graphics. pip is the Python package manager and you are telling it to install the listed packages to your environment.
Jupyter Notebooks
Jupyter Notebooks are an incredibly useful tool for experimentation, iteration, exploration, and even production at some companies!
They have the file extension .ipynb (IPYthon NoteBook)
You can learn more about Jupyter and their notebooks here!
In order to use a notebook, you'll first want to make sure you've installed jupyter in your environment
conda activate <YOUR ENV NAME HERE>pip install jupyter
From here, you can navigate to any folder containing a .ipynb file, and run the command jupyter notebook. This should launch a server, and provide you with a link. Navigate to the link in your browser in order to get started in your notebook!
Be sure to terminate the server when you are done! Closing the webpage does not stop the server, so you'll need to make sure you do that manually in the terminal, or before you close the webpage with your server!
Let's start off by setting up our environment! Review the environment setup instructions for the local environment that you'll be using in this course.
Windows
- Install Windows Subsystem for Linux using Powershell
wsl --install -d Ubuntu-20.04- Install Windows Terminal (You can even make it your default!)
- Install Ubuntu
(If you find yourself getting stuck on the WSL2 install, here is a link to video instructions)
Give it a test drive!
Continue by installing the following tools using Windows Terminal to setup your environment. When prompted, make sure to add conda to init.
| Tool | Purpose | Command |
|---|---|---|
| π Anaconda | Python & ML Toolkits | wget https://repo.anaconda.com/archive/Anaconda3-2021.11-Linux-x86_64.sh bash Anaconda3-2021.11-Linux-x86_64.sh source ~/.bashrc |
| Version Control | sudo apt update && sudo apt upgrade sudo apt install git-all |
Linux (Debian/Ubuntu)
Open terminal using Ctrl+Shift+T. Enter the following commands in terminal to setup your environment. When prompted, make sure to add conda to init.
| Tool | Purpose | Command |
|---|---|---|
| π Anaconda | Python & ML Toolkits | wget https://repo.anaconda.com/archive/Anaconda3-2021.11-Linux-x86_64.sh bash Anaconda3-2021.11-Linux-x86_64.sh source ~/.bashrc |
| Version Control | sudo apt update && sudo apt upgrade sudo apt install git-all |
macOS
To get started, we need to download the MacOS package manager, Homebrew πΊ, so that we can download the tools we'll be using in the course. If you don't already have Homebrew installed, run the following commands:
-
Open terminal using β+Space and type
terminal. -
Install Homebrew using the command below, following the command prompts:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" -
Update Homebrew (This may take a few minutes)
git -C /usr/local/Homebrew/Library/Taps/homebrew/homebrew-core fetch --unshallowgit -C /usr/local/Homebrew/Library/Taps/homebrew/homebrew-cask fetch -
Install the
wgetcommand to continue following alongbrew install wget
Enter the following commands in terminal to setup your environment. When prompted, make sure to add conda to init.
| Tool | Purpose | Command |
|---|---|---|
| π Anaconda | Python & ML Toolkits | wget https://repo.anaconda.com/archive/Anaconda3-2021.11-MacOSX-x86_64.sh bash Anaconda3-2021.11-MacOSX-x86_64.sh source ~/.bashrc |
| Version Control | brew install git |
If you don't already have one, make an account on Github
Github SSH Setup
Secure Shell Protocol (SSH) provides a secure communication channel of an unsecured network. Let's set it up!- Generate a Private/Public SSH Key Pair.
ssh-keygen -o -t rsa -C "your email address for github"-
Save file pair. Default location
~/.ssh/id_rsais fine! -
At the prompt, type in a secure passphrase.
-
Copy the contents of the public key that we will share with GitHub.
- For WSL:
clip.exe < ~/.ssh/id_rsa.pub - For MacOS:
pbcopy < ~/.ssh/id_rsa.pub - For Linux:
xclip -sel c < ~/.ssh/id_rsa.pub -
Go to your GitHub account and go to
Settings. -
Under
Access, click on theSSH and GPG keystabs on the left.
- Click on the
New SSH Keybutton.
- Name the key, and paste the public key that you copied. Click the
Add SSH Keybutton
Creating a New Repository
When viewing the respository page, click on New and proceed to create your repo.
Filling Respository Details
Create the repository by inputting the following:
Repo nameRepo description- Make repo
public - Add a
README - Add
.gitignore(Python template) - Add
license(choose MIT)
Then click Create Repository.
Clone Your Repo
- Open your terminal and navigate to a place where you would like to make a directory to hold all your files for this class using the command
cd.
cd {directory name}- Once there, make a top level directory using
mkdir.
mkdir {directory name}cdinto it and make another directory calledcode.
cd {directory name}mkdir codecdinto it and run yourgit clone {your repo url}command.
cd codegit clone {your repo url}- Now let's get into our directory so we can access the contents of the repo!
cd {your repo name}Adding The FourthBrain Whodunit? Content to Your Repo
- Check your remote git.
git remote -vAt this point, you should just have access to your own repo with an origin branch with both fetch and push options.
- Let's setup our global configuration:
git config --global user.email "your email address"git config --global user.name "your name"- Let's add a local branch for development.
git checkout -b LocalDevYou can change anything here in this branch!
git add .Commit the changes with the branch addition.
git commit -m "Adding a LocalDev branch."- Let's push our local changes to our remote repo.
git checkout maingit merge LocalDevgit push origin main- Add the Whodunit (WD) repo as an extra remote repo:
git remote add WD [email protected]:FourthBrain/whodunit.gitLet's check our remote repos:
git remote -vAt this point, you should have access to both your own repo and FourthBrain and should see something like this:
WD [email protected]:FourthBrain/whodunit.git (fetch)
WD [email protected]:FourthBrain/whodunit.git (push)
origin [email protected]:rafatisina/TestRepo.git (fetch)
origin [email protected]:rafatisina/TestRepo.git (push)Let's update our local repos:
git fetch --allMake a new branch for the Whodunit material (WDBranch).
git checkout --track -b WDBranch WD/mainYou should see something like this:
Branch 'WDBranch' set up to track remote branch 'main' from 'WD'.You can visually check whether you are in that branch:
git log --all --graphNow let's push our updated local repo to our remote repo!
git checkout maingit merge WDBranch --allow-unrelated-historiesIf there are any conflicts you'll need to resolve them.
git add .git commit -m "message-here"git push origin mainFrom now on... after each release follow these steps to update your repo with new content:
git fetch --all
git checkout WDBranch
git merge --ff-only @{u}
git add .
git commit -m "branch is updated"
git checkout main
git merge WDBranch --allow-unrelated-historiesYou will be asked to add a comment about why this change is necessary --> add a message.
git push origin main
Jupyter notebooks
-
First, make sure that you are in your repo's main directory. Then navigate to the MLE-8 folder of your repo.
HINT:You can usepwdto see the directory you're currently in. -
Navigate to the
notebooksfolder within thesoftware-dev-for-ml-101folder.
cd software-dev-for-ml-101/notebooks- Activate your conda environment that you created above.
conda activate <YOUR ENV NAME HERE>-
Run the
jupyter notebookcommand. -
A new window should open in your browser with the Jupyter Server. If not copy and paste the give link in your browser.
-
Open the
unix-conda-pip.ipynbnotebook and go through the demo.
Note: JupyterLab is an acceptable alternative to Jupyter Notebooks if you prefer JupyterLab!
Now let's practice what you have learned by playing the Whodunit? game!








