Hey! You're here because you want to show your worth as a Site Reliability Engineering (a.k.a SRE). You know what? I'm really happy you're here. If you want more information about what is a SRE, we recommend read the books published by Google to increase your knowledge.
Let's go!
You have to fork this repository to complete the following challenges in your own github account. Feel free to solve the challenge you want. If you have any doubt, don't hesitate to open an issue to ask any question about any challenge.
Exists 6 basic challenges and 3 extras challenge. So, the basic we recommend to finish them and the extra only if you want demostrate more.
- Every challenge must have the SOLUTION.md in their directory.
- The content of SOLUTION.md is how-to obtain the result, executed commands and short explanation (if necessary).
And... this is all. The first step is clone the repository and read quietly.
NOTE: Go challenge-1 directory.
We've found a sample.log file with 3360 lines but we need some info. Can you help us?
- Count all lines with
500HTTP code. - Count all
GETrequests fromyokoto/rrhhlocation and was OK (200). - How many requests go to
/? - Count all lines without
5XXHTTP code. - Replace all
503HTTP code by500, how many requests have500HTTP code?
NOTE: Create challenge-2 directory.
We would like get some info. about the server. Can you help us? Someone told us about sysstat package.
- Check the distribution.
- Check CPU usage.
- Check RAM usage. Can you explain the difference of
free,used,sharedandavailablestats? - List block devices and file system disk.
- Obtain TCP and UDP listen ports.
- Get only PID top 10 process with more CPU usage.
- List all pid which open/used
/dev/null.
NOTE: Create challenge-3 directory.
We would like use the challenge-2 commands with a simple menu (develop with bash script). In my mind the -h (help) print this:
Usage: myscript [options..]
Myscript description
Myscript options:
-d, --disk check disk stats
-c, --cpu check cpu stats
-p, --ports check listen ports
-r, --ram check ram stats
-o, --overview top 10 process with more CPU usage.NOTE: Go challenge-4 directory.
We've the server.py code and we want containerized (with docker) this HTTP server. Can you give us the Dockerfile? Ah! Can you check everything is running? Our technical team told us we need make a request with Challenge: intelygenz.com header. Can you give us the result that server print?
NOTE: Go challenge-5 directory.
Oh, no! I don't know what happen on this binary! Can you help me? When I executed the binary told me always Ooooh, what's wrong? :(. How to fix it? We expected Congrats! :) message.
NOTE: Go challenge-6 directory.
NOTE 2: We recommend use a Virtual Machine with Debian (or you favorite flavour).
You find a playbook but is incomplete. Can you develop Ansible tasks to deploy the challenge-4?
- Add the server on the inventory.
- Install
docker. buildthe image fromDockerfile(challenge-4).- Deploy the image on the server.
- Check if HTTP server is running and response properly.
- Save the output of the
ansible-playbookexecution inansible.logfile and upload. - Group tasks with
tags.
We've some modules to solve it:
NOTE: Create challenge-extra-1 directory.
- Use kreuzwerker/docker and hashicorp/http providers to replicate
challenge-6with Terraform. - Upload all files when you finished the task.
NOTE: Go challenge-extra-2 directory.
Prepare environment:
Get info.:
- Get all namespaces.
- Get all pods from all namespaces.
- Get all resources from all namespaces.
- Get all services from namespace
intelygenz. - Get all deployments from
tools. - Get image from
nginxdeployment onintelygenznamespace. - Create a
port-forwardto accessnginxpod onintelygenznamespace.
NOTE: Create challenge-extra-3 directory.
