-
Notifications
You must be signed in to change notification settings - Fork 15
Handbook: Job Description - DevOps Engineer #3870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* **Documentation Review**: Study existing infrastructure documentation and incident response procedures | ||
* **Install FlowFuse**: Install FlowFuse in a variety of environments, and provide feedback on the experience and areas of improvement | ||
* **Tool Familiarization**: Get hands-on experience with FlowFuse's current toolchain and development processes | ||
* **Initial Improvements**: Implement quick wins to improve developer experience, system reliability and onboarding experience for FlowFuse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a really ambitious plan. With a developer I expect the first week to have a PR merged, and pick up the pace from there. The first 5 bullet points are studying and getting to know items. I fear we hire someone that just studies for weeks without starting to iterate on the infra from day one.
I suspect that anyone can find a broken window in their first week: https://en.wikipedia.org/wiki/Broken_windows_theory
Impact from day 2 onwards I'd say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sensible, though I'd argue in that case many of our other job descriptions we already have published need to be more aggressive to align to this too.
* **Initial Improvements**: Implement quick wins to improve developer experience, system reliability and onboarding experience for FlowFuse | ||
|
||
* **Week 5-8: Infrastructure Enhancement & Automation** | ||
* **CI/CD Optimization**: Enhance existing deployment pipelines with better testing, security scanning, and rollback capabilities |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently the release seems very manual, with mutiple engineers in a room together each time. Consider moving this up to week 1-4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to move it up. Release is only 2 people (because one is running the scripts, and the second has to approve PRs opened before they can be merged). Generally taking 45 minutes of active work (not accounting for 20 mins in the middle where tests run)
|
||
* **Week 5-8: Infrastructure Enhancement & Automation** | ||
* **CI/CD Optimization**: Enhance existing deployment pipelines with better testing, security scanning, and rollback capabilities | ||
* **Monitoring Implementation**: Deploy comprehensive monitoring and alerting for critical systems and customer-facing services |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have the foundation already, I'm not sure what the task is here.
* **Automation Development**: Build scripts and tools to automate common operational tasks and reduce manual intervention | ||
* **Performance Optimization**: Establish performance benchmarks and implement optimizations to improve response times and resource utilization | ||
* **Security Hardening**: Tackle security issues as they arise, and implement additional security measures and compliance controls across the infrastructure | ||
* **Knowledge Sharing**: Begin documenting processes and sharing knowledge with the engineering team |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Knowledge is Power" + Contributions on the first week imply that this should be done the first weeks.
navGroup: Job Descriptions | ||
--- | ||
|
||
# DevOps Engineer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now this is mostly a SysOps role, where DevOps would be a Developer + Operations. Has the Job Description just changed to be the same as old school sysops?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is almost entirely based on the job description we posted when we hired previously.
|
||
### Must Have | ||
|
||
* **Cloud Infrastructure Expertise**: 4-6 years of hands-on experience with AWS services including EC2, EKS, RDS, S3, CloudFront, and IAM. Strong understanding of cloud architecture patterns and best practices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to allow Azure or other clouds? Most of the tools are about the same I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had Azure/GCP as "Nice to have", whilst I'm sure there is overlap, it feels like given we are an AWS shop, and I know having a grasp of AWS costing is an absolute art, that having AWS should be "must have", the others are "nice to haves"
* **Security Best Practices**: Knowledge of security frameworks, vulnerability management, and compliance requirements (SOC 2, ISO 27001). | ||
* **Multi-cloud Experience**: Experience with other cloud providers (Azure, GCP) or hybrid cloud environments. | ||
* **Industrial/IIoT Background**: Understanding of industrial automation protocols, edge computing, or IoT device management. | ||
* **Python/Go Development**: Additional programming language experience for building internal tools and automation scripts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use Node.JS for this? For internal tools I don't think it's worth adding a new language?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we can scope this to NodeJS, but also, why be opinionated for internal tooling? If we're building utility apps and automations, I'd be hoping people use the best tools available, rathe than forcing it into NodeJS?
### Nice to Have | ||
|
||
* **Observability Tools**: Experience deploying and managing observability tools like DataDog, Sentry, or similar APM and Monitoring solutions. | ||
* **Database Management**: Experience with PostgreSQL, MySQL, or other database systems including backup, recovery, and performance optimization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So Git is a must have, and PG a nice to have? That's surprising in my mind
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, will pull it up into "Must Have"
Co-authored-by: Zeger-Jan van de Weg <[email protected]>
Co-authored-by: Zeger-Jan van de Weg <[email protected]>
Description
As part of the best practice to document job descriptions for all existing FF employees, I'm starting to upload those for the Engineering Team, first one here is the DevOps Engineer role.