FlowFuse · joepavitt · Sep 30, 2025 · Sep 30, 2025 · Sep 30, 2025 · Sep 30, 2025
@@ -0,0 +1,87 @@
+---
+navTitle: DevOps Engineer
+navGroup: Job Descriptions
+---
+
+# DevOps Engineer
+
+## Job Description
+
+The DevOps Engineer at FlowFuse plays a critical role in building and maintaining the infrastructure that powers FlowFuse, ensuring reliability, scalability, and security for our customers' industrial automation and IIoT solutions. Additionally, a DevOps Engineer at FlowFuse should always be practicing our [Iterative Improvement](../../company/values.md#🔁-iterative-improvement) value, seeking areas for automation and refinement in our own internal practices and processes. 
+
+This role combines deep technical expertise, with a customer-focused mindset, to ensure reliable infrastructure and automation, as well as establishing a close relationship and collaboration with our engineering teams to create robust, automated systems that enable our engineering teams to deliver value quickly and reliably.
+
+As a DevOps Engineer, you'll be responsible for building and managing our cloud infrastructure, automating deployment processes, and building tools that improve both developer productivity and customer experience. You'll work closely with our engineering teams to understand their needs and translate them into ideas on how automation can help them. This position requires a balance of technical depth, automation expertise, and strong collaboration skills to support both FlowFuse customers and internal engineering teams.
+
+Key Responsibilities:
+
+* **Infrastructure Management & Automation**: Design, implement, and maintain AWS-based infrastructure using Infrastructure as Code (IaC) principles. Build and maintain CI/CD pipelines that enable rapid, reliable deployments while maintaining high security standards.
+* **Platform Reliability & Monitoring**: You should have your finger on the pulse of the health of the platform at all times, and be able to implement comprehensive monitoring, logging, and alerting systems to ensure platform stability and performance. Develop automated incident response procedures and conduct root cause analysis for production issues.
+* **Developer Experience & Tooling**: Build and maintain development tools, automation scripts, and internal services that improve developer productivity and reduce friction in the development process. This should be a key focus of the role, you should be an amplifier of those around you by removing friction with automation, and you should always be seeking to identify areas for refinement in our own internal practices and processes.
+* **Security & Compliance**: Implement security best practices across all infrastructure components, ensuring compliance with industry standards, security policies, and customer requirements.
+* **Customer-Focused Operations**: Respond to incidents quickly, and understand the needs of our customers, iterating on onboarding processes for FlowFuse Self-Hosted and Dedicated, reducing friction and improving the customer experience. 
+* **Collaboration & Knowledge Sharing**: Work closely with engineering teams to understand their infrastructure needs, provide technical guidance, and share knowledge through documentation and mentoring. Ensure we are learning from incidents when they happen, document our learnings and implement changes to prevent similar incidents from happening again.
+
+## Skills
+
+The DevOps Engineer skill set includes:
+
+### Must Have
+
+* **Cloud Infrastructure Expertise**: 4-6 years of hands-on experience with AWS services including EC2, EKS, RDS, S3, CloudFront, and IAM. Strong understanding of cloud architecture patterns and best practices.
+* **Kubernetes Proficiency**: Experience with Kubernetes API, container orchestration, and managing containerized applications in production environments.
+* **Node.js & JavaScript**: Solid experience with Node.js development and JavaScript ecosystem, enabling effective collaboration with our engineering teams.
+* **CI/CD & Automation**: Proven experience building and maintaining CI/CD pipelines using tools like GitHub Actions, Jenkins, or similar platforms.
+* **Infrastructure as Code**: Experience with Terraform, CloudFormation, or similar IaC tools for managing cloud infrastructure.
+* **Monitoring & Observability**: Experience implementing monitoring solutions using tools like Prometheus, Grafana, ELK stack, or similar observability platforms.
+* **Linux System Administration**: Strong Linux skills including shell scripting, system configuration, and troubleshooting.
+* **Git & Version Control**: Proficiency with Git workflows and collaborative development practices.
+
+### Nice to Have
+
+* **Observability Tools**: Experience deploying and managing observability tools like DataDog, Sentry, or similar APM and Monitoring solutions.
+* **Database Management**: Experience with PostgreSQL, MySQL, or other database systems including backup, recovery, and performance optimization.
+* **Security Best Practices**: Knowledge of security frameworks, vulnerability management, and compliance requirements (SOC 2, ISO 27001).
+* **Multi-cloud Experience**: Experience with other cloud providers (Azure, GCP) or hybrid cloud environments.
+* **Industrial/IIoT Background**: Understanding of industrial automation protocols, edge computing, or IoT device management.
+* **Python/Go Development**: Additional programming language experience for building internal tools and automation scripts.
+* **Team Leadership**: Experience mentoring junior engineers or leading infrastructure initiatives in larger teams.
+
+## 90-Day Plan
+
+* **Week 1-4: Foundation & FlowFuse Immersion**
+   * **Infrastructure Assessment**: Conduct a comprehensive review of existing AWS infrastructure, CI/CD pipelines, and monitoring systems
+   * **Team Integration**: Meet with engineering teams to understand their workflows, pain points, and infrastructure needs
+   * **Documentation Review**: Study existing infrastructure documentation and incident response procedures
+   * **Install FlowFuse**: Install FlowFuse in a variety of environments, and provide feedback on the experience and areas of improvement
+   * **Tool Familiarization**: Get hands-on experience with FlowFuse's current toolchain and development processes
+   * **Initial Improvements**: Implement quick wins to improve developer experience, system reliability and onboarding experience for FlowFuse
+
+* **Week 5-8: Infrastructure Enhancement & Automation**
+   * **CI/CD Optimization**: Enhance existing deployment pipelines with better testing, security scanning, and rollback capabilities
+   * **Monitoring Implementation**: Deploy comprehensive monitoring and alerting for critical systems and customer-facing services
+   * **Automation Development**: Build scripts and tools to automate common operational tasks and reduce manual intervention
+   * **Performance Optimization**: Establish performance benchmarks and implement optimizations to improve response times and resource utilization
+   * **Security Hardening**: Tackle security issues as they arise, and implement additional security measures and compliance controls across the infrastructure
+   * **Knowledge Sharing**: Begin documenting processes and sharing knowledge with the engineering team
+
+* **Week 9-13: Strategic Impact & Innovation**
+   * **Infrastructure Scaling**: Design and implement solutions to support FlowFuse's growth and increasing customer demands
+   * **Disaster Recovery**: Implement comprehensive backup and disaster recovery procedures
+   * **Cost Optimization**: Analyze cloud costs and implement strategies to optimize spending while maintaining performance
+   * **Advanced Monitoring**: Deploy advanced observability tools and create bespoke dashboards for better system visibility
+   * **Process Improvement**: Lead initiatives to improve operational processes and reduce mean time to recovery (MTTR)
+
+## Hiring Plan
+
+1. **Initial Screening**: Review resumes and cover letters to assess technical qualifications and experience alignment with FlowFuse's needs.
+
+2. **Technical Interview (Infrastructure & Automation)**: Video interview focusing on AWS expertise, Kubernetes knowledge, CI/CD experience, and problem-solving approach to infrastructure challenges.
+
+3. **System Design Interview**: Present candidates with real-world scenarios involving scaling challenges, incident response, or infrastructure optimization to assess their architectural thinking.
+
+4. **STAR Interview**: Behavioral interview focusing on past situations, tasks, actions, and results to understand problem-solving abilities and value alignment
+
+5. **Final Interview**: A final interview with key stakeholders or other members of the leadership team
+
+6. **Offer**: Extend an offer to the selected candidate.