Welcome to the AI Infrastructure Team Lead / Engineering Manager Learning Repository! This curriculum is designed to prepare experienced AI infrastructure engineers for technical leadership and people management roles.
By completing this curriculum, you will be able to:
Technical Leadership:
- Lead technical architecture and design decisions
- Conduct code and architecture reviews
- Guide technical strategy for ML infrastructure
- Evaluate and select technologies and vendors
- Drive technical standards and best practices
People Management:
- Build and scale high-performing engineering teams
- Conduct hiring interviews and onboarding
- Provide career development and mentorship
- Manage performance and give effective feedback
- Handle difficult conversations and conflict resolution
Project & Process Management:
- Plan and execute quarterly roadmaps
- Manage project timelines and resources
- Implement agile methodologies effectively
- Balance technical debt vs new features
- Drive cross-functional collaboration
Business & Strategy:
- Translate business requirements into technical solutions
- Communicate with non-technical stakeholders
- Manage budgets and resource allocation
- Align team goals with company objectives
- Measure and report on team performance
Total Duration: 500 hours (12-14 weeks full-time, 25-30 weeks part-time) Difficulty: Advanced/Leadership Prerequisites: Senior AI Infrastructure Engineer experience (3+ years)
| Module | Topic | Duration | Focus |
|---|---|---|---|
| 101 | Leadership Fundamentals | 20 hours | Transition from IC to leader |
| 102 | People Management Essentials | 25 hours | 1:1s, feedback, performance |
| 103 | Hiring & Team Building | 20 hours | Recruiting, interviewing, onboarding |
| 104 | Technical Leadership | 25 hours | Architecture reviews, standards |
| 105 | Project Management for Engineers | 20 hours | Agile, roadmaps, execution |
| 106 | Communication & Stakeholder Management | 20 hours | Presentations, influence, negotiation |
| 107 | Strategic Thinking & Planning | 20 hours | OKRs, vision, strategy |
| 108 | Budget & Resource Management | 15 hours | Cost optimization, headcount |
| 109 | Building Team Culture | 20 hours | Culture, values, engagement |
| 110 | Crisis & Incident Management | 15 hours | Oncall, postmortems, escalations |
| Project | Description | Duration | Type |
|---|---|---|---|
| 01 | Team Process Implementation | 60 hours | Process Design |
| 02 | Technical Strategy & Roadmap | 60 hours | Strategic Planning |
| 03 | Hiring & Onboarding Pipeline | 60 hours | People Management |
| 04 | Cross-Functional Platform Project | 80 hours | Project Leadership |
| 05 | Leadership Capstone | 40 hours | Portfolio & Presentation |
Before starting, you should have:
- 3+ years as an AI Infrastructure Engineer or similar role
- Technical expertise in ML infrastructure, Kubernetes, cloud platforms
- Senior-level skills in system design and architecture
- Some exposure to leadership (mentoring, tech leads, etc.)
- Desire to transition into people management or technical leadership
This curriculum is designed for:
- Senior engineers ready for team lead roles
- Staff engineers considering management track
- Technical leads wanting formal leadership training
- New managers (0-12 months) seeking structured curriculum
- Individual contributors exploring management as career path
Months 1-2: Modules 101-103 (Leadership, People Management, Hiring)
Month 3: Project 01 (Team Process Implementation)
Months 4-5: Modules 104-106 (Technical Leadership, PM, Communication)
Month 6: Project 02 (Technical Strategy & Roadmap)
Month 7: Modules 107-109 (Strategy, Budget, Culture)
Month 8: Project 03 (Hiring & Onboarding Pipeline)
Month 9-10: Module 110 + Project 04 (Cross-Functional Project)
Month 11-12: Project 05 (Leadership Capstone)
ai-infra-team-lead-learning/
├── README.md # This file
├── CURRICULUM.md # Detailed curriculum guide
├── lessons/ # Leadership modules
│ ├── mod-101-leadership-fundamentals/
│ ├── mod-102-people-management/
│ ├── mod-103-hiring-team-building/
│ ├── mod-104-technical-leadership/
│ ├── mod-105-project-management/
│ ├── mod-106-communication/
│ ├── mod-107-strategic-thinking/
│ ├── mod-108-budget-resources/
│ ├── mod-109-team-culture/
│ └── mod-110-crisis-management/
├── projects/ # Leadership projects
│ ├── project-01-team-process/
│ ├── project-02-technical-strategy/
│ ├── project-03-hiring-onboarding/
│ ├── project-04-platform-project/
│ └── project-05-leadership-capstone/
├── assessments/ # Leadership assessments
├── resources/ # Books, articles, templates
└── community/ # Discussion, mentorship
- Module Quizzes (10 × 10 points = 100 points)
- Project Assessments (5 × 100 points = 500 points)
- Leadership Portfolio (100 points)
- 360° Feedback Simulation (100 points)
- Total: 800 points
- Overall Passing: 560/800 points (70%)
- Each Project: Minimum 70/100 points
- Leadership Portfolio: Minimum 70/100 points
You're ready to lead when you can:
- Build quarterly roadmaps aligned with business goals
- Conduct effective 1:1s and performance reviews
- Make data-driven hiring and technical decisions
- Navigate difficult conversations with confidence
- Balance team happiness with business objectives
- Communicate technical concepts to non-technical audiences
- AI Infrastructure Team Lead
- Engineering Manager - ML Infrastructure
- Technical Program Manager - AI/ML
- Engineering Manager - MLOps
- Manager - ML Platform Engineering
- Team Lead (3-5 reports): $160,000 - $230,000
- Engineering Manager (5-8 reports): $180,000 - $280,000
- Senior EM (8+ reports): $220,000 - $350,000
- Director-level: $250,000 - $450,000+
From:
- Senior AI Infrastructure Engineer
- Staff Engineer
- Tech Lead
To:
- Engineering Manager → Senior EM → Director
- Technical Program Manager → Senior TPM
- Group Engineering Manager (multiple teams)
- Architecture Leadership: System design reviews, ADRs, technical standards
- Technology Strategy: Tool evaluation, vendor selection, technical roadmaps
- Code Quality: Code review excellence, mentoring through PRs
- Incident Management: Oncall processes, postmortems, SRE practices
- People Development: Mentoring, coaching, career growth planning
- Communication: Presentations, writing, influence, negotiation
- Decision Making: Data-driven, consensus-building, speed vs accuracy
- Conflict Resolution: Difficult conversations, mediation, team dynamics
- Hiring: Job descriptions, interviewing, offer negotiation
- Performance Management: Goal setting, feedback, PIPs, promotions
- Resource Planning: Headcount planning, budget management
- Process Design: Agile, standups, retrospectives, team rituals
- Vision & Strategy: OKRs, quarterly planning, long-term thinking
- Stakeholder Management: Exec communication, cross-functional alignment
- Business Acumen: ROI analysis, prioritization, trade-offs
- Culture Building: Values, psychological safety, team engagement
Design and implement team processes including:
- Sprint planning and execution framework
- Code review standards and guidelines
- Oncall rotation and incident response
- Retrospective and continuous improvement
- Documentation and knowledge sharing
Deliverables: Process documentation, team handbook, tooling setup
Create a 12-month technical strategy including:
- Current state assessment and gap analysis
- Technology evaluation and selection
- Quarterly OKRs and key results
- Resource planning and timeline
- Risk mitigation strategies
Deliverables: Strategy document, roadmap presentation, exec briefing
Build comprehensive hiring system including:
- Job descriptions and leveling framework
- Interview process and question bank
- Candidate evaluation rubrics
- Onboarding 30-60-90 day plan
- Mentorship and buddy programs
Deliverables: Hiring playbook, interview guides, onboarding materials
Lead a simulated cross-functional project:
- Requirements gathering from stakeholders
- Technical design and architecture
- Resource allocation and timeline planning
- Risk management and mitigation
- Status reporting and communication
Deliverables: PRD, tech spec, project plan, status reports
Demonstrate leadership readiness through:
- Leadership philosophy statement
- Portfolio of work (projects 1-4)
- Presentation to senior leadership
- 360° feedback analysis and response
- Personal development plan
Deliverables: Portfolio, presentation, development plan
- The Manager's Path - Camille Fournier
- Radical Candor - Kim Scott
- High Output Management - Andy Grove
- An Elegant Puzzle - Will Larson
- The Making of a Manager - Julie Zhuo
- Will Larson's Staff Eng (staffeng.com)
- Charity Majors' blog
- Lara Hogan's blog
- Manager Tools podcast
- Rands in Repose
- 1:1 templates
- Performance review frameworks
- Hiring scorecards
- OKR templates
- Incident postmortem templates
- GitHub Discussions: Leadership Q&A, peer mentorship
- Office Hours: Monthly leadership AMA sessions
- Mentorship Program: Match with experienced engineering managers
- Slack Community: Real-time discussion and support
We welcome contributions from:
- Experienced engineering managers
- Leadership coaches
- Technical program managers
- Anyone passionate about engineering leadership
MIT License - See LICENSE for details
- Engineering leaders who reviewed curriculum
- Practicing managers who shared experiences
- Leadership coaches who provided guidance
- Open source engineering management community
- Principal AI Infrastructure Engineer (IC track)
- Principal AI Infrastructure Architect (Architecture track)
- Senior Engineering Manager (8+ reports)
- Director of Engineering (multiple teams)
- VP of Engineering (department leadership)
Ready to start your leadership journey? 🚀 Begin with Module 101: Leadership Fundamentals
Questions? Open an issue or join our community discussions!
Last Updated: October 2025 Version: 1.0.0 Maintained by: AI Infrastructure Curriculum Team Contact: ai-infra-curriculum@joshua-ferguson.com