B

Senior Manager – DevOps Engineering at Beacon Power Services

Beacon Power Services
Full-time
On-site
How You'll Make an Impact


We are looking for a strategic and execution-focused Senior Manager - DevOps Engineering to define and scale the infrastructure strategy that powers our software platforms.
You will lead the evolution of our cloud architecture, DevSecOps practices, and operational resilience model across the organization. This role goes beyond pipeline management, you will establish standards, build scalable platform capabilities, and ensure our infrastructure enables rapid product innovation while maintaining enterprise-grade security, reliability, and compliance.
You will be accountable for infrastructure strategy, platform performance, reliability engineering, cost governance, and incident management maturity across environments.


What You'll Do

Infrastructure Strategy & Platform Leadership


Define and own the cloud and infrastructure strategy aligned with company growth and product scalability objectives
Architect resilient, secure, and cost-efficient multi-environment cloud platforms (AWS, Azure, or GCP)
Establish and enforce infrastructure standards, governance models, and best practices across teams
Lead capacity planning, disaster recovery strategy, and business continuity planning
Drive cloud cost optimization


DevOps & Platform Engineering Excellence


Build and scale CI/CD frameworks that enable reliable, secure, and high-frequency releases
Institutionalize Infrastructure as Code (Terraform, CloudFormation, Pulumi) and GitOps practices across teams
Establish platform engineering capabilities to improve developer productivity and reduce operational friction
Drive automation strategy to eliminate manual processes and increase deployment confidence
Define and monitor DevOps maturity metrics across the organization


Reliability, Observability & Operational Resilience


Lead Site Reliability Engineering (SRE) practices to improve uptime, performance, and scalability
Define and enforce SLAs, SLOs, and error budgets
Oversee monitoring, logging, and observability frameworks (Prometheus, Grafana, ELK, Datadog)
Lead major incident response processes and post-incident reviews
Drive root cause analysis discipline and continuous reliability improvement


Security & Compliance (DevSecOps)


Embed security-by-design principles across infrastructure and deployment pipelines
Partner with Cybersecurity leadership on vulnerability management and risk mitigation
Establish robust access control, secrets management, and compliance frameworks
Ensure infrastructure meets regulatory and industry standards


Managerial Responsibilities


Define and own the annual operating plan for the function, including workforce and financial planning
Manage and optimize departmental budgets
Make strategic resource allocation decisions
Build, scale, and structure teams to support long-term capability development
Establish performance frameworks, succession planning, and leadership development pathways
Ensure governance, risk management, and operational maturity across the function
Drive the creation of standardized, scalable systems that enable consistent execution across regions


What We're Looking For

Professional & Technical Strengths


Strong experience in cloud infrastructure, DevOps, or platform engineering roles
Proven experience defining infrastructure strategy in high-growth or enterprise environments
Deep expertise in cloud platforms (AWS, Azure, or GCP)
Strong experience with containerization and orchestration (Docker, Kubernetes, ECS)
Advanced proficiency in Infrastructure as Code tools (Terraform, CloudFormation, Pulumi)
Extensive experience designing and scaling CI/CD frameworks
Strong understanding of SRE principles, observability, and operational excellence
Experience managing cloud cost optimization initiatives


Leadership & Interpersonal Strengths


Ability to collaborate with multicultural and geographically distributed teams
High degree of autonomy and accountability (we operate within a hybrid work environment)
Systems thinker capable of balancing speed, reliability, security, and cost at scale
Ability to influence senior stakeholders and drive alignment across Engineering, Cybersecurity, and Product Management
Ability to lead high-severity incident response, coordinating cross-functional teams to restore service quickly and drive structured post-incident resolution
Strong ownership mindset with accountability for organizational-level outcomes