Lead Site Reliability Engineer
Board Intelligence is a technology and advisory firm that supercharges boards with the science of board effectiveness. We build better businesses and benefit society.
Through a suite of AI-powered software tools, evaluation frameworks, and advisory services that distil twenty years of boardroom experience, we improve the efficiency of board processes and the effectiveness of boards.
We work with over 70,000 leaders and 3,000 organisations across the world, with clients across the Fortune 500, FTSE 100, and OMX 30. In 2024 we received substantial backing from K1 Investment Management – the leading B2B Enterprise SaaS investors.We are at the beginning of significant growth, and we’re looking for superb talent to join us on this journey.
As we grow, we’re fiercely protective of our culture and values. Many of us, including our founders, have families and other priorities, so we know the value of a supportive company.
The team is diverse and friendly. We value fun: most days you’ll find a social event or learning opportunity to get involved with, including company socials, away days, philanthropic activities and lunch & learns.
Our Mission
We unleash the potential of organisations through the science of board effectiveness, building better businesses and benefiting society.
Our Engineering Team
We build, maintain, and improve the software that our clients rely on. Our work ensures that Board Intelligence product suite is efficient, scalable, and capable of adapting to changing customer needs.
This role offers full-time working from our Central Stockholm office.
The Role
We are looking for a Lead SRE to enable the highest standards of availability, scalability, performance, and security for our SaaS environments across multiple cloud vendors and our private cloud infrastructure. Your team will deliver enabling infrastructure, pipelines, and tooling to support product development.Through collaboration with security, product development, and commercial teams you'll ensure the future suitability of our infrastructure, whilst setting standards and methodologies for engineering work and proactively monitoring our platform and responding to incidents.
What will you be responsible for?- Lead and mentor a team of SREs, fostering a collaborative and high-performing environment.
- Project manage key technical projects, ensuring timely delivery and adherence to quality standards.
- Maintain a strong technical understanding of our systems and contribute to their development and maintenance.
- Improve the security posture of our infrastructure and applications.
- Ensure the reliability and stability of our platform.
- Contribute to the design and implementation of a scalable, multi-tenant architecture.
- Implement and maintain monitoring solutions and build automation to reduce toil.
- Participate in on-call duties
Requirements
We’re looking for someone that has a hunger to change our working environment for the better, driving performance from our people and protecting our culture and values to make sure we remain a caring, entrepreneurial and client-first workplace.
We’re open-minded on the background someone may have coming into this role, but things that could help a candidate to be successful would be:
- Proven experience leading and mentoring SRE or DevOps teams, with strong delegation, communication, and collaboration skills
- Extensive experience managing and maintaining on-premises infrastructure
- Deep understanding of cloud-native architectures and experience managing infrastructure solutions.
- Expertise in IAC (Terraform), configuration management tools, and CI/CD pipelines
- Strong understanding of security best practices and experience implementing security controls
Desirable skills would be:
- Experience with service mesh technologies
- Familiarity with co-located physical infrastructure
- Experience with database administration
- Knowledge of Ruby, Java, or Go
Tech Stack
Our applications are written in Ruby (with Rails) or Java. Client-side web apps are written in React, and some services in Clojure, Java and Go.
Our platform consists of:
- Multiple Kubernetes Cluster for Container orchestration
- Apache Kafka and Redis shortly Postgres for event messaging
- Postgres for data storage
- OpenStack Swift for Object storage
- Juniper & Cisco networking devices
- A number of internally written tools for managing the platform written in Go
We run our own physical infrastructure co-located in three datacentres across Europe. We also run a public cloud Production Environment on GCP for one of our products and we’re moving in the direction of more public cloud for production and pre-production environments and pipelines.
Benefits
We pride ourselves on our great working environment and package. Here’s some of what’s on offer:
- 25 days of Vacation with Advanced Vacation days
- Wellness benefits including a wellness hour per week for exercise and up to SEK 2,000 per year for wellness activities
- Parental leave with 90% salary coverage for up to 6 months
- Comprehensive insurance including Group Life, TFA Safety and Travel
- Regular team lunches and fikas
- ITP1 + Flexpension
- Vision benefit with free eye test & terminal glasses
- Tax-free daily allowances for travel
- Activities such as conferences, kick-offs, and other events