We're looking to add a Site Reliability Engineer to our growing Operations team at Mediavine. If you've got extensive AWS experience and you love thinking about ways to streamline processes and build infrastructure as code, we'd love to have a conversation with you!About Mediavine
Mediavine is a fast-growing advertising management company representing over 7500 websites in the food, lifestyle, DIY, and entertainment space. Founded by content creators, for content creators, Mediavine is a Top 20 Comscore property, exclusively reaching over 125 million monthly unique visitors. With best-in-class technology and a commitment to traffic quality and brand safety, we ensure optimal performance for our creators.Mission & Culture
We help content creators build sustainable businesses. From educational tools and cutting-edge plugins to ad technology that maximizes earnings without slowing down your site, our motivation is ensuring your brand and business grow in every respect.
We are striving to build an inclusive and diverse team of highly talented individuals that reflects the industries we serve and the world we live in. We are committed to creating a culture where everyone feels welcomed. We are looking for individuals that will challenge us to continuously evolve and make Mediavine the employer of choice for people of all backgrounds. We strongly encourage minorities and individuals from underrepresented groups in technology to apply for this position.
Diversity and inclusion aren't platitudes to us; we take them seriously. Have a look at our team and read through our blog posts to learn more about our values and to discover if Mediavine is the place for you!Position Title & Overview:
As a Site Reliability Engineer at Mediavine, you will be working toward optimizing the performance and availability of our services, collaborating with our engineering teams on their infrastructure needs, and working with other SREs on capacity planning, documentation, and developing processes for downtime, scalability, and more.Essential Responsibilities:
- Building software and systems to manage our infrastructure and applications, including: monitoring and detecting problems related to high usage, slow response times, and database failures.
- The availability, performance, security, monitoring, documentation, and incidence response of the applications and services that our company runs and owns.
- Capacity planning (new product launches, high traffic seasons or holidays, etc).
- Work with Product to understand useful metrics and alerts for each of our products.
- Prepare incidence documentation when needed.
- Help develop and maintain a runbook for all of our products and services.
- Create self-healing scripts where possible, to automate recovery tasks.
- Work with other SREs to create deployment and rollback processes.
- Help develop on-call rotations for our products and services.
- Must currently live in the United States.
- Minimum of 2+ years experience running large-scale customer-facing web services.
- Experience with building & maintaining complex, scalable, and distributed systems.
- AWS experience.
- A knack for spotting potential problems, performance bottlenecks, and areas for improvement.
- Experience with CI/CD orchestration tools. (CircleCI, AWS CodePipeline etc.)
- Experience with container technologies. (Kubernetes, Docker, AWS Fargate, AWS ECS).
- Experience with and knowledge of disaster recovery processes.
- Availability for 24 hour on-call rotation for service related issues.
- Remote work environment
- Remote work environment
- Travel opportunities (remember those?!)
- Comprehensive benefits including 401k, Health, Dental, and Vision insurance
- Learning allowance
- Access to experienced engineers with direct mentorship opportunities
- Generous Vacation/Time off policies
- Additional side benefits such as home-office upgrades, tuition reimbursement, paid gym memberships and wellness retreats, upgraded flights, cool swag and more
- Company match charitable donations
- Salary: $120k-135k