Sr. Cloud Application Engineer: Streaming – Los Angeles

Direct Hire | Entertainment | Los Angeles, CA | Apply Now

POSITION SUMMARY

As a Senior Software Engineer on our client’s Global Cloud Application Engineering team, you will design, code, scale and support for the global Cloud infrastructure for flagship OTT streaming platforms, along with the core applications and tooling that supports it. You will create robust, repeatable processes and automation to safeguard our customer experience as well as increase efficiency of our applications and team. Passion for learning, collaboration and fun is a must!

KEY RESPONSIBILITIES

  • Architect, codify and build out the international infrastructure platform for content streaming, expanding and adapting our existing platform to a global audience.
  • Champion best practices and uphold a culture that is committed to quality, test driven development and repeatable processes through automation and infrastructure as code, influencing not only our team, but also our client development and API engineering teams.
  • Develop and support core functionality and components to support traffic shaping/routing.
  • Build security into our systems and infrastructure to avert disruption and maintain uptime.
  • Design & develop tooling that ensures resiliency and redundancy for our infrastructure with an eye towards reducing mean time to recovery in failure scenarios. Passion for making production deployments and severity events boring and uneventful.
  • Review project objectives and determine best technologies for implementation. Review and evaluate emerging technologies.
  • Ensure clear/straightforward design and comprehensive documentation of code. Look for ways to continually improve current codebase and test suite with each commit.
  • Collaborate with other world-class software engineers across the enterprise to deliver ground-breaking content and features for the future of streaming media. Be a trusted resource across our software development teams for Cloud and IaC best practices.
  • Share knowledge, mentor and grow more junior staff.
  • Develop tooling and services that provide real time insight into service and system health for a large distributed system.
  • Deftly balance between the architectural requirements for normal day-to-day operations vs unprecedented global streaming events such as program premieres.
  • Strive for operational excellence by participating in an on-call rotation as well as contributing to our incident management and blameless post-mortem processes.

KEY QUALIFICATIONS and EXPERIENCE

  • 5+ years of experience building and operating large-scale, highly available applications in a cloud environment with broad exposure to AWS architecture, networking and cloud security practices
  • Solid experience in designing, implementing and supporting container-based public cloud platforms with IaaS (AWS, Azure) and Kubernetes/Docker
  • Solid Linux experience and experience with DevOps/Infrastructure as Code tooling such as Terraform and configuration management tools like Ansible
  • Experience in systems engineering and operations, especially for systems that are multi-region or datacenter, and are designed for resiliency and scalability
  • Experience in managing the container lifecycle in a service-oriented infrastructure
  • Expert level software development experience writing large, distributed applications/services in languages such as NodeJS, Python, GoLang or Java
  • Experience in monitoring and telemetry: Telegraf, Grafana, InfluxDB, and Prometheus
  • Solid understanding of how the internet works and operates, particularly in client/server transactions with a keen knowledge of HTTP, DNS, REST, etc.
  • Experience creating automated tests as part of the development lifecycle. Passion for test driven development.
  • Full working knowledge of Git version control
  • A passion for learning, sharing knowledge, mentoring, and working in a team setting with engineers of varying levels of experience
  • Experience with Amazon EKS architecture and operations preferred
  • Familiarity with automated infrastructure testing using tools such as Serverspec, AWSpec, or Terratest preferred
  • Experience with observability tools such as log aggregation (Splunk/ELK), time series databases (Prometheus/Graphite) and Distributed Tracing preferred
  • Experience creating SLAs, SLOs, and SLIs for web-based services preferred
Apply Now