Site Reliability Engineer III

  • Full-Time
  • Irving, TX
  • GM Financial
  • Posted 3 years ago – Accepting applications
Job Description
Overview: The Site Reliability Engineering (SRE) team provides leadership, direction, and accountability for building and running large-scale software systems. As a Site Reliability Engineer, you will identify and deliver automation solutions designed to ensure high availability and resiliency using your expertise in software development, complexity analysis, and scalable system design. Strong collaboration skills will be required to work closely with other engineering teams to ensure services/systems are highly stable and performant, meeting the expectations of our business partners and end users.Responsibilities:JOB DUTIES
  • Guide the architecture and development teams on how to make applications highly available, reliable, and performant at global scale
  • Partner with architecture team to ensure operability, measurability, and manageability are accounted for in business features and enablers
  • Collaborate with product owners and managers to establish service level objectives for applications and agreed consequences if the objectives are not being met
  • Collaborate with development team members to swarm, troubleshoot, and resolve problems
  • Drive the Root Cause Analysis of production issues and other failures within the product software, pipeline, or other DevOps support processes or technology
  • Design, build, and champion automated solutions to optimize application/service/platform uptime with minimal human intervention
  • Be available for an on-call rotation to participate in troubleshooting and communication efforts outside of normal business hours
  • Create and implement standards and best practices, driving adoption across development teams and external vendors as applicable
  • Perform other duties as assigned
  • Conform with all company policies and procedures
REPORTING RELATIONSHIP

AVP Software Solutions USQualifications:Knowledge
  • Authority in defining, implementing, and evaluating Service Level Objectives (SLO) and Service Level Indicators (SLI), and associated consequences
  • Software development expertise in multiple high-level programming and scripting languages
  • Expert in evolutionary database design, query performance analysis, and indexing as a cornerstone for delivering scalable, performant products and services
  • Expert in designing, building, and optimizing automated pipelines with automated testing and automated security controls
  • Expert in performing Root Cause Analysis and Problem Management
  • Experience working in Agile Scrum teams with demonstrated success leading improvements (getting better/faster/happier)
Skills
  • Help establish and maintain a culture of learning through the development and sharing of skills, knowledge, process and tools; combat traditional silos that create “us and them” environments
  • A driving passion for finding solutions to hard problems at scale and operationalizing them
  • Exceptional critical thinking and communication skills, with a passion for leveraging documentation as a tool for constant improvement
Additional Knowledge Skills and Abilities
  • Pipeline Automation: Azure DevOps (YAML, ARM), Terraform, Jenkins, Chef, Octopus Deploy
  • Code Scanning: SonarQube, Checkmarx
  • Source Code repos: Git
  • Containerization: Azure Kubernetes Service, Kubernetes (open source), Docker
  • High level programming languages: Java, C# (NET MVC and NET Core), Go
  • Scripting: PowerShell, Bash
  • Database: Oracle, Microsoft SQL Server, NoSQL (eg CosmosDB)
  • Test Automation: XamarinUITest, Specflow, DevTest, Selenium, Test Data Manager, Postman, Maven, TestNG, JMeter
  • Operating systems: Windows, Linux
  • Cloud Platforms: Azure
  • Metrics and Monitoring: Splunk
Education
  • Bachelor’s Degree in related field or equivalent work or military experience required
  • Master’s Degree in related field preferred
Experience
  • 5-7 years experience in software development and test automation required
  • 5-7 years of web development experience strongly preferred
  • 3-5 years of site reliability engineering experience
Working Conditions
  • Normal office environment subject to stressful situations
  • Possibility of working long hours including weekends/holidays may be required
  • Limited travel may be required to support business needs
#DICE #LI-KD2
Apply to this Job