Senior Cloud DevOps Engineer - Monitoring

  • Full-Time
  • Austin, TX
  • Teradata
  • Posted 2 years ago – Accepting applications
Job Description
Senior Cloud DevOps Engineer, Monitoring

L ocation: San Diego, CA

Teradata is growing our Cloud Operations team and we’re looking for individuals that exemplify our principle of Customer Obsession through operational excellence, leadership, and a passion to continually be the voice of the customer. This is a unique opportunity to join our team in a period of fast growth and expansion. If you are interested in working in a dynamic and fast paced environment where you can directly influence the future of cloud-based analytics solutions and services, then this is the place for you. You will actively develop and implement state of the art technical solutions, including capabilities to support elastic scalability, on-demand self-service, disaster recovery, and usage-based consumption, to enable customers to solve their most complex data analytics challenges.

Teradata Cloud seeks a Sr Staff DevOps Engineer to lead in building and operating highly scalable, fault tolerant, and secure systems in a distributed system highly distributed and dynamic Hybrid Cloud environment.

Responsibilities

Provide architectural leadership for d eveloping and building highly available systems and software in large distributed and Hybrid Cloud environments

Promote a culture of continuous improvement for technology, and processes

Lead in-depth analysis for improving the deployment of cloud-native applications, monitoring, securing, and supporting a large-scale public cloud environment

Analyze and improve existing provisioning processes for automation opportunities and improvements

Drive the improvement of proactive alerting using modern monitoring tools such as Datadog, NewRelic, Nagios

Improve system monitoring and observability through log analysis, dashboard creation, and automated alerts based on established service level objectives (SLO) and service level agreements (SLA)

Mentor team members in use of industry standard best practices

Working with the development teams to clarify runtime infrastructure requirements

Collaborating with other teams to gather requirements, and decompose large tasks into small, testable commits

Understanding performance and security considerations for the code we deploy

Collaborate with distributed, global teams to achieve common goals

Build automation frameworks and systems to improve time to delivery through the use of modern CI/CD systems

Participate in on-call for escalated support of production customer and systems

Perform and improve SRE / operational functions, such as monitoring and maintenance of productions systems

Qualifications

5+ years of relevant job experience

Expert level hands-on system administrator experience on public cloud platforms with at least one of the big three Google Cloud, Azure, and AWS. (Google Cloud and Azure highly preferred)

Expert coding skills

Proven experience with Configuration Management tools such as Ansible, Puppet, Chef

Strong experience with Test and build systems such as Jenkins, Maven, Ant

Experience with Monitoring and reporting tools such as DataDog, New Relic, Nagios, and Graphite

Strong experience with Linux operating systems

Experience working with database systems, network topologies, and hardware

Experience working with virtualization software such as VMWare and Openstack preferred

Experience working in hybrid environment preferred

Bachelor’s Degree in computer science or related field preferred
Apply to this Job