Site Reliability Engineer - OS & Cloud Native Applications

  • Location:
    Feltham, United Kingdom
  • Area of Interest
    Engineer - Software
  • Job Type
  • Technology Interest
    Cloud and Data Center
  • Job Id
 ite Reliability Engineer

Location: Bedfont Lakes, Feltham, UK

We at Cisco are looking for a SRE/Cloud Engineer to join our IT team, to enable our IaaS strategy in support of next generation cloud native applications.

Why you will love Cisco:

For years, Cisco's vision has been to change the way the world works, lives, plays, and learns. Our vision is more relevant today than ever. We made the Internet what it is today. First, we focused on creating connectivity. Now, we're entering the Internet of Everything transition—an era where we'll help create unprecedented value by connecting the unconnected. The Internet of Everything is a global industry phenomenon that is driving the biggest market transition for Cisco and our customers. This includes the intelligent connection of people, process, data, and things. It's where everything is converged on the Internet, making networked connections more relevant and valuable than before. To help us bring this vision to life, join us in our exciting journey.

We celebrate the creativity and diversity that fuels our innovation. We are dreamers and we are doers.

What You’ll Do

You will join our Infrastructure Services team (IS). You will help us move from scripts and playbooks to a fully automated pipeline running through our Continuous Integration/ Continuous Delivery (CI/CD) system. We want to move to zero-touch automation of builds, patches, and upgrades of our Hosting Infrastructure, VMware and OpenStack environment.

Responsibilities include:

Participate in Agile scrum
Automate OS provisioning, Lifecycle and configuration management using Python, Ansible, terraform and other tools
Deliver releases through CI/CD pipeline.
Dealing with security patches and issues at the UCS Hardware, RHEL OS layer.
Diagnose and Fix the L3 level support incidents.
Serve as an Incident Management escalation point for major incidents.
Undertake root cause analysis of major incidents.
Work on monitoring and alerting requirements for our hosting platform.
Deliver bug fixes, new features, and functionality as requested by our customers
Participating in on-call rotation
Who You'll Work With

We are a DevOps team inside Cisco IT maintain and building the next generation hybrid  platform which is hosted On prem and on Cloud Envt’s. Our services that will be used by all of Cisco Business units as we move to cloud-native applications. This is a team of highly motivated individuals leveraging Agile Scrum. We move at a fast pace and are passionate about the cloud, automation, and security. Giving back and contributing to the Opensource projects is encouraged.

Who You Are

Ideally: you know servers; you know cloud; you know storage; you should have proven experience with Python, Git, Ansible, terraform and you should understand IT infrastructure customer needs.

Technical Expertise:

BS/MS and 10 years of relevant experience.
Experience in leading large-scale infrastructure with more than 10k compute systems.
Redhat Enterprise Linux build, development, and operations
Experience with configuration management tools (Ansible and/or Puppet)
Automation of OS configuration, builds, upgrade and patching.
Software development lifecycle including design, development, testing, packaging, deployment, upgrade and support. Git and Jenkins
Python, Ruby, or similar programming experience.
QA and testing experience of your code and the entire platform.
Understanding of security including OS hardening, firewalls, iptables, and working with Security
Understanding of Cisco UCS compute infrastructure
Availability to be on pager duty during weekends and a rotation basis
Proven ability to respond to critical issues on a 24/7/365 basis and to own problems from beginning to end.

Non-Technical Requirements:

Agile software development practices
Work with geographically distributed teams
Understand IT processes, including architecture, design, implementation, and operations
Opensource development experience
Self-motivated, able and willing to help where help is needed
Able to build and establish relationships, be culturally sensitive, have goal alignment and learning agility