Cloud Reliability & Operations Engineer

apartmentAmber People placeNew York calendar_month 

Cloud Reliability & Operations Engineer

New York

Base: $160,000 - $170,000

Our client is looking for a senior cloud reliability and operations engineer to join our IT department. This individual will be working on developing the operating model and supporting the firm’s cloud hosting zones across a number of providers.

This is a key role which focuses on quality, availability, and performance to ensure the firm’s cloud applications and services meet the demands of the firm’s digital users today and in the future. The individual will need to be proficient in a variety of observability technologies, including availability and performance monitoring and tuning, and automation to help define and mature our cloud management and reporting capabilities.
This role will also help transition 24x7 operational responsibilities to the standard operation teams by enabling new tooling, capability, training and documentation to allow for the traditional operations team to take on the new cloud centric responsibilities.

After the support model is established this role will serve as the L3 escalation point for cloud based incidents and admin escalations from ops, appdev, and infrastructure teams.

Key Responsibilities
  • On-going production operations of AWS and Azure hosted infrastructure and applications
  • Drive the development and use of new and self-service tooling to support the operating model for the cloud
  • Improve resiliency for all cloud applications and infrastructure and ensure that HA, DR, Data Protection requirements are appropriately engineered and implemented for each workload
  • Stand up cloud environments based on established standards and guard-rails
  • Use configuration management, orchestration and management tooling to ensure cloud environments meet operational and security standards
  • Be a subject matter expert in reducing and resolving production incidents by identifying preventive controls and driving proactive efforts
  • Act as the gatekeeper for all access escalations across all cloud environments
  • Drive to a new operating model – enable tooling and process so that all L1/L2 operations can be done by more traditional NOC teams and remain the L3 escalation point for cloud incidents and requests
  • Track system uptime and availability and promote incremental increases to change velocity
  • Drive innovation and prioritization and engineering of new cloud capabilities to bolster the operating model

Required:

  • 5+ years of reliability and operations experience – Linux, Windows, DevOps, Infrastrcuture, Network, Cyber
  • 3+ years of experience with cloud – AWS, Azure, VMWare
  • Expertise in troubleshooting cloud environments – finding and fixing critical production issues
  • Practical experience with modern scripting languages – Python, Powershell, Perl, PHP, Shell
  • Experience implementing Infrastructure as Code – Terraform, CloudFormation, Ansible etc..
  • Expertise in management and monitoring cloud tooling – cloudwatch, splunk, datadog, ELK, Prometheus, cloudtrail etc..
  • Experience with AIOps platforms to automate and shift-left operations functions
  • Experience supporting mission critical applications and infrastructure on a 24x7 basis
  • Working knowledge of cloud security principles and best practices
  • Working knowledge of cloud networking – DNS, SG, NACL, firewalls,
  • Expertise in driving good hygiene in cloud environments – in place patching, immutability, compliance monitoring (aws config), clean up of technical debt, IAM
  • Experience with a DevOps delivery model for infrastructure, applications, and configuration
  • Designing operational state to be policy and automation driven
  • Strong communication skills
  • Ability to multitask, work well under pressure and prioritize work against competing deadlines and changing business priorities

Desired:

  • Experience with Google Cloud Platform
  • Software development experience
  • AWS Certifications
apartmenttanishasystemsplaceNew York
Position: Network Operations Engineer (Data Center Network)Salary upto $90k Few points Rolling workdays means Tues to Sat / Wed to Sun and So on Location Buffalo NY mandatorily 100% work from office all 5 days Open Roles: 2 Position: Network...
business_centerHigh salary

Senior Security Operations Engineer

apartmentMMC GroupplaceNew York
Job Description: The IT Security Operations Engineer will be part of Infrastructure Security group in Global Security team located in Greater New york. Primary focus will be on designing and building Security Operations Centers for our clients...
apartmentIntone NetworksplaceMontvale (NJ), 23 mi from New York
Job Title: Technical Operations Engineer III / Application Support Engineer/Application Support AnalystLocation: Montvale, NJ - Hybrid Role, will need to be located near Montvale, expected to come into the office twice a week.Long TermClient...