Site Reliability Engineer

apartmentTEK INSPIRATIONS placeDallas calendar_month 
Job Description - Urgent Role - 100% Offer !! Site Reliability Engineer 100% Remote Duration: 6+ months Need Updated LinkedIn Primary Purpose of Position The Site Reliability Engineer will architect, develop, and maintain cloud environment in both the commercial and government cloud.

The role will work closely with software engineers, architects, and DevOps engineers to architect and maintain a secure, resilient and high performance cloud infrastructure. Main Must Have skills- need is Azure and Kubernetes, Terraform, and monitoring tools Must HaveRequirements: 6 + years of experience working within a cloud engineer/SRE role Expert knowledge of a cloud service provider Expert knowledge and hands on production experience in Kubernetes (bare metal or managed) cluster setup and management required.

Experience with infrastructure as code (IaC) tools like Terraform, Pulumi. Experience with Kubernetes deployment tools like Helm, ArgoCD, Flux Strong awareness of networking and internet protocols. Understanding of identity and access management (IAM) Experience supporting infrastructure in production cloud environments.
Knowledge of Encryption, Public Key Infrastructure (PKI), understanding of OWASP Experience working with RESTful services Some experience with monitoring tools (Azure Monitor, Splunk, Dynatrace, Graphana, Prometheus Familiarity with IDEs and Source Control tools like Visual Studio Code and Git.
Bachelors Degree in Computer Science, Information Technology, Software Engineering, Math, Physics Masters Degree with coursework focused on advanced algorithms, mathematics in computing, data structures or related field Expert knowledge of Azure Demonstrate passion about infrastructure automation Ability to prioritize work in a fast-paced environment.

Essential Functions Include: Build, maintain, and operate IaaS and PaaS infrastructure in Azure commercial and government clouds Work closely with dev teams to identify and measure SLOs, SLAs and SLIs Act a strong contributor to development of platform services including architecture, provisioning, configuration, deployment, and support Perform integrations with central logging, metrics dashboards, instrumentation, incident monitoring and management Build/integrate/administer systems and tools that enable engineering teams to observe their applications in production with autonomy (Dashboards, APMs Support software and/or cloud-infrastructure in an on-call rotation basis Assist with identification and remediation of technical problems at the root cause by continuously implementing automation, self-healing, and real-time monitoring to production systems Maintain and improve operational tooling, frameworks, Build frameworks that test the performance and resiliency of our platform services/tools Automate alerts for metrics on performance, cost, vulnerabilities, risk, compliance violations Improve processes and champion automation of any manual items around support.

apartmentEllofant ConsultingplaceDallas
or the employer. Overview: We are seeking a highly skilled and proactive Site Reliability Engineer (SRE) / Resiliency Engineer to join our team in support of a leading financial services client. As a leading technology strategy firm, we are dedicated to helping...
apartmentLogin Consulting Services Inc.placePlano, 6 mi from Dallas
have achieved mastery in one of the SecDevops practices ? IaC-Terraform, SONAR, Git-CI Cloud-AWS, CICD-Git/Argo, microservices architecture deployment (primarily Kubernetes) with awareness about site reliability engineering principles, coupled...
local_fire_departmentUrgent

Senior Maintenance Engineer

apartmentAdeccoplaceArlington, 33 mi from Dallas
in troubleshooting, diagnostics, and equipment optimization techniques. Reliability Engineering & Continuous Improvement  •  Utilize Reliability-Centered Maintenance (RCM) and Failure Modes and Effects Analysis (FMEA).  •  Implement Condition-Based Maintenance...