168456BR - Sr Engineer, Systems Reliability - Promotions at T-Mobile (Overland Park, KS)
Add To BookmarksCompany:
Location: Overland Park, KS
Type: Full Time
Created: 2021-07-16 05:00:51
Job Description :The System Reliability Engineer (SRE) improves and protects the software and systems behind T-Mobile's Promotions Information Management systems and associated services, including management of scalability, availability, latency, performance, security, and capacity, and delivering of software faster, better, and cheaper. From designing & maintaining CICD Pipelines to building the next generation of T-Mobile applications on cloud-native platforms, the SRE's enable great customer experience and product innovation by continuous improvement of operational support Responsibilities :Technology and System Demonstrates fluency in emerging DevOps-centric automation tools and technologies for CICD, configuration management, etc. for non-prod environments. Performs environment management, automated server provisioning (VMs). Delivers software to improve the availability, scalability, latency, and efficiency of T-Mobile’s services. Creates, manages, and uses dashboard for continuous monitoring and health check of applications, and the underlying infrastructure, improve the quality of services using the monitoring feedback for the non-production environment. Contributes in future improvement of software delivery processes and operations, e.g., cloud enablement, use of microservices with containerization.
- Owns Telemetry configuration and management - infrastructure, network, database and application monitoring. Threshold definition, managing dashboards with the goal of maintaining high availability environment (HA/DR)
- Responsible for owning disaster recover strategy for infrastructure and application components
- Incident management and conduct post-incident review.
- Coordinate's change management activities and orchestrates release activities for lower environment and production deployments
- Defines SLA for application and infrastructure performance
- Expertise with the following: AWS Management, EC2, ELB, S3, CloudFront, Route 53, Kubernetes, Docker,Gitlab, Terraform, Windows Administration, DNS, network routing. HA/DR practice
- 4-7 years of relevant experience in a similar role.
- Proven experience in one or more of: C, C#, Java, Perl, Python, Go, or scripting experience in Shell and Perl.
- Experience in Continuous Integration/Continuous Delivery tools, such as, Gitlab Pipelines , Octopus etc., and other automation tools.
- Experience in APM tools, including, AppDynamics, Splunk, etc.
- Experience managing and operating a cloud environment (public/private).
- Experience in migrating to cloud or cloud-native environments experience is preferable.
- Bachelor’s degree in Computer Science or a related field. In lieu of degree, equivalent industry experience may be considered.