Site Reliability Engineer - Private Cloud at JPMorgan Chase Bank, N.A. (Jersey City, NJ)
Location: Jersey City, NJ
Type: Full Time
Created: 2021-12-20 05:01:25
As a JPMorgan Chase & Co. Site Reliability Engineer (SRE), you will combine software and systems to help us build a world-class engineering function. Working with your team, you'll focus on improving our production applications and systems to creatively solve operations problems. Much of our support and software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation.
Our culture of diversity, intellectual curiosity and problem solving is essential to our success. We bring people together with a wide variety of backgrounds, experiences and perspectives. We support teamwork, thinking big and taking risks in a blame-free environment. We promote self-direction to work on relevant projects, while building an environment that provides the support and mentorship needed to learn and grow. We are excited to see what you will bring to our team.
This role requires a wide variety of strengths and capabilities, including:
§ Expert knowledge of application, data and infrastructure architecture, design and business processes
§ Expertise in working in partnership with colleagues throughout the firm, and in leading collaborative teams to achieve common goals
§ Deep understanding of SRE philosophy, technologies, platforms and tools, SI/SLO/SLA management, incident resolution, and automation.
§ Hands on experience on managing operations of large-scale internet-centric production environments for application or infrastructure services serving tens to millions of end users.
§ Prior experience in large scale internet companies/technologies, where uptime and continuous availability was core to the business.
§ Work with Architecture to design reusable patterns to deploy to applications, provide governance around adoption, and influence application development teams on roadmaps and designs.
§ Identify and partner with Infrastructure teams and AD teams to implement automation opportunities to drive down toil and reduce technical debt.
§ Apply standards of cloud compliance to application design to achieve reliability
§ Good understanding of networking protocols and cybersecurity best practices in cloud environment e.g., Security, Load Balancing, Network routing protocols.
§ Leads a regional SRE team to implement SRE frameworks to support globally multi-cloud environments, and ensure the highest level of SLA through operational excellence
§ Leads the development of product technology roadmap
§ Leads proof of concept research and development in line with emerging industry trends
§ Leads product design, development, and transition to operations
§ Leads failure analysis / root cause analysis when required
§ Provides support to develop & improve the quality of technical engineering documentation
§ Provides support to drive the maturity of the software development lifecycle
§ Provides technical supervision, oversight and problem resolution for engineering activities and quality control of engineering deliverables
§ Provides technical consultation to product management
§ Performs deployment, administration, management, configuration, testing, and integration tasks related to the cloud platforms
§ Helps to develop new cloud engineering strategies and implementations for the firm
§ Champions a DevOps model so that services are automated and elastic across all platforms
§ Supports management of relationships with technology vendors
§ Identifies, recruits, and staff resources required for product design, development, and test
§ Responsible for coaching and mentoring less experienced team members.
§ Participates in 24x7 SRE on-call rotations and escalation workflows.
§ Bachelor's degree in Computer Science, Information Technology, or equivalent technical field
§ 5+ years of SRE or System Engineering experience with leadership roles.
§ Enterprise Cloud infrastructure experience e.g., AWS, Azure, GCP, Cloud Foundry in a mission critical environment
§ Proven experience in the area of people management on globally distributed operation teams
§ In-Depth OS experience e.g., RHEL, Ubuntu, Windows Server with strong debugging, troubleshooting, and problem-solving skills
§ Experience in site reliability engineering in one of the following languages e.g., C, C++, Java J2EE technology stack and web technologies ( Python, Go, Perl, Ruby or shell scripting )
§ Experience in developing and managing operations leveraging key event streaming, messaging and DB services e.g., Casandra, MQ/JMS/Kafka, Aurora, RDS, Cloud SQL, BigTable , DanamoDB , Cloud Spanner , Kinesis, Cloud Pub/Sub, etc.
§ Hands-on experience working with containers e.g., Docker, Kubernetes, Cloud Foundry, etc.
§ Strong experience in using industry standard monitoring tools e.g., AppDynamics, Dynatrace, APICA, Splunk, ELK, FluentD , Prometheus, Kibana, Elasticsearch, Grafana, Nagios, Datadog, New Relic, etc.
§ Strong working knowledge of modern development technologies and tools e.g., Agile, CI/CD, Git, Terraform and Jenkins.
§ Deep knowledge of Internet protocols and web services technologies e.g., HTTP, DNS, TCP/UDP, SOAP, JSON and REST
JPMorgan Chase & Co., one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world's most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans over 200 years and today we are a leader in investment banking, consumer and small business banking, commercial banking, financial transaction processing and asset management.
We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. In accordance with applicable law, we make reasonable accommodations for applicants' and employees' religious practices and beliefs, as well as any mental health or physical disability needs.
The health and safety of our colleagues, candidates, clients and communities has been a top priority in light of the COVID-19 pandemic. JPMorgan Chase was awarded the "WELL Health-Safety Rating" for all of our 6,200 locations globally based on our operational policies, maintenance protocols, stakeholder engagement and emergency plans to address a post-COVID-19 environment.As a part of our commitment to health and safety, we have implemented various COVID-related health and safety requirements for our workforce. These requirements may include sharing information in the firm's vaccine record tool, vaccination or regular testing, mask wearing, social distancing and daily health checks. Requirements may change in the future with the evolving public health landscape. JPMorgan Chase will consider accommodation requests.
Equal Opportunity Employer/Disability/Veterans