Site Reliability Engineer (SRE)- Java / Python / Linux Production / Operations /... at JPMorgan Chase Bank, N.A. (Jersey City, NJ)
Location: Jersey City, NJ
Type: Full Time
Created: 2021-09-29 05:00:19
As a Site Reliability Engineer (SRE), you'll help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems. Much of our support and software development focuses on optimizing existing systems, building infrastructure, and reducing work through automation. You'll join a team of curious problem solvers with a diverse set of perspectives who are thinking big and taking risks. In this environment, you'll take the lead on relevant projects, supported by an organization that provides the support and mentorship you need to learn and grow. As an SRE, you'll be focused on running better production applications and systems.
As a Site Reliability Engineer, you are responsible for the development and implementation of processes necessary to improve application / system reliability along with operational support. Your expertise in application performance, analyzing complex data systems, anticipating problems and finding ways to mitigate risk, will be key focus of a high performing team to successfully design and navigate the program roadmap.
By incorporating your hands-on knowledge with application development and mission critical production environments, you will affect change, drive automation, and development of innovative improvements and world-class practices.
You will be responsible for both uplifting and maintaining our evolving technology platforms, infrastructure and technology controls. This includes production operations of our systems, as well as development/engineering of solutions to improve observability & traceability, DevOps tasks such as building CI/CD pipelines and maximize system reliability. You may also be involved in defining Service Level Objectives (SLO) and measure performance by implementing Service Level Indicators (SLI).
Your role also include root cause analysis of incidents and pro-active prevention of recurrence through the creative design and development of technical solutions & process improvements. You will partner with Infrastructure, Operations and AD teams to identify and implement automation opportunities to drive down toil, reduce technical debt and improve system stability.
Best of all, you'll be able to harness massive amounts of brainpower through our global network of technologists from around the world to tackle big challenges.
Responsibilities: * Design, code, test, and deliver software to automate manual operational work * Troubleshoot priority incidents, facilitate blameless post-mortems and ensure permanent closure of incidents * Engage with development team throughout the life cycle to help develop software for reliability and scale, ensuring minimal refactoring or changes * Identify application patterns and analytics in support of better service level objectives * Design self-healing and resiliency patterns * Design automated software and product upgrades, change management, and release management solutions * Coach or manage teams as applicable * Participate in the 24x7 support coverage as needed
Qualifications: * Bachelor's degree or equivalent experience in an software engineering discipline * Expertise in at least one technology stack designing, coding, testing, and delivering software * Proficiency in one or more technology domains, may be a cross-domain expert able to solve complex and mission critical problems within a business or across the firm * Working knowledge of infrastructure components (e.g. routers, load balancers, cloud products, container systems, compute, storage, and networks) * Excellent debugging and trouble shooting skills * Demonstrated experience in systems engineering, software development and production support in mission critical environments. * Experience with modern agile software delivery practices such as scrum, continuous integration and delivery (CI/CD), DevOps. * Experience of automating tasks and writing tools in scripting languages (e.g. Python). * Knowledge of relational database systems such as Oracle and the ability to read and write SQL. * OS and Web server experience (RHEL, Apache/Tomcat/Springboot) with strong debugging, troubleshooting, and problem-solving skills. * Knowledge of industry practices such as the microservices patterns * Experience with or an understanding of data/event streaming technologies such as Kafka and Spark is a plus * Preferred Skills: * Java development experience; u nderstanding of data & event driven architecture; f amiliarity with Cloud based architectures and deployment practices
JPMorgan Chase & Co., one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world's most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans over 200 years and today we are a leader in investment banking, consumer and small business banking, commercial banking, financial transaction processing and asset management.
We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. In accordance with applicable law, we make reasonable accommodations for applicants' and employees' religious practices and beliefs, as well as any mental health or physical disability needs.
Equal Opportunity Employer/Disability/Veterans