Junior Research Operations Engineer at King's College London, Faculty of Natural and Mathematical Sciences (London, UK)

Junior Research Operations Engineer at King's College London, Faculty of Natural and Mathematical Sciences (London, UK)


Add To Bookmarks
Company:
Location: London, UK
Type: Full Time
Created: 2021-09-10 05:00:06

Apply Here


We are seeking a technically skilled graduate, post-graduate, post-doc or early career ResOps/DevOps/SRE professional to join KCL's expanding e-Research team. The post presents a great opportunity to be one of the first hires into a team with significant investment planned over the next 3 years. KCL is at the start of a significant reboot of its e-Research function having invested £2,000,000+ over the past 18 months on core platforms for providing Cloud, Storage and HPC capabilities to its world leading researchers. This role will provide front line support to researchers using e-Research infrastructure platforms across KCL's vast multi-disciplinary software developer, computational and data intensive research community. The post benefits from close mentorship from a highly skilled and experienced senior technical team that will empower the successful candidate to resolve issues and make long term improvements to the platforms they support. A non-exhaustive list of technologies that you will work with in this role:

  • Linux: Ubuntu, CentOS
  • Development: python, git, ansible, puppet, SQL, PHP/Laravel
  • Virtualisation and Cloud: OpenStack, ProxMox VE, Azure, AWS
  • HPC: Slurm, InfiniBand, CUDA
  • Build tools: Easybuild/Spack/Nix
  • Containers: docker, kubernetes, singularity
  • Monitoring and metrics: Icinga, Prometheus, Grafana, InfluxDB

Some examples of our work can be seen on our public GitHub page: https://github.com/kcl-eresearch Benefits:

  • Flexible working arrangements (guideline of minimum 1 day on-site per week)
  • On the job training from highly skilled colleagues
  • 35 hour week
  • 27 days holiday + Christmas closure
  • 1 day in 10 dedicated to employee led personal development (attend lectures or research a technology of your choice)

This post will be offered on an indefinite contract. This is a full-time post.

Key responsibilities

  • Respond to support requests from researchers using e-Research Cloud, Storage and HPC systems
  • Procure, install and configure new server hardware (including site visits to data centres located in Slough/Uxbridge)
  • Configure and tune monitoring systems for e-Research infrastructure
  • Diagnose and resolve performance issues and failures detected by monitoring systems
  • Compile and deploy scientific software packages for use on e-Research platforms
  • On-campus meetings with research groups to understand their workloads and technology requirements
  • Identify and execute on opportunities to eliminate repetitive operational tasks via automation
The above list of responsibilities may not be exhaustive, and the post holder will be required to undertake such tasks and responsibilities as may reasonably be expected within the scope and grading of the post.

Skills, knowledge, and experience

Essential criteria
  • Bachelors degree in science or engineering field
  • Confidence using common Linux command line utilities
  • Experience learning from open source documentation sites, Linux man pages and built-in help commands
  • Competence in at least one high-level programming language and demonstrable ability to learn python, modern web and configuration management technologies
  • Logical and methodical approach to problem solving and technical troubleshooting
  • Strong verbal and written communication skills
  • Strong customer service ethos with the ability to cater to clients that range from technical novices to subject matter experts
Desirable criteria
  • Strong python development skills
  • Web development skills
  • Demonstrable Linux administration experience (taught, self-taught or professional)
  • Undergraduate level exposure to machine learning techniques and technologies
  • Experience configuring and diagnosing computer networks (taught, self-taught or professional)
  • Configuration of Linux based storage systems (taught, self-taught or professional)