Job Summary
This position will provide systems administration support as applied to research in AI, bioinformatics, genetics, genomics, imaging, proteomics, and structural biology in a highly interdisciplinary, dynamic and team-based environment. The Systems Administrator will consult with users on hardware and software solutions, define systems scope, and provide recommendations for all systems supported infrastructure as part of regular operations, focused on high-performance computing (HPC). This position also provides systems infrastructure provisioning, configuration, and support in a timely, efficient manner. They will work a fixed schedule but require flexibility as some work must be performed outside of regular business operating hours. The Systems Administrator may also be required to act in an on-call capacity in the event of major service disruptions.
Organizational Status
The position operates in the context of the Michael Smith Laboratories at the Point Grey Campus. The position’s focus is on satisfying the needs of faculty members with advanced compute requirements in design, administration and support of high-performance computing systems. They will work in close partnership with the IT Director, UBC Advanced Research Computing, and other UBC IT staff. The position will report directly to the IT Director with oversight from the Data Science IT advisory committee.
Work Performed
For the purposes of this position, HPC includes database servers, parallel computing clusters, compute servers, distributed and cloud computing, web applications and associated software, network and hardware infrastructure used for research. Other IT domains such as email and desktop support are addressed by other personnel.
-
Deploys new hardware, software, or security updates, and provide issue resolution related to hardware or software used in HPC systems
- Implements new technologies and service as well as supporting existing technology and services
- Implements, manages and maintains industry standard infrastructure and services, largely centered on self-provisioning and automation
- Ensures appropriate security is maintained across all technologies and service
- Writes and maintains documentation in accordance with prescribed standards
- Trains and assists laboratory members on the use of the systems
- Configures, installs and maintains server and storage infrastructure, virtualization infrastructure, backup and disaster recovery infrastructure, patch management
- Performs assessments, diagnostics and issue resolution
- Formulates and defines system scope and objectives and recommends a strategy, potential solution, or “work-around”
- Compiles costing statistics for evaluation, makes recommendations on purchase of hardware software and network equipment
- Monitors and analyzes systems issues and provides recommendations for all systems supported infrastructure as part of regular operations
- Designs, provisions, and configures systems
- Acts as a liaison between technical groups and stakeholders to coordinate the system installations and ensures technical compatibility and satisfaction
- Designs solutions to resolve system-related problems, meet user requirements, and streamline system work flows
- Provides recommendations for improving procedures and coordinating system implementation
- Integrates development of best practices, standards, procedures and quality objectives across systems infrastructure or platforms
- Maintains appropriate professional designations and up-to-date knowledge of current information technology techniques and tools
- Performs other related duties as required
Supervision Received
The position will work independently but receives direct supervision from the IT Director Michael Smith Laboratories. They will work closely with other MSL and university IT staff. Work is reviewed in terms of achievement of specific task objectives.
Supervision Given
None
Consequence of Error/Judgement
Decisions and actions will have a direct impact on how efficiently and effectively applications will perform and function. Errors could result in serious damage to computing and networking equipment, loss of research time, computing and networking services, and/or loss of all data communications, as well as financial loss to the departmental and/or researcher grants. Failure to act decisively could have a detrimental effect resulting in data loss or exposure of personal information and could significantly impact the educational and research mission of the department and damage UBC s reputation.
Work will involve sensitive data and it is expected to follow all UBC Information Security Standards, and to exercise judgment, diplomacy, and tact in all interactions.
Minimum Qualifications
Undergraduate degree in a relevant discipline. Minimum of three years of related experience, or the equivalent combination of education and experience.
- Willingness to respect diverse perspectives, including perspectives in conflict with one’s own
- Demonstrates a commitment to enhancing one’s own awareness, knowledge, and skills related to equity, diversity, and inclusion
Preferred Qualifications
University degree in Computer Systems Technology, Computer Science, Mathematics, Statistics, Physics, or a similar discipline. Experience working in an academic research setting an asset.
Skills
The System Administrator demonstrates strong technical, analytical and problem-solving skills in order to design, install, trouble-shoot and maintain computational infrastructure. Experience supporting Linux server configurations is necessary. Must be able to move and lift a wide assortment of equipment. Expected to have the skills to plan and carry out multiple tasks and projects, to prioritize and organize effectively, to work under pressure, and to meet established timelines. Ability to work independently and in a team environment. Appropriate professional certifications are an asset. Demonstrated willingness to learn and continually upgrade skills is essential. As this position also involves extensive personal interaction as well as via phone and email, excellent written and verbal English communication skills are critical.
The following is a more extensive list of skills that are desirable at time of hire, or must demonstrate an ability to acquire:
- Advanced Linux shell programming
- Source-code management with Git
- Database administration/SQL
- MySQL database clustering/replication
- Systems programming in Python and Perl
- Administration of R and RStudio
- Server & Configuration Management & Automation using Salt Stack in both an Ubuntu and Red Hat distribution Linux environment
- Networking knowledge with the base understanding of TCP/IP, DNS, DHCP, and similar software/protocol stacks
- Web application configuration experience with PHP, Tomcat, and Apache
- Identity Management experience using 389 Directory and OpenLDAP
- NFS, CIFS, and iSCSI
- Juniper, Dell, and HP Switch Programming
- Linux filesystems such as XFS, EXT2/3/4, and ZFS
- Familiarity with Physical Server Hardware e.g. Dell, HP, Supermicro, and Sun Microsystem Servers
- Server Rack, PDU, and UPS Backup Power Management
- Self-provisioning and automation using PXE, Kickstart, Cobbler, FOG, Foreman, and similar toolkits
- Commercial cloud computing environments (e.g. Google, Amazon, and Azure)
- Continuous integration systems such as Bamboo, Jenkins, and OpenShift
- Parallelization using job schedulers (OpenMPI) for HPC using software such as Rocks, OpenMPI, Torque, Maui, and other job schedulers
- Systems security practices such as SELINUX enforcement, hardening of servers, and securing of web applications
Application link: https://ubc.wd10.myworkdayjobs.com/en-US/ubcstaffjobs/job/Systems-Administrator_JR19524
Posting End Date
January 7, 2025
Note: Applications will be accepted until 11:59 PM on the day prior to the Posting End Date.