HPC Systems Engineer - 114913
#114913 HPC Systems EngineerExtended Deadline: Thu 10/6/2022
For the safety and well-being of the entire university community, the University of California requires, with few exceptions, that all students, faculty and staff be vaccinated against the COVID-19 virus and influenza before they will be allowed on campus or in a facility or office. For more information visit: Flu Vaccine Mandate / COVID Vaccine Policy
UCSD Layoff from Career Appointment: Apply by 03/07/22 for consideration with preference for rehire. All layoff applicants should contact their Employment Advisor.
Special Selection Applicants: Apply by 03/17/22. Eligible Special Selection clients should contact their Disability Counselor for assistance.
Job posting will remain open until a suitable candidate has been identified.
The San Diego Supercomputer Center (SDSC) is a world leader in using, innovating and providing cyberinfrastructure to enable advances and new discovery in science and engineering. Focusing on data-oriented and computational science and engineering applications, SDSC serves as an international resource for data cyberinfrastructure through the provision of software, hardware and human resources in multi-disciplinary science and engineering, and is a leading national cyberinfrastructure center to the National Science Foundation (NSF) and broader community.
SDSC’s High-Performance Systems Group is responsible for and operates SDSC’s high-performance computing clusters and related systems. The group operates large-scale compute and storage systems funded by the National Science Foundation (currently the XSEDE program), the UCSD campus (e.g., the Triton Shared Compute Cluster) and other entities; these systems support users from campus, national, and international communities across a broad range of scientific disciplines. The group is part of SDSC’s Data-Enabled Scientific Computing (DESC) Division.
The incumbent will apply skills as a seasoned, experienced systems integration professional with a full understanding of systems and software integration concepts to evaluate, resolve and implement medium-sized projects or portions of large projects with moderate scope and complexity. S/he will resolve a wide range of business processes, system functionality, implementation issues and system and software integration issues; demonstrate competency in selecting tools, methods and techniques to obtain results; give technical presentations to associated teams and other technical units; evaluate new technologies including performing moderate to complex cost / benefit analyses; and may lead a team of systems / infrastructure professionals.
The HPC Systems Engineer is responsible for the management of national and campus HPC clusters and their related storage systems, such as large parallel file systems, NFS file servers, and the underlying storage technologies. Responsibilities include but are not limited to systems administration (primarily Linux) with on-call duties, including management of hardware, OS, I/O, and software environment installation and maintenance. The incumbent will support resource managers, schedulers and client access to parallel and distributed file systems; conduct multi-faceted analysis, testing, scripting and benchmarking; work with very complex, advanced systems, data and networks in a research and performance evaluation environment; and provide technical expertise in parallel and high-performance filesystems (Lustre, Ceph, GPFS, etc.) and storage. Additionally, s/he will be responsible for system internals, data and storage, network and operating systems, emerging technologies, hardware, and architectures and the interrelationship of all the foregoing as well as contribute to the design, installation, management and upgrade of very large HPC clusters, filesystems, data and storage resources.
The incumbent will work closely with other groups to integrate the HPC systems and storage into the SDSC networking, cloud, and user environments; collaborate on security procedure development and implementation; and provide support to the user services and scientific applications group. S/he will also present at national meetings as necessary; work with the Operations group in training their staff and serve as liaison to the computational scientists; as well as work on multiple problems or tasks that are not necessarily well defined and make recommendations that have an impact on an entire project or system. The HPC Systems Engineer will also provide advanced technical guidance to others at the same or lower level on an ongoing basis; work well in a group and collaborative setting, such as national projects like XSEDE and its constituent working groups; and exhibit effective communications skills in a professional manner.
For more information, please visit: https://www.sdsc.edu/
Advanced knowledge of systems integration and deploying moderately complex systems integration solutions. Specifically demonstrated through experience administering large-scale HPC clusters and their related filesystems.
Strong knowledge of administering Linux systems, primarily Red Hat and its derivatives, including services, networking, and file systems.
Experience with complex troubleshooting in a multi-platform environment. Experience troubleshooting and repairing HPC hardware including compute, GPU, storage and network equipment.
Ability to install, maintain, upgrade, and troubleshoot large (petabyte scale) high performance parallel and distributed filesystems such as Luster, GPFS and Ceph.
Strong demonstrated experience with a major configuration management software, including application packaging and installation.
- Job offer is contingent upon satisfactory clearance based on Background Check results.
- Occasional evenings and weekends may be required.
- Overtime and weekends may be required.
Job offer is contingent on successful engagement in the UC COVID-19 Vaccination program (fully vaccinated with documented proof or approved exception/deferral).
To foster the best possible working and learning environment, UC San Diego strives to cultivate a rich and diverse environment, inclusive and supportive of all students, faculty, staff and visitors. For more information, please visit UC San Diego Principles of Community.
The University of California is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability, age, protected veteran status, gender identity or sexual orientation. For the complete University of California nondiscrimination and affirmative action policy see: http://www-hr.ucsd.edu/saa/nondiscr.html
UC San Diego is a smoke and tobacco free environment. Please visit smokefree.ucsd.edu for more information.