Enhancing High-Performance Computing (HPC) Security: A Compre- hensive Review of Detection and Protection Strategies

Document Type : English Paper

Authors
1 Department of Hardware and infrastructure Islamic World Science & Technology Monitoring and Citation Institute (ISC) Shiraz, IRAN , Corresponding Author
2 School of Information and Cybersecurity Technology University Dublin Dublin, Ireland
Abstract
The escalating demand for High-Performance Computing
(HPC) systems and data analysis across diverse scientific do-
mains has amplified network security issues that need to be ad-
dressed urgently. This study provides an in-depth exploration
of security challenges, prevalent threats, and exploitable vulner-
abilities in HPC systems. In the pursuit of fortifying High-Per-
formance Computing (HPC) systems, the methodologies delin-
eated in this study hold potential applicability. These strategies,
which can be systematically classified into detection and pro-
tection categories, are designed to counteract a diverse range of
vulnerabilities and threats inherent in HPC systems. This paper
offers a comprehensive review and summarization of these
strategies, encapsulated within a taxonomy diagram. Subse-
quent sections provide an in-depth exploration of some of the
major subjects within this taxonomy. The paper presents two
major approaches for software fault detection in HPC systems:
static analysis and dynamic analysis. The paper also discusses
the use of different monitoring systems by HPC system admin-
istrators to identify and stop malicious activities. The paper
highlights that most software threats and insider attacks that aim
to compromise the Confidentiality, Integrity, and Availability
(CIA) of HPC systems can be detected by monitoring. It men-
tions DPEM and VARAN as examples of monitoring systems
developed to combat runtime attacks and provide low perfor-
mance overhead suitable for HPC systems, respectively. This
paper discusses the protection strategies for HPC systems. It
emphasizes the importance of Access Control, Randomization,
Control Flow Integrity, Multi-Execution, and Fault Toler-
ance. In addition to the challenges of traditional HPC systems,
the paper discusses some issues in cloud-based HPC systems,
such as virtualization overhead and multi-tenancy. This paper
also explores the most recent and relevant research for enhanc-
ing HPC security from software and hardware perspectives, and
summarizes some important and outstanding case studies con-
ducted in different countries /regions by focusing on HPC secu-
rity in recent years. By offering a detailed taxonomy and a ro-
bust security management model, this study aims to empower
researchers and system administrators with the knowledge and
tools necessary to safeguard their HPC systems and the sensi-
tive data they process.