UnifyFS, a user-level file system specifically designed for burst buffers, aims to integrate the existing node-local storage layers in parallel computing systems, thereby providing a unified namespace for parallel computing applications. This study details the successful integration of UnifyFS with the Slurm resource manager, and emphasizes the synergistic effects of this integration in enhancing the performance of parallel computing systems. Detailed evaluations of this integration on leading parallel computing systems show that the integrated system exhibits significant advantages in scalability of data writing operations, and achieves notable performance improvements in the read-write bandwidth of applications, bringing higher efficiency and improvements to parallel computing applications. The paper thoroughly analyzes the effective integration mechanisms of UnifyFS and Slurm, and explores its potential in optimizing the collaborative operation of distributed storage and task scheduling, offering a viable solution to the broad parallel computing community. Future research will focus on further optimizing this integration framework, adding more advanced features, to achieve broader applications and technological innovations.
Parallel simulation tasks of computer-aided engineering (CAE) require software licenses when running on parallel computing clusters. However, most existing license management software lacks interfaces to interact with cluster scheduling systems. In the context of multiple Slurm clusters, the job scheduler of each cluster cannot obtain real-time information on available license counts from the license server, and the same software license is assigned to multiple jobs, resulting in abnormal job execution. The paper proposes a license management model called PMLP, which is designed to enable sharing of a software license resource repository among multiple Slurm clusters. PMLP can monitor the license server port and parse the logs of the license management software to obtain real-time information on available licenses and synchronize it across all clusters. Experimental results using a historical job dataset from a parallel computing center show that compared with the proportional allocation of licenses to each cluster, the average utilization of licenses is improved by 18.47%. Compared with the scheme of synchronizing usage information from the license server through periodic polling, the proportion of abnormal jobs is reduced by 4.76%.
In parallel computing applications, containerization technology provides an efficient, reliable, and scalable way to manage and deploy applications. Additionally, RDMA (Remote Direct Memory Access) technology meets the low-latency, highbandwidth, and high-performance network communication requirements of parallel computing applications. This is achieved through the use of low-latency, high-bandwidth, and high-performance network communications. The objective of this research is to investigate the utilization of containerization techniques in combination with RDMA and IB (InfiniBand) networks. This is achieved by packaging applications and their dependencies in independent, portable containers and utilizing RDMA technology for fast data transfer and processing. The result is an efficient and flexible operating environment that is comparable in efficiency to physical machines. This research provides an in-depth study of how containerization techniques and the advantages of RDMA combined with IB networks can improve the performance and efficiency of massively parallel computing applications.
Slurm (Simple Linux Utility for Resource Management) is a popular open-source cluster job scheduling system. In scenarios involving multiple queues and a large number of scientific computing tasks, the Slurm scheduling system faces challenges such as improper queue depth settings and uneven job loads in queues. This research aims to optimize the scheduling performance of the Slurm scheduling system in scenarios where resources are shared among multiple queues. By forecasting future CPU load increases based on the historical CPU utilization of queues, a window period is identified. Using the forecasted results, a dynamic adjustment method for queue priorities is introduced. This method aims to elevate job priorities in specific queues when a significant number of jobs are queued, ensuring prompt execution. Additionally, this study involves a dynamic adjustment strategy for queue depth to address issues arising from inappropriate queue depth settings, where resources in the queue are idle, but queued jobs face delays in scheduling. Experimental results demonstrate that this approach enhances the scheduling performance of the Slurm system in multi-queue scenarios, improving cluster resource utilization and better meeting the diverse demands of users.
KEYWORDS: Data modeling, Education and training, Machine learning, Nomenclature, Random forests, Performance modeling, Mining, Semantics, Matrices, Detection and tracking algorithms
Predicting the resource consumption and completion status of jobs is beneficial to improve the scheduling performance of the system. Many studies have shown that job name can effectively improve the accuracy of prediction. Therefore, by mining the structural semantic information of job name, this paper introduces new features of job name habit, including job name length, number of job name elements, editing distance, and analyzes each substructure of job name, adding classification features after clustering. The introduced new features can better characterize the similarity between jobs and provide strong support for model prediction. Based on the model trained by the new feature data set, the prediction accuracy is significantly improved compared with the model that only introduces the job name.
Cloud-native virtualization technology combines virtualization technology with cloud-native computing to provide a more efficient, flexible, and scalable cloud computing environment. In the process of analysis and research in the field of bioinformatics, it is usually necessary to deal with large-scale data sets and complex computing tasks, and the demand for computing power throughout the research and development cycle is characterized by peaks and troughs. The elastic scalability of cloud-native virtualization technology allows for the expansion of computing resources according to demand, meeting the data processing and analysis requirements throughout the entire research and development cycle. By integrating virtualized InfiniBand high-speed NICs, data transfer and the execution of computational tasks are accelerated, further reducing the research and development cycle. In summary, cloud-native virtualization technology has significant application value in the field of bioinformatics, providing an efficient computing environment while saving time and costs.
The progress of scientific development requires the use of high-performance computers for large-scale simulations, resulting in a significant communication overhead and thereby constraining the computer's performance. To address this issue, optimizing the topological mapping of application processes to computing nodes is essential for enhancing the communication performance of high-performance computers. However, this topic has not been extensively explored in the literature. In order to reduce the communication overhead of high-performance applications, this study formulates the optimization of topological mapping from application processes to computing nodes as a quadratic allocation problem. The proposed method collects communication features to assess the communication intimacy between processes and considers the communication relationship between application processes and network topology. To overcome the limitations of traditional genetic algorithms, this study introduces elite learning and adaptive selection into the mutation operator. In this algorithm, individuals undergoing mutation learn from fragments of the best individuals in the current population. Additionally, three functions are selected to control the probability of selecting the elite learning mutation during the mutation process, thereby enhancing the algorithm's efficiency and accuracy. The results of the experiments demonstrate that the suggested methodology yields a noteworthy enhancement in communication performance compared to the widely adopted round-robin approach in NPB test suites. Furthermore, the enhanced genetic algorithm displays superior optimization efficiency in comparison to conventional genetic algorithms and other heuristic approaches.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.