CQUniversity
Browse

Optimizing job scheduling by using broad learning to predict execution times on HPC clusters

journal contribution
posted on 2024-09-10, 01:57 authored by Z Hou, Hong Shen, Q Feng, Z Lv, J Jin, X Zhou, J Gu
Small and middle size high-performance computing clusters are very popular for various applications. How to utilize the accumulated log data generated in the past to optimize job scheduling using machine learning techniques is an interesting problem. Most of the current work use the common machine learning algorithms, such as the multivariate linear regression and polynomial model, to predict job runtime and optimize job scheduling. They either ignore the interference among job features or require a high time overhead for improving the prediction accuracy. In this paper, we propose to implement and improve broad learning algorithm for predicting the execution times of new coming jobs more accurately and efficiently. The experimental results showed that the proposed method can obtain high prediction accuracy with a negligible time overhead. And the predicted job execution time can help improve the efficiency of job scheduling and HPC systems.

History

Volume

6

Issue

4

Start Page

365

End Page

377

Number of Pages

13

eISSN

2524-4930

ISSN

2524-4922

Publisher

Springer Science and Business Media LLC

Language

en

Peer Reviewed

  • Yes

Open Access

  • No

Acceptance Date

2023-01-14

Era Eligible

  • Yes

Journal

CCF Transactions on High Performance Computing

Usage metrics

    CQUniversity

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC