CQUniversity
Browse

File(s) not publicly available

Fairness-aware scheduling of dynamic cross-job coflows in shared datacenters based on meta learning

journal contribution
posted on 2024-04-22, 02:11 authored by H Huang, Hong Shen
In today's shared datacenters, communications among tasks across jobs (applications) typically generate large amounts of coflows. System efficiency and job fairness oriented coflow scheduling is critical for improving both system performance and user satisfaction in the application level. Since the metrics of fairness and efficiency usually conflict to each other, how to schedule coflows at the desired tradeoff between the metrics is a challenging problem. Due to the great variety of jobs and tasks within each job, communications among them have different characteristics, resulting their coflows presented in different patterns. Existing coflow schedulers attempt to optimize either only one metric or a tradeoff of both for static patterns of coflows (within the same job) because of their incapability of capturing the inherent patterns of coflows that are dynamically changing specially for cross-job coflows. This paper proposes a novel fairness-aware coflow scheduling algorithm that combines a link-embedded multi-layer neural network with a meta-learning framework for unsupervised learning of dynamic coflows across different jobs and hence adaptive scheduling of the coflows to achieve the desired fairness–efficiency tradeoff. In our algorithm, while the neural network takes care of mining the relationships among coflow allocations, the meta-learning framework effectively captures the dynamic patterns of a large number of sample coflows and trains the neural network to achieve the desired scheduling performance. Extensive experimental results demonstrate that our algorithm outperforms the state-of-the-art coflow schedulers on scheduling performance for achieving the desired fairness–efficiency tradeoff.

Funding

Category 2 - Other Public Sector Grants Category

History

Volume

100

Start Page

1

End Page

10

Number of Pages

10

eISSN

1879-0755

ISSN

0045-7906

Publisher

Elsevier BV

Language

en

Peer Reviewed

  • Yes

Open Access

  • No

Acceptance Date

2022-02-10

Era Eligible

  • Yes

Journal

Computers and Electrical Engineering

Article Number

107815

Usage metrics

    CQUniversity

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC