CQUniversity
Browse
Can Language Models Help in System Security_CQU.pdf (680.39 kB)

Can Language Models Help in System Security? Investigating Log Anomaly Detection using BERT

Download (680.39 kB)
conference contribution
posted on 2024-02-09, 01:14 authored by Crispin Almodovar, Fariza SabrinaFariza Sabrina, Sarvnaz Karimi, Salahuddin AzadSalahuddin Azad
The log files generated by networked computer systems contain valuable information that can be used to monitor system security and stability. Transformer-based natural language processing methods have proven effective in detecting anomalous activities from system logs. The current approaches, however, have limited practical application because they rely on log templates which cannot handle variability in log content, or they require supervised training to be effective. We propose a novel log anomaly detection approach named LogFiT. It utilises a pretrained BERT-based language model and fine-tunes it towards learning the linguistic structure of system logs. The LogFiT model is trained in a self-supervised manner using normal log data only. Using masked token prediction and centroid distance minimisation as training objectives, the LogFiT model learns to recognise the linguistic patterns associated with the normal log data. During inference, a discriminator function uses the LogFiT model’s top-k token prediction accuracy and computed centroid distance to determine if the input is normal or anomaly. Our experiments on three different datasets show that LogFiT is effective.

History

Editor

Parameswaran P; Biggs J; Powers D

Start Page

140

End Page

149

Number of Pages

9

Start Date

2022-12-14

Finish Date

2023-01-16

Location

Adelaide, Australia

Publisher

Australasian Language Technology Association

Place of Publication

Online

Additional Rights

cc by

Peer Reviewed

  • Yes

Open Access

  • Yes

External Author Affiliations

CSIRO

Era Eligible

  • Yes

Name of Conference

The 20th Annual Workshop of the Australasian Language Technology Association

Parent Title

Proceedings of the The 20th Annual Workshop of the Australasian Language Technology Association

Usage metrics

    CQUniversity

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC