GitHub - LogIntelligence/log-analytics-chatgpt: Log Parsing: How Far Can ChatGPT Go? (ASE 2023

Repository for the paper Log Parsing: How Far Can ChatGPT Go? (ASE 2023 - NIER Track)

Introduction: In this paper, we conduct a preliminary evaluation of ChatGPT for log parsing. We design appropriate prompts to guide ChatGPT to understand the log parsing task and extract the log event/template from the input log messages.

Example:

I. Study Design

1.1. Research questions

a) RQ1: Can ChatGPT effectively perform log parsing?

We provide a basic definition of log parsing (i.e., abstracting the dynamic variables in logs) and ask ChatGPT to extract the log template for one log message using the prompt:

You will be provided with a log message delimited by backticks. You must abstract variables with `{placeholders}` to extract the corresponding template.
Print the input log’s template delimited by backticks.

Log message: `[LOG]`

b) RQ2: How does ChatGPT perform with different prompting methods?

We evaluate the performance of ChatGPT on log parsing in two scenarios:

Few-shot scenarios: We use the following prompt to evaluate the generality of ChatGPT generality to a variety of log data.

You will be provided with a log message delimited by backticks. You must abstract variables with `{placeholders}` to extract the corresponding template.
For example:
The template of `[DEMO_LOG1]` is `[TEMPLATE1]`.
The template of `[DEMO_LOG2]` is `[TEMPLATE2]`.
...
Print the input log's template delimited by backticks.

Log message: `[LOG]`

Different prompts: We evaluate the impact of simple and enhance prompts on log parsing with ChatGPT.

Simple:

You will be provided with a log message delimited by backticks. Please extract the log template from this log message:
`[LOG]`

Log template:

Enhance:

You will be provided with a log message delimited by backticks. You must identify and abstract all the dynamic variables in logs with `{placeholders}` and output a static log template.
Print the input log's template delimited by backticks.

Log message: `[LOG]`

1.2. Datasets

We use 16 representative log datasets from a wide range of systems for the evaluation, including distributed systems (i.e., HDFS, Hadoop, Spark, Zookeeper, OpenStack), supercomputers (i.e., BGL, HPC, Thunderbird), operating systems (i.e., Windows, Linux, Mac), mobile systems (i.e., Android, HealthApp), server applications (i.e., Apache, OpenSSH), and standalone softwares (i.e., Proxifier). Each dataset contains 2,000 manually labelled log messages The dataset originated from LogPAI. We use a corrected version from recent studies for our study.

II. Benchmark

Set the Open AI API Key at /chat/__init__.py (OPEN_AI_KEY)
Run the script python main.py to generate log templates with ChatGPT
Set the output directory at /outputs/post_process.py and run the script cd outputs && python post_process.py to apply common post-process rules for log parsing.
Set the output directory at evaluate.py and run the script python evaluate.py

III. Experimental results

3.1. RQ1: Can ChatGPT effectively perform log parsing?

3.2. RQ2: How does ChatGPT perform with different prompting methods?

a) Few-shot scenarios:

b) Different prompts:

Acknowledgement

We adopt the implementation for baselines and evaluation metrics from logparser and an empirical study.
We use the implementation provided by authors for SPINE.
We adopt the few-shot data sampling from LogPPT.

References:

Tools and Benchmarks for Automated Log Parsing. International Conference on Software Engineering (ICSE), 2019.
Guidelines for assessing the accuracy of log message template identification techniques. International Conference on Software Engineering (ICSE), 2022.
Log Parsing with Prompt-based Few-shot Learning. International Conference on Software Engineering (ICSE), 2023.

Citation:

If you find the code and models useful for your research, please cite the following paper: @INPROCEEDINGS{10298390, author={Le, Van-Hoang and Zhang, Hongyu}, booktitle={2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE)}, title={Log Parsing: How Far Can ChatGPT Go?}, year={2023}, pages={1699-1704}, doi={10.1109/ASE56229.2023.00206}}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
chat		chat
dataset		dataset
docs/images		docs/images
evaluation		evaluation
outputs		outputs
.gitignore		.gitignore
README.md		README.md
evaluate.py		evaluate.py
main.py		main.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Repository for the paper Log Parsing: How Far Can ChatGPT Go? (ASE 2023 - NIER Track)

I. Study Design

1.1. Research questions

a) RQ1: Can ChatGPT effectively perform log parsing?

b) RQ2: How does ChatGPT perform with different prompting methods?

1.2. Datasets

II. Benchmark

III. Experimental results

3.1. RQ1: Can ChatGPT effectively perform log parsing?

3.2. RQ2: How does ChatGPT perform with different prompting methods?

a) Few-shot scenarios:

b) Different prompts:

Acknowledgement

References:

Citation:

About

Uh oh!

Releases

Packages

Languages

LogIntelligence/log-analytics-chatgpt

Folders and files

Latest commit

History

Repository files navigation

Repository for the paper Log Parsing: How Far Can ChatGPT Go? (ASE 2023 - NIER Track)

I. Study Design

1.1. Research questions

a) RQ1: Can ChatGPT effectively perform log parsing?

b) RQ2: How does ChatGPT perform with different prompting methods?

1.2. Datasets

II. Benchmark

III. Experimental results

3.1. RQ1: Can ChatGPT effectively perform log parsing?

3.2. RQ2: How does ChatGPT perform with different prompting methods?

a) Few-shot scenarios:

b) Different prompts:

Acknowledgement

References:

Citation:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages