FIRA: Fine-Grained Graph-Based Code Change Representation for Automated Commit Message Generation

FIRA is a learning-based commit message generation approach, which first represents code changes via fine-grained graphs and then learns to generate commit messages automatically. In this repository, we provide our code and the data we use.


  • Python == 3.8.5
  • Pytorch == 1.7.1
  • Numpy == 1.19.2
  • Scipy == 1.5.4
  • Nltk == 3.5
  • Sacrebleu == 1.5.1
  • Sumeval == 0.2.2


The folder DataSet contains all the data which was already preprocessed, and can be directly used to train or evaluate the model.

The folder PreProcess contains the scrips to preprocess data, and you can run

python num_processes num_tasks

to preprocess the data and run


to gather the data and the final dataset will be put in the folder DataSet. We use subprocess module of python to preprocess parallelly. The arguments num_processes and num_tasks are the number of parallel subprocesses and the number of tasks one subprocess executes. The two arguments should be set according to the capacity of the CPU.


We use GNN as encoder and transformer with dual copy mechanism as decoder. We define the model in file If you want to train the model, you can run

python train

and the model will be saved as

If you want to evaluate the model, you can run

python test

and the output commit messages will be saved in OUTPUT/output_fira.


The folder OUTPUT contains the commit messages generated by FIRA and other compared approaches.


The folder Metrics contains the scripts to compute the metrics we use to evaluate our approach, including BLEU, ROUGE-L, METEOR, and Penalty-BLEU. The commands to execute are as follows, and ref is the ground_truth commit message and gen is the generated commit message.,, and are from the scripts provided by Tao et al. [1], who conducted an experimental study on the evaluation of commit message generation models and found that B-Norm BLEU exhibits the most consistently with human judgements on the quality of commit messages.

python ref < gen

python --ref_path ref --gen_path gen

python --ref_path ref --gen_path gen

python ref < gen

Human Evaluation

The folder HumanEvaluation contains the scores of the six participants.


Tao W, Wang Y, Shi E, et al. On the Evaluation of Commit Message Generation Models: An Experimental Study[C]//2021 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 2021: 126-136.