How do you Propose Your Code Changes? Empirical Analysis of Affect Metrics of Pull Requests on GitHub

Software engineering methodologies rely on version control systems such as git to store source code artifacts and manage changes to the codebase. Pull requests include chunks of source code, history of changes, log messages around a proposed change of the mainstream codebase, and much discussion on...

Full description

Bibliographic Details
Main Authors: Marco Ortu, Giuseppe Destefanis, Daniel Graziotin, Michele Marchesi, Roberto Tonelli
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9117137/
_version_ 1818617246315970560
author Marco Ortu
Giuseppe Destefanis
Daniel Graziotin
Michele Marchesi
Roberto Tonelli
author_facet Marco Ortu
Giuseppe Destefanis
Daniel Graziotin
Michele Marchesi
Roberto Tonelli
author_sort Marco Ortu
collection DOAJ
description Software engineering methodologies rely on version control systems such as git to store source code artifacts and manage changes to the codebase. Pull requests include chunks of source code, history of changes, log messages around a proposed change of the mainstream codebase, and much discussion on whether to integrate such changes or not. A better understanding of what contributes to a pull request fate and latency will allow us to build predictive models of what is going to happen and when. Several factors can influence the acceptance of pull requests, many of which are related to the individual aspects of software developers. In this study, we aim to understand how the affect (e.g., sentiment, discrete emotions, and valence-arousal-dominance dimensions) expressed in the discussion of pull request issues influence the acceptance of pull requests. We conducted a mining study of large git software repositories and analyzed more than 150,000 issues with more than 1,000,000 comments in them. We built a model to understand whether the affect and the politeness have an impact on the chance of issues and pull requests to be merged-i.e., the code which fixes the issue is integrated in the codebase. We built two logistic classifiers, one without affect metrics and one with them. By comparing the two classifiers, we show that the affect metrics improve the prediction performance. Our results show that valence (expressed in comments received and posted by a reporter) and joy expressed in the comments written by a reporter are linked to a higher likelihood of issues to be merged. On the contrary, sadness, anger, and arousal expressed in the comments written by a reporter, and anger, arousal, and dominance expressed in the comments received by a reporter, are linked to a lower likelihood of a pull request to be merged.
first_indexed 2024-12-16T17:02:39Z
format Article
id doaj.art-07cfb3b47c8442049f9f96669b2c4d4d
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-16T17:02:39Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-07cfb3b47c8442049f9f96669b2c4d4d2022-12-21T22:23:41ZengIEEEIEEE Access2169-35362020-01-01811089711090710.1109/ACCESS.2020.30026639117137How do you Propose Your Code Changes? Empirical Analysis of Affect Metrics of Pull Requests on GitHubMarco Ortu0Giuseppe Destefanis1https://orcid.org/0000-0003-3982-6355Daniel Graziotin2Michele Marchesi3Roberto Tonelli4Dipartimento di Ingegneria Elettrica ed Elettronica, University of Cagliari, Cagliari, ItalyDepartment of Computer Science, Brunel University London, Uxbridge, U.KInstitute of Software Technology, University of Stuttgart, Stuttgart, GermanyDipartimento di Matematica e Informatica, University of Cagliari, Cagliari, ItalyDipartimento di Matematica e Informatica, University of Cagliari, Cagliari, ItalySoftware engineering methodologies rely on version control systems such as git to store source code artifacts and manage changes to the codebase. Pull requests include chunks of source code, history of changes, log messages around a proposed change of the mainstream codebase, and much discussion on whether to integrate such changes or not. A better understanding of what contributes to a pull request fate and latency will allow us to build predictive models of what is going to happen and when. Several factors can influence the acceptance of pull requests, many of which are related to the individual aspects of software developers. In this study, we aim to understand how the affect (e.g., sentiment, discrete emotions, and valence-arousal-dominance dimensions) expressed in the discussion of pull request issues influence the acceptance of pull requests. We conducted a mining study of large git software repositories and analyzed more than 150,000 issues with more than 1,000,000 comments in them. We built a model to understand whether the affect and the politeness have an impact on the chance of issues and pull requests to be merged-i.e., the code which fixes the issue is integrated in the codebase. We built two logistic classifiers, one without affect metrics and one with them. By comparing the two classifiers, we show that the affect metrics improve the prediction performance. Our results show that valence (expressed in comments received and posted by a reporter) and joy expressed in the comments written by a reporter are linked to a higher likelihood of issues to be merged. On the contrary, sadness, anger, and arousal expressed in the comments written by a reporter, and anger, arousal, and dominance expressed in the comments received by a reporter, are linked to a lower likelihood of a pull request to be merged.https://ieeexplore.ieee.org/document/9117137/Software engineeringbehavioral software engineeringhuman aspectssentiment analysissoftware qualityversion control systems
spellingShingle Marco Ortu
Giuseppe Destefanis
Daniel Graziotin
Michele Marchesi
Roberto Tonelli
How do you Propose Your Code Changes? Empirical Analysis of Affect Metrics of Pull Requests on GitHub
IEEE Access
Software engineering
behavioral software engineering
human aspects
sentiment analysis
software quality
version control systems
title How do you Propose Your Code Changes? Empirical Analysis of Affect Metrics of Pull Requests on GitHub
title_full How do you Propose Your Code Changes? Empirical Analysis of Affect Metrics of Pull Requests on GitHub
title_fullStr How do you Propose Your Code Changes? Empirical Analysis of Affect Metrics of Pull Requests on GitHub
title_full_unstemmed How do you Propose Your Code Changes? Empirical Analysis of Affect Metrics of Pull Requests on GitHub
title_short How do you Propose Your Code Changes? Empirical Analysis of Affect Metrics of Pull Requests on GitHub
title_sort how do you propose your code changes empirical analysis of affect metrics of pull requests on github
topic Software engineering
behavioral software engineering
human aspects
sentiment analysis
software quality
version control systems
url https://ieeexplore.ieee.org/document/9117137/
work_keys_str_mv AT marcoortu howdoyouproposeyourcodechangesempiricalanalysisofaffectmetricsofpullrequestsongithub
AT giuseppedestefanis howdoyouproposeyourcodechangesempiricalanalysisofaffectmetricsofpullrequestsongithub
AT danielgraziotin howdoyouproposeyourcodechangesempiricalanalysisofaffectmetricsofpullrequestsongithub
AT michelemarchesi howdoyouproposeyourcodechangesempiricalanalysisofaffectmetricsofpullrequestsongithub
AT robertotonelli howdoyouproposeyourcodechangesempiricalanalysisofaffectmetricsofpullrequestsongithub