We conducted a systematic analysis of the how 250 starred open-source projects on GitHub protect their source code. We defined a few metrics that allowed us to benchmark the effect that turning on security capabilities have on Pull Requests, repo interactions and quality outcomes. Based on the analysis, we note the positive impact Branch Protection and CODEOWNERS have on Pull Request reviews quality.
The proliferation of software supply chain attacks highlights the need for better security of source code, CI/CD pipelines and the entire DevOps toolchain.
To better understand which DevOps hardening aspect will be most impactful, we researched the common practices of the top 250 starred projects on GitHub.
The data for this analysis was obtained through queries executed against open-source repositories on GitHub. The dataset contains the following collected metadata:
How do the teams behind these repos manage PRs? Over the period of a year, 42% of the repos have had between 0-50 PRs. The second and third interval of between 50-200 and 200-800 PRs accounts for an additional 19% and 17% PRs respectively.
Given this data, we posed a few key questions that will shed light on what development teams are facing. These questions explore the interplay between quality and security controls. The following dimensions were considered:
Let’s run through the combinations of these dimensions.
While Branch Protection usage is evenly split between the top 250 repos, those that use Branch Protection have an average contributor base that is double than those that don’t.
Based on this data, we can infer that Branch Protection policies are used to deal with scale.
The use of CODEOWNERS settings (a sub-option of Branch Protection) in the top 250 repos doesn’t seem to have much traction at the moment.
To analyze the other dimensions, we introduce 3 additional measurements.
TBI is the time incurred during successive PR events that take place in the lifecycle of a single PR, such as comments, review, changes, etc. In the example below, the PR was created by ‘itsamyth’ at: 2021-08-29 20:31:03.
The first CHANGES_REQUESTED event was entered by ‘dougwilson’ at: 2021-08-29 21:01:48. Therefore, the TBI is: 30.75 mins. The next COMMENTED event from ‘itsamyth’ incurs a TBI of 2623 mins (the time difference between 2021-08-31 16:44:30 and 2021-08-29 21:01:48).
Since the TBI is assumed to be proportional to the number of changed lines of code, we normalize TBI by lines of code.
This metric captures the qualitative aspects of PR comments, reviews and change. PRIS is a score derived based on user association (member, contributor, etc.) and specific review actions (approved, commented, changes requested, etc.). The score is additive for a PR based on all the PR events.
The following pseudo code is used to compute the initial basis for PRIS:
We have also normalized the score on lines of code added/deleted and number of reviews in the PRs. For the sake of the discussion we loosely consider a higher PRIS score as an indicator of higher quality PR review process, although due to Goodhart’s law, we don’t imply this should be used as a definitive PR quality metric).
To quantify the quality of the PR score, it is important to place it in context. Each of the contributors in each repository has a different load. We compute a Mean Reviewer Load Factor as how “busy” are all the reviewers in that repo based on the actual lines of code (both additions and deletions) reviewed by them.
Pull Request Review Quality Score is calculated by multiplying the Mean Reviewer Load Factor with the Mean PRIS (based on the role of each reviewer) to avoid situations where the author or unknown user comment on a PR to impact its score. The score is calculated as a collaborative effort.
Based on the formula “Pull Request Review Quality Score” = (“Mean Reviewer Load Factor” * “Mean PRIS”), the higher score indicates better quality.
Now that we have defined these metrics, we present a first cut exploratory data analysis below. For the analysis, we split PR volumes into bins across all the 250 repos.
The above table captures a paradox: the more PRs in a repo, the lower the mean TBI. We believe that this paradox can be explained by the fact that the repos that accumulate greater contributor interest, greater scope of code changes and process them much quicker points to the efficiency of their process. Also, as number of PRs increase and members being a limited resource, we also observe a decrease in Mean PRIS scores which means that the quality of PR reviews process is moderately impacted.
In the previous section, we looked at the repos’ PR TBI regardless of their CODEOWNERS and/or Branch Protection settings.
We examined the effect of Branch Protection setting enabled below:
In general, Branch Protection has a positive effect on the mean PR TBI and PR Review Quality Score.
Next, we look at the effects of turning on the CODEOWNERS setting. We’ll use two analyses, an overall vs binned approach to PRs processed by organization. The overall PRs processed are below:
We see that introducing the CODEOWNERS setting has a negative impact on mean TBI but not on PR interactions. Thus, we infer that specifying a defined pool of reviewers might play a role here in trading off mean TBI for greater code review interactions and therefore quality.
First, analyzing using overall metrics with both settings available:
We can imply that in all combinations above, the enablement of Branch Protection policies increased the overall PR Review Quality Score.
Second, we turn to a PR volume binned approach to the various settings – Branch Protection and CODEOWNERS.
Setting: Branch Protection – ON, CODEOWNERS – ON/OFF
As can be seen from the above table, with both CODEOWNERS and Branch Protection settings turned on, the mean normalized PR TBI is higher compared to those repos that have only Branch Protection turned on. Again, we reason that PR event interactions controlled by CODEOWNERS setting restricts the review process to specified individuals or teams which we hypothesize would result in greater quality of PR process and outcomes at the expense of a lower mean TBI (the same holds for normalized mean TBI).
However, the PR # Interval of (800, 2000] bears further investigation because the PR Review Quality Score (as well as PRIS) degrades with turning on CODEOWNERS option.
Setting: Branch Protection – OFF, CODEOWNERS - OFF
Comparing repos in the same PR volume bins with the Branch Protection and CODEOWNERS turned off in Table 7 vs similar PR volume binned repos in Table 6 shows the positive effect on mean TBI and PR Review Quality Score.
In conclusion, we investigated the Branch Protection landscape gleaned from the top 250 starred GitHub repos. The team analyzed the various Branch Protection strategies, configuration options and organizational strategies that are currently in use in the open-source space. We were able to quantify the effect of using Branch Protection as well as CODEOWNERs settings across some of the top GitHub open-source repos. While these options represent static approaches to setting up and managing code protection, the Arnica team is looking at dynamic approaches that will supercharge the automation efforts in both open and closed source projects that lie at the intersection of large contributor teams, DevOps governance and security. We see a tremendous opportunity in introducing tools and insights that will improve the efficiency of DevOps processes while simultaneously reaping the benefits of better code quality.