Software health is an elusive concept with no consensus on its constituents and metrics, and little work in the blockchain domain. Since the inception of Bitcoin, the proliferation of blockchain projects and tokens has also resulted in numerous victims harmed by scammers quickly copying code, creating phishing sites, and luring investors into unhealthy projects. We model blockchain software health by exploratory factor analysis to identify the latent constructs of general Public Interest in software, Developer Engagement, and Software Robustness. Using publicly available trace data, we find that Interest is a combination of stars, forks, and text mentions in the repository, while a second factor for Robustness is composed of a criticality score, updated time, rank, and geographic distribution. A confirmatory factor analysis model completes the project-level picture of health. Cross validation of the dataset is carried out with good support for the model.
blockchain, software health, GitHub, factor analysis, trace data
ID | stars_tot | forks_tot | auth_tot | authors_ma3 | commits_ma3 | comments_ma3 | PR_open_ma3 | days_inactive |
---|---|---|---|---|---|---|---|---|
1 | 72112 | 59013 | 6242 | 150.33 | 360 | 2440.67 | 170.67 | 0.006 |
2 | 4904 | 10079 | 728 | 13.33 | 39 | 23 | 3 | 0.696 |
52 | 4799 | 3023 | 594 | 35.67 | 18.67 | 138 | 19.33 | 0.287 |
74 | 15491 | 4684 | 1663 | 28.33 | 14 | 91.67 | 9.33 | 0.031 |
Column name | Description |
---|---|
ID | tbd |
stars_tot | tbd |
forks_tot | tbd |
auth_tot | tbd |
authors_ma3 | tbd |
commits_ma3 | tbd |
comments_ma3 | tbd |
PR_open_ma3 | tbd |
days_inactive | tbd |
- DOI: to be posted
- pdf download: to be posted