The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

GitStats

lifecycle R-CMD-check Codecov test coverage

With GitStats you can pull git data in a uniform way from GitHub and GitLab. For the time-being you can get data on:

You can also prepare basic statistics with get_*_stats() functions for commits and issues.

Installation

From CRAN:

install.packages("GitStats")

From GitHub:

devtools::install_github("r-world-devs/GitStats")

Examples:

Setup your GitStats:

library(GitStats)

git_stats <- create_gitstats() |>
  set_gitlab_host(
    repos = "mbtests/gitstatstesting"
  ) |>
  set_github_host(
    orgs = "r-world-devs",
    repos = "openpharma/DataFakeR"
  ) 

Get commits:

commits <- git_stats |>
  get_commits(
    since = "2022-01-01"
  )

commits
#> # A tibble: 2,976 × 11
#>    id    committed_date      author author_login author_name additions deletions
#>    <chr> <dttm>              <chr>  <chr>        <chr>           <int>     <int>
#>  1 7f48… 2024-09-10 11:12:59 Macie… maciekbanas  Maciej Ban…         0         0
#>  2 9c66… 2024-09-10 10:35:37 Macie… maciekbanas  Maciej Ban…         0         0
#>  3 fca2… 2024-09-10 10:31:24 Macie… maciekbanas  Maciej Ban…         0         0
#>  4 e8f2… 2023-03-30 14:15:33 Macie… maciekbanas  Maciej Ban…         1         0
#>  5 7e87… 2023-02-10 09:48:55 Macie… maciekbanas  Maciej Ban…         1         1
#>  6 62c4… 2023-02-10 09:17:24 Macie… maciekbanas  Maciej Ban…         2        87
#>  7 55cf… 2023-02-10 09:07:54 Macie… maciekbanas  Maciej Ban…        92         0
#>  8 C_kw… 2023-05-08 09:43:31 Kryst… krystian8207 Krystian I…        18         0
#>  9 C_kw… 2023-04-28 12:30:40 Kamil… <NA>         Kamil Kozi…        18         0
#> 10 C_kw… 2023-03-01 15:05:10 Kryst… krystian8207 Krystian I…       296       153
#> # ℹ 2,966 more rows
#> # ℹ 4 more variables: repository <chr>, organization <chr>, repo_url <chr>,
#> #   api_url <glue>

commits |>
  get_commits_stats(
    time_aggregation = "month",
    group_var = author
  )
#> # A tibble: 307 × 4
#>    stats_date          githost author             stats
#>    <dttm>              <chr>   <chr>              <int>
#>  1 2022-01-01 00:00:00 github  Admin_mschuemi         1
#>  2 2022-01-01 00:00:00 github  Gowtham Rao            5
#>  3 2022-01-01 00:00:00 github  Krystian Igras         1
#>  4 2022-01-01 00:00:00 github  Martijn Schuemie       1
#>  5 2022-02-01 00:00:00 github  Hadley Wickham         3
#>  6 2022-02-01 00:00:00 github  Martijn Schuemie       2
#>  7 2022-02-01 00:00:00 github  Maximilian Girlich    13
#>  8 2022-02-01 00:00:00 github  Reijo Sund             1
#>  9 2022-02-01 00:00:00 github  eitsupi                1
#> 10 2022-03-01 00:00:00 github  Maximilian Girlich    14
#> # ℹ 297 more rows

Get repositories with specific code:

git_stats |>
  get_repos(
    with_code = "shiny",
    add_contributors = FALSE
  )
#> # A tibble: 8 × 16
#>   repo_id      repo_name         organization fullname platform repo_url api_url
#>   <chr>        <chr>             <chr>        <chr>    <chr>    <chr>    <chr>  
#> 1 R_kgDOHYNOFQ cohortBuilder     r-world-devs r-world… github   https:/… https:…
#> 2 R_kgDOHYNrJw shinyCohortBuild… r-world-devs r-world… github   https:/… https:…
#> 3 R_kgDOIvtxsg GitStats          r-world-devs r-world… github   https:/… https:…
#> 4 R_kgDOJAtHJA shinyTimelines    r-world-devs r-world… github   https:/… https:…
#> 5 R_kgDOJWYrCA hypothesis        r-world-devs r-world… github   https:/… https:…
#> 6 R_kgDOM0o1nA queryBuilder      r-world-devs r-world… github   https:/… https:…
#> 7 R_kgDOM0vVsA shinyQueryBuilder r-world-devs r-world… github   https:/… https:…
#> 8 R_kgDONLzQTw GitAI             r-world-devs r-world… github   https:/… https:…
#> # ℹ 9 more variables: created_at <dttm>, last_activity_at <dttm>,
#> #   last_activity <drtn>, default_branch <chr>, stars <int>, forks <int>,
#> #   languages <chr>, issues_open <int>, issues_closed <int>

Get files:

git_stats |>
  get_files(
    pattern = "\\.md",
    depth = 2L
  )
#> # A tibble: 58 × 8
#>    repo_name      repo_id organization file_path file_content file_size repo_url
#>    <chr>          <chr>   <chr>        <chr>     <chr>            <int> <chr>   
#>  1 GitStats Test… gid://… mbtests      README.md "# GitStats…       122 https:/…
#>  2 shinyGizmo     R_kgDO… r-world-devs NEWS.md   "# shinyGiz…      2186 https:/…
#>  3 shinyGizmo     R_kgDO… r-world-devs README.md "\n# shinyG…      2337 https:/…
#>  4 shinyGizmo     R_kgDO… r-world-devs cran-com… "## Test en…      1700 https:/…
#>  5 cohortBuilder  R_kgDO… r-world-devs NEWS.md   "# cohortBu…      1072 https:/…
#>  6 cohortBuilder  R_kgDO… r-world-devs README.md "\n# cohort…     15830 https:/…
#>  7 shinyCohortBu… R_kgDO… r-world-devs NEWS.md   "# shinyCoh…      2018 https:/…
#>  8 shinyCohortBu… R_kgDO… r-world-devs README.md "\n# shinyC…      3355 https:/…
#>  9 cohortBuilder… R_kgDO… r-world-devs README.md "\n# cohort…      3472 https:/…
#> 10 GitStats       R_kgDO… r-world-devs LICENSE.… "# MIT Lice…      1075 https:/…
#> # ℹ 48 more rows
#> # ℹ 1 more variable: api_url <chr>

Print GitStats to see what it stores:

git_stats
#> A GitStats object for 2 hosts: 
#> Hosts: https://gitlab.com/api/v4, https://api.github.com
#> Scanning scope: 
#>  Organizations: [1] r-world-devs
#>  Repositories: [2] mbtests/gitstatstesting, openpharma/DataFakeR
#> Storage: 
#>  Repositories: 8 
#>  Commits: 2976 [date range: 2022-01-01 - 2025-06-27]
#>  Files: 58 [file pattern: \.md]

See also

GitStats is used to facilitate workflow of the GitAI R package, a tool for gathering AI-based knowledge about git repositories: https://r-world-devs.github.io/GitAI/

Acknowledgement

Special thanks to James Black, Karolina Marcinkowska, Kamil Koziej, Matt Secrest, Krystian Igras, Kamil Wais, Adam Forys - for the support in the package development.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.