The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Based on community detection to automatically classify the keywords, can utilize different algorithms for clustering. In this vignette, a benchmark is provided to show the difference for various algorithms on multiple sizes of networks.
First, we’ll load the needed packages.
Then, we prepare the needed data. The built-in data table
biblio_data_table
would be used here.
Next, a combination of network size and community detection algorithms are designed to be tested:
100:300 -> topn_sample
ls("package:akc") %>%
str_extract("^group.+") %>%
na.omit() %>%
setdiff(c("group_biconnected_component",
"group_components",
"group_optimal")) -> com_detect_fun_list
Finally, we’ll implement the computation and record the results.
all = tibble()
for(i in com_detect_fun_list){
for(j in topn_sample){
system.time({
clean_data %>%
keyword_group(top = j,com_detect_fun = get(i)) %>%
as_tibble -> grouped_network_table
}) %>% na.omit-> time_info
grouped_network_table %>% nrow -> node_no
grouped_network_table %>% distinct(group) %>% nrow -> group_no
grouped_network_table %>%
count(group) %>%
summarise(mean(n)) %>%
.[[1]] -> group_avg_node_no
grouped_network_table %>%
count(group) %>%
summarise(sd(n)) %>%
.[[1]] -> group_sd_node_no
c(com_detect_fun = i,
topn = j,
node_no = node_no,group_no = group_no,
avg = group_avg_node_no,
sd = group_sd_node_no,time_info[1:3]) %>%
bind_rows(all,.) -> all
}
}
res = all %>%
mutate_at(2:9,function(x) as.numeric(x) %>% round(2)) %>%
distinct(com_detect_fun,node_no,.keep_all = T) %>%
select(-topn,-contains("self")) %>%
setNames(c("com_detect_fun","No. of total nodes","No. of total groups",
"Average node number in each group","Standard deviation of node number",
"Computer running time for keyword_group function"))
The results are displayed in the following table.
com_detect_fun | No. of total nodes | No. of total groups | Average node number in each group | Standard deviation of node number | Computer running time for keyword_group function |
---|---|---|---|---|---|
group_edge_betweenness | 103 | 36 | 2.86 | 9.17 | 0.50 |
group_edge_betweenness | 207 | 68 | 3.04 | 12.53 | 2.98 |
group_edge_betweenness | 326 | 89 | 3.66 | 13.12 | 10.03 |
group_fast_greedy | 103 | 5 | 20.60 | 8.17 | 0.17 |
group_fast_greedy | 207 | 5 | 41.40 | 24.36 | 0.18 |
group_fast_greedy | 326 | 6 | 54.33 | 34.77 | 0.19 |
group_infomap | 103 | 1 | 103.00 | NA | 0.17 |
group_infomap | 207 | 4 | 51.75 | 94.83 | 0.22 |
group_infomap | 326 | 6 | 54.33 | 114.98 | 0.34 |
group_label_prop | 103 | 1 | 103.00 | NA | 0.16 |
group_label_prop | 207 | 1 | 207.00 | NA | 0.17 |
group_label_prop | 326 | 1 | 326.00 | NA | 0.18 |
group_leading_eigen | 103 | 4 | 25.75 | 9.57 | 0.17 |
group_leading_eigen | 207 | 5 | 41.40 | 19.19 | 0.18 |
group_leading_eigen | 326 | 7 | 46.57 | 35.15 | 0.22 |
group_louvain | 103 | 5 | 20.60 | 12.14 | 0.16 |
group_louvain | 207 | 8 | 25.88 | 14.11 | 0.17 |
group_louvain | 326 | 9 | 36.22 | 19.08 | 0.18 |
group_spinglass | 103 | 5 | 20.60 | 5.13 | 1.66 |
group_spinglass | 207 | 8 | 25.88 | 13.38 | 4.04 |
group_spinglass | 326 | 8 | 40.75 | 12.07 | 7.30 |
group_walktrap | 103 | 103 | 1.00 | 0.00 | 0.16 |
group_walktrap | 207 | 207 | 1.00 | 0.00 | 0.17 |
group_walktrap | 326 | 326 | 1.00 | 0.00 | 0.17 |
The session information is displayed as below:
sessionInfo()
#> R version 4.4.2 (2024-10-31 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 26100)
#>
#> Matrix products: default
#>
#>
#> locale:
#> [1] LC_COLLATE=C
#> [2] LC_CTYPE=Chinese (Simplified)_China.utf8
#> [3] LC_MONETARY=Chinese (Simplified)_China.utf8
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=Chinese (Simplified)_China.utf8
#>
#> time zone: Asia/Shanghai
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.37 R6_2.5.1 fastmap_1.2.0 xfun_0.49
#> [5] cachem_1.1.0 knitr_1.49 htmltools_0.5.8.1 rmarkdown_2.29
#> [9] lifecycle_1.0.4 cli_3.6.3 sass_0.4.9 jquerylib_0.1.4
#> [13] compiler_4.4.2 rstudioapi_0.17.1 tools_4.4.2 evaluate_1.0.1
#> [17] bslib_0.8.0 yaml_2.3.10 rlang_1.1.4 jsonlite_1.8.9
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.