The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Words used in Portuguese Wikipedia
This data-package contains a dataset with words used in a random sample from ~15.000 pages from the Portuguese Wikipedia.
It can be installed using:
::install_github("dfalbel/ptwikiwords") devtools
After installing the package, you can load the dataset using:
library(ptwikiwords)
data(ptwikiwords)
head(ptwikiwords)
#> # A tibble: 6 × 3
#> word count check
#> <chr> <int> <lgl>
#> 1 de 210954 TRUE
#> 2 a 109652 TRUE
#> 3 e 100028 TRUE
#> 4 o 87839 TRUE
#> 5 em 67040 TRUE
#> 6 do 59489 TRUE
The dataset contains 3 columns:
Here is a wordcloud of those words:
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(wordcloud))
<- ptwikiwords %>%
words_filter filter(check == T) %>%
slice(1:300)
wordcloud(words_filter$word, words_filter$count)
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.