charlatan
makes fake data, inspired from and borrowing some code from Python’s faker
Why would you want to make fake data? Here’s some possible use cases to give you a sense for what you can do with this package:
R6
objects that a user can initialize and then call methods on. These contain all the logic that the below interfaces use.ch_*()
that wrap low level interfaces, and are meant to be easier to use and provide an easy way to make many instances of a thing.ch_generate()
- generate a data.frame with fake data, choosing which columns to include from the data types provided in charlatan
fraudster()
- single interface to all fake data methods, - returns vectors/lists of data - this function wraps the ch_*()
functions described aboveStable version from CRAN
install.packages("charlatan")
Development version from Github
devtools::install_github("ropensci/charlatan")
library("charlatan")
… for all fake data operations
x <- fraudster()
x$job()
#> [1] "Pilot, airline"
x$name()
#> [1] "Kit Olson"
x$job()
#> [1] "Civil Service fast streamer"
x$color_name()
#> [1] "PaleGoldenRod"
Adding more locales through time, e.g.,
Locale support for job data
ch_job(locale = "en_US", n = 3)
#> [1] "Community arts worker" "Exercise physiologist" "Bookseller"
ch_job(locale = "fr_FR", n = 3)
#> [1] "Auxiliaire de vie sociale" "Chef monteur"
#> [3] "Biologiste en environnement"
ch_job(locale = "hr_HR", n = 3)
#> [1] "Pregledač vagona"
#> [2] "Vozač teretnog motornog vozila i autobusa"
#> [3] "Djelatnik koji obavlja poslove izvođenja glasnog pucnja"
ch_job(locale = "uk_UA", n = 3)
#> [1] "Випробувач" "Мірошник" "Швачка"
ch_job(locale = "zh_TW", n = 3)
#> [1] "播音/配音人員" "麵包師" "產品維修人員"
For colors:
ch_color_name(locale = "en_US", n = 3)
#> [1] "RoyalBlue" "Pink" "Khaki"
ch_color_name(locale = "uk_UA", n = 3)
#> [1] "Колір засмаги" "Темно-кораловий" "Блаватний"
More coming soon …
ch_generate()
#> # A tibble: 10 x 3
#> name job phone_number
#> <chr> <chr> <chr>
#> 1 Dr. Rube Jenkins Garment/textile technologist 03032473162
#> 2 Nicolas Rohan II Community arts worker 832.223.2113x…
#> 3 Lisette Kunde Chemical engineer 894-412-4188x…
#> 4 Tanika Bayer Teacher, music 317-851-3598x…
#> 5 Wesley Paucek Housing manager/officer 049.669.5051
#> 6 Casey Walter Immunologist 00838982753
#> 7 Nakisha DuBuque-Runolfsson Solicitor, Scotland 546-479-6195x…
#> 8 Ray Nolan Broadcast engineer 046-837-3872x…
#> 9 Cindy Rosenbaum-Zboncak Development worker, community (342)135-9921
#> 10 Dr. Reino Romaguera Best boy 720-227-6908
ch_generate('job', 'phone_number', n = 30)
#> # A tibble: 30 x 2
#> job phone_number
#> <chr> <chr>
#> 1 Secretary, company (176)766-6539x98163
#> 2 Accommodation manager 756-275-2881x5444
#> 3 Technical author 1-238-563-1014x9137
#> 4 Diagnostic radiographer +25(9)2759353471
#> 5 Trade mark attorney 632-364-4032x3837
#> 6 Fisheries officer 03784472866
#> 7 Public affairs consultant 642-477-1823x9911
#> 8 Geographical information systems officer 590-683-8806x87386
#> 9 Medical laboratory scientific officer 747.240.4935
#> 10 Physicist, medical 389.193.5038
#> # ... with 20 more rows
ch_name()
#> [1] "Miss Fiona Hettinger DVM"
ch_name(10)
#> [1] "Ronal McDermott" "Eric Hand"
#> [3] "Mr. Kasey Roberts Jr." "Mrs. Olivine Osinski"
#> [5] "Delma Dickinson DVM" "Milo Glover"
#> [7] "Abdiel Yost" "Latanya King"
#> [9] "Sabastian Zboncak" "Dr. Dora Hammes DVM"
ch_phone_number()
#> [1] "624.842.8023x0490"
ch_phone_number(10)
#> [1] "+48(4)9316992700" "+09(0)7272220687" "1-770-726-3350"
#> [4] "719.617.7928" "+25(1)4441957520" "1-601-264-1417"
#> [7] "192-090-5794x8002" "00622724252" "398-551-3270x449"
#> [10] "(210)784-5702x647"
ch_job()
#> [1] "Museum/gallery curator"
ch_job(10)
#> [1] "Nature conservation officer" "Engineer, mining"
#> [3] "Risk analyst" "Designer, furniture"
#> [5] "Education officer, museum" "Higher education lecturer"
#> [7] "Music tutor" "Research scientist (maths)"
#> [9] "Lexicographer" "Occupational hygienist"
ch_credit_card_provider()
#> [1] "Maestro"
ch_credit_card_provider(n = 4)
#> [1] "Mastercard" "Voyager" "VISA 16 digit"
#> [4] "American Express"
ch_credit_card_number()
#> [1] "3748464464469461"
ch_credit_card_number(n = 10)
#> [1] "4800004198534897" "54958287613799373" "676231076965784"
#> [4] "3337013655555534144" "4407106818436689" "3528609408271169955"
#> [7] "4108643086250" "3158593336043226738" "6011140503841684099"
#> [10] "4424599784454216"
ch_credit_card_security_code()
#> [1] "188"
ch_credit_card_security_code(10)
#> [1] "368" "7546" "8950" "422" "320" "387" "250" "804" "454" "712"
Real data is messy, right? charlatan
makes it easy to create messy data. This is still in the early stages so is not available across most data types and languages, but we’re working on it.
For example, create messy names:
ch_name(50, messy = TRUE)
#> [1] "Krystal Wilderman-Crist" "Mr Nevin Stehr"
#> [3] "Harris Kris" "Ilona Bergstrom"
#> [5] "Mr Tanner Sipes IV" "Arther Hegmann"
#> [7] "Sherilyn Lubowitz" "Dr Bee Harvey md"
#> [9] "Agusta Runte-Dickinson" "Alphonso Koepp"
#> [11] "Burr Gibson" "Soloman Murray"
#> [13] "Jerold Hamill" "Pattie Gorczany m.d."
#> [15] "Dr Vessie Herman" "Irvine Bradtke-Hackett"
#> [17] "Mr Tristen Schimmel" "Jack Carter"
#> [19] "Hurley Bauch" "Efrain Kirlin-Gleichner"
#> [21] "Ethan Boehm I" "Willaim Stokes"
#> [23] "Bev Kihn-Murazik" "Mauricio Rippin"
#> [25] "Dr Tilden Littel Jr" "Abie Erdman"
#> [27] "Austin Kuhlman" "Byron Hills"
#> [29] "Bernita Reichert Ph.D." "Kristi Hickle"
#> [31] "Leslee Bartell DVM" "Nia Connelly"
#> [33] "Mrs. Velda Dickens md" "Susanna VonRueden"
#> [35] "Garrick Langosh" "Davon Gerlach"
#> [37] "Dr Barbra Reynolds DVM" "Dr. Lupe Mitchell md"
#> [39] "Wilson Carter II" "Omari Kuvalis"
#> [41] "Bernice Bergnaum" "Raegan Braun-Lindgren"
#> [43] "Ardis Walter" "Miss Luciana Lynch DVM"
#> [45] "Emmitt Yundt" "Johnson Funk"
#> [47] "Ebert Spencer IV" "Earnest Cummerata"
#> [49] "Moody Koch Jr." "Lex Lehner Sr"
Right now only suffixes and prefixes for names in en_US
locale are supported. Notice above some variation in prefixes and suffixes.