charlatan
makes fake data, inspired from and borrowing some code from Python’s faker
Why would you want to make fake data? Here’s some possible use cases to give you a sense for what you can do with this package:
R6
objects that a user can initialize and then call methods on. These contain all the logic that the below interfaces use.ch_*()
that wrap low level interfaces, and are meant to be easier to use and provide an easy way to make many instances of a thing.ch_generate()
- generate a data.frame with fake data, choosing which columns to include from the data types provided in charlatan
fraudster()
- single interface to all fake data methods, - returns vectors/lists of data - this function wraps the ch_*()
functions described aboveStable version from CRAN
install.packages("charlatan")
Development version from Github
devtools::install_github("ropensci/charlatan")
library("charlatan")
… for all fake data operations
x <- fraudster()
x$job()
#> [1] "Banker"
x$name()
#> [1] "Jo Dicki"
x$job()
#> [1] "Armed forces technical officer"
x$color_name()
#> [1] "SeaGreen"
Adding more locales through time, e.g.,
Locale support for job data
ch_job(locale = "en_US", n = 3)
#> [1] "Architect" "Psychologist, forensic"
#> [3] "Television camera operator"
ch_job(locale = "fr_FR", n = 3)
#> [1] "Ingénieur"
#> [2] "Responsable de la collecte des déchets ménagers"
#> [3] "Fleuriste"
ch_job(locale = "hr_HR", n = 3)
#> [1] "Osoba stručno osposobljena za uzgoj riba i drugih morskih organizama"
#> [2] "Prvostupnik medicinsko- laboratorijske dijagnostike"
#> [3] "Viši knjižničar"
ch_job(locale = "uk_UA", n = 3)
#> [1] "Письменник" "Швачка" "Біолог"
ch_job(locale = "zh_TW", n = 3)
#> [1] "領隊" "外務/快遞/送貨" "砌磚工"
For colors:
ch_color_name(locale = "en_US", n = 3)
#> [1] "YellowGreen" "Chartreuse" "NavajoWhite"
ch_color_name(locale = "uk_UA", n = 3)
#> [1] "Малахітовий" "Жовто-персиковий" "Яскраво-зелений"
More coming soon …
ch_generate()
#> # A tibble: 10 x 3
#> name job
#> <chr> <chr>
#> 1 Joette Keeling Development worker, community
#> 2 Wirt Rempel Purchasing manager
#> 3 Cherilyn Terry Higher education lecturer
#> 4 Garfield Torphy Psychologist, prison and probation services
#> 5 Sharif Stehr Recycling officer
#> 6 Alver Mraz Health and safety adviser
#> 7 Cleva Thiel Engineer, biomedical
#> 8 Red Skiles Nurse, learning disability
#> 9 Alfreda Jacobi Hotel manager
#> 10 Mrs. Karolyn Bode Data processing manager
#> # ... with 1 more variables: phone_number <chr>
ch_generate('job', 'phone_number', n = 30)
#> # A tibble: 30 x 2
#> job phone_number
#> <chr> <chr>
#> 1 Engineer, structural 230-418-4340x5974
#> 2 Furniture designer (534)138-4659x98182
#> 3 Nature conservation officer 1-341-021-7161x8149
#> 4 Tax adviser (347)710-0498x04182
#> 5 Leisure centre manager 059.367.5785
#> 6 Scientist, biomedical 1-026-641-9920x73432
#> 7 Education administrator +15(7)3523656979
#> 8 Naval architect 603-410-1279x66743
#> 9 Psychiatrist 664-458-5161
#> 10 Accommodation manager 1-438-457-6413x355
#> # ... with 20 more rows
ch_name()
#> [1] "Cristina Heathcote-Hoeger"
ch_name(10)
#> [1] "Harper Cassin" "Arizona Feest" "Danyel Harber"
#> [4] "Dr. Evelin Jones" "Ms. Ally Reinger" "Destiney Grant"
#> [7] "Adan Macejkovic" "Nasir Abbott-Shields" "Cortney Thompson"
#> [10] "Zachariah Littel"
ch_phone_number()
#> [1] "495-062-1294x75034"
ch_phone_number(10)
#> [1] "482-502-9430x065" "160.030.6101" "895.692.4964x96989"
#> [4] "(857)282-4600" "03506676221" "1-045-259-2363x38019"
#> [7] "(864)729-1156x3038" "580-663-7118x052" "(629)372-6175x5956"
#> [10] "687.301.8353x530"
ch_job()
#> [1] "Exhibition designer"
ch_job(10)
#> [1] "Aeronautical engineer" "Engineer, civil (contracting)"
#> [3] "Health promotion specialist" "Newspaper journalist"
#> [5] "Science writer" "Conference centre manager"
#> [7] "Special effects artist" "Interpreter"
#> [9] "Location manager" "Podiatrist"
ch_credit_card_provider()
#> [1] "Mastercard"
ch_credit_card_provider(n = 4)
#> [1] "JCB 15 digit" "VISA 16 digit" "VISA 16 digit" "VISA 16 digit"
ch_credit_card_number()
#> [1] "180014495963802646"
ch_credit_card_number(n = 10)
#> [1] "4341333219296046" "3019347410625616" "676210961178778"
#> [4] "4101721569265605" "869976400881964937" "3337693345342041531"
#> [7] "675940381801501" "6011119366631562934" "3528100212370176513"
#> [10] "180069364073399725"
ch_credit_card_security_code()
#> [1] "064"
ch_credit_card_security_code(10)
#> [1] "181" "6582" "358" "021" "322" "538" "117" "013" "322" "4657"
Real data is messy, right? charlatan
makes it easy to create messy data. This is still in the early stages so is not available across most data types and languages, but we’re working on it.
For example, create messy names:
ch_name(50, messy = TRUE)
#> [1] "Dr Sim Hodkiewicz DVM" "Clella Hills md"
#> [3] "Elam Dietrich DDS" "Miss Nedra Mann"
#> [5] "Candido Green" "Mrs Almedia Marquardt md"
#> [7] "Elenor Hyatt" "Jonnie Moore"
#> [9] "Fleda Anderson" "Lazaro Waelchi-Hackett"
#> [11] "Dr. Leslie Davis Sr." "Danny Ledner"
#> [13] "Blair Lindgren" "Hebert Hoeger DVM"
#> [15] "Mr Tollie Senger" "Audie Hamill-Hettinger"
#> [17] "Joanne Ziemann" "Mr Aden Moore"
#> [19] "Mr Boyce Champlin PhD" "Dallas Langosh"
#> [21] "Maynard Brown" "Mellisa Casper"
#> [23] "Luka Brekke" "Wilhelm Hills d.d.s."
#> [25] "Karma Hane" "Leonore Stehr"
#> [27] "Ms Margarita Rodriguez" "Dr. Melissia Simonis"
#> [29] "Darryl Olson Jr" "Jannette Krajcik"
#> [31] "Shante Becker PhD" "Randolph Borer"
#> [33] "Hermon Willms" "Belinda Rippin"
#> [35] "Arjun Pfannerstill" "Djuana Schamberger"
#> [37] "Miss Osa Terry" "Edie Conroy"
#> [39] "Greta Muller DDS" "Ms. Debora Sporer"
#> [41] "Aidyn Mayert" "Dr Alexa Russel"
#> [43] "Enos Eichmann" "Lular Bechtelar PhD"
#> [45] "Bess Hamill-Gleason" "Erin Wilderman"
#> [47] "Ambrose Rice" "Dr Mayo Hoeger"
#> [49] "Tabetha Schamberger" "Lanette Rodriguez"
Right now only suffixes and prefixes for names in en_US
locale are supported. Notice above some variation in prefixes and suffixes.