| Type: | Package | 
| Title: | Probability and Bayesian Modeling | 
| Version: | 1.1 | 
| Author: | Jim Albert <albert@bgsu.edu> | 
| Maintainer: | Jim Albert <albert@bgsu.edu> | 
| Depends: | LearnBayes, ggplot2, gridExtra, shiny | 
| Suggests: | knitr, rmarkdown | 
| URL: | https://github.com/bayesball/ProbBayes | 
| License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] | 
| Packaged: | 2020-02-27 13:44:56 UTC; jamesalbert | 
| Description: | Functions and datasets to accompany J. Albert and J. Hu, "Probability and Bayesian Modeling", CRC Press, (2019, ISBN: 1138492566). | 
| Encoding: | UTF-8 | 
| LazyData: | true | 
| NeedsCompilation: | no | 
| Repository: | CRAN | 
| Date/Publication: | 2020-03-06 09:40:07 UTC | 
Trend Estimates of Bird Populations
Description
Trend Estimates for 28 Grassland Bird Species
Usage
  BBS_survey
Format
A data frame with 28 observations on the following 4 variables.
- Species_Name
- name of bird species 
- Trend
- trend estimate 
- SE
- standard error of estimate 
- N_Site
- number of observations at site 
Source
North American Breeding Bird Survey
Expeditures of U.S. Households
Description
Expeditures of U.S. Households
Usage
  CEsample
Format
A data frame with 1000 observations on the following 3 variables.
- UrbanRural
- urban/rural status of CU - 1 = urban and 2 = rural 
- TotalIncomeLastYear
- amount of CU income before taxes in the last 12 months 
- TotalExpLastQ
- CU's total expenditure in the last quarter 
Source
U.S. Bureau of Labor Statistics
Shiny App to Choose a Beta Curve
Description
Interactively choose beta curve by selecting the .5 and .9 quantiles
Usage
  ChooseBeta()
Value
None
Author(s)
Jim Albert
Personal Computer Data
Description
Variables on a sample of personal computers
Usage
  ComputerPriceSample
Format
A data frame with 500 observations on the following 5 variables.
- Price
- sales price 
- Speed
- clock speed in MHz 
- HardDrive
- size of hard drive in MB 
- Ram
- size of Ram in MB 
- Premium
- premium status of manufacturer 
Source
Unknown
Personality and Volunteering
Description
Data from study to learn about personality determinants of volunteering
Usage
  Cowles
Format
A data frame with 1421 observations on the following 5 variables.
- subject
- subject number 
- neuroticism
- measurement of neuroticism 
- extraversion
- measurement of extraversion 
- sex
- male or female 
- volunteer
- no or yes 
Source
Unknown.
Risk-adjusted mortality outcomes for all NYC hospitals
Description
Reported deaths from heart attack for hospitals in New York City
Usage
  DeathHeartAttackDataNYCfull
Format
A data frame with 45 observations on the following 5 variables.
- Hospital
- name of hospital 
- Borough
- borough in New York City 
- Type
- type of hospital 
- Cases
- number of heart attach cases 
- Deaths
- number of deaths 
Source
New York State Department of Health
Risk-adjusted mortality outcomes for Manhattan hospitals
Description
Reported deaths from heart attack for hospitals in Manhattan in New York City
Usage
  DeathHeartAttackManhattan
Format
A data frame with 13 observations on the following 4 variables.
- Hospital
- name of hospital 
- Type
- type of hospital 
- Cases
- number of heart attach cases 
- Deaths
- number of deaths 
Source
New York State Department of Health
Graduate School Admission
Description
Study to see what variables are helpful in determining admission to Graduate School
Usage
  GradSchoolAdmission
Format
A data frame with 400 observations on the following 3 variables.
- Admission
- student was admitted (1) or not admitted (0) 
- GRE
- GRE score 
- GPA
- grade point average 
Source
Unknown.
Homework Hours for Five Schools
Description
Weekly hours spent on homework for students from five schools
Usage
  HWhours5schools
Format
A data frame with 116 observations on the following 2 variables.
- school
- school number of student 
- hours
- weekly hours spent on homework 
Source
Unknown.
Frequency use of "can" for Federalist Papers
Description
Frequency use of "can" for Federalist Papers written by Alexander Hamilton
Usage
  Hamilton_can
Format
A data frame with 49 observations on the following 6 variables.
- Name
- name of Federalist paper 
- Total
- total number of words 
- word
- word that is counted 
- N
- frequency of the word 
- Rate
- fraction of words with that word 
- Authorship
- author of paper 
Source
http://www.gutenberg.org/ebooks/18
JAGS Script for Common Models
Description
Model script for JAGS to fit a particular Bayesian model. Currently the possible models are "beta_binomial", "hier_normal", "hier_trajectory", "normal", "regression", "regression_cond_means", and "trajectory".
Usage
  JAGS_script(model)
Arguments
| model | name of the model | 
Value
A character string containing the model script
Korean Drama Ratings
Description
Ratings of Korean dramas prodcast during different days of the week and didfferent producers
Usage
  KDramaData
Format
A data frame with 101 observations on the following 5 variables.
- Drama
- name of drama 
- Schedule
- indicator of what day the drama was broadcast 
- Producer
- indicator of the producer of the drama 
- Rating
- rating of the drama 
- Date
- date of rating 
Source
AGB Nielsen Media Research Group
U.S. Women Labor Participation
Description
U.S. women labor participation and family income
Usage
  LaborParticipation
Format
A data frame with 753 observations on the following 2 variables.
- Participation
- labor participation of the wife 
- FamilyIncome
- family income exclusive of wife's income in $1000 
Source
University of Michigan Panel Study of Income Dynamics
Frequency use of "can" for Federalist Papers
Description
Frequency use of "can" for Federalist Papers written by James Madison
Usage
  Madison_can
Format
A data frame with 49 observations on the following 6 variables.
- Name
- name of Federalist paper 
- Total
- total number of words 
- word
- word that is counted 
- N
- frequency of the word 
- Rate
- fraction of words with that word 
- Authorship
- author of paper 
Source
http://www.gutenberg.org/ebooks/18
Professor Salary Study
Description
Study on inputs that impact a salary of a professor
Usage
  ProfessorSalary
Format
A data frame with 397 observations on the following 7 variables.
- subject
- subject id 
- rank
- professor rank 
- discipline
- A is theoretical and B is applied 
- yrs.since.phd
- number of years since receipt of doctorate 
- yrs.service
- number of years of service 
- sex
- Female or Male 
- salary
- nine-month salary in dollars 
Source
Unknown.
Scores on Achievement Exam
Description
Scores on a 20-question T/F exam
Usage
  ScoreData
Format
A data frame with 30 observations on the following 2 variables.
- Person
- subject id 
- Score
- number correct in 20-question exam 
Source
Data randomly generated.
Movie Ratings
Description
Ratings for a set of 2010 animation movies
Usage
  animation_ratings
Format
A data frame with 55 observations on the following 6 variables.
- userId
- user ID 
- movieId
- movie ID 
- rating
- numerical rating 
- timestamp
- time when the rating was recorded 
- title
- name of the movie 
- Group_Number
- numerical ID of movie 
Source
MovieLens by GroupLens Research
Arm span and height measurements
Description
Arm span and height measurements for a sample of students
Usage
  arm_height
Format
A data frame with 20 observations on the following 2 variables.
- arm
- length of arm span in cm 
- height
- height in cm 
Source
Sample of college students
Bar plot of numeric or character data
Description
Constructs frequency bar plot of a vector of numeric data or a vector of character data
Usage
  bar_plot(y, ...)
Arguments
| y | vector of outcomes | 
| ... | title of the graph | 
Value
A ggplot2 object containing the bar graph.
Author(s)
Jim Albert
Examples
  s <- spinner_data(c(1, 2, 2, 1), nsim=100)
  bar_plot(s, "Spinner Data")
  y <- c(rep("a", 10), rep("b", 5),
         rep("c", 8), rep("d", 4))
  bar_plot(y)
Batting Statistics for 2018 Season
Description
Batting statistics collected for all players during the first month and remainder of 2018 baseball season
Usage
  batting_2018
Format
A data frame with 549 observations on the following 5 variables.
- Name
- name of player 
- AB.x
- number of at bats in first month 
- H.x
- number of hits in first month 
- AB.y
- number of at bats in remainder of season 
- H.y
- number of hits in remainder of season 
Source
Data collected from Retrosheet.org.
Computes Posterior Probabilities for Discrete Models
Description
Given a data table with columns Prior and Likelihood, computes posterior probabilities
Usage
  bayesian_crank(d)
Arguments
| d | data frame with columns Prior and Likelihood | 
Value
data frame with new columns Product and Posterior
Author(s)
Jim Albert
Examples
  df <- data.frame(p=c(.1, .3, .5, .7, .9),
                   Prior=rep(1/5, 5))
  y <- 5
  n <- 10
  df$Likelihood <- dbinom(y, prob=df$p, size=n)
  df <- bayesian_crank(df)
Displays Areas Under a Beta Curve
Description
Computes and Displays Areas Under a Beta Curve
Usage
  beta_area(lo, hi, shape_par, Color = "orange")
Arguments
| lo | lower bound of interval | 
| hi | upper bound of interval | 
| shape_par | vector of shape parameters of the beta curve | 
| Color | color of shading in the graph | 
Value
ggplot2 object containing the graphical display.
Author(s)
Jim Albert
Examples
  lo <- .2
  hi <- .4
  shape_par <- c(2, 5)
  beta_area(lo, hi, shape_par)
Simulate random data from a beta curve
Description
Simulate random data from a beta curve
Usage
  beta_data(shape_par, nsim=1000)
Arguments
| shape_par | vector of shape parameters of the beta curve | 
| nsim | number of simulations | 
Value
A vector of random draws from the beta distribution
Author(s)
Jim Albert
Examples
  shape_par <- c(12, 8)
  beta_data(shape_par, 10)
Draw a Beta Curve
Description
Draw a Beta Curve
Usage
  beta_draw(shape_pars)
Arguments
| shape_pars | vector of shape parameters of the beta curve | 
Value
ggplot2 object containing the graphical display.
Author(s)
Jim Albert
Examples
  shape_pars <- c(2, 5)
  beta_draw(shape_pars)
Probability Interval for a Beta Curve
Description
Computes Probability Interval for a Beta Curve
Usage
  beta_interval(prob, shape_par, Color = "orange")
Arguments
| prob | value of coverage probability | 
| shape_par | vector of shape parameters of the beta curve | 
| Color | color of shading in the graph | 
Value
ggplot2 object containing the graphical display.
Author(s)
Jim Albert
Examples
  shape_par <- c(2, 5)
  beta_interval(.5, shape_par)
Plot of Two Beta Curves
Description
Plot of Prior and Posterior Beta Curves
Usage
  beta_prior_post(prior_shapes, post_shapes)
Arguments
| prior_shapes | vector of shape parameters of the beta prior | 
| post_shapes | vector of shape parameters of the beta posterior | 
Value
ggplot2 object containing the graphical display.
Author(s)
Jim Albert
Examples
 prior_shapes <- c(4, 6)
 post_shapes <- c(19, 16)
 beta_prior_post(prior_shapes, post_shapes)
Displays a Quantile of a Beta Curve
Description
Displays a Quantile of a Beta Curve
Usage
  beta_quantile(prob, shape_par, Color = "orange")
Arguments
| prob | probability value of interest | 
| shape_par | vector of shape parameters of the beta curve | 
| Color | color of shading in the graph | 
Value
ggplot2 object containing the graphical display.
Author(s)
Jim Albert
Examples
  # find the .50 quantile (the median)
  prob <- 0.5
  shape_par <- c(2, 5)
  beta_quantile(prob, shape_par)
  # find the .90 quantile (90th percentile)
  prob <- 0.9
  beta_quantile(prob, shape_par)
Text Statistics for Books
Description
Text statistics for a collection of books sold at Amazon.com
Usage
  book_stats
Format
A data frame with 21 observations on the following 3 variables.
- Book
- name of book 
- Complex.Words
- percentage of words in the book with three or more syllables 
- Fog.Index
- number of years of formal education required to read and understand a passage of text 
Source
Data collected from Amazon.com website.
Buffalo snowfall data
Description
Total snowfall in inches for 20 Januarys in Buffalo, New York
Usage
  buffalo_jan
Format
A data frame with 20 observations on the following 2 variables.
- SEASON
- Season 
- JAN
- inches of total snowfall 
Source
National Weather Service, www.weather.gov
Career Trajectory Data for Baseball Players
Description
Season on-base statistics for collection of MLB baseball players who were born in 1978
Usage
  career_1978
Format
A data frame with 399 observations on the following 6 variables.
- nameLast
- last name of player 
- Player
- id of player 
- Age
- age of player 
- AgeD
- deviation of age from 30 
- PA
- number of plate appearances 
- OB
- number of on-base events 
Source
Data collected from Lahman database.
Centers title in a ggplot2 graphic
Description
Centers and increases font size of a ggplot2 graphic title
Usage
centertitle(Color = "blue")
Arguments
| Color | color of the text in the ggplot2 title | 
Value
ggplot2 theme code to center the title
Author(s)
Jim Albert
Examples
df <- data.frame(p=c(.1, .3, .5, .7, .9),
                 Prior=rep(1/5, 5))
ggplot(df, aes(p, Prior)) +
geom_point() +
ggtitle("My Prior") +
centertitle()
Plot of Distribution of Two Proportions
Description
Constructs a graph of the probability distribution of two proportions
Usage
  draw_two_p(prob_matrix, ...)
Arguments
| prob_matrix | matrix of probabilities of two proportions with the rows and columns labeled by the values | 
| ... | other arguments such as the title of the plot | 
Value
ggplot2 object containing the graphical display.
Author(s)
Jim Albert
Examples
  prob_matrix <- testing_prior()
  draw_two_p(prob_matrix, title="Testing Prior")
Hypergeometric sampling density
Description
Hypergeometric sampling density
Usage
  dsampling(sample_b, pop_N, pop_B, sample_n)
Arguments
| sample_b | number of black balls in sample | 
| pop_N | number of balls in population | 
| pop_B | number of black balls in population | 
| sample_n | number of balls in sample | 
Value
Value of hypergeometric sampling probability
Author(s)
Jim Albert
Examples
  pop_N <- 10
  pop_B <- 4
  sample_n <- 3
  sample_b <- 2
  dsampling(sample_b, pop_N, pop_B, sample_n)
Computes likelihoods for spinner outcomes
Description
Computes likelihoods for spinner outcomes
Usage
  dspinner(x, Prob)
Arguments
| x | vector of spinner observations | 
| Prob | matrix of spinner probabilities where each row corresponds to a different spinner | 
Value
column vector consisting of the likelihoods for the different spinners
Author(s)
Jim Albert
Examples
  Prob <- matrix(c(.25, .25, .25, .25,
                   .50, .125, .125, .5,
                   .25, .5, .25, 0), 3, 4, byrow=TRUE)
  x <- c(1, 2, 1, 3, 4)
  dspinner(x, Prob)
Electricity Bills
Description
Electricity bills collected for all months for five years
Usage
  electricbills
Format
A data frame with 62 observations on the following 3 variables.
- Year
- year 
- Month
- number of month 
- Amount
- electicity bill in dollars 
Source
Data collected for one household in Ohio
Frequency use of words for Federalist Papers
Description
Frequency use of words for Federalist Papers written by either Alexander Hamilton or James Madison
Usage
  federalist_word_study
Format
A data frame with 56853 observations on the following 7 variables.
- Name
- name of Federalist paper 
- Total
- total number of words 
- word
- word that is counted 
- N
- frequency of the word 
- Rate
- fraction of words with that word 
- Authorship
- author of paper 
- Disputed
- is authorship disputed? 
Source
http://www.gutenberg.org/ebooks/18
Times to Serve for Roger Federer
Description
Measurements of time to serve for 20 serves of the tennis player Roger Federer
Usage
  federer_time_to_serve
Format
A data frame with 20 observations on the following one variable.
- time
- time to serve in seconds 
Source
https://github.com/JeffSackmann
Fire Calls for Zip Code Areas
Description
The number of fire calls and building fires for ten zip codes in Montgomery County, Pennsylvania
Usage
  fire_calls
Format
A data frame with 10 observations on the following 3 variables.
- Zip_Code
- zip code 
- Fire_Calls
- number of fire calls 
- Building_Fires
- number of building fires 
Source
kaggle.com
Football Field Goals Dataset
Description
Field goal attempt data for three seasons of professional football
Usage
  football_field_goals
Format
A data frame with 3025 observations on the following 5 variables.
- Team
- name of team 
- Year
- football season 
- Kicker
- last name of kicker 
- Distance
- distance in feet of attempt 
- Success
- attempt was successful (1) or not (0) 
Source
Data collected by Michael Lopez.
Gas bill data
Description
Measurements of average temperature and natural gas bill for each month in 2017
Usage
  gas2017
Format
A data frame with 12 observations on the following 3 variables.
- Month
- abbreviation of month 
- Temp
- average temperature 
- Bill
- natural gas bill in dollars 
Source
Personal data collected by a homeowner in Ohio
Gibbs sampling of the beta-binomial distribution
Description
Implements Gibbs sampling of the beta-binomial distribution
Usage
  gibbs_betabin(n, a, b, p = 0.5, iter = 1000)
Arguments
| n | binomial sample size | 
| a | first beta shape parameter | 
| b | second beta shape parameter | 
| p | starting value of proportion in algorithm | 
| iter | number of iterations | 
Value
matrix of simulated draws from the algorithm
Author(s)
Jim Albert
Examples
sp <- gibbs_betabin(20, 5, 5, 100)
Gibbs sampling of a bivariate discrete distribution
Description
Implements Gibbs sampling for an arbitrary bivariate discrete distribution
Usage
  gibbs_discrete(p, i = 1, iter = 1000)
Arguments
| p | matrix defining the probabiity distribution | 
| i | starting row of the matrix | 
| iter | number of cycles of algorithm | 
Value
matrix of simulated draws from algorithm
Author(s)
Jim Albert
Examples
p <- matrix(c(4, 3, 2, 1,
              3, 4, 3, 2,
              2, 3, 4, 3,
              1, 2, 3, 4) / 40, 4, 4, byrow = TRUE)
out <- gibbs_discrete(p, 1, 100)
Gibbs sampling of the normal sampling posterior
Description
Implements Gibbs sampling for normal sampling with independent priors on the mean and precision
Usage
  gibbs_normal(s, P = 0.002, iter = 1000)
Arguments
| s | a list with components y, the observed data, mu0, the prior mean of mu, sigma0, the prior standard deviation of mu, a, the shape parameter of the gamma prior on P, b, the rate parameter of the gamma prior on P | 
| P | starting value of the precision parameter | 
| iter | number of iterations | 
Value
matrix of simulated draws of (mu, P) from the algorithm
Author(s)
Jim Albert
Examples
s <- list(y = rnorm(20, 5, 2),
  mu0 = 10, sigma0 = 3, a = 1, b = 1)
out <- gibbs_normal(s, P = 0.01, iter=100)
House price data
Description
Measurements of house size and selling price for a collection of homes in a city in Ohio
Usage
  house_prices
Format
A data frame with 24 observations on the following 2 variables.
- price
- selling price in $1000 
- size
- square footage of house 
Source
Zillow.com
Increases font size of text
Description
Increases font size on all text in a ggplot2 graphic
Usage
  increasefont(Size = 18)
Arguments
| Size | font size of all textual elements in a ggplot2 graphic | 
Value
ggplot2 theme code to increase the font size
Author(s)
Jim Albert
Examples
df <- data.frame(p=c(.1, .3, .5, .7, .9),
                 Prior=rep(1/5, 5))
ggplot(df, aes(p, Prior)) +
geom_point() + increasefont()
Graph of several normal curves
Description
Graph of several normal curves
Usage
  many_normal_plots(list_normal_par)
Arguments
| list_normal_par | list of vectors, where each vector is a mean and standard deviation for a normal distribution | 
Value
ggplot2 object containing the graphical display.
Author(s)
Jim Albert
Examples
 list_normal_par <- list(c(100, 15),
     c(110, 15), c(120, 15))
 many_normal_plots(list_normal_par)
Graphs a collection of spinners
Description
Graphs a collection of spinners
Usage
  many_spinner_plots(list_regions)
Arguments
| list_regions | list of vectors of integer areas for the spins 1, 2, ... | 
Value
A ggplot2 object containing the spinner displays
Author(s)
Jim Albert
Examples
  regions1 <- c(1, 1, 1)
  regions2 <- c(2, 1, 2, 1)
  many_spinner_plots(list(regions1, regions2))
Annual Marriage Counts in Italy
Description
Annual marriage counts per 1000 of the population in Italy from 1936 to 1951
Usage
  marriage_counts
Format
A data frame with 16 observations on the following 2 variables.
- Year
- year 
- Count
- count of marriages per 1000 people 
Source
Unknown.
Nutritional data for McDonalds Sandwiches
Description
Serving size and calories for a selection of sandwiches from McDonalds
Usage
  mcdonalds
Format
A data frame with 11 observations on the following 3 variables.
- Sandwich
- name of sandwich 
- Size
- serving size in grams 
- Calories
- calories of sandwich 
Source
McDonalds restaurant
Metropolis sampling of a continuous distribution
Description
Implements Metropolis sampling for an arbitrary continuous probability distribution
Usage
  metropolis(logpost, current, C, iter, ...)
Arguments
| logpost | function definition of the log probability function | 
| current | starting value of algorithm | 
| C | half-width of proposal interval | 
| iter | number of iterations | 
| ... | other inputs needed in logpost function | 
Value
| S | vector of simulated values | 
| accept_rate | acceptance rate of algorithm | 
Author(s)
Jim Albert
Examples
lpost <- function(theta, s){
  dnorm(s$ybar, theta, s$se, log = TRUE) +
    dcauchy(theta, s$loc, s$scale, log = TRUE)
}
s <- list(ybar = 20,
          se = 0.4,
          loc = 10,
          scale = 2)
post <- metropolis(lpost, 10, 20, 100, s)
Movies Sales Data
Description
Weekend and gross sales for a selection of movies released in 2017
Usage
  movies2017
Format
A data frame with 10 observations on the following 3 variables.
- Movie
- name of movie 
- Weekend
- opening weekend sales in millions of dollars 
- Gross
- gross sales in millions of dollars 
Source
Internet Movie Database
Basketball Shooting Data for Point Guards
Description
Field goal and free throw shooting data for a collection of great NBA point guards
Usage
  nba_guards
Format
A data frame with 230 observations on the following 6 variables.
- Player
- name of player 
- Age
- age of player 
- FG
- field goals 
- FGA
- field goal attempts 
- FT
- free throws 
- FTA
- free throw attempts 
Source
Data collected from Basketball-Reference.com.
Displays Area Under a Normal Curve
Description
Computes and Displays Area Under a Normal Curve
Usage
  normal_area(lo, hi, normal_pars, Color = "orange")
Arguments
| lo | lower bound of interval | 
| hi | upper bound of interval | 
| normal_pars | vector of mean and standard deviation of the normal curve | 
| Color | color of shading in plot | 
Value
ggplot2 object containing the graphical display.
Author(s)
Jim Albert
Examples
  lo <- 10
  hi <- 20
  normal_pars <- c(25, 10)
  normal_area(lo, hi, normal_pars)
Draws a Normal Curve
Description
Draws a Normal Curve
Usage
  normal_draw(normal_pars, Color = "red")
Arguments
| normal_pars | vector of mean and standard deviation of the normal curve | 
| Color | color of line in plot | 
Value
ggplot2 object containing the graphical display.
Author(s)
Jim Albert
Examples
  normal_pars <- c(2, 1)
  normal_draw(normal_pars)
Probability Interval for a Normal Curve
Description
Computes "equal-tails" probability interval for a normal curve
Usage
  normal_interval(prob, normal_pars, Color = "orange")
Arguments
| prob | value of coverage probability | 
| normal_pars | vector of mean and standard deviation of the normal curve | 
| Color | color of shading in plot | 
Value
ggplot2 object containing the graphical display.
Author(s)
Jim Albert
Examples
  normal_pars <- c(2, 0.5)
  prob <- 0.5
  normal_interval(prob, normal_pars)
Displays a Quantile of a Normal Curve
Description
Displays a Quantile of a Normal Curve
Usage
  normal_quantile(prob, normal_pars, Color = "orange")
Arguments
| prob | probability value of interest | 
| normal_pars | vector of mean and standard deviation of the normal curve | 
| Color | color of shading in plot | 
Value
ggplot2 object containing the graphical display.
Author(s)
Jim Albert
Examples
  normal_pars <- c(100, 10)
  prob <- 0.7
  normal_quantile(prob, normal_pars)
Updates a Normal Prior with Normal Data
Description
Finds the parameters of the normal posterior with normal data and a normal prior
Usage
  normal_update(prior, data, teach=FALSE)
Arguments
| prior | vector with components mean and sd of the normal prior | 
| data | vector with components the sample mean and the standard error of the estimate | 
| teach | logical variable indicating the form of the output | 
Value
If teach = TRUE, returns data frame that displays the mean, precision, and standard deviation for the prior, data, and posterior. If teach = FALSE, returns a vector with mean and standard deviation of the posterior.
Author(s)
Jim Albert
Examples
  prior <- c(100, 10)
  data <- c(110, 15)
  normal_update(prior, data)
  normal_update(prior, data, teach=TRUE)
Winning Times in the 100 Meter Butterfly Race
Description
Winning times in seconds for the men's and women's 100m butterfly race for the Olympics from 1964 through 2016.
Usage
  olympic_butterfly
Format
A data frame with 28 observations on the following 3 variables.
- Year
- year of Olympics 
- Gender
- gender 
- Time
- winning time in seconds 
Source
https://www.olympic.org/swimming/
Graphs prior and posterior probabilities
Description
Graphs prior and posterior probabilities from a discrete Bayesian model
Usage
  prior_post_plot(d, Color = "orange")
Arguments
| d | data frame where the first column are the model values, and columns named Prior and Posterior | 
| Color | fill color for the bars | 
Value
ggplot2 object containing the graphical display.
Author(s)
Jim Albert
Examples
d <- data.frame(p=c(.1, .3, .5, .7, .9),
                 Prior=rep(1/5, 5))
y <- 5
n <- 10
d$Likelihood <- dbinom(y, prob=d$p, size=n)
d <- bayesian_crank(d)
prior_post_plot(d, "red")
Constructs a graph of a probability distribution
Description
Constructs a graph of a discrete probability distribution
Usage
  prob_plot(d, Color = "red", Size = 1.5)
Arguments
| d | data frame where the first two columns are the variable and associated probabilities | 
| Color | color of line in plot | 
| Size | width of line in plot | 
Value
A ggplot2 object containing the plot display
Author(s)
Jim Albert
Examples
  d <- data.frame(x=1:5,
         Probability=c(.1, .2, .3, .3, .1))
  prob_plot(d)
Prices of One Carat Diamonds
Description
Prices of a sample of one carat diamonds
Usage
  pt100price
Format
A data frame with 25 observations on the following 2 variables.
- diamond
- index of diamond 
- price
- price divided by 100 
Source
Unknown.
Prices of 0.99 Carat Diamonds
Description
Prices of a sample of 0.99 carat diamonds
Usage
  pt99price
Format
A data frame with 23 observations on the following 2 variables.
- diamond
- index of diamond 
- price
- price divided by 100 
Source
Unknown.
Baseball Win-Loss Records
Description
Final standings of the MLB baseball teams in the 2018 season
Usage
  pythag2018
Format
A data frame with 30 observations on the following 7 variables.
- Team
- team abbreviation 
- League
- league abbreviation 
- W
- number of wins 
- L
- number of losses 
- Pct
- proportion of wins 
- R
- average runs scored 
- RA
- average runs allowed 
Source
Lahman database
Metropolis sampling of a discrete distribution
Description
Implements Metropolis sampling for an arbitrary discrete probability distribution
Usage
  random_walk(pd, start, num_steps)
Arguments
| pd | function containing discrete probability function on the integers 1, 2, ... | 
| start | starting value of algorithm | 
| num_steps | number of iterations of algorithm | 
Value
A vector of simulated values
Author(s)
Jim Albert
Examples
# random walk through a binomial distribution
pd <- function(x){
  dbinom(x, size = 10, prob = 0.5)
}
start <- 4
num_steps <- 50
out <- random_walk(pd, start, num_steps)
Sleeping Times
Description
Sample of sleeping times for a single night for a sample of college students
Usage
  sleeping_times
Format
A data frame with 14 observations on the following single variable.
- hours
- number of hours of sleep 
Source
Personal collection
Implements Bayes' rule for a spinner problem
Description
Computes and plots the posterior distribution of spinners given a sequence of spins
Usage
  spinner_bayes(list_regions,
                prior,
                data,
                plot=TRUE)
Arguments
| list_regions | list of vectors of integer areas for the spins 1, 2, ... | 
| prior | a vector containing the prior probabilities for the spinners | 
| data | a vector containing the spin values where 1, 2, 3, ... are the possible spins | 
| plot | if plot=TRUE, a comparative graph of the prior and posterior probabilities is displayed | 
Value
A data frame with variables Spinner, Prior, Likelihood, Product, and Posterior
Author(s)
Jim Albert
Examples
  regions1 <- c(1, 1, 1)
  regions2 <- c(2, 1, 2, 1)
  data <- c(1, 1, 1, 2)
  spinner_bayes(list(regions1, regions2),
                prior=c(0.5, 0.5),
                data)
Simulate random data from a spinner
Description
Simulate random data from a spinner
Usage
  spinner_data(regions, nsim=1000)
Arguments
| regions | vector of integer values for the spins 1, 2, ... | 
| nsim | number of spins | 
Value
A vector of random spins from the spinner
Author(s)
Jim Albert
Examples
  regions <- c(2, 1, 1, 2)
  spinner_data(regions, nsim=20)
Computes likelihood matrix for many spinners
Description
Computes likelihood matrix for many spinners
Usage
  spinner_likelihoods(regions)
Arguments
| regions | list of vectors of integer areas for the spins 1, 2, ... | 
Value
A matrix where each row corresponds to the outcome probabilities for one spinner.
Author(s)
Jim Albert
Examples
  sp1 <- c(2, 1, 1)
  sp2 <- c(1, 1, 1, 1)
  regions <- list(sp1, sp2)
  spinner_likelihoods(regions)
Constructs a spinner
Description
Constructs a spinner with different regions
Usage
  spinner_plot(probs, ...)
Arguments
| probs | vector of probabilities for the spins 1, 2, ... | 
| ... | optional vector of values and title | 
Value
A ggplot2 object containing the spinner display
Author(s)
Jim Albert
Examples
  probs <- rep(.2, 5)
  spinner_plot(probs,
         values=c("A", "B", "C", "D", "E"),
         title="My Spinner")
  # probs does not need to be normalized
  spinner_plot(c(1, 2, 1, 2))
Display probability distribution for a spinner
Description
Display probability distribution for a spinner
Usage
  spinner_probs(regions)
Arguments
| regions | vector of positive values for the spins 1, 2, ... | 
Value
Dataframe with variables Region and Prob
Author(s)
Jim Albert
Examples
  regions <- c(2, 1, 1, 2)
  spinner_probs(regions)
Taxi Fares
Description
Sample of taxi fares from a particular city
Usage
  taxi_fares
Format
A data frame with 20 observations on the following single variable.
- fare
- taxi cab fare 
Source
Personal collection
Tennis Times to Serve
Description
Data on time to serve for six professional tennis players
Usage
  tennis_serve
Format
A data frame with 6 observations on the following 3 variables.
- Player
- last name of player 
- n
- number of serves 
- ybar
- mean time to serve 
Source
https://github.com/JeffSackmann
Testing prior for two proportions
Description
Constructs a discrete distribution for two proportions under a testing or uniform hypotheses
Usage
  testing_prior(lo=.1, hi=.9, n_values=9,
        pequal=0.5, uniform=FALSE)
Arguments
| lo | minimum value of each proportion | 
| hi | maximum value of each proportion | 
| n_values | number of values of each proportion | 
| pequal | probability of the equality of the two proportions | 
| uniform | indicates if a uniform prior is desired | 
Value
matrix of probabilities where the rows and columns are labeled by the values of the proportions
Author(s)
Jim Albert
Examples
  # testing prior where each proportion is
  # .1, .3, .5, .7, .9
  Prob <- testing_prior(.1, .9, 5)
  # uniform prior over same proportion values
  Prob <- testing_prior(.1, .9, 5, uniform=TRUE)
Mike Trout Statcast Data
Description
Launch speed and distance traveled for a sample of balls hit by the baseball player Mike Trout
Usage
  trout20
Format
A data frame with 25 observations on the following 2 variables.
- launch_speed
- launch speed in mph 
- hit_distance_sc
- distance in feet 
Source
Major League Baseball Advanced Media
Summaries of a probability matrix
Description
Computes posterior of difference P2 - P1 of a probability matrix of two proportions
Usage
  two_p_summarize(prob_matrix)
Arguments
| prob_matrix | probability matrix where the rows and columns are labeled with the values of the proportions | 
Value
data frame with variables diff21 and Prob where diff21 = P2 - P1
Author(s)
Jim Albert
Examples
  # use uniform prior over values .2, .3, .4
  prob_matrix <- testing_prior(.2, .4, 3, uniform=TRUE)
  two_p_summarize(prob_matrix)
Posterior updating of two proportions
Description
Computes posterior distribution of two proportions with a discrete prior
Usage
  two_p_update(prior, s1f1, s2f2)
Arguments
| prior | prior probability matrix where the rows and columns are labeled with the values of the proportions | 
| s1f1 | number of successes and number of failures from first sample | 
| s2f2 | number of successes and number of failures from second sample | 
Value
posterior probability matrix
Author(s)
Jim Albert
Examples
  prior <- testing_prior()
  s1f1 <- c(3, 10)
  s2f2 <- c(8, 20)
  two_p_update(prior, s1f1, s2f2)
Times to Serve for Two Tennis Players
Description
Measurements of time to serve serves of the tennis players Roger Federer and Rafael Nadal
Usage
  two_players_time_to_serve
Format
A data frame with 100 observations on the following 2 variables.
- Player
- last name of player 
- time
- time to serve in seconds 
Source
https://github.com/JeffSackmann
Website tracking data
Description
Number of visits to a blog website for different weeks and days of the week
Usage
  web_visits
Format
A data frame with 28 observations on the following 3 variables.
- Week
- week number 
- Day
- day ofthe week 
- Count
- number of website visits 
Source
Personal data collected from Wordpress.com