The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
In this vignette we will discuss about how to customize the summary
statistics using ExpCustomStat function from
SmartEDA
. The output of this function returns matrix object
containing descriptive information on all input variables for each
level/combination of levels in categorical/group variable. Also, while
running the analysis we can filter row/cases of the data. We can apply
the filters at individual variable level or complete data like base
subsetting.
Function definition:
ExpCustomStat(data,Cvar=NULL,Nvar=NULL,stat=NULL,gpby=TRUE,filt=NULL,dcast=FALSE)
Key functionalities of ExpCustomStat
are:
ExpCustomStat
functionWill open the carseats data from ISLR package and drive different
types of use cases using ExpCustomStat
function.
In this vignette, we will be using a simulated data set containing sales of child car seats at 400 different stores.
Data Source ISLR package.
Function source SmartEDA package
Carseats data from ISLR package:
options(width = 150)
= ISLR::Carseats
CData head(CData,5)
## Sales CompPrice Income Advertising Population Price ShelveLoc Age Education Urban US
## 1 9.50 138 73 11 276 120 Bad 42 17 Yes Yes
## 2 11.22 111 48 16 260 83 Good 65 10 Yes Yes
## 3 10.06 113 35 10 269 80 Medium 59 12 Yes Yes
## 4 7.40 117 100 4 466 97 Medium 55 14 Yes Yes
## 5 4.15 141 64 3 340 128 Bad 38 13 Yes No
Categorical summaries to describe the distribution for a qualitative variables.
The number of observations for particular category
"US", "Urban" and "ShelveLoc"
variableExpCustomStat(Carseats,Cvar=c("US","Urban","ShelveLoc","Education"),gpby=FALSE)
## Level Group_by Count Prop
## 1: Yes US 258 64.50
## 2: No US 142 35.50
## 3: Yes Urban 282 70.50
## 4: No Urban 118 29.50
## 5: Bad ShelveLoc 96 24.00
## 6: Good ShelveLoc 85 21.25
## 7: Medium ShelveLoc 219 54.75
## 8: 17 Education 49 12.25
## 9: 10 Education 48 12.00
## 10: 12 Education 49 12.25
## 11: 14 Education 40 10.00
## 12: 13 Education 43 10.75
## 13: 16 Education 47 11.75
## 14: 15 Education 36 9.00
## 15: 18 Education 40 10.00
## 16: 11 Education 48 12.00
OR we can use similar analysis using ExpCTable
function
from same package, this functions includes cumulative percentages and
Total
ExpCTable(Carseats,Target=NULL,clim=5,nlim=15,round=2,bin=NULL,per=F)
## Variable Valid Frequency Percent CumPercent
## 1 ShelveLoc Bad 96 24.00 24.00
## 2 ShelveLoc Good 85 21.25 45.25
## 3 ShelveLoc Medium 219 54.75 100.00
## 4 ShelveLoc TOTAL 400 NA NA
## 5 Urban No 118 29.50 29.50
## 6 Urban Yes 282 70.50 100.00
## 7 Urban TOTAL 400 NA NA
## 8 US No 142 35.50 35.50
## 9 US Yes 258 64.50 100.00
## 10 US TOTAL 400 NA NA
## 11 Education 10 48 12.00 12.00
## 12 Education 11 48 12.00 24.00
## 13 Education 12 49 12.25 36.25
## 14 Education 13 43 10.75 47.00
## 15 Education 14 40 10.00 57.00
## 16 Education 15 36 9.00 66.00
## 17 Education 16 47 11.75 77.75
## 18 Education 17 49 12.25 90.00
## 19 Education 18 40 10.00 100.00
## 20 Education TOTAL 400 NA NA
"US", "Urban" and "ShelveLoc"
variableExpCustomStat(Carseats,Cvar=c("US","Urban","ShelveLoc"),gpby=FALSE)
## Level Group_by Count Prop
## 1: Yes US 258 64.50
## 2: No US 142 35.50
## 3: Yes Urban 282 70.50
## 4: No Urban 118 29.50
## 5: Bad ShelveLoc 96 24.00
## 6: Good ShelveLoc 85 21.25
## 7: Medium ShelveLoc 219 54.75
To produce cross tables which calculate counts and proportions for each combination of categorical variables we can use ExpCustomStat
NOTE: For crosstabulation change input gpby
=TRUE
"US" Vs "Urban"
ExpCustomStat(Carseats,Cvar=c("US","Urban"),gpby=TRUE,filt=NULL)
## US Urban Count Prop
## 1: Yes Yes 186 46.5
## 2: No Yes 96 24.0
## 3: Yes No 72 18.0
## 4: No No 46 11.5
We can also produce multidimensional tables based on three or more categorical variables
"US" Vs "Urban" Vs "ShelveLoc"
ExpCustomStat(Carseats,Cvar=c("US","Urban","ShelveLoc"),gpby=TRUE,filt=NULL)
## US Urban ShelveLoc Count Prop
## 1: Yes Yes Bad 51 12.75
## 2: Yes Yes Good 39 9.75
## 3: Yes Yes Medium 96 24.00
## 4: No Yes Bad 23 5.75
## 5: Yes No Bad 11 2.75
## 6: No Yes Medium 55 13.75
## 7: No No Medium 29 7.25
## 8: Yes No Medium 39 9.75
## 9: No Yes Good 18 4.50
## 10: Yes No Good 22 5.50
## 11: No No Good 6 1.50
## 12: No No Bad 11 2.75
If we want to understand the number of stores in US
and
location is Urban
for Population
size greater
than 150
Population
is > 150"US" Vs "Urban"
ExpCustomStat(Carseats,Cvar=c("US","Urban"),gpby=TRUE,filt="Population>150")
## US Urban Filter Count Prop
## 1: Yes Yes Population>150 138 48.25
## 2: No Yes Population>150 63 22.03
## 3: Yes No Population>150 53 18.53
## 4: No No Population>150 32 11.19
Population
> 150 and Urban
==“Yes”"US" Vs "ShelveLoc"
ExpCustomStat(Carseats,Cvar=c("US","ShelveLoc"),gpby=TRUE,filt="Urban=='Yes' & Population>150")
## US ShelveLoc Filter Count Prop
## 1: Yes Bad Urban=='Yes' & Population>150 40 19.90
## 2: Yes Good Urban=='Yes' & Population>150 31 15.42
## 3: Yes Medium Urban=='Yes' & Population>150 67 33.33
## 4: No Bad Urban=='Yes' & Population>150 13 6.47
## 5: No Medium Urban=='Yes' & Population>150 37 18.41
## 6: No Good Urban=='Yes' & Population>150 13 6.47
Numerical summaries to describe the distribution for quantitative variables.
options(width = 150)
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('Count','mean','sum','var','sd','min','max','IQR'))
## Attribute Count mean sum var sd min max IQR
## 1: Population 400 264.840000 105936.00 21719.813935 147.376436 10 509.00 259.50
## 2: Sales 400 7.496325 2998.53 7.975626 2.824115 0 16.27 3.93
## 3: CompPrice 400 124.975000 49990.00 235.147243 15.334512 77 175.00 20.00
## 4: Income 400 68.657500 27463.00 783.218239 27.986037 21 120.00 48.25
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('min','p0.25','median','p0.75','max'))
## Attribute min p0.25 median p0.75 max
## 1: Population 10 139.00 272.00 398.50 509.00
## 2: Sales 0 5.39 7.49 9.32 16.27
## 3: CompPrice 77 115.00 125.00 135.00 175.00
## 4: Income 21 42.75 69.00 91.00 120.00
Filter rows/cases of complete dataset where conditions are true
options(width = 150)
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('Count','mean','sum','var','min','median','max'),filt="Urban=='Yes'")
## Attribute Filter Count mean sum var min median max
## 1: Population Urban=='Yes' 282 259.886525 73288.00 20913.488857 12.00 274.00 508.00
## 2: Sales Urban=='Yes' 282 7.468191 2106.03 8.044141 0.37 7.42 16.27
## 3: CompPrice Urban=='Yes' 282 125.634752 35429.00 246.545834 77.00 125.00 175.00
## 4: Income Urban=='Yes' 282 69.343972 19555.00 743.706885 21.00 70.00 120.00
options(width=150)
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('Count','mean','sum','median','IQR'),filt="Urban=='Yes' & Population>150")
## Attribute Filter Count mean sum median IQR
## 1: Population Urban=='Yes' & Population>150 201 332.641791 66861.00 331.00 165.00
## 2: Sales Urban=='Yes' & Population>150 201 7.573383 1522.25 7.49 3.74
## 3: CompPrice Urban=='Yes' & Population>150 201 125.134328 25152.00 125.00 19.00
## 4: Income Urban=='Yes' & Population>150 201 68.223881 13713.00 69.00 47.00
This will be useful when we need to exclude redundant values like ‘999’ or ‘9999’ or ‘-9’ or ‘-1111’, or ‘888’ etc from each selected variable.
Eg:dat = data.frame(x = c(23,24,34,999,12,12,23,999,45), y = c(1,3,4,999,0,999,0,8,999,0)
Exclude 999:
x = c(23,24,34,12,12,23,45) y = c(1,3,4,0,0,8,0)
= Carseats[,]
data_sam sample(1:400,30),"Sales"] <- 999
data_sam[sample(1:400,20),"CompPrice"] <- -9
data_sam[sample(1:400,45),"Income"] <- 999
data_sam[ExpCustomStat(data_sam,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('Count','mean','sum','min'),filt="All %ni% c(999,-9)")
## Attribute Filter Count mean sum min
## 1: Population Population%ni% c(999,-9) 400 264.840000 105936.00 10.00
## 2: Sales Sales%ni% c(999,-9) 370 7.535865 2788.27 0.16
## 3: CompPrice CompPrice%ni% c(999,-9) 380 125.150000 47557.00 77.00
## 4: Income Income%ni% c(999,-9) 355 69.025352 24504.00 21.00
Different filters for each numeric variable. For example, below are the conditions (logic) for each variable summary analysis.
"Population" - Consider only Good ShelveLoc (the quality of the shelving location for the car seats at each site) ShelveLoc=='Good'
"Sales" - Inculde only those store belongs to Urban location (Urban==Yes)
"CompPrice" - Exclude Price is greater than 150
"Education" - All stores
"Income" - Inculde only stores in US (US==Yes)
Table: Descriptive summary for Price, Population, Sales, CompPrice, Income based on the filters.
options(width = 150)
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Education","Income"),stat = c('Count','mean','sum','var','sd','IQR','median'),filt=c("ShelveLoc=='Good'^Urban=='Yes'^Price>=150^All^US=='Yes'"))
## Attribute Filter Count mean sum var sd IQR median
## 1: Population ShelveLoc=='Good' 85 267.047059 22699.00 16193.473950 127.253581 177.0000 272.00
## 2: Sales Urban=='Yes' 282 7.468191 2106.03 8.044141 2.836219 3.9175 7.42
## 3: CompPrice Price>=150 32 141.875000 4540.00 181.596774 13.475785 18.2500 142.50
## 4: Education All 400 13.900000 5560.00 6.867168 2.620528 4.0000 14.00
## 5: Income US=='Yes' 258 70.515504 18193.00 782.849953 27.979456 48.0000 70.00
Descriptive summary for numerical variable by group level.
options(width = 150)
ExpCustomStat(Carseats,Cvar = c("Urban","ShelveLoc"), Nvar=c("Population","Sales"), stat = c('Count','Prop','mean','min','P0.25','median','p0.75','max'),gpby=FALSE)
## Level Attribute Group_by Count Prop mean min P0.25 median p0.75 max
## 1: Yes Population Urban 282 70.50 259.886525 12.00 139.0000 274.00 376.7500 508.00
## 2: No Population Urban 118 29.50 276.677966 10.00 144.0000 271.00 408.0000 509.00
## 3: Yes Sales Urban 282 70.50 7.468191 0.37 5.3750 7.42 9.2925 16.27
## 4: No Sales Urban 118 29.50 7.563559 0.00 5.4400 7.67 9.3350 14.90
## 5: Bad Population ShelveLoc 96 24.00 275.291667 10.00 145.5000 296.00 400.5000 501.00
## 6: Good Population ShelveLoc 85 21.25 267.047059 14.00 176.0000 272.00 353.0000 503.00
## 7: Medium Population ShelveLoc 219 54.75 259.401826 12.00 124.0000 261.00 405.0000 509.00
## 8: Bad Sales ShelveLoc 96 24.00 5.522917 0.37 4.0525 5.21 7.4625 11.67
## 9: Good Sales ShelveLoc 85 21.25 10.214000 3.58 8.3300 10.50 11.9600 16.27
## 10: Medium Sales ShelveLoc 219 54.75 7.306575 0.00 5.6250 7.38 8.7750 13.36
options(width = 150)
ExpCustomStat(Carseats,Cvar = c("Urban","US","ShelveLoc"), Nvar=c("CompPrice","Income"), stat = c('Count','Prop','mean','sum','PS','min','max','IQR','sd'), gpby = TRUE)
## Urban US ShelveLoc Attribute Count Prop mean sum PS min max IQR sd
## 1: Yes Yes Bad CompPrice 51 12.75 124.88235 6369 12.74 93 161 17.00 14.768408
## 2: Yes Yes Good CompPrice 39 9.75 127.00000 4953 9.91 89 156 22.00 15.998355
## 3: Yes Yes Medium CompPrice 96 24.00 125.05208 12005 24.01 85 175 20.00 15.410434
## 4: Yes No Bad CompPrice 23 5.75 125.52174 2887 5.78 86 157 21.50 17.367340
## 5: No Yes Bad CompPrice 11 2.75 122.45455 1347 2.69 88 149 8.50 17.494934
## 6: Yes No Medium CompPrice 55 13.75 126.00000 6930 13.86 77 159 20.00 16.611018
## 7: No No Medium CompPrice 29 7.25 121.86207 3534 7.07 89 154 21.00 16.012772
## 8: No Yes Medium CompPrice 39 9.75 126.33333 4927 9.86 96 150 18.50 14.411862
## 9: Yes No Good CompPrice 18 4.50 126.94444 2285 4.57 95 157 21.00 16.049454
## 10: No Yes Good CompPrice 22 5.50 122.36364 2692 5.39 107 147 16.25 11.986284
## 11: No No Good CompPrice 6 1.50 126.50000 759 1.52 96 142 13.75 16.610238
## 12: No No Bad CompPrice 11 2.75 118.36364 1302 2.60 106 131 15.00 9.080449
## 13: Yes Yes Bad Income 51 12.75 76.31373 3892 14.17 28 119 40.50 26.657449
## 14: Yes Yes Good Income 39 9.75 66.89744 2609 9.50 21 117 49.00 28.163802
## 15: Yes Yes Medium Income 96 24.00 68.83333 6608 24.06 25 120 41.25 27.070538
## 16: Yes No Bad Income 23 5.75 67.52174 1553 5.65 25 114 34.00 25.414508
## 17: No Yes Bad Income 11 2.75 72.72727 800 2.91 21 120 52.50 33.400871
## 18: Yes No Medium Income 55 13.75 66.14545 3638 13.25 22 118 42.50 26.857248
## 19: No No Medium Income 29 7.25 62.44828 1811 6.59 22 120 60.00 32.077569
## 20: No Yes Medium Income 39 9.75 69.05128 2693 9.81 21 119 55.00 30.884291
## 21: Yes No Good Income 18 4.50 69.72222 1255 4.57 24 113 62.25 31.765326
## 22: No Yes Good Income 22 5.50 72.31818 1591 5.79 22 115 35.00 27.425026
## 23: No No Good Income 6 1.50 53.83333 323 1.18 30 78 40.25 22.542552
## 24: No No Bad Income 11 2.75 62.72727 690 2.51 25 105 17.00 23.524842
## Urban US ShelveLoc Attribute Count Prop mean sum PS min max IQR sd
options(width = 150)
ExpCustomStat(Carseats,Cvar = c("Urban","US","ShelveLoc"), Nvar=c("CompPrice","Income"), stat = c('Count','Prop','mean','sum','PS','median','IQR'), gpby = TRUE,filt="Urban=='Yes'")
## Urban US ShelveLoc Attribute Filter Count Prop mean sum PS median IQR
## 1: Yes Yes Bad CompPrice Urban=='Yes' 51 18.09 124.88235 6369 17.98 125.0 17.00
## 2: Yes Yes Good CompPrice Urban=='Yes' 39 13.83 127.00000 4953 13.98 128.0 22.00
## 3: Yes Yes Medium CompPrice Urban=='Yes' 96 34.04 125.05208 12005 33.88 125.0 20.00
## 4: Yes No Bad CompPrice Urban=='Yes' 23 8.16 125.52174 2887 8.15 125.0 21.50
## 5: Yes No Medium CompPrice Urban=='Yes' 55 19.50 126.00000 6930 19.56 127.0 20.00
## 6: Yes No Good CompPrice Urban=='Yes' 18 6.38 126.94444 2285 6.45 122.0 21.00
## 7: Yes Yes Bad Income Urban=='Yes' 51 18.09 76.31373 3892 19.90 81.0 40.50
## 8: Yes Yes Good Income Urban=='Yes' 39 13.83 66.89744 2609 13.34 69.0 49.00
## 9: Yes Yes Medium Income Urban=='Yes' 96 34.04 68.83333 6608 33.79 69.0 41.25
## 10: Yes No Bad Income Urban=='Yes' 23 8.16 67.52174 1553 7.94 67.0 34.00
## 11: Yes No Medium Income Urban=='Yes' 55 19.50 66.14545 3638 18.60 69.0 42.50
## 12: Yes No Good Income Urban=='Yes' 18 6.38 69.72222 1255 6.42 73.5 62.25
options(width = 150)
= Carseats[,]
data_sam sample(1:400,30),"Sales"] <- 888
data_sam[sample(1:400,20),"CompPrice"] <- 999
data_sam[sample(1:400,45),"Income"] <- 999
data_sam[ExpCustomStat(data_sam,Cvar = c("Urban","US","ShelveLoc"), Nvar=c("Sales","CompPrice","Income"), stat = c('Count','Prop','mean','sum','PS'), gpby = TRUE,filt="All %ni% c(888,999)")
## Urban US ShelveLoc Attribute Filter Count Prop mean sum PS
## 1: Yes Yes Bad Sales Sales%ni% c(888,999) 47 12.70 5.655106 265.79 9.56
## 2: Yes Yes Good Sales Sales%ni% c(888,999) 37 10.00 10.774054 398.64 14.35
## 3: Yes Yes Medium Sales Sales%ni% c(888,999) 93 25.14 7.569785 703.99 25.33
## 4: Yes No Bad Sales Sales%ni% c(888,999) 22 5.95 5.403636 118.88 4.28
## 5: No Yes Bad Sales Sales%ni% c(888,999) 11 2.97 5.959091 65.55 2.36
## 6: Yes No Medium Sales Sales%ni% c(888,999) 49 13.24 7.094082 347.61 12.51
## 7: No No Medium Sales Sales%ni% c(888,999) 25 6.76 6.415600 160.39 5.77
## 8: No Yes Medium Sales Sales%ni% c(888,999) 35 9.46 7.811429 273.40 9.84
## 9: Yes No Good Sales Sales%ni% c(888,999) 16 4.32 9.144375 146.31 5.27
## 10: No Yes Good Sales Sales%ni% c(888,999) 20 5.41 9.998000 199.96 7.20
## 11: No No Good Sales Sales%ni% c(888,999) 5 1.35 9.386000 46.93 1.69
## 12: No No Bad Sales Sales%ni% c(888,999) 10 2.70 5.133000 51.33 1.85
## 13: Yes Yes Bad CompPrice CompPrice%ni% c(888,999) 50 13.16 124.900000 6245.00 13.14
## 14: Yes Yes Medium CompPrice CompPrice%ni% c(888,999) 91 23.95 125.428571 11414.00 24.01
## 15: Yes No Bad CompPrice CompPrice%ni% c(888,999) 22 5.79 124.954545 2749.00 5.78
## 16: No Yes Bad CompPrice CompPrice%ni% c(888,999) 11 2.89 122.454545 1347.00 2.83
## 17: Yes No Medium CompPrice CompPrice%ni% c(888,999) 52 13.68 126.057692 6555.00 13.79
## 18: Yes Yes Good CompPrice CompPrice%ni% c(888,999) 35 9.21 127.742857 4471.00 9.41
## 19: No No Medium CompPrice CompPrice%ni% c(888,999) 28 7.37 121.928571 3414.00 7.18
## 20: No Yes Medium CompPrice CompPrice%ni% c(888,999) 37 9.74 126.729730 4689.00 9.87
## 21: Yes No Good CompPrice CompPrice%ni% c(888,999) 18 4.74 126.944444 2285.00 4.81
## 22: No Yes Good CompPrice CompPrice%ni% c(888,999) 20 5.26 121.550000 2431.00 5.11
## 23: No No Good CompPrice CompPrice%ni% c(888,999) 5 1.32 125.400000 627.00 1.32
## 24: No No Bad CompPrice CompPrice%ni% c(888,999) 11 2.89 118.363636 1302.00 2.74
## 25: Yes Yes Bad Income Income%ni% c(888,999) 50 14.08 76.180000 3809.00 15.67
## 26: Yes Yes Good Income Income%ni% c(888,999) 33 9.30 67.636364 2232.00 9.18
## 27: Yes Yes Medium Income Income%ni% c(888,999) 83 23.38 68.843373 5714.00 23.51
## 28: Yes No Bad Income Income%ni% c(888,999) 23 6.48 67.521739 1553.00 6.39
## 29: No Yes Bad Income Income%ni% c(888,999) 9 2.54 67.888889 611.00 2.51
## 30: No Yes Medium Income Income%ni% c(888,999) 34 9.58 67.058824 2280.00 9.38
## 31: Yes No Medium Income Income%ni% c(888,999) 50 14.08 67.800000 3390.00 13.95
## 32: No No Medium Income Income%ni% c(888,999) 25 7.04 59.520000 1488.00 6.12
## 33: Yes No Good Income Income%ni% c(888,999) 16 4.51 67.937500 1087.00 4.47
## 34: No Yes Good Income Income%ni% c(888,999) 18 5.07 74.611111 1343.00 5.53
## 35: No No Good Income Income%ni% c(888,999) 6 1.69 53.833333 323.00 1.33
## 36: No No Bad Income Income%ni% c(888,999) 8 2.25 59.000000 472.00 1.94
## Urban US ShelveLoc Attribute Filter Count Prop mean sum PS
Different base for each numeric variable.
"Population" - Consider only Good ShelveLoc (the quality of the shelving location for the car seats at each site) ShelveLoc=='Good'
"Sales" - Inculde only those store belongs to Urban location (Urban==Yes)
"CompPrice" - Exclude Price is greater than 150
ExpCustomStat(Carseats,Cvar = c("Urban","US"), Nvar=c("Population","Sales","CompPrice"), stat = c('Count','Prop','mean','sum','var','IQR'), filt=c("ShelveLoc=='Good'^Urban=='Yes'^Price>=150"))
## Urban US Attribute Filter Count Prop mean sum var IQR
## 1: Yes Yes Population ShelveLoc=='Good' 39 45.88 277.538462 10824.00 18259.676113 176.0000
## 2: Yes No Population ShelveLoc=='Good' 18 21.18 219.888889 3958.00 13625.633987 155.5000
## 3: No Yes Population ShelveLoc=='Good' 22 25.88 283.318182 6233.00 15122.512987 205.5000
## 4: No No Population ShelveLoc=='Good' 6 7.06 280.666667 1684.00 13183.066667 125.2500
## 5: Yes Yes Sales Urban=='Yes' 186 65.96 7.710968 1434.24 8.734665 4.0775
## 6: Yes No Sales Urban=='Yes' 96 34.04 6.997813 671.79 6.445127 3.4175
## 7: Yes Yes CompPrice Price>=150 16 50.00 142.187500 2275.00 223.362500 17.5000
## 8: No Yes CompPrice Price>=150 7 21.88 138.428571 969.00 180.952381 16.0000
## 9: Yes No CompPrice Price>=150 7 21.88 143.285714 1003.00 156.571429 14.5000
## 10: No No CompPrice Price>=150 2 6.25 146.500000 293.00 112.500000 7.5000
Reshapes a grouped data
options(width = 150)
ExpCustomStat(Carseats,Cvar = c("Urban"), Nvar=c("Population","Sales"), stat = c('Count','Prop'),gpby=TRUE,dcast=TRUE)
## Row value : Attribute
## Column value: Urban
## Statistics : Count + Prop
## Attribute Count_No Count_Yes Prop_No Prop_Yes
## 1: Population 118 282 29.5 70.5
## 2: Sales 118 282 29.5 70.5
##Frequency table for categorical variables
ExpCustomStat(Carseats,Cvar=c("US","Urban","ShelveLoc"),gpby=FALSE)
##Crosstabulation between categorical variables
ExpCustomStat(Carseats,Cvar=c("US","Urban"),gpby=TRUE,filt=NULL)
ExpCustomStat(Carseats,Cvar=c("US","Urban","ShelveLoc"),gpby=TRUE,filt=NULL)
##Adding filters for custom tables
ExpCustomStat(Carseats,Cvar=c("US","Urban"),gpby=TRUE,filt="Population>150")
ExpCustomStat(Carseats,Cvar=c("US","ShelveLoc"),gpby=TRUE,filt="Urban=='Yes' & Population>150")
## Numeric variable summary
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('Count','mean','sum','var','min','max'))
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('min','p0.25','median','p0.75','max'))
## Adding filters for complete data (like base Subset)
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('Count','mean','sum','var'),filt="Urban=='Yes'")
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('Count','mean','sum'),filt="Urban=='Yes' & Population>150")
## Filter unique value from all the numeric variables
ExpCustomStat(data_sam,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('Count','mean','sum','min'),filt="All %ni% c(999,-9)")
## Adding filters at variable level
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Education","Income"),stat = c('Count','mean','sum','var','sd','IQR','median'),filt=c("ShelveLoc=='Good'^Urban=='Yes'^Price>=150^ ^US=='Yes'"))
##Numerical summaries by category
##Variable summary report (One group variable)
ExpCustomStat(Carseats,Cvar = c("Urban","ShelveLoc"), Nvar=c("Population","Sales"), stat = c('Count','Prop','mean','min','P0.25','median','p0.75','max'),gpby=FALSE)
##Variable summary report (More than One group variable)
ExpCustomStat(Carseats,Cvar = c("Urban","US","ShelveLoc"), Nvar=c("CompPrice","Income"), stat = c('Count','Prop','mean','sum','PS','min','max','IQR','sd'), gpby = TRUE)
##Variable summary report (More than One group variable) with filter
ExpCustomStat(Carseats,Cvar = c("Urban","US","ShelveLoc"), Nvar=c("CompPrice","Income"), stat = c('Count','Prop','mean','sum','PS','P0.25','median','p0.75'), gpby = TRUE,filt="Urban=='Yes'")
ExpCustomStat(data_sam,Cvar = c("Urban","US","ShelveLoc"), Nvar=c("Sales","CompPrice","Income"), stat = c('Count','Prop','mean','sum','PS'), gpby = TRUE,filt="All %ni% c(888,999)")
ExpCustomStat(Carseats,Cvar = c("Urban","US"), Nvar=c("Population","Sales","CompPrice"), stat = c('Count','Prop','mean','sum','var','min','max'), filt=c("ShelveLoc=='Good'^Urban=='Yes'^Price>=150"))
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.