Who does need the package?

This package implements a specific method for generating n-dimensional random vectors with given marginal distributions and correlation matrix. The method uses the NORTA(NORmal To Anything) approach which generates a standard normal random vector and then transforms it into a random vector with specified marginal distributions and the RA(Retrospective Approximation) algorithm which is a generic stochastic root-finding algorithm. The marginals can be continuous or discrete. So if you want to generate a data set with specified mariginals and correlations to do some research, the package is a good choice.

Where can i learn more about the details?

Read the following paper: Huifen Chen, (2001) Initialization for NORTA: Generation of Random Vectors with Specified Marginals and Correlations. INFORMS Journal on Computing 13(4):312-331. See more details in the above paper.

Is the package completely implement the above algorithm?

NO, the package does some slightly changes according to the paper, e.g. the initial sample size is set to 60 not 40, and the random seeds choice go to the third choice not the first choice which introduced by Appendix in the above paper. But the practice results of the package showes it work well in a lot of situations.

How can i use the package?

You can see examples in by ?functionname after library(NORTARA). Here is a workflow for you to use the package. Suppose you want to generate a sample size of 10000, from 4 marginals:

  1. Give your marginals(The inverse CDF functions):
   #exists in basic packages: you can use their names directly or by a new name
   qt
   qnorm
   b <- qpois
   #or from other packages  : Here you must give it a new name
   #you can replace the package::functionname on your needs.
   f <- stats::qweibull
   invcdfnames <- c("qt","qnorm","qpois","f")
  #or invcdfnames <- c("qt","qnorm","b","f") but never
  #invcdfnames <- c("qt","qnorm","qpois","stats::qweibull")
  1. Give the marginals’ arguments:
  #always you can use the following way, the inner lists' names should match the
  #above functions' arguments names.  
  paramslists <- list(
             m1 = list(df = 5 ),
             m2 = list(mean = 0, sd = 1),
             m3 = list(lambda = 3),
             m4 = list(shape = 1, scale = 1)                 
             )
  #if you are lazy,e.g. qnorm using the default values, then you can use the following way:
 paramslists2 <- list(
             m1 = list(df = 5 ),             
             m3 = list(lambda = 3),
             m4 = list(shape = 1, scale = 1)                 
             )
 defaultindex <- c(2)

3.Give the other arguments for bounding RA algorithm:

 #If you are familiar with the bounding RA algorithm, you can set the functions' arguments
 #on your needs. e.g. let m1 = 80, sigma0 = 0.001 will be ok if you know the smaller 
 #sigma0  the more time will be costed. But if you don't familiar with it, you'd better  
 #use the default values

4.Give the target correlation matrix

cor_matrix <- matrix(c(1.0,-0.4,0.1,-0.2,-0.4,
                       1.0,0.8,0.6,0.1,0.8,1.0,
                       0.5, -0.2,0.6,0.5,1.0
                       ),4,4)

5.Generate the wanted samples:

  f <- stats::qweibull
  invcdfnames <- c("qt","qnorm","qpois","f")
  paramslists <- list(
             m1 = list(df = 5 ),
             m2 = list(mean = 0, sd = 1),
             m3 = list(lambda = 3),
             m4 = list(shape = 1, scale = 1)                 
             )
  cor_matrix <- matrix(c(1.0,-0.4,0.1,-0.2,-0.4,
                       1.0,0.8,0.6,0.1,0.8,1.0,
                       0.5, -0.2,0.6,0.5,1.0
                       ),4,4) 
  cor_matrix
##      [,1] [,2] [,3] [,4]
## [1,]  1.0 -0.4  0.1 -0.2
## [2,] -0.4  1.0  0.8  0.6
## [3,]  0.1  0.8  1.0  0.5
## [4,] -0.2  0.6  0.5  1.0
  res <- NORTARA::genNORTARA(10000,cor_matrix,invcdfnames,paramslists)
  head(res,5)
##            [,1]        [,2] [,3]       [,4]
## [1,]  1.3725517 -1.88567582    1 0.02951737
## [2,] -1.9223006  0.26371059    2 1.71821173
## [3,]  1.0177562  0.19177707    4 0.77355479
## [4,] -0.8192868  0.02579475    3 0.88599746
## [5,] -1.8099858  1.05776347    4 3.36731987
  cor(res)
##             [,1]       [,2]       [,3]       [,4]
## [1,]  1.00000000 -0.4064268 0.09803377 -0.2127361
## [2,] -0.40642683  1.0000000 0.79889080  0.6061054
## [3,]  0.09803377  0.7988908 1.00000000  0.5025336
## [4,] -0.21273607  0.6061054 0.50253359  1.0000000
  paramslists2 <- list(
             m1 = list(df = 5 ),             
             m3 = list(lambda = 3),
             m4 = list(shape = 1, scale = 1)                 
             )
  defaultindex <- c(2)
  res2 <- NORTARA::genNORTARA(10000,cor_matrix,invcdfnames,paramslists2,defaultindex)
  head(res2,5)
##              [,1]       [,2] [,3]      [,4]
## [1,] -0.689036285 -1.5664905    0 1.0524013
## [2,] -0.561602277  1.2449831    5 3.4239488
## [3,] -1.459169972  0.4980389    2 0.6915617
## [4,]  0.001371613  0.6301018    4 0.5581841
## [5,] -0.411591075 -1.4503019    0 0.1353949
  cor(res2)
##            [,1]       [,2]      [,3]       [,4]
## [1,]  1.0000000 -0.4022837 0.1027877 -0.2128540
## [2,] -0.4022837  1.0000000 0.7977321  0.6107353
## [3,]  0.1027877  0.7977321 1.0000000  0.5046156
## [4,] -0.2128540  0.6107353 0.5046156  1.0000000

what should i also be careful?

How can i contact the author if i have problems in using the package?

You can send your email to desolator@sjtu.edu.cn