The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

How to use the estimation procedures

library(PCBN)

Data simulation

DAG = create_empty_DAG(7)
DAG = bnlearn::set.arc(DAG, 'U1', 'U2')
DAG = bnlearn::set.arc(DAG, 'U1', 'U3')
DAG = bnlearn::set.arc(DAG, 'U1', 'U4')
DAG = bnlearn::set.arc(DAG, 'U2', 'U4')
DAG = bnlearn::set.arc(DAG, 'U3', 'U4')
DAG = bnlearn::set.arc(DAG, 'U4', 'U6')
DAG = bnlearn::set.arc(DAG, 'U5', 'U6')
DAG = bnlearn::set.arc(DAG, 'U4', 'U7')
DAG = bnlearn::set.arc(DAG, 'U6', 'U7')
DAG |> bnlearn::as.igraph() |>
  igraph::plot.igraph(size = 20, label.cex = 2
                      # , layout = igraph::layout_as_tree
  )

order_hash = r2r::hashmap()
order_hash[['U4']] = c("U2", "U1", "U3")
order_hash[['U6']] = c("U4", "U5")
order_hash[['U7']] = c("U4", "U6")
complete_and_check_orders(DAG, order_hash)

fam = matrix(c(0, 1, 1, 1, 0, 0, 0,
               0, 0, 0, 1, 0, 0, 0,
               0, 0, 0, 1, 0, 0, 0,
               0, 0, 0, 0, 0, 1, 1,
               0, 0, 0, 0, 0, 1, 0,
               0, 0, 0, 0, 0, 0, 1,
               0, 0, 0, 0, 0, 0, 0), 
             byrow = TRUE, ncol = 7)
tau = 0.2 * fam

my_PCBN = new_PCBN(
  DAG, order_hash,
  copula_mat = list(tau = tau, fam = fam))

N = 100
mydata = PCBN_sim(my_PCBN, N = N)

Estimation

Copula of U1 and U2

We prepare the environment for storing the results.

e = default_envir()

We start by fitting the copula of \((U_1, U_2)\).

BiCopCondFit(data = mydata, DAG = DAG, v = "U1", w = "U2",
             cond_set = c(), familyset = 1, order_hash = order_hash, e = e,
             method = "mle")
#> Estimating the copula of  U1  and  U2
#> Bivariate copula: Gaussian (par = 0.27, tau = 0.17)

This is stored in e$copula_hash. To access it, we can do:

copula_key = e$keychain[[list(margins = c("U1", "U2"), cond = character(0))]]

e$copula_hash[[copula_key]]
#> Bivariate copula: Gaussian (par = 0.27, tau = 0.17)

There is a level of indirection, because the copula_key actually stores the whole computation tree. Therefore, if the statistician decides to use a different PCBN for the same structure, some estimated parts can be reused if the computation path is the same.

Let’s see what is actually the copula_key.

print(data.tree::FromListSimple(copula_key))
#>   levelName
#> 1    U1, U2
#> 2     ¦--U1
#> 3     °--U2

Corresponding computation trees of the copulas

The computation trees of these copulas can be found as before.

e$keychain[[list(margins = c("U2", "U4"), cond = character(0))]] |>
  data.tree::FromListSimple() |>
  print()
#>   levelName
#> 1    U2, U4
#> 2     ¦--U2
#> 3     °--U4
e$keychain[[list(margins = c("U1", "U4"), cond = c("U2"))]] |>
  data.tree::FromListSimple() |>
  print()
#>        levelName
#> 1 U1, U4 | U2   
#> 2  ¦--U1 | U2   
#> 3  ¦   °--U1, U2
#> 4  ¦       ¦--U1
#> 5  ¦       °--U2
#> 6  °--U4 | U2   
#> 7      °--U2, U4
#> 8          ¦--U2
#> 9          °--U4
e$keychain[[list(margins = c("U3", "U4"), cond = c("U1", "U2"))]] |>
  data.tree::FromListSimple() |>
  print()
#>                 levelName
#> 1  U3, U4 | U1, U2       
#> 2   ¦--U3 | U1           
#> 3   ¦   °--U1, U3        
#> 4   ¦       ¦--U1        
#> 5   ¦       °--U3        
#> 6   °--U4 | U1, U2       
#> 7       °--U1, U4 | U2   
#> 8           ¦--U1 | U2   
#> 9           ¦   °--U1, U2
#> 10          ¦       ¦--U1
#> 11          ¦       °--U2
#> 12          °--U4 | U2   
#> 13              °--U2, U4
#> 14                  ¦--U2
#> 15                  °--U4

Corresponding computation trees of the margins

In the same way, we obtain the computation tree of the margin \(U_4 \, | \, U_1, U_2\) by:

e$keychain[[list(margin = c("U4"), cond = c("U1", "U2"))]] |>
  data.tree::FromListSimple() |>
  print()
#>             levelName
#> 1  U4 | U1, U2       
#> 2   °--U1, U4 | U2   
#> 3       ¦--U1 | U2   
#> 4       ¦   °--U1, U2
#> 5       ¦       ¦--U1
#> 6       ¦       °--U2
#> 7       °--U4 | U2   
#> 8           °--U2, U4
#> 9               ¦--U2
#> 10              °--U4

You can remark that this is a sub-tree of the computation tree of the copula of \((U_3, U_4) \, | \, U_1, U_2\). This is coherent, because we needed the margin \(U_4 \, | \, U_1, U_2\) to estimate this copula. Nevertheless, note that

e$keychain[[list(margin = c("U3"), cond = c("U1", "U2"))]]
#> NULL

This is because the margin \(U_3 \, | \, U_1, U_2\) is actually the same as \(U_3 \, | \, U_1\) by conditional independence. This can be seen using this function:

remove_CondInd(DAG = DAG, node = "U3", cond_set = c("U1", "U2"))
#> [1] "U1"

Conditional marginal pseudo-observations

The conditional marginal pseudo-observations can be found in e$margin_hash. For example, the conditional pseudo-observations \(\hat U_{i, \, 3|1}\), \(i=1, \dots, n\), can be obtained by:

e$margin_hash[[ e$keychain[[list(margin = c("U3"), cond = c("U1"))]] ]]  |>
  head()
#> [1] 0.7016929 0.3503202 0.2715592 0.9891115 0.1999063 0.1960950

These conditional margins are internally computed by the function ComputeCondMargin.

Fitting the rest of the DAG

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.