The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

POP-Inf

This repository hosts the R package that implements the POP-Inf method described in the paper: Assumption-lean and data-adaptive post-prediction inference.

POP-Inf provides valid and powerful inference based on ML predictions for parameters defined through estimation equations.

Installation

# install.packages("devtools")
devtools::install_github("qlu-lab/POPInf")

Useful examples

Here are examples of POP-Inf for M-estimation tasks including: mean estimation, linear regression, logistic regression, and Poisson regrssion. The main function is pop_M(), where the argument method indicates which task to do.

# Load the package
library(POPInf)

# Load the simulated data
set.seed(999)
data <- sim_data()
X_lab = data$X_lab ## Covariates in the labeled data
X_unlab = data$X_unlab ## Covariates in the unlabeled data
Y_lab = data$Y_lab ## Observed outcome in the labeled data
Yhat_lab = data$Yhat_lab ## Predicted outcome in the labeled data
Yhat_unlab = data$Yhat_unlab ## Predicted outcome in the unlabeled data

Mean estimation

# Run POP-Inf mean estimation
fit_mean <- pop_M(Y_lab = Y_lab, Yhat_lab = Yhat_lab, Yhat_unlab = Yhat_unlab,
                  alpha = 0.05, method = "mean")

print(fit_mean)

#   Estimate  Std.Error Lower.CI Upper.CI       P.value    Weight
# 1 1.623601 0.05514429  1.51552 1.731682 1.557956e-190 0.9226747

Linear regression

# Run POP-Inf linear regression
fit_ols <- pop_M(X_lab = X_lab, X_unlab = X_unlab,
           Y_lab = Y_lab, Yhat_lab = Yhat_lab, Yhat_unlab = Yhat_unlab,
           alpha = 0.05, method = "ols")

print(fit_ols)

#     Estimate  Std.Error  Lower.CI Upper.CI       P.value    Weight
#    1.6181357 0.05351775 1.5132429 1.723029 8.093485e-201 0.8811378
# X1 0.8716172 0.07335443 0.7278452 1.015389  1.463365e-32 1.0000000

Logistic regression

# Load the simulated data
set.seed(999)
data <- sim_data(binary = T)
X_lab = data$X_lab
X_unlab = data$X_unlab
Y_lab = data$Y_lab
Yhat_lab = data$Yhat_lab
Yhat_unlab = data$Yhat_unlab

# Run POP-Inf logistic regression
fit_logistic <- pop_M(X_lab = X_lab, X_unlab = X_unlab,
                      Y_lab = Y_lab, Yhat_lab = Yhat_lab, Yhat_unlab = Yhat_unlab,
                      alpha = 0.05, method = "logistic")

print(fit_logistic)

#      Estimate  Std.Error   Lower.CI   Upper.CI      P.value    Weight
#    -0.1355928 0.08443198 -0.3010764 0.02989085 1.082868e-01 0.4218688
# X1  0.5876862 0.08938035  0.4125039 0.76286842 4.861518e-11 0.5340878

Poisson regression

# Load the simulated data
set.seed(999)
data <- sim_data()
X_lab = data$X_lab
X_unlab = data$X_unlab
Y_lab = round(data$Y_lab - min(data$Y_lab))
Yhat_lab = round(data$Yhat_lab - min(data$Yhat_lab))
Yhat_unlab = round(data$Yhat_unlab - min(Yhat_unlab))

# Run POP-Inf Poisson regression
fit_poisson <- pop_M(X_lab = X_lab, X_unlab = X_unlab,
                     Y_lab = Y_lab, Yhat_lab = Yhat_lab, Yhat_unlab = Yhat_unlab,
                     alpha = 0.05, method = "poisson")

print(fit_poisson)

#     Estimate  Std.Error  Lower.CI  Upper.CI      P.value    Weight
#    1.2227700 0.01730779 1.1888473 1.2566926 0.000000e+00 0.8460699
# X1 0.2568325 0.02437762 0.2090532 0.3046118 5.921326e-26 0.9171068

Analysis script

We provide the script for analysis in the POP-Inf paper here.

Contact

Please submit an issue or contact Jiacheng (jiacheng.miao@wisc.edu) or Xinran (xinran.miao@wisc.edu) for questions.

Reference

Assumption-lean and Data-adaptive Post-Prediction Inference

Valid inference for machine learning-assisted GWAS

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.