The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Type: Package
Title: Privacy-Preserving Distributed Algorithms
Version: 1.3.0
Date: 2025-11-11
Description: A collection of privacy-preserving distributed algorithms (PDAs) for conducting federated statistical learning across multiple data sites. The PDA framework includes models for various tasks such as regression, trial emulation, causal inference, design-specific analysis, and clustering. The PDA algorithms run on a lead site and only require summary statistics from collaborating sites, with one or few iterations. The package can be used together with the online data transfer system (https://pda-ota.pdamethods.org/) for safe and convenient collaboration. For more information, please visit our software websites: https://github.com/Penncil/pda, and https://pdamethods.org/.
Maintainer: Chongliang Luo <luocl3009@gmail.com>
License: Apache License 2.0
Suggests: lme4
Depends: R (≥ 4.1.0)
Imports: Rcpp (≥ 0.12.19), stats, httr, rvest, jsonlite, data.table, cobalt, EmpiricalCalibration, survival, minqa, glmnet, MASS, numDeriv, metafor, Matrix, ordinal, plyr, tidyr, tibble, dplyr, geex, data.tree
LinkingTo: Rcpp, RcppArmadillo, RcppEigen
RoxygenNote: 7.3.3
Encoding: UTF-8
LazyData: true
NeedsCompilation: yes
Packaged: 2025-11-16 17:36:26 UTC; chongliang
Author: Chongliang Luo [cre], Rui Duan [aut], Mackenzie Edmondson [aut], Jiayi Tong [aut], Xiaokang Liu [aut], Kenneth Locke [aut], Jie Hu [aut], Bingyu Zhang [aut], Yicheng Shen [aut], Yudong Wang [aut], Yiwen Lu [aut], Lu Li [aut], Yong Chen [aut], Penn Computing Inference Learning (PennCIL) lab [cph]
Repository: CRAN
Date/Publication: 2025-11-17 21:50:52 UTC

ADAP derivatives

Description

ADAP derivatives

Usage

ADAP.derive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

list(site=config$site_id, site_size = nrow(ipdata), logL_D1=logL_D1, logL_D2=logL_D2)


ADAP surrogate estimation

Description

ADAP surrogate estimation

Usage

ADAP.estimate(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

PDA control

config

cloud configuration

Details

step-3: construct and solve surrogate objective function at the master/lead site

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


ADAP initialize

Description

ADAP initialize

Usage

ADAP.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init


ADAP simulated data

Description

A simulated data set for ADAP demonstration

Usage

ADAP_data

Format

A list containing the following elements:

sites

site id, 300 'site1', 300 'site2', 300 'site3'

status

binary outcome of length 900

x

900 by 49 matrix generated by standard normal distribution, representing the covariates


PDA COLA estimation

Description

PDA COLA estimation

Usage

COLA.estimate(ipdata=NULL,control,config)

Arguments

ipdata

no need

control

PDA control

config

cloud configuration

Details

COLA estimation: (1) COLA-GLM (2) COLA-GLM-H (3) COLA-GLMM

Value

list(est, se)


COLA initialize

Description

COLA initialize

Usage

COLA.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init

References

Qiong Wu, et al. (2025) COLA-GLM: Collaborative One-shot and Lossless Algorithms of Generalized Linear Models for Decentralized Observational Healthcare Data. npj Digital Medicine.
Bingyu Zhang, et al (2025) A Lossless One-shot Distributed Algorithm for Addressing Heterogeneity in Multi-Site Generalized Linear Models. Journal of the American Medical Informatics Association (under revision).
Jiayi Tong, et al. (2025) Unlocking Efficiency in Real-world Collaborative Studies: A Multi-site International Study with Collaborative One-shot Lossless Algorithm for Generalized Linear Mixed Model. npj Digital Medicine.


COLA simulated data

Description

A simulated COVID-19 data set for Collaborative One-shot and Lossless Algorithms of generalized linear models (COLA-GLM)

Usage

COLA_covid

Format

A data frame with 1500 rows and 6 variables:

site

site, 600 'site1', 500 'site2', 400 'site3'

age

binary age

sex

binary sex

medical_condition

binary medical condition

status

binary outcome, COVID-19 status. This is the binary outcome for COLA logistic regression

visits

poisson outcome, number of visits. This is the count outcome for COLA Poisson regression


PDA DLM estimation

Description

PDA DLM estimation

Usage

DLM.estimate(ipdata=NULL,control,config)

Arguments

ipdata

no need

control

PDA control

config

cloud configuration

Details

DLM estimation: (1) Linear model, (2) Linear model with fixed effects, (3) Linear model with random effects (Linear mixed model)

Value

list(bhat, sebhat, sigmahat, uhat, seuhat)


DLM initialize

Description

DLM initialize

Usage

DLM.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init

References

Yixin Chen, et al. (2006) Regression cubes with lossless compression and aggregation. IEEE Transactions on Knowledge and Data Engineering, 18(12), pp.1585-1599.
(DLMM) Chongliang Luo, et al. (2020) Lossless Distributed Linear Mixed Model with Application to Integration of Heterogeneous Healthcare Data. medRxiv, doi:10.1101/2020.11.16.20230730.


DPQL derive

Description

DPQL derive

Usage

DPQL.derive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Details

This step calculated the intermediate aggregated data (XtWX, XtWY, and YtWY) for each site. May need to be iterated several times until prespecified rounds are met.

Value

list(SiX, SiXY, SiY, ni)

References

Chongliang Luo, et al. (2021) dPQL: a lossless distributed algorithm for generalized linear mixed model with application to privacy-preserving hospital profiling. medRxiv, doi:10.1101/2021.05.03.21256561.
Chongliang Luo, et al. (2020) Lossless Distributed Linear Mixed Model with Application to Integration of Heterogeneous Healthcare Data. medRxiv, doi:10.1101/2020.11.16.20230730.


PDA DPQL estimation

Description

PDA DPQL estimation

Usage

DPQL.estimate(ipdata=NULL,control,config)

Arguments

ipdata

no need

control

PDA control

config

cloud configuration

Details

DPQL estimation: (iterative) weighted DLMM using AD from all sites

Value

list(risk_factor, risk_factor_heterogeneity, bhat, sebhat, uhat, seuhat, Vhat)

References

Chongliang Luo, et al. (2021) dPQL: a lossless distributed algorithm for generalized linear mixed model with application to privacy-preserving hospital profiling. medRxiv, doi:10.1101/2021.05.03.21256561.
Chongliang Luo, et al. (2020) Lossless Distributed Linear Mixed Model with Application to Integration of Heterogeneous Healthcare Data. medRxiv, doi:10.1101/2020.11.16.20230730.


DPQL initialize

Description

DPQL initialize

Usage

DPQL.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Details

To initialize, fit glm at each individual site and send the estimated effect size and variances to the lead site. This step may be optional if we just use zero's as initial effect sizes to start the PQL algorithm.

Value

init


DisC2o AIPW estimate of the ATE at each site

Description

DisC2o AIPW estimate of the ATE at each site

Usage

DisC2o.AIPWestimate(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

pda control

config

pda cloud configuration

Value

list(btilde=btilde, Vtilde=Vtilde)


DisC2o_OM derivatives

Description

DisC2o_OM derivatives

Usage

DisC2o.OMderive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

list(site=config$site_id, site_size = nrow(ipdata), logL_D1=logL_D1, logL_D2=logL_D2)


DisC2o outcome model surrogate estimation

Description

DisC2o outcome model surrogate estimation

Usage

DisC2o.OMestimate(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

PDA control

config

cloud configuration

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


DisC2o_OM initialize

Description

DisC2o_OM initialize

Usage

DisC2o.OMinitialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init


DisC2o_PS derivatives

Description

DisC2o_PS derivatives

Usage

DisC2o.PSderive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

list(site=config$site_id, site_size = nrow(ipdata), logL_D1=logL_D1, logL_D2=logL_D2)


PDA surrogate estimation

Description

PDA surrogate estimation

Usage

DisC2o.PSestimate(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

PDA control

config

cloud configuration

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


DisC2o PS initialize

Description

DisC2o PS initialize

Usage

DisC2o.PSinitialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init

References

Tong J, et al. 2025. DisC2o-HD: Distributed causal inference with covariates shift for analyzing real-world high-dimensional data. Journal of Machine Learning Research. 2025;26(3):1-50.


DisC2o AIPW estimate of the ATE, synthesizing all sites

Description

DisC2o AIPW estimate of the ATE, synthesizing all sites

Usage

DisC2o.synthesize(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

pda control

config

pda cloud configuration

Value

list(btilde=btilde, Vtilde=Vtilde)


LATTE LATTE.estimate

Description

LATTE conditional log-likelihood reconstruction at Lead Site

Usage

LATTE.estimate(init_data, control, config)

Arguments

init_data

initialization data from LATTE.initialize

control

pda control data

config

local site configuration

Value

analysis results


LATTE initialize

Description

LATTE (Lossless Aggregation for Treatment effect estimation) initialization: Propensity Score stratification/matching at Lead site

Usage

LATTE.initialize(ipdata, control, config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init object containing prepared data and PS model


LATTE simulated data

Description

A simulated ADRD data set for Lossless Oneshot Algorithm for Target Trial Emulation (LATTE)

Usage

LATTE_ADRD

Format

A data frame with 1224 rows and 326 variables:

ID

Unique patient identifier

Stroke...Transient.Ischemic.Attack

History of Stroke or Transient Ischemic Attack, binary indicator

Acquired.Hypothyroidism

History of Acquired Hypothyroidism, binary indicator

Fibromyalgia..Chronic.Pain.and.Fatigue

History of Fibromyalgia, Chronic Pain and Fatigue, binary indicator

RA.OA..Rheumatoid.Arthritis..Osteoarthritis.

History of RA/OA (Rheumatoid Arthritis / Osteoarthritis), binary indicator

Hypertension

History of Hypertension, binary indicator

Anxiety.Disorders

History of Anxiety Disorders, binary indicator

Chronic.Obstructive.Pulmonary.Disease.and.Bronchiectasis

History of Chronic Obstructive Pulmonary Disease (COPD) and Bronchiectasis, binary indicator

Asthma

History of Asthma, binary indicator

Traumatic.Brain.Injury.and.Nonpsychotic.Mental.Disorders.due.to.Brain.Damage

History of Traumatic Brain Injury and Nonpsychotic Mental Disorders due to Brain Damage, binary indicator

Sleep.disorders

History of Sleep disorders, binary indicator

ADHD..Conduct.Disorders..and.Hyperkinetic.Syndrome

History of ADHD, Conduct Disorders, and Hyperkinetic Syndrome, binary indicator

Cataract

History of Cataract, binary indicator

Migraine.and.Chronic.Headache

History of Migraine and Chronic Headache, binary indicator

Depressive.Disorders

History of Depressive Disorders, binary indicator

Hyperlipidemia

History of Hyperlipidemia, binary indicator

Sensory...Deafness.and.Hearing.Impairment

History of Sensory Deafness and Hearing Impairment, binary indicator

Female...Male.Breast.Cancer

History of Female/Male Breast Cancer, binary indicator

Personality.Disorders

History of Personality Disorders, binary indicator

Anemia

History of Anemia, binary indicator

Chronic.Kidney.Disease

History of Chronic Kidney Disease, binary indicator

Schizophrenia.and.Other.Psychotic.Disorders

History of Schizophrenia and Other Psychotic Disorders, binary indicator

Glaucoma

History of Glaucoma, binary indicator

Peripheral.Vascular.Disease..PVD.

History of Peripheral Vascular Disease (PVD), binary indicator

Heart.Failure

History of Heart Failure, binary indicator

Pressure.and.Chronic.Ulcers

History of Pressure and Chronic Ulcers, binary indicator

Obesity

History of Obesity, binary indicator

Diabetes

History of Diabetes, binary indicator

Mobility.Impairments

History of Mobility Impairments, binary indicator

Benign.Prostatic.Hyperplasia

History of Benign Prostatic Hyperplasia, binary indicator

Drug.Use.Disorders

History of Drug Use Disorders, binary indicator

Alcohol.Use.Disorders

History of Alcohol Use Disorders, binary indicator

Post.Traumatic.Stress.Disorder..PTSD.

History of Post Traumatic Stress Disorder (PTSD), binary indicator

Atrial.Fibrillation

History of Atrial Fibrillation, binary indicator

Tobacco.Use

History of Tobacco Use, binary indicator

Ischemic.Heart.Disease

History of Ischemic Heart Disease, binary indicator

Liver.Disease..Cirrhosis.and.Other.Liver.Conditions..except.Viral.Hepatitis.

History of Liver Disease (Cirrhosis and Other Liver Conditions except Viral Hepatitis), binary indicator

Sensory...Blindness.and.Visual.Impairment

History of Sensory Blindness and Visual Impairment, binary indicator

Bipolar.Disorder

History of Bipolar Disorder, binary indicator

Depression

History of Depression, binary indicator

Prostate.Cancer

History of Prostate Cancer, binary indicator

Acute.Myocardial.Infarction

History of Acute Myocardial Infarction, binary indicator

Hip.Pelvic.Fracture

History of Hip/Pelvic Fracture, binary indicator

Other.Developmental.Delays

History of Other Developmental Delays, binary indicator

Viral.Hepatitis..General.

History of Viral Hepatitis (General), binary indicator

Sickle.Cell.Disease

History of Sickle Cell Disease, binary indicator

Multiple.Sclerosis.and.Transverse.Myelitis

History of Multiple Sclerosis and Transverse Myelitis, binary indicator

Leukemias.and.Lymphomas

History of Leukemias and Lymphomas, binary indicator

Opioid.Use.Disorder

History of Opioid Use Disorder, binary indicator

Colorectal.Cancer

History of Colorectal Cancer, binary indicator

Epilepsy

History of Epilepsy, binary indicator

Osteoporosis

History of Osteoporosis, binary indicator

Intellectual.Disabilities.and.Related.Conditions

History of Intellectual Disabilities and Related Conditions, binary indicator

Spinal.Cord.Injury

History of Spinal Cord Injury, binary indicator

Endometrial.Cancer

History of Endometrial Cancer, binary indicator

Spina.Bifida.and.Other.Congenital.Anomalies.of.the.Nervous.System

History of Spina Bifida and Other Congenital Anomalies of the Nervous System, binary indicator

Learning.Disabilities

History of Learning Disabilities, binary indicator

Periodontitis

History of Periodontitis, binary indicator

Lung.Cancer

History of Lung Cancer, binary indicator

Cystic.Fibrosis.and.Other.Metabolic.Developmental.Disorders

History of Cystic Fibrosis and Other Metabolic Developmental Disorders, binary indicator

Cerebral.Palsy

History of Cerebral Palsy, binary indicator

Human.Immunodeficiency.Virus.and.or.Acquired.Immunodeficiency.Syndrome..HIV.AIDS.

History of Human Immunodeficiency Virus and/or Acquired Immunodeficiency Syndrome (HIV/AIDS), binary indicator

Menopause

History of Menopause, binary indicator

Muscular.Dystrophy

History of Muscular Dystrophy, binary indicator

Autism.Spectrum.Disorders

History of Autism Spectrum Disorders, binary indicator

atorvastatin

History of atorvastatin use, binary indicator

hydrochlorothiazide

History of hydrochlorothiazide use, binary indicator

amlodipine

History of amlodipine use, binary indicator

aspirin

History of aspirin use, binary indicator

metoprolol

History of metoprolol use, binary indicator

levothyroxine

History of levothyroxine use, binary indicator

metformin

History of metformin use, binary indicator

lisinopril

History of lisinopril use, binary indicator

simvastatin

History of simvastatin use, binary indicator

sodium.chloride

History of sodium chloride use, binary indicator

omeprazole

History of omeprazole use, binary indicator

albuterol

History of albuterol use, binary indicator

potassium.chloride

History of potassium chloride use, binary indicator

sertraline

History of sertraline use, binary indicator

lidocaine

History of lidocaine use, binary indicator

furosemide

History of furosemide use, binary indicator

losartan

History of losartan use, binary indicator

donepezil

History of donepezil use, binary indicator

cholecalciferol

History of cholecalciferol use, binary indicator

ondansetron

History of ondansetron use, binary indicator

oxycodone

History of oxycodone use, binary indicator

lorazepam

History of lorazepam use, binary indicator

glucose.oxidase

History of glucose oxidase use, binary indicator

prednisone

History of prednisone use, binary indicator

fluticasone

History of fluticasone use, binary indicator

gabapentin

History of gabapentin use, binary indicator

rosuvastatin

History of rosuvastatin use, binary indicator

tamsulosin

History of tamsulosin use, binary indicator

fentanyl

History of fentanyl use, binary indicator

carbidopa

History of carbidopa use, binary indicator

pantoprazole

History of pantoprazole use, binary indicator

escitalopram

History of escitalopram use, binary indicator

insulin.aspart..human

History of insulin aspart (human) use, binary indicator

clonazepam

History of clonazepam use, binary indicator

carvedilol

History of carvedilol use, binary indicator

heparin

History of heparin use, binary indicator

bupropion

History of bupropion use, binary indicator

polyethylene.glycol.3350

History of polyethylene glycol 3350 use, binary indicator

pravastatin

History of pravastatin use, binary indicator

docusate

History of docusate use, binary indicator

vitamin.B12

History of vitamin B12 use, binary indicator

clopidogrel

History of clopidogrel use, binary indicator

atenolol

History of atenolol use, binary indicator

glipizide

History of glipizide use, binary indicator

ascorbic.acid

History of ascorbic acid use, binary indicator

glucose

History of glucose use, binary indicator

insulin.glargine

History of insulin glargine use, binary indicator

zolpidem

History of zolpidem use, binary indicator

warfarin

History of warfarin use, binary indicator

azithromycin

History of azithromycin use, binary indicator

trazodone

History of trazodone use, binary indicator

esomeprazole

History of esomeprazole use, binary indicator

sennosides..USP

History of sennosides (USP) use, binary indicator

propofol

History of propofol use, binary indicator

iopamidol

History of iopamidol use, binary indicator

diltiazem

History of diltiazem use, binary indicator

ibuprofen

History of ibuprofen use, binary indicator

dexamethasone

History of dexamethasone use, binary indicator

tramadol

History of tramadol use, binary indicator

amoxicillin

History of amoxicillin use, binary indicator

midazolam

History of midazolam use, binary indicator

lansoprazole

History of lansoprazole use, binary indicator

citalopram

History of citalopram use, binary indicator

valsartan

History of valsartan use, binary indicator

ciprofloxacin

History of ciprofloxacin use, binary indicator

famotidine

History of famotidine use, binary indicator

calcium.carbonate

History of calcium carbonate use, binary indicator

finasteride

History of finasteride use, binary indicator

levofloxacin

History of levofloxacin use, binary indicator

sulfamethoxazole

History of sulfamethoxazole use, binary indicator

duloxetine

History of duloxetine use, binary indicator

alprazolam

History of alprazolam use, binary indicator

naproxen

History of naproxen use, binary indicator

levetiracetam

History of levetiracetam use, binary indicator

ranitidine

History of ranitidine use, binary indicator

triamcinolone

History of triamcinolone use, binary indicator

salmeterol

History of salmeterol use, binary indicator

folic.acid

History of folic acid use, binary indicator

nifedipine

History of nifedipine use, binary indicator

ferrous.sulfate

History of ferrous sulfate use, binary indicator

morphine

History of morphine use, binary indicator

hydralazine

History of hydralazine use, binary indicator

montelukast

History of montelukast use, binary indicator

magnesium.sulfate

History of magnesium sulfate use, binary indicator

methylphenidate

History of methylphenidate use, binary indicator

hydrocortisone

History of hydrocortisone use, binary indicator

latanoprost

History of latanoprost use, binary indicator

quetiapine

History of quetiapine use, binary indicator

metronidazole

History of metronidazole use, binary indicator

diphenhydramine

History of diphenhydramine use, binary indicator

memantine

History of memantine use, binary indicator

methylprednisolone

History of methylprednisolone use, binary indicator

doxycycline

History of doxycycline use, binary indicator

fluoxetine

History of fluoxetine use, binary indicator

paroxetine

History of paroxetine use, binary indicator

gadobenate

History of gadobenate use, binary indicator

propranolol

History of propranolol use, binary indicator

ramipril

History of ramipril use, binary indicator

ezetimibe

History of ezetimibe use, binary indicator

allopurinol

History of allopurinol use, binary indicator

enoxaparin

History of enoxaparin use, binary indicator

apixaban

Binary indicator for history of apixaban

sildenafil

Binary indicator for history of sildenafil

oxybutynin

Binary indicator for history of oxybutynin

insulin.lispro

Binary indicator for history of insulin lispro

melatonin

Binary indicator for history of melatonin

tadalafil

Binary indicator for history of tadalafil

hydromorphone

Binary indicator for history of hydromorphone

ipratropium

Binary indicator for history of ipratropium

cephalexin

Binary indicator for history of cephalexin

lovastatin

Binary indicator for history of lovastatin

mirtazapine

Binary indicator for history of mirtazapine

venlafaxine

Binary indicator for history of venlafaxine

fenofibrate

Binary indicator for history of fenofibrate

guaifenesin

Binary indicator for history of guaifenesin

estradiol

Binary indicator for history of estradiol

nitroglycerin

Binary indicator for history of nitroglycerin

pregabalin

Binary indicator for history of pregabalin

lamotrigine

Binary indicator for history of lamotrigine

enalapril

Binary indicator for history of enalapril

ergocalciferol

Binary indicator for history of ergocalciferol

ketoconazole

Binary indicator for history of ketoconazole

spironolactone

Binary indicator for history of spironolactone

cyclobenzaprine

Binary indicator for history of cyclobenzaprine

meloxicam

Binary indicator for history of meloxicam

cetirizine

Binary indicator for history of cetirizine

alendronate

Binary indicator for history of alendronate

nortriptyline

Binary indicator for history of nortriptyline

bisacodyl

Binary indicator for history of bisacodyl

sitagliptin

Binary indicator for history of sitagliptin

salmon.oil

Binary indicator for history of salmon oil

olmesartan

Binary indicator for history of olmesartan

timolol

Binary indicator for history of timolol

nitrofurantoin

Binary indicator for history of nitrofurantoin

celecoxib

Binary indicator for history of celecoxib

glimepiride

Binary indicator for history of glimepiride

iohexol

Binary indicator for history of iohexol

clonidine

Binary indicator for history of clonidine

valacyclovir

Binary indicator for history of valacyclovir

ropinirole

Binary indicator for history of ropinirole

bupivacaine

Binary indicator for history of bupivacaine

benzonatate

Binary indicator for history of benzonatate

carbamazepine

Binary indicator for history of carbamazepine

meclizine

Binary indicator for history of meclizine

azelastine

Binary indicator for history of azelastine

diclofenac

Binary indicator for history of diclofenac

clobetasol

Binary indicator for history of clobetasol

ketorolac

Binary indicator for history of ketorolac

amphetamine

Binary indicator for history of amphetamine

budesonide

Binary indicator for history of budesonide

diazepam

Binary indicator for history of diazepam

repaglinide

Binary indicator for history of repaglinide

omega.3.acid.ethyl.esters..USP.

Binary indicator for history of omega-3-acid ethyl esters (USP)

methadone

Binary indicator for history of methadone

clindamycin

Binary indicator for history of clindamycin

rivaroxaban

Binary indicator for history of rivaroxaban

vitamin.E

Binary indicator for history of vitamin E

valproate

Binary indicator for history of valproate

tiotropium

Binary indicator for history of tiotropium

temazepam

Binary indicator for history of temazepam

chlorhexidine

Binary indicator for history of chlorhexidine

hydroxychloroquine

Binary indicator for history of hydroxychloroquine

nystatin

Binary indicator for history of nystatin

olanzapine

Binary indicator for history of olanzapine

nicotine

Binary indicator for history of nicotine

mometasone

Binary indicator for history of mometasone

prednisolone

Binary indicator for history of prednisolone

estrogens..conjugated..USP.

Binary indicator for history of conjugated estrogens (USP)

mupirocin

Binary indicator for history of mupirocin

loratadine

Binary indicator for history of loratadine

fexofenadine

Binary indicator for history of fexofenadine

solifenacin

Binary indicator for history of solifenacin

irbesartan

Binary indicator for history of irbesartan

ephedrine

Binary indicator for history of ephedrine

isosorbide

Binary indicator for history of isosorbide

gadoterate.meglumine

Binary indicator for history of gadoterate meglumine

ubidecarenone

Binary indicator for history of ubidecarenone

verapamil

Binary indicator for history of verapamil

promethazine

Binary indicator for history of promethazine

doxazosin

Binary indicator for history of doxazosin

fluconazole

Binary indicator for history of fluconazole

haloperidol

Binary indicator for history of haloperidol

rivastigmine

Binary indicator for history of rivastigmine

labetalol

Binary indicator for history of labetalol

eszopiclone

Binary indicator for history of eszopiclone

insulin.detemir

Binary indicator for history of insulin detemir

aripiprazole

Binary indicator for history of aripiprazole

vancomycin

Binary indicator for history of vancomycin

thiamine

Binary indicator for history of thiamine

epinephrine

Binary indicator for history of epinephrine

amitriptyline

Binary indicator for history of amitriptyline

tolterodine

Binary indicator for history of tolterodine

cefazolin

Binary indicator for history of cefazolin

lisdexamfetamine

Binary indicator for history of lisdexamfetamine

risperidone

Binary indicator for history of risperidone

mirabegron

Binary indicator for history of mirabegron

magnesium.oxide

Binary indicator for history of magnesium oxide

niacin

Binary indicator for history of niacin

pramipexole

Binary indicator for history of pramipexole

zoledronic.acid

Binary indicator for history of zoledronic acid

sex

Binary sex

age

Age (continuous)

race

Race

days

Time from index to event (continuous)

index_date

Index date of diagnosis

treatment

Binary treatment indicator

outcome_AD_value

Outcome binary indicator: AD

outcome_AD_time

Outcome time: AD

outcome_ADRD_value

Outcome binary indicator: ADRD

outcome_ADRD_time

Outcome time: ADRD

outcome_acute_conjunctivitis_value

Outcome binary indicator: acute conjunctivitis

outcome_acute_conjunctivitis_time

Outcome time: acute conjunctivitis

outcome_acute_tonsillitis_value

Outcome binary indicator: acute tonsillitis

outcome_acute_tonsillitis_time

Outcome time: acute tonsillitis

outcome_adhesive_capsulitis_of_shoulder_value

Outcome binary indicator: adhesive capsulitis of shoulder

outcome_adhesive_capsulitis_of_shoulder_time

Outcome time: adhesive capsulitis of shoulder

outcome_allergic_rhinitis_value

Outcome binary indicator: allergic rhinitis

outcome_allergic_rhinitis_time

Outcome time: allergic rhinitis

outcome_blepharitis_value

Outcome binary indicator: blepharitis

outcome_blepharitis_time

Outcome time: blepharitis

outcome_carpal_tunnel_syndrome_value

Outcome binary indicator: carpal tunnel syndrome

outcome_carpal_tunnel_syndrome_time

Outcome time: carpal tunnel syndrome

outcome_chalazion_value

Outcome binary indicator: chalazion

outcome_chalazion_time

Outcome time: chalazion

outcome_contact_dermatitis_value

Outcome binary indicator: contact dermatitis

outcome_contact_dermatitis_time

Outcome time: contact dermatitis

outcome_dental_caries_value

Outcome binary indicator: dental caries

outcome_dental_caries_time

Outcome time: dental caries

outcome_deviated_nasal_septum_value

Outcome binary indicator: deviated nasal septum

outcome_deviated_nasal_septum_time

Outcome time: deviated nasal septum

outcome_foreign_body_in_ear_value

Outcome binary indicator: foreign body in ear

outcome_foreign_body_in_ear_time

Outcome time: foreign body in ear

outcome_gout_value

Outcome binary indicator: gout

outcome_gout_time

Outcome time: gout

outcome_hemorrhoids_value

Outcome binary indicator: hemorrhoids

outcome_hemorrhoids_time

Outcome time: hemorrhoids

outcome_impacted_cerumen_value

Outcome binary indicator: impacted cerumen

outcome_impacted_cerumen_time

Outcome time: impacted cerumen

outcome_influenza_value

Outcome binary indicator: influenza

outcome_influenza_time

Outcome time: influenza

outcome_ingrowing_nail_value

Outcome binary indicator: ingrowing nail

outcome_ingrowing_nail_time

Outcome time: ingrowing nail

outcome_low_back_pain_value

Outcome binary indicator: low back pain

outcome_low_back_pain_time

Outcome time: low back pain

outcome_menieres_disease_value

Outcome binary indicator: menieres disease

outcome_menieres_disease_time

Outcome time: menieres disease

outcome_osteoarthritis_of_knee_value

Outcome binary indicator: osteoarthritis of knee

outcome_osteoarthritis_of_knee_time

Outcome time: osteoarthritis of knee

outcome_osteoporosis_value

Outcome binary indicator: osteoporosis

outcome_osteoporosis_time

Outcome time: osteoporosis

outcome_foot_drop_value

Outcome binary indicator: foot drop

outcome_foot_drop_time

Outcome time: foot drop

outcome_hearing_problem_value

Outcome binary indicator: hearing problem

outcome_hearing_problem_time

Outcome time: hearing problem

outcome_intra_abdominal_and_pelvic_swelling_mass_and_lump_value

Outcome binary indicator: intra abdominal and pelvic swelling mass and lump

outcome_intra_abdominal_and_pelvic_swelling_mass_and_lump_time

Outcome time: intra abdominal and pelvic swelling mass and lump

outcome_irritability_and_anger_value

Outcome binary indicator: irritability and anger

outcome_irritability_and_anger_time

Outcome time: irritability and anger

outcome_wristdrop_value

Outcome binary indicator: wristdrop

outcome_wristdrop_time

Outcome time: wristdrop

site

Study site identifier


Length of Stay data

Description

A simulated data set of hospitalization Length of Stay (LOS) from 3 sites

Usage

LOS

Format

A data frame with 1000 rows and 5 variables:

site

site id, 500 'site1', 400 'site2' and 100 'site3'

age

3 categories, 'young', 'middle', and 'old'

sex

2 categories, 'M' for male and 'F' for female

lab

lab test results, continuous value ranging from 0 to 100

los

LOS in days, ranging from 1 tp 28. Treated as continuous outcome in DLM


Generate pda UWZ derivatives

Description

Generate pda UWZ derivatives

Usage

ODAC.derive(ipdata, control, config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Details

Calculate and broadcast 1st and 2nd order derivative at initial bbar for ODAC, this requires 2 substeps: 1st calculate summary stats (U, W, Z), 2nd calculate derivatives (logL_D1, logL_D2)

Value

list(T_all=T_all, b_meta=b_meta, site=control$mysite, site_size = nrow(ipdata), U=U, W=W, Z=Z, logL_D1=logL_D1, logL_D2=logL_D2)


Generate pda UWZ summary statistics before calculating derivatives

Description

Generate pda UWZ summary statistics before calculating derivatives

Usage

ODAC.deriveUWZ(ipdata, control, config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

list(T_all=T_all, b_meta=b_meta, site=control$mysite, site_size = nrow(ipdata), U=U, W=W, Z=Z, logL_D1=logL_D1, logL_D2=logL_D2)


PDA surrogate estimation

Description

PDA surrogate estimation

Usage

ODAC.estimate(ipdata, control, config)

Arguments

ipdata

local data in data frame

control

pda control

config

cloud config

Details

step-4: construct and solve surrogate logL at the master/lead site

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


ODAC initialize

Description

ODAC initialize

Usage

ODAC.initialize(ipdata, control, config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

list(T_i = T_i, bhat_i = fit_i$coef, Vhat_i = summary(fit_i)$coef[,2]^2, site=control$mysite, site_size= nrow(ipdata))

References

Rui Duan, et al. "Learning from local to global: An efficient distributed algorithm for modeling time-to-event data". Journal of the American Medical Informatics Association, 2020, https://doi.org/10.1093/jamia/ocaa044 Chongliang Luo, et al. "ODACH: A One-shot Distributed Algorithm for Cox model with Heterogeneous Multi-center Data". medRxiv, 2021, https://doi.org/10.1101/2021.04.18.21255694


PDA synthesize surrogate estimates from all sites, optional

Description

PDA synthesize surrogate estimates from all sites, optional

Usage

ODAC.synthesize(ipdata, control, config)

Arguments

ipdata

local data in data frame

control

pda control

config

cloud config

Details

Optional step-4: synthesize all the surrogate est btilde_i from each site, if step-3 from all sites is broadcasted

Value

list(btilde=btilde, Vtilde=Vtilde)


ODACAT derivatives

Description

ODACAT derivatives

Usage

ODACAT.derive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

list(site=config$site_id, site_size = nrow(ipdata), logL_D1=logL_D1, logL_D2=logL_D2)


PDA surrogate estimation

Description

PDA surrogate estimation

Usage

ODACAT.estimate(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

PDA control

config

cloud configuration

Details

step-3: construct and solve surrogate logL at the master/lead site

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


ODACAT initialize

Description

ODACAT initialize

Usage

ODACAT.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init


PDA synthesize surrogate estimates from all sites, optional

Description

PDA synthesize surrogate estimates from all sites, optional

Usage

ODACAT.synthesize(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

pda control

config

pda cloud configuration

Details

Optional step-4: synthesize all the surrogate est btilde from each site, if step-3 from all sites is broadcasted

Value

list(btilde=btilde, Vtilde=Vtilde)


ODACATH derivatives

Description

ODACATH derivatives

Usage

ODACATH.derive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

list(site=config$site_id, site_size = n, S_site=S_site, eta=eta_mat[site,])


PDA surrogate estimation

Description

PDA surrogate estimation

Usage

ODACATH.estimate(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

PDA control

config

cloud configuration

Details

step-3: construct and solve surrogate efficient score at the master/lead site

Value

list(btilde=betanew, btilde.se=beta_SE,eta_mat=eta_mat,eta_mat_theta=NULL,site=config$site_id, site_size=n_site)


ODACATH initialize

Description

ODACATH initialize

Usage

ODACATH.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init


PDA synthesize surrogate estimates from all sites, optional

Description

PDA synthesize surrogate estimates from all sites, optional

Usage

ODACATH.synthesize(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

pda control

config

pda cloud configuration

Details

Optional step-4: synthesize all the surrogate est btilde from each site, if step-3 from all sites is broadcasted

Value

list(btilde=btilde, Vtilde=Vtilde)


ODACAT simulated data with nominal outcome

Description

A simulated data set for ODACAT demonstration

Usage

ODACAT_nominal

Format

A list containing the following elements:

id.site

site id, 100 'site1', 100 'site2', 100 'site3'

outcome

nominal outcome taking values 1,2,3

X1

a continuous covariate

X2

a binary covariate

X3

a binary covariate


ODACAT simulated data with ordinal outcome

Description

A simulated data set for ODACAT demonstration

Usage

ODACAT_ordinal

Format

A data frame with 300 rows and 5 variables:

id.site

site id, 105 'site1', 105 'site2', 90 'site3'

outcome

3-category outcome, possible values are 1,2,3. Category 3 will be used as reference

X1

the first covariate, continuous

X2

the second covariate, binary

X3

the third covariate, binary


Generate pda derivatives

Description

Generate pda derivatives

Usage

ODACH_CC.derive(ipdata, control, config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Details

Calculate and broadcast 1st and 2nd order derivative at initial bbar

Value

list(bbar=bbar, site=control$mysite, site_size = nrow(ipdata), logL_D1=logL_D1, logL_D2=logL_D2)


PDA surrogate estimation

Description

PDA surrogate estimation

Usage

ODACH_CC.estimate(ipdata, control, config)

Arguments

ipdata

local data in data frame

control

pda control

config

cloud config

Details

step-4: construct and solve surrogate logL at the master/lead site

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


ODACH_CC initialize

Description

ODACH_CC initialize

Usage

ODACH_CC.initialize(ipdata, control, config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

list(bhat_i = fit_i$coef, Vhat_i = summary(fit_i)$coef[,2]^2, site=control$mysite, site_size= nrow(ipdata))

References

Chongliang Luo, et al. "ODACH: A One-shot Distributed Algorithm for Cox model with Heterogeneous Multi-center Data". medRxiv, 2021, https://doi.org/10.1101/2021.04.18.21255694


PDA synthesize surrogate estimates from all sites, optional

Description

PDA synthesize surrogate estimates from all sites, optional

Usage

ODACH_CC.synthesize(ipdata, control, config)

Arguments

ipdata

local data in data frame

control

pda control

config

cloud config

Details

Optional step-4: synthesize all the surrogate est btilde_i from each site, if step-3 from all sites is broadcasted

Value

list(btilde=btilde, Vtilde=Vtilde)


Generate pda ODACT derivatives

Description

Generate pda ODACT derivatives

Usage

ODACT.derive(ipdata, control, config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Details

Calculate and broadcast 1st and 2nd order derivative at initial bbar for ODACT

Value

list(b_meta=b_meta, site=control$mysite, site_size = nrow(ipdata), logL_D1=logL_D1, logL_D2=logL_D2)


PDA ODACT surrogate estimation

Description

PDA ODACT surrogate estimation

Usage

ODACT.estimate(ipdata, control, config)

Arguments

ipdata

local data in data frame

control

pda control

config

cloud config

Details

step-4: construct and solve surrogate logL at the master/lead site

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


ODACT initialize

Description

ODACT initialize

Usage

ODACT.initialize(ipdata, control, config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

list(bhat_i, Vhat_i, site, site_size)

References

Liang CJ, Luo C, Kranzler HR, Bian J, Chen Y. Communication-efficient federated learning of temporal effects on opioid use disorder with data from distributed research networks. J Am Med Inform Assoc. 2025 Apr 1;32(4):656-664. doi: 10.1093/jamia/ocae313. PMID: 39864407; PMCID: PMC12005629.


PDA synthesize surrogate estimates from all sites, optional

Description

PDA synthesize surrogate estimates from all sites, optional

Usage

ODACT.synthesize(ipdata, control, config)

Arguments

ipdata

local data in data frame

control

pda control

config

cloud config

Details

Optional step-4: synthesize all the surrogate est btilde_i from each site, if step-3 from all sites is broadcasted

Value

list(btilde=btilde, Vtilde=Vtilde)


ODAH derivatives

Description

ODAH derivatives

Usage

ODAH.derive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

derivatives list(site = config$site_id, site_size = nrow(ipdata), logL_D1_zero = logL_D1_zero, logL_D1_count = logL_D1_count, logL_D2_zero = logL_D2_zero, logL_D2_count = logL_D2_count)


PDA surrogate estimation

Description

PDA surrogate estimation

Usage

ODAH.estimate(ipdata,control,config)

Arguments

ipdata

local data in a list(ipdata, X_count, X_zero)

control

PDA control

config

cloud configuration

Details

construct and solve surrogate logL at the master/lead site

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


ODAH initialize

Description

ODAH initialize

Usage

ODAH.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init

References

TBD


ODAL derivatives

Description

ODAL derivatives

Usage

ODAL.derive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

list(site=config$site_id, site_size = nrow(ipdata), logL_D1=logL_D1, logL_D2=logL_D2)


PDA surrogate estimation

Description

PDA surrogate estimation

Usage

ODAL.estimate(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

PDA control

config

cloud configuration

Details

step-3: construct and solve surrogate logL at the master/lead site

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


ODAL initialize

Description

ODAL initialize

Usage

ODAL.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init

References

Rui Duan, et al. "Learning from electronic health records across multiple sites: A communication-efficient and privacy-preserving distributed algorithm". Journal of the American Medical Informatics Association, 2020, https://doi.org/10.1093/jamia/ocz199


PDA synthesize surrogate estimates from all sites, optional

Description

PDA synthesize surrogate estimates from all sites, optional

Usage

ODAL.synthesize(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

pda control

config

pda cloud configuration

Details

Optional step-4: synthesize all the surrogate est btilde_i from each site, if step-3 from all sites is broadcasted

Value

list(btilde=btilde, Vtilde=Vtilde)


ODAP derivatives

Description

ODAP derivatives

Usage

ODAP.derive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

derivatives list(site = config$site_id, site_size = nrow(ipdata), logL_D1 = logL_D1, logL_D2 = logL_D2)


PDA surrogate estimation

Description

PDA surrogate estimation

Usage

ODAP.estimate(ipdata,control,config)

Arguments

ipdata

local data in data frame (generated in pda)

control

PDA control

config

cloud configuration

Details

construct and solve surrogate logL at the master/lead site

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


ODAP initialize

Description

ODAP initialize

Usage

ODAP.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init

References

TBD


ODAPB derivatives

Description

ODAPB derivatives

Usage

ODAPB.derive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

derivatives list(site = config$site_id, site_size = nrow(ipdata), logL_D1 = logL_D1, logL_D2 = logL_D2)


PDA surrogate estimation

Description

PDA surrogate estimation

Usage

ODAPB.estimate(ipdata,control,config)

Arguments

ipdata

local data in data frame (generated in pda)

control

PDA control

config

cloud configuration

Details

construct and solve surrogate logL at the master/lead site

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


ODAPB initialize

Description

ODAPB initialize

Usage

ODAPB.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init

References

TBD


COLA-GLMM

Description

Fits a generalized linear mixed model with site-level random intercepts using only one-shot per-site summaries (Ck, Sk, S2k, X0). Each of the iterations constructs weighted LMM summary statistics which are then solved by lmm.fit (from DLMM), yielding updated fixed effects and random intercepts until convergence.

Usage

cola_glmm(
  summary_by_site,
  family = "poisson",
  intercept = TRUE,
  beta_init = NULL,
  u_init = NULL,
  max_iter = 50,
  tol = 1e-06,
  verbose = TRUE
)

Arguments

summary_by_site

Named list of site summaries. Each element must contain Ck, Sk, S2k, and X0 as returned by generate_CSU_site. The list names should be site IDs.

family

Character; one of "poisson" or "binomial" (canonical links).

intercept

Logical; whether the fixed-effect design includes an intercept (affects how X0 was constructed). Default TRUE.

beta_init

Optional named numeric vector of initial fixed effects. Defaults to zeros.

u_init

Optional named numeric vector of initial site random effects (one per site). Defaults to zeros.

max_iter

Integer maximum number of IRLS iterations. Default 50.

tol

Convergence tolerance on relative squared parameter change. Default 1e-6.

verbose

Logical; print iteration progress. Default TRUE.

Details

Uses canonical links: log for Poisson and logit for binomial. The fixed-effect covariates in X0 are assumed binary (plus optional Intercept). For numerically extreme logits, a small weight floor is used internally. Requires lmm.fit from dlmm.R to be on the search path.

Value

A list with elements:

Examples

# fit <- cola_glmm(summary_by_site, family = "poisson")

COVID-19 LOS and mortality data

Description

A simulated data set of hospitalization Length of Stay (LOS) and mortality from 6 sites

Usage

covid

Format

A data frame with 2100 rows and 6 variables:

site

site id, 600 'site1', 500 'site2', 400 'site3', 300 'site4', 200 'site5', 100 'site6'

age

continuous age in year, min 3 max 97

sex

2 categories, '1' for male and '0' for female

lab

lab test results, continuous value ranging from 2.3 to 97.4

los

LOS in days, ranging from 1 to 29

death

mortality status, '1' for death and '0' for alive.


CrabSatellites data

Description

A data set modified from the CrabSatellites data in countreg package (see demo(ODAH)).

Usage

cs

Format

A data frame containing 173 observations on 4 variables.

site

Simulated site id, 85 'site1' and 88 'site2'.

satellites

Number of satellites. Treated as (zero-inflated) count outcome in ODAH

width

Carapace width (cm).

weight

Weight (kg).

Source

https://rdrr.io/rforge/countreg/man/CrabSatellites.html


dGEM hospital-specific effect derivation

Description

dGEM hospital-specific effect derivation

Usage

dGEM.derive(ipdata,control,config,hosdata)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

hosdata

hospital-level data

Value

hospital_effect


dGEM standardized event rate estimation

Description

dGEM standardized event rate estimation

Usage

dGEM.estimate(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

PDA control

config

cloud configuration

Details

step-3:

Value

event rate


dGEM initialize

Description

dGEM initialize

Usage

dGEM.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init

References

NA


PDA dGEM synthesize

Description

PDA dGEM synthesize

Usage

dGEM.synthesize(control,config)

Arguments

control

pda control

config

pda cloud configuration

Details

Synthesis to get the standardized mortality rate

Value

list(final_event_rate=final_event_rate)


Pooled estimation for COLA-GLM and COLA-GLM-H

Description

Performs pooled estimation for Generalized Linear Models (GLMs) using either the standard COLA-GLM approach or the heterogeneous intercept extension (COLA-GLM-H). This function supports binary and Poisson outcomes.

Usage

estimatePool(KSiteIPD, formula, family = "binomial", 
outcome_name, heter_intercept = FALSE)

Arguments

KSiteIPD

glm data

formula

glm formula

family

glm family

outcome_name

outcome name

heter_intercept

Logical; if TRUE, includes site-specific intercepts to model heterogeneity across sites.

Value

A data frame with point estimates and standard errors for each coefficient


One-shot site summaries for COLA-GLMM

Description

Produces the **lossless**, pattern-level sufficient statistics for a single site: pattern counts 'Ck', outcome sums 'Sk = \sum y', squared sums 'S2k = \sum y^2', and the corresponding pattern matrix 'X0'. Works for both binomial and Poisson outcomes (for Bernoulli, 'S2k == Sk').

Usage

generate_CSU_site(df_site, x_names, intercept = TRUE)

Arguments

df_site

Data frame for one site. Must include outcome column named 'y' and the fixed-effect covariates in 'x_names'. If 'intercept = TRUE', the function will add an 'Intercept' column when missing.

x_names

Character vector of fixed-effect names (binary covariates; may include '"Intercept"' if 'intercept = TRUE').

intercept

Logical; include a fixed intercept in the pattern matrix.

Value

A list with elements:

Examples

# df_site$y must exist; x_names are binary
# out <- generate_CSU_site(df_site, c("Intercept","age","sex"), intercept = TRUE)

gather cloud settings into a list

Description

gather cloud settings into a list

Usage

getCloudConfig(site_id,dir=NULL,uri=NULL,secret=NULL,silent_message=T)

Arguments

site_id

site identifier

dir

shared directory path if flat files

uri

web uri if web service

secret

web token if web service

silent_message

logical, if the message will be muted

Value

A list of cloud parameters: site_id, secret and uri

See Also

pda


DisC2o simulated data

Description

A simulated long-covid data set for Distributed causal inference with covariates shift (DisC2o). This only contains 5 covariates and more noisy covariates can be added when running demo example.

Usage

long_covid

Format

A data frame with 900 rows and 53 variables:

PASC_features

number of Post Acute Sequelae of COVID (PASC, or long covid) features

covid_vaccination

treatment of covid vaccination, 1=vaccinated

site

site id, 300 participants each for 'site1', 'site2', and 'site3'

X1

a binary covariate

X2

a binary covariate

X3

a continuous covariate

X4

a continuous covariate

X5

a continuous covariate


Lung cancer survival time data

Description

A data set modified from the lung data in survival package (see demo(ODAC)).

Usage

lung2

Format

A data frame with 228 rows and 5 variables:

site

simulated site id, 86 'site1', 83 'site2' and 59 'site3'

time

survival time in days

status

censoring status 0=censored, 1=dead

age

age in years

sex

1 for female and 0 for male

Source

https://CRAN.R-project.org/package=survival


Construct binary covariate pattern matrix

Description

Builds the full design grid of binary fixed-effect patterns used by COLA to aggregate counts and outcome sums. When 'intercept = TRUE', an 'Intercept' column of ones is included and all other variables are expanded over {0,1}.

Usage

make_patterns(x_names, intercept = TRUE)

Arguments

x_names

Character vector of fixed-effect names. If 'intercept = TRUE', it may include '"Intercept"'; otherwise it must not.

intercept

Logical; include a fixed intercept column. Default: 'TRUE'.

Value

A tibble of all binary patterns over 'x_names' (with/without 'Intercept'), one row per pattern.


A flexible version of MASS::glmmPQL

Description

A flexible version of MASS::glmmPQL

Usage

myglmmPQL(formula.glm, formula, offset=NULL, family, data, fixef.init = NULL, 
                 weights=NULL, REML=T, niter=10, verbose=T)

Arguments

formula.glm

formula used to fit glm for initial fixed effects

formula

formula used to fit iterative lmer in PQL algorithm

offset

glm offset term

family

glm family

data

glm data

fixef.init

initial fixed effects estimates, set to zeros if NULL

weights

glm weights

REML

lmer logical scalar - Should the estimates be chosen to optimize the REML criterion (as opposed to the log-likelihood)?

niter

glmmPQL maximum number of iterations.

verbose

glmmPQL logical: print out record of iterations?

Details

Use lme4::lmer instead of nlme::varFixed in PQL iteration to allow REML

Value

An object wiht the same format as lmer.


ODACATH simulated data with nominal outcome

Description

A simulated data set for ODACATH demonstration

Usage

nominal_data_hetero

Format

A list containing the following elements:

id.site

site id, 100 'site1', 100 'site2', 100 'site3'

y

nominal outcome taking values 1,2,3

X1_cont

a continuous covariate

X2_bin

a binary covariate


ODACH_CC simulated data

Description

A simulated data set for ODACH with case-cohort design demonstration

Usage

odach_cc

Format

A data frame with 413 rows and 8 variables:

site

site id, 187 'site1', 133 'site2', 93 'site3'. The full_cohort_size are 800, 600 and 400 respectively

subcohort

1=subcohort, e.g. uniformly subsampled from full_cohort_size, 0=case

time

survival time

status

censoring status 0=censored, 1=dead

X1

the first covariate, continuous

X2

the second covariate, continuous

Category

the third covariate, categorical

Group

the fourth covariate, categorical


ODACATH simulated data with ordinal outcome

Description

A simulated data set for ODACATH demonstration

Usage

ordinal_data

Format

A list containing the following elements:

id.site

site id, 100 'site1', 100 'site2', 100 'site3'

y

ordinal outcome taking values 1,2,3

X1_cont

a continuous covariate

X2_bin

a binary covariate


PDA: Privacy-preserving Distributed Algorithm

Description

Fit Privacy-preserving Distributed Algorithms for linear, logistic, Poisson and Cox PH regression with possible heterogeneous data across sites.

Usage

pda(ipdata=NULL,site_id,control=NULL,dir=NULL,uri=NULL,secret=NULL,
upload_without_confirm=F, silent_message=F, digits=4,hosdata=NULL)

Arguments

ipdata

Local IPD data in data frame, should include at least one column for the outcome and one column for the covariates

site_id

Character site name

control

pda control data

dir

directory for shared flat file cloud

uri

Universal Resource Identifier for this run

secret

password to authenticate as site_id on uri

upload_without_confirm

logical. TRUE if want silent upload, no interactive confirm

silent_message

logical. TRUE if want to mute message

digits

digits after decimal points in the output json files

hosdata

(for dGEM) hospital-level data, should include the same name as defined in the control file

Value

control

control

References

Michael I. Jordan, Jason D. Lee & Yun Yang (2019) Communication-Efficient Distributed Statistical Inference,
Journal of the American Statistical Association, 114:526, 668-681
doi:10.1080/01621459.2018.1429274.
(DLM) Yixin Chen, et al. (2006) Regression cubes with lossless compression and aggregation. IEEE Transactions on Knowledge and Data Engineering, 18(12), pp.1585-1599.
(DLMM) Chongliang Luo, et al. (2020) Lossless Distributed Linear Mixed Model with Application to Integration of Heterogeneous Healthcare Data. medRxiv, doi:10.1101/2020.11.16.20230730.
(DPQL) Chongliang Luo, et al. (2021) dPQL: a lossless distributed algorithm for generalized linear mixed model with application to privacy-preserving hospital profiling.
medRxiv, doi:10.1101/2021.05.03.21256561.
(ODAL) Rui Duan, et al. (2020) Learning from electronic health records across multiple sites:
A communication-efficient and privacy-preserving distributed algorithm.
Journal of the American Medical Informatics Association, 27.3:376–385,
doi:10.1093/jamia/ocz199.
(ODAC) Rui Duan, et al. (2020) Learning from local to global: An efficient distributed algorithm for modeling time-to-event data.
Journal of the American Medical Informatics Association, 27.7:1028–1036,
doi:10.1093/jamia/ocaa044.
(ODACH) Chongliang Luo, et al. (2021) ODACH: A One-shot Distributed Algorithm for Cox model with Heterogeneous Multi-center Data.
medRxiv, doi:10.1101/2021.04.18.21255694.
(ODAH) Mackenzie J. Edmondson, et al. (2021) An Efficient and Accurate Distributed Learning Algorithm for Modeling Multi-Site Zero-Inflated Count Outcomes. medRxiv, pp.2020-12.
doi:10.1101/2020.12.17.20248194.
(ADAP) Xiaokang Liu, et al. (2021) ADAP: multisite learning with high-dimensional heterogeneous data via A Distributed Algorithm for Penalized regression.
(dGEM) Jiayi Tong, et al. (2022) dGEM: Decentralized Generalized Linear Mixed Effects Model
(COLA) Wu, Q., Reps, J.M., Li, L. et al. COLA-GLM: collaborative one-shot and lossless algorithms of generalized linear models for decentralized observational healthcare data. npj Digit. Med. 8, 442 (2025). https://doi.org/10.1038/s41746-025-01781-1.
(ODACT) Liang CJ, Luo C, Kranzler HR, Bian J, Chen Y. Communication-efficient federated learning of temporal effects on opioid use disorder with data from distributed research networks. J Am Med Inform Assoc. 2025 Apr 1;32(4):656-664. doi: 10.1093/jamia/ocae313. PMID: 39864407; PMCID: PMC12005629.
(DisC2o) Tong J, et al. 2025. DisC2o-HD: Distributed causal inference with covariates shift for analyzing real-world high-dimensional data. Journal of Machine Learning Research. 2025;26(3):1-50.

See Also

pdaPut, pdaList, pdaGet, getCloudConfig and pdaSync.


use this function to guide end-users step-by-step to identify best pda models for their tasks, and set up control.

Description

use this function to guide end-users step-by-step to identify best pda models for their tasks, and set up control.

Usage

pdaCatalog(task=c('Regression', 'Survival', 'Trial_emulation', 
'Causal_inference', 'Design_analysis', 'Clustering'), 
write_json_file_path=getwd(), optim_maxit,optim_method,init_method)

Arguments

task

user-specified task, c('Regression', 'Survival', 'Trial_emulation', 'Causal_inference', 'Design_analysis', 'Clustering'). If no specify, display all models

write_json_file_path

directory path to write the control file to

optim_maxit

option in the control file for the optimization in pda, default 100

optim_method

option in the control file for the optimization in pda, default "BFGS"

init_method

option in the control file for calculating the initial estimate in pda, default "meta"

Value

pda control

See Also

pda


Function to download json and return as object

Description

Function to download json and return as object

Usage

pdaGet(name,config)

Arguments

name

of file

config

cloud configuration

Value

A list of data objects from the json file on the cloud

See Also

pda


Function to list available objects

Description

Function to list available objects

Usage

pdaList(config)

Arguments

config

a list of variables for cloud configuration

Value

A list of (json) files on the cloud

See Also

pda


Function to upload object to cloud as json

Description

Function to upload object to cloud as json

Usage

pdaPut(obj,name,config,upload_without_confirm=F,silent_message=F,digits=4)

Arguments

obj

R object to encode as json and uploaded to cloud

name

of file

config

a list of variables for cloud configuration

upload_without_confirm

logical. TRUE if want silent upload, no interactive confirm

silent_message

logical. TRUE if want to mute message

digits

digits after decimal points in the output json files

Value

NONE

See Also

pda


pda control synchronize

Description

update pda control if ready (run by lead)

Usage

pdaSync(config,upload_without_confirm,silent_message, digits)

Arguments

config

cloud configuration

upload_without_confirm

logical. TRUE if want silent upload, no interactive confirm

silent_message

logical. TRUE if want to mute message

digits

digits after decimal points in the output json files

Value

control

See Also

pda


Function to perform all data processing and pooled stratified analysis

Description

Function to perform all data processing and pooled stratified analysis

Usage

run_pooled_analysis(data, outcome_id, outcome_time, sites)

Arguments

data

The combined LATTE_ADRD data.

outcome_id

The name of the primary outcome column.

outcome_time

The name of the outcome time column (for Poisson).

sites

A vector of site identifiers.

Value

A list containing the results of the standard logistic and Poisson pooled analysis.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.