| Type: | Package |
| Title: | Privacy-Preserving Distributed Algorithms |
| Version: | 1.3.0 |
| Date: | 2025-11-11 |
| Description: | A collection of privacy-preserving distributed algorithms (PDAs) for conducting federated statistical learning across multiple data sites. The PDA framework includes models for various tasks such as regression, trial emulation, causal inference, design-specific analysis, and clustering. The PDA algorithms run on a lead site and only require summary statistics from collaborating sites, with one or few iterations. The package can be used together with the online data transfer system (https://pda-ota.pdamethods.org/) for safe and convenient collaboration. For more information, please visit our software websites: https://github.com/Penncil/pda, and https://pdamethods.org/. |
| Maintainer: | Chongliang Luo <luocl3009@gmail.com> |
| License: | Apache License 2.0 |
| Suggests: | lme4 |
| Depends: | R (≥ 4.1.0) |
| Imports: | Rcpp (≥ 0.12.19), stats, httr, rvest, jsonlite, data.table, cobalt, EmpiricalCalibration, survival, minqa, glmnet, MASS, numDeriv, metafor, Matrix, ordinal, plyr, tidyr, tibble, dplyr, geex, data.tree |
| LinkingTo: | Rcpp, RcppArmadillo, RcppEigen |
| RoxygenNote: | 7.3.3 |
| Encoding: | UTF-8 |
| LazyData: | true |
| NeedsCompilation: | yes |
| Packaged: | 2025-11-16 17:36:26 UTC; chongliang |
| Author: | Chongliang Luo [cre], Rui Duan [aut], Mackenzie Edmondson [aut], Jiayi Tong [aut], Xiaokang Liu [aut], Kenneth Locke [aut], Jie Hu [aut], Bingyu Zhang [aut], Yicheng Shen [aut], Yudong Wang [aut], Yiwen Lu [aut], Lu Li [aut], Yong Chen [aut], Penn Computing Inference Learning (PennCIL) lab [cph] |
| Repository: | CRAN |
| Date/Publication: | 2025-11-17 21:50:52 UTC |
ADAP derivatives
Description
ADAP derivatives
Usage
ADAP.derive(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
list(site=config$site_id, site_size = nrow(ipdata), logL_D1=logL_D1, logL_D2=logL_D2)
ADAP surrogate estimation
Description
ADAP surrogate estimation
Usage
ADAP.estimate(ipdata,control,config)
Arguments
ipdata |
local data in data frame |
control |
PDA control |
config |
cloud configuration |
Details
step-3: construct and solve surrogate objective function at the master/lead site
Value
list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))
ADAP initialize
Description
ADAP initialize
Usage
ADAP.initialize(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
init
ADAP simulated data
Description
A simulated data set for ADAP demonstration
Usage
ADAP_data
Format
A list containing the following elements:
- sites
site id, 300 'site1', 300 'site2', 300 'site3'
- status
binary outcome of length 900
- x
900 by 49 matrix generated by standard normal distribution, representing the covariates
PDA COLA estimation
Description
PDA COLA estimation
Usage
COLA.estimate(ipdata=NULL,control,config)
Arguments
ipdata |
no need |
control |
PDA control |
config |
cloud configuration |
Details
COLA estimation: (1) COLA-GLM (2) COLA-GLM-H (3) COLA-GLMM
Value
list(est, se)
COLA initialize
Description
COLA initialize
Usage
COLA.initialize(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
init
References
Qiong Wu, et al. (2025) COLA-GLM: Collaborative One-shot and Lossless Algorithms of Generalized Linear Models for Decentralized Observational Healthcare Data.
npj Digital Medicine.
Bingyu Zhang, et al (2025) A Lossless One-shot Distributed Algorithm for Addressing Heterogeneity in Multi-Site Generalized Linear Models.
Journal of the American Medical Informatics Association (under revision).
Jiayi Tong, et al. (2025) Unlocking Efficiency in Real-world Collaborative Studies: A Multi-site International Study with Collaborative One-shot Lossless Algorithm for Generalized Linear Mixed Model.
npj Digital Medicine.
COLA simulated data
Description
A simulated COVID-19 data set for Collaborative One-shot and Lossless Algorithms of generalized linear models (COLA-GLM)
Usage
COLA_covid
Format
A data frame with 1500 rows and 6 variables:
- site
site, 600 'site1', 500 'site2', 400 'site3'
- age
binary age
- sex
binary sex
- medical_condition
binary medical condition
- status
binary outcome, COVID-19 status. This is the binary outcome for COLA logistic regression
- visits
poisson outcome, number of visits. This is the count outcome for COLA Poisson regression
PDA DLM estimation
Description
PDA DLM estimation
Usage
DLM.estimate(ipdata=NULL,control,config)
Arguments
ipdata |
no need |
control |
PDA control |
config |
cloud configuration |
Details
DLM estimation: (1) Linear model, (2) Linear model with fixed effects, (3) Linear model with random effects (Linear mixed model)
Value
list(bhat, sebhat, sigmahat, uhat, seuhat)
DLM initialize
Description
DLM initialize
Usage
DLM.initialize(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
init
References
Yixin Chen, et al. (2006) Regression cubes with lossless compression and aggregation.
IEEE Transactions on Knowledge and Data Engineering, 18(12), pp.1585-1599.
(DLMM) Chongliang Luo, et al. (2020) Lossless Distributed Linear Mixed Model with Application to Integration of Heterogeneous Healthcare Data.
medRxiv, doi:10.1101/2020.11.16.20230730.
DPQL derive
Description
DPQL derive
Usage
DPQL.derive(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Details
This step calculated the intermediate aggregated data (XtWX, XtWY, and YtWY) for each site. May need to be iterated several times until prespecified rounds are met.
Value
list(SiX, SiXY, SiY, ni)
References
Chongliang Luo, et al. (2021) dPQL: a lossless distributed algorithm for generalized linear mixed model
with application to privacy-preserving hospital profiling. medRxiv, doi:10.1101/2021.05.03.21256561.
Chongliang Luo, et al. (2020) Lossless Distributed Linear Mixed Model with Application to Integration of Heterogeneous Healthcare Data.
medRxiv, doi:10.1101/2020.11.16.20230730.
PDA DPQL estimation
Description
PDA DPQL estimation
Usage
DPQL.estimate(ipdata=NULL,control,config)
Arguments
ipdata |
no need |
control |
PDA control |
config |
cloud configuration |
Details
DPQL estimation: (iterative) weighted DLMM using AD from all sites
Value
list(risk_factor, risk_factor_heterogeneity, bhat, sebhat, uhat, seuhat, Vhat)
References
Chongliang Luo, et al. (2021) dPQL: a lossless distributed algorithm for generalized linear mixed model
with application to privacy-preserving hospital profiling. medRxiv, doi:10.1101/2021.05.03.21256561.
Chongliang Luo, et al. (2020) Lossless Distributed Linear Mixed Model with Application to Integration of Heterogeneous Healthcare Data.
medRxiv, doi:10.1101/2020.11.16.20230730.
DPQL initialize
Description
DPQL initialize
Usage
DPQL.initialize(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Details
To initialize, fit glm at each individual site and send the estimated effect size and variances to the lead site. This step may be optional if we just use zero's as initial effect sizes to start the PQL algorithm.
Value
init
DisC2o AIPW estimate of the ATE at each site
Description
DisC2o AIPW estimate of the ATE at each site
Usage
DisC2o.AIPWestimate(ipdata,control,config)
Arguments
ipdata |
local data in data frame |
control |
pda control |
config |
pda cloud configuration |
Value
list(btilde=btilde, Vtilde=Vtilde)
DisC2o_OM derivatives
Description
DisC2o_OM derivatives
Usage
DisC2o.OMderive(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
list(site=config$site_id, site_size = nrow(ipdata), logL_D1=logL_D1, logL_D2=logL_D2)
DisC2o outcome model surrogate estimation
Description
DisC2o outcome model surrogate estimation
Usage
DisC2o.OMestimate(ipdata,control,config)
Arguments
ipdata |
local data in data frame |
control |
PDA control |
config |
cloud configuration |
Value
list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))
DisC2o_OM initialize
Description
DisC2o_OM initialize
Usage
DisC2o.OMinitialize(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
init
DisC2o_PS derivatives
Description
DisC2o_PS derivatives
Usage
DisC2o.PSderive(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
list(site=config$site_id, site_size = nrow(ipdata), logL_D1=logL_D1, logL_D2=logL_D2)
PDA surrogate estimation
Description
PDA surrogate estimation
Usage
DisC2o.PSestimate(ipdata,control,config)
Arguments
ipdata |
local data in data frame |
control |
PDA control |
config |
cloud configuration |
Value
list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))
DisC2o PS initialize
Description
DisC2o PS initialize
Usage
DisC2o.PSinitialize(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
init
References
Tong J, et al. 2025. DisC2o-HD: Distributed causal inference with covariates shift for analyzing real-world high-dimensional data. Journal of Machine Learning Research. 2025;26(3):1-50.
DisC2o AIPW estimate of the ATE, synthesizing all sites
Description
DisC2o AIPW estimate of the ATE, synthesizing all sites
Usage
DisC2o.synthesize(ipdata,control,config)
Arguments
ipdata |
local data in data frame |
control |
pda control |
config |
pda cloud configuration |
Value
list(btilde=btilde, Vtilde=Vtilde)
LATTE LATTE.estimate
Description
LATTE conditional log-likelihood reconstruction at Lead Site
Usage
LATTE.estimate(init_data, control, config)
Arguments
init_data |
initialization data from LATTE.initialize |
control |
pda control data |
config |
local site configuration |
Value
analysis results
LATTE initialize
Description
LATTE (Lossless Aggregation for Treatment effect estimation) initialization: Propensity Score stratification/matching at Lead site
Usage
LATTE.initialize(ipdata, control, config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
init object containing prepared data and PS model
LATTE simulated data
Description
A simulated ADRD data set for Lossless Oneshot Algorithm for Target Trial Emulation (LATTE)
Usage
LATTE_ADRD
Format
A data frame with 1224 rows and 326 variables:
- ID
Unique patient identifier
- Stroke...Transient.Ischemic.Attack
History of Stroke or Transient Ischemic Attack, binary indicator
- Acquired.Hypothyroidism
History of Acquired Hypothyroidism, binary indicator
- Fibromyalgia..Chronic.Pain.and.Fatigue
History of Fibromyalgia, Chronic Pain and Fatigue, binary indicator
- RA.OA..Rheumatoid.Arthritis..Osteoarthritis.
History of RA/OA (Rheumatoid Arthritis / Osteoarthritis), binary indicator
- Hypertension
History of Hypertension, binary indicator
- Anxiety.Disorders
History of Anxiety Disorders, binary indicator
- Chronic.Obstructive.Pulmonary.Disease.and.Bronchiectasis
History of Chronic Obstructive Pulmonary Disease (COPD) and Bronchiectasis, binary indicator
- Asthma
History of Asthma, binary indicator
- Traumatic.Brain.Injury.and.Nonpsychotic.Mental.Disorders.due.to.Brain.Damage
History of Traumatic Brain Injury and Nonpsychotic Mental Disorders due to Brain Damage, binary indicator
- Sleep.disorders
History of Sleep disorders, binary indicator
- ADHD..Conduct.Disorders..and.Hyperkinetic.Syndrome
History of ADHD, Conduct Disorders, and Hyperkinetic Syndrome, binary indicator
- Cataract
History of Cataract, binary indicator
- Migraine.and.Chronic.Headache
History of Migraine and Chronic Headache, binary indicator
- Depressive.Disorders
History of Depressive Disorders, binary indicator
- Hyperlipidemia
History of Hyperlipidemia, binary indicator
- Sensory...Deafness.and.Hearing.Impairment
History of Sensory Deafness and Hearing Impairment, binary indicator
- Female...Male.Breast.Cancer
History of Female/Male Breast Cancer, binary indicator
- Personality.Disorders
History of Personality Disorders, binary indicator
- Anemia
History of Anemia, binary indicator
- Chronic.Kidney.Disease
History of Chronic Kidney Disease, binary indicator
- Schizophrenia.and.Other.Psychotic.Disorders
History of Schizophrenia and Other Psychotic Disorders, binary indicator
- Glaucoma
History of Glaucoma, binary indicator
- Peripheral.Vascular.Disease..PVD.
History of Peripheral Vascular Disease (PVD), binary indicator
- Heart.Failure
History of Heart Failure, binary indicator
- Pressure.and.Chronic.Ulcers
History of Pressure and Chronic Ulcers, binary indicator
- Obesity
History of Obesity, binary indicator
- Diabetes
History of Diabetes, binary indicator
- Mobility.Impairments
History of Mobility Impairments, binary indicator
- Benign.Prostatic.Hyperplasia
History of Benign Prostatic Hyperplasia, binary indicator
- Drug.Use.Disorders
History of Drug Use Disorders, binary indicator
- Alcohol.Use.Disorders
History of Alcohol Use Disorders, binary indicator
- Post.Traumatic.Stress.Disorder..PTSD.
History of Post Traumatic Stress Disorder (PTSD), binary indicator
- Atrial.Fibrillation
History of Atrial Fibrillation, binary indicator
- Tobacco.Use
History of Tobacco Use, binary indicator
- Ischemic.Heart.Disease
History of Ischemic Heart Disease, binary indicator
- Liver.Disease..Cirrhosis.and.Other.Liver.Conditions..except.Viral.Hepatitis.
History of Liver Disease (Cirrhosis and Other Liver Conditions except Viral Hepatitis), binary indicator
- Sensory...Blindness.and.Visual.Impairment
History of Sensory Blindness and Visual Impairment, binary indicator
- Bipolar.Disorder
History of Bipolar Disorder, binary indicator
- Depression
History of Depression, binary indicator
- Prostate.Cancer
History of Prostate Cancer, binary indicator
- Acute.Myocardial.Infarction
History of Acute Myocardial Infarction, binary indicator
- Hip.Pelvic.Fracture
History of Hip/Pelvic Fracture, binary indicator
- Other.Developmental.Delays
History of Other Developmental Delays, binary indicator
- Viral.Hepatitis..General.
History of Viral Hepatitis (General), binary indicator
- Sickle.Cell.Disease
History of Sickle Cell Disease, binary indicator
- Multiple.Sclerosis.and.Transverse.Myelitis
History of Multiple Sclerosis and Transverse Myelitis, binary indicator
- Leukemias.and.Lymphomas
History of Leukemias and Lymphomas, binary indicator
- Opioid.Use.Disorder
History of Opioid Use Disorder, binary indicator
- Colorectal.Cancer
History of Colorectal Cancer, binary indicator
- Epilepsy
History of Epilepsy, binary indicator
- Osteoporosis
History of Osteoporosis, binary indicator
- Intellectual.Disabilities.and.Related.Conditions
History of Intellectual Disabilities and Related Conditions, binary indicator
- Spinal.Cord.Injury
History of Spinal Cord Injury, binary indicator
- Endometrial.Cancer
History of Endometrial Cancer, binary indicator
- Spina.Bifida.and.Other.Congenital.Anomalies.of.the.Nervous.System
History of Spina Bifida and Other Congenital Anomalies of the Nervous System, binary indicator
- Learning.Disabilities
History of Learning Disabilities, binary indicator
- Periodontitis
History of Periodontitis, binary indicator
- Lung.Cancer
History of Lung Cancer, binary indicator
- Cystic.Fibrosis.and.Other.Metabolic.Developmental.Disorders
History of Cystic Fibrosis and Other Metabolic Developmental Disorders, binary indicator
- Cerebral.Palsy
History of Cerebral Palsy, binary indicator
- Human.Immunodeficiency.Virus.and.or.Acquired.Immunodeficiency.Syndrome..HIV.AIDS.
History of Human Immunodeficiency Virus and/or Acquired Immunodeficiency Syndrome (HIV/AIDS), binary indicator
- Menopause
History of Menopause, binary indicator
- Muscular.Dystrophy
History of Muscular Dystrophy, binary indicator
- Autism.Spectrum.Disorders
History of Autism Spectrum Disorders, binary indicator
- atorvastatin
History of atorvastatin use, binary indicator
- hydrochlorothiazide
History of hydrochlorothiazide use, binary indicator
- amlodipine
History of amlodipine use, binary indicator
- aspirin
History of aspirin use, binary indicator
- metoprolol
History of metoprolol use, binary indicator
- levothyroxine
History of levothyroxine use, binary indicator
- metformin
History of metformin use, binary indicator
- lisinopril
History of lisinopril use, binary indicator
- simvastatin
History of simvastatin use, binary indicator
- sodium.chloride
History of sodium chloride use, binary indicator
- omeprazole
History of omeprazole use, binary indicator
- albuterol
History of albuterol use, binary indicator
- potassium.chloride
History of potassium chloride use, binary indicator
- sertraline
History of sertraline use, binary indicator
- lidocaine
History of lidocaine use, binary indicator
- furosemide
History of furosemide use, binary indicator
- losartan
History of losartan use, binary indicator
- donepezil
History of donepezil use, binary indicator
- cholecalciferol
History of cholecalciferol use, binary indicator
- ondansetron
History of ondansetron use, binary indicator
- oxycodone
History of oxycodone use, binary indicator
- lorazepam
History of lorazepam use, binary indicator
- glucose.oxidase
History of glucose oxidase use, binary indicator
- prednisone
History of prednisone use, binary indicator
- fluticasone
History of fluticasone use, binary indicator
- gabapentin
History of gabapentin use, binary indicator
- rosuvastatin
History of rosuvastatin use, binary indicator
- tamsulosin
History of tamsulosin use, binary indicator
- fentanyl
History of fentanyl use, binary indicator
- carbidopa
History of carbidopa use, binary indicator
- pantoprazole
History of pantoprazole use, binary indicator
- escitalopram
History of escitalopram use, binary indicator
- insulin.aspart..human
History of insulin aspart (human) use, binary indicator
- clonazepam
History of clonazepam use, binary indicator
- carvedilol
History of carvedilol use, binary indicator
- heparin
History of heparin use, binary indicator
- bupropion
History of bupropion use, binary indicator
- polyethylene.glycol.3350
History of polyethylene glycol 3350 use, binary indicator
- pravastatin
History of pravastatin use, binary indicator
- docusate
History of docusate use, binary indicator
- vitamin.B12
History of vitamin B12 use, binary indicator
- clopidogrel
History of clopidogrel use, binary indicator
- atenolol
History of atenolol use, binary indicator
- glipizide
History of glipizide use, binary indicator
- ascorbic.acid
History of ascorbic acid use, binary indicator
- glucose
History of glucose use, binary indicator
- insulin.glargine
History of insulin glargine use, binary indicator
- zolpidem
History of zolpidem use, binary indicator
- warfarin
History of warfarin use, binary indicator
- azithromycin
History of azithromycin use, binary indicator
- trazodone
History of trazodone use, binary indicator
- esomeprazole
History of esomeprazole use, binary indicator
- sennosides..USP
History of sennosides (USP) use, binary indicator
- propofol
History of propofol use, binary indicator
- iopamidol
History of iopamidol use, binary indicator
- diltiazem
History of diltiazem use, binary indicator
- ibuprofen
History of ibuprofen use, binary indicator
- dexamethasone
History of dexamethasone use, binary indicator
- tramadol
History of tramadol use, binary indicator
- amoxicillin
History of amoxicillin use, binary indicator
- midazolam
History of midazolam use, binary indicator
- lansoprazole
History of lansoprazole use, binary indicator
- citalopram
History of citalopram use, binary indicator
- valsartan
History of valsartan use, binary indicator
- ciprofloxacin
History of ciprofloxacin use, binary indicator
- famotidine
History of famotidine use, binary indicator
- calcium.carbonate
History of calcium carbonate use, binary indicator
- finasteride
History of finasteride use, binary indicator
- levofloxacin
History of levofloxacin use, binary indicator
- sulfamethoxazole
History of sulfamethoxazole use, binary indicator
- duloxetine
History of duloxetine use, binary indicator
- alprazolam
History of alprazolam use, binary indicator
- naproxen
History of naproxen use, binary indicator
- levetiracetam
History of levetiracetam use, binary indicator
- ranitidine
History of ranitidine use, binary indicator
- triamcinolone
History of triamcinolone use, binary indicator
- salmeterol
History of salmeterol use, binary indicator
- folic.acid
History of folic acid use, binary indicator
- nifedipine
History of nifedipine use, binary indicator
- ferrous.sulfate
History of ferrous sulfate use, binary indicator
- morphine
History of morphine use, binary indicator
- hydralazine
History of hydralazine use, binary indicator
- montelukast
History of montelukast use, binary indicator
- magnesium.sulfate
History of magnesium sulfate use, binary indicator
- methylphenidate
History of methylphenidate use, binary indicator
- hydrocortisone
History of hydrocortisone use, binary indicator
- latanoprost
History of latanoprost use, binary indicator
- quetiapine
History of quetiapine use, binary indicator
- metronidazole
History of metronidazole use, binary indicator
- diphenhydramine
History of diphenhydramine use, binary indicator
- memantine
History of memantine use, binary indicator
- methylprednisolone
History of methylprednisolone use, binary indicator
- doxycycline
History of doxycycline use, binary indicator
- fluoxetine
History of fluoxetine use, binary indicator
- paroxetine
History of paroxetine use, binary indicator
- gadobenate
History of gadobenate use, binary indicator
- propranolol
History of propranolol use, binary indicator
- ramipril
History of ramipril use, binary indicator
- ezetimibe
History of ezetimibe use, binary indicator
- allopurinol
History of allopurinol use, binary indicator
- enoxaparin
History of enoxaparin use, binary indicator
- apixaban
Binary indicator for history of apixaban
- sildenafil
Binary indicator for history of sildenafil
- oxybutynin
Binary indicator for history of oxybutynin
- insulin.lispro
Binary indicator for history of insulin lispro
- melatonin
Binary indicator for history of melatonin
- tadalafil
Binary indicator for history of tadalafil
- hydromorphone
Binary indicator for history of hydromorphone
- ipratropium
Binary indicator for history of ipratropium
- cephalexin
Binary indicator for history of cephalexin
- lovastatin
Binary indicator for history of lovastatin
- mirtazapine
Binary indicator for history of mirtazapine
- venlafaxine
Binary indicator for history of venlafaxine
- fenofibrate
Binary indicator for history of fenofibrate
- guaifenesin
Binary indicator for history of guaifenesin
- estradiol
Binary indicator for history of estradiol
- nitroglycerin
Binary indicator for history of nitroglycerin
- pregabalin
Binary indicator for history of pregabalin
- lamotrigine
Binary indicator for history of lamotrigine
- enalapril
Binary indicator for history of enalapril
- ergocalciferol
Binary indicator for history of ergocalciferol
- ketoconazole
Binary indicator for history of ketoconazole
- spironolactone
Binary indicator for history of spironolactone
- cyclobenzaprine
Binary indicator for history of cyclobenzaprine
- meloxicam
Binary indicator for history of meloxicam
- cetirizine
Binary indicator for history of cetirizine
- alendronate
Binary indicator for history of alendronate
- nortriptyline
Binary indicator for history of nortriptyline
- bisacodyl
Binary indicator for history of bisacodyl
- sitagliptin
Binary indicator for history of sitagliptin
- salmon.oil
Binary indicator for history of salmon oil
- olmesartan
Binary indicator for history of olmesartan
- timolol
Binary indicator for history of timolol
- nitrofurantoin
Binary indicator for history of nitrofurantoin
- celecoxib
Binary indicator for history of celecoxib
- glimepiride
Binary indicator for history of glimepiride
- iohexol
Binary indicator for history of iohexol
- clonidine
Binary indicator for history of clonidine
- valacyclovir
Binary indicator for history of valacyclovir
- ropinirole
Binary indicator for history of ropinirole
- bupivacaine
Binary indicator for history of bupivacaine
- benzonatate
Binary indicator for history of benzonatate
- carbamazepine
Binary indicator for history of carbamazepine
- meclizine
Binary indicator for history of meclizine
- azelastine
Binary indicator for history of azelastine
- diclofenac
Binary indicator for history of diclofenac
- clobetasol
Binary indicator for history of clobetasol
- ketorolac
Binary indicator for history of ketorolac
- amphetamine
Binary indicator for history of amphetamine
- budesonide
Binary indicator for history of budesonide
- diazepam
Binary indicator for history of diazepam
- repaglinide
Binary indicator for history of repaglinide
- omega.3.acid.ethyl.esters..USP.
Binary indicator for history of omega-3-acid ethyl esters (USP)
- methadone
Binary indicator for history of methadone
- clindamycin
Binary indicator for history of clindamycin
- rivaroxaban
Binary indicator for history of rivaroxaban
- vitamin.E
Binary indicator for history of vitamin E
- valproate
Binary indicator for history of valproate
- tiotropium
Binary indicator for history of tiotropium
- temazepam
Binary indicator for history of temazepam
- chlorhexidine
Binary indicator for history of chlorhexidine
- hydroxychloroquine
Binary indicator for history of hydroxychloroquine
- nystatin
Binary indicator for history of nystatin
- olanzapine
Binary indicator for history of olanzapine
- nicotine
Binary indicator for history of nicotine
- mometasone
Binary indicator for history of mometasone
- prednisolone
Binary indicator for history of prednisolone
- estrogens..conjugated..USP.
Binary indicator for history of conjugated estrogens (USP)
- mupirocin
Binary indicator for history of mupirocin
- loratadine
Binary indicator for history of loratadine
- fexofenadine
Binary indicator for history of fexofenadine
- solifenacin
Binary indicator for history of solifenacin
- irbesartan
Binary indicator for history of irbesartan
- ephedrine
Binary indicator for history of ephedrine
- isosorbide
Binary indicator for history of isosorbide
- gadoterate.meglumine
Binary indicator for history of gadoterate meglumine
- ubidecarenone
Binary indicator for history of ubidecarenone
- verapamil
Binary indicator for history of verapamil
- promethazine
Binary indicator for history of promethazine
- doxazosin
Binary indicator for history of doxazosin
- fluconazole
Binary indicator for history of fluconazole
- haloperidol
Binary indicator for history of haloperidol
- rivastigmine
Binary indicator for history of rivastigmine
- labetalol
Binary indicator for history of labetalol
- eszopiclone
Binary indicator for history of eszopiclone
- insulin.detemir
Binary indicator for history of insulin detemir
- aripiprazole
Binary indicator for history of aripiprazole
- vancomycin
Binary indicator for history of vancomycin
- thiamine
Binary indicator for history of thiamine
- epinephrine
Binary indicator for history of epinephrine
- amitriptyline
Binary indicator for history of amitriptyline
- tolterodine
Binary indicator for history of tolterodine
- cefazolin
Binary indicator for history of cefazolin
- lisdexamfetamine
Binary indicator for history of lisdexamfetamine
- risperidone
Binary indicator for history of risperidone
- mirabegron
Binary indicator for history of mirabegron
- magnesium.oxide
Binary indicator for history of magnesium oxide
- niacin
Binary indicator for history of niacin
- pramipexole
Binary indicator for history of pramipexole
- zoledronic.acid
Binary indicator for history of zoledronic acid
- sex
Binary sex
- age
Age (continuous)
- race
Race
- days
Time from index to event (continuous)
- index_date
Index date of diagnosis
- treatment
Binary treatment indicator
- outcome_AD_value
Outcome binary indicator: AD
- outcome_AD_time
Outcome time: AD
- outcome_ADRD_value
Outcome binary indicator: ADRD
- outcome_ADRD_time
Outcome time: ADRD
- outcome_acute_conjunctivitis_value
Outcome binary indicator: acute conjunctivitis
- outcome_acute_conjunctivitis_time
Outcome time: acute conjunctivitis
- outcome_acute_tonsillitis_value
Outcome binary indicator: acute tonsillitis
- outcome_acute_tonsillitis_time
Outcome time: acute tonsillitis
- outcome_adhesive_capsulitis_of_shoulder_value
Outcome binary indicator: adhesive capsulitis of shoulder
- outcome_adhesive_capsulitis_of_shoulder_time
Outcome time: adhesive capsulitis of shoulder
- outcome_allergic_rhinitis_value
Outcome binary indicator: allergic rhinitis
- outcome_allergic_rhinitis_time
Outcome time: allergic rhinitis
- outcome_blepharitis_value
Outcome binary indicator: blepharitis
- outcome_blepharitis_time
Outcome time: blepharitis
- outcome_carpal_tunnel_syndrome_value
Outcome binary indicator: carpal tunnel syndrome
- outcome_carpal_tunnel_syndrome_time
Outcome time: carpal tunnel syndrome
- outcome_chalazion_value
Outcome binary indicator: chalazion
- outcome_chalazion_time
Outcome time: chalazion
- outcome_contact_dermatitis_value
Outcome binary indicator: contact dermatitis
- outcome_contact_dermatitis_time
Outcome time: contact dermatitis
- outcome_dental_caries_value
Outcome binary indicator: dental caries
- outcome_dental_caries_time
Outcome time: dental caries
- outcome_deviated_nasal_septum_value
Outcome binary indicator: deviated nasal septum
- outcome_deviated_nasal_septum_time
Outcome time: deviated nasal septum
- outcome_foreign_body_in_ear_value
Outcome binary indicator: foreign body in ear
- outcome_foreign_body_in_ear_time
Outcome time: foreign body in ear
- outcome_gout_value
Outcome binary indicator: gout
- outcome_gout_time
Outcome time: gout
- outcome_hemorrhoids_value
Outcome binary indicator: hemorrhoids
- outcome_hemorrhoids_time
Outcome time: hemorrhoids
- outcome_impacted_cerumen_value
Outcome binary indicator: impacted cerumen
- outcome_impacted_cerumen_time
Outcome time: impacted cerumen
- outcome_influenza_value
Outcome binary indicator: influenza
- outcome_influenza_time
Outcome time: influenza
- outcome_ingrowing_nail_value
Outcome binary indicator: ingrowing nail
- outcome_ingrowing_nail_time
Outcome time: ingrowing nail
- outcome_low_back_pain_value
Outcome binary indicator: low back pain
- outcome_low_back_pain_time
Outcome time: low back pain
- outcome_menieres_disease_value
Outcome binary indicator: menieres disease
- outcome_menieres_disease_time
Outcome time: menieres disease
- outcome_osteoarthritis_of_knee_value
Outcome binary indicator: osteoarthritis of knee
- outcome_osteoarthritis_of_knee_time
Outcome time: osteoarthritis of knee
- outcome_osteoporosis_value
Outcome binary indicator: osteoporosis
- outcome_osteoporosis_time
Outcome time: osteoporosis
- outcome_foot_drop_value
Outcome binary indicator: foot drop
- outcome_foot_drop_time
Outcome time: foot drop
- outcome_hearing_problem_value
Outcome binary indicator: hearing problem
- outcome_hearing_problem_time
Outcome time: hearing problem
- outcome_intra_abdominal_and_pelvic_swelling_mass_and_lump_value
Outcome binary indicator: intra abdominal and pelvic swelling mass and lump
- outcome_intra_abdominal_and_pelvic_swelling_mass_and_lump_time
Outcome time: intra abdominal and pelvic swelling mass and lump
- outcome_irritability_and_anger_value
Outcome binary indicator: irritability and anger
- outcome_irritability_and_anger_time
Outcome time: irritability and anger
- outcome_wristdrop_value
Outcome binary indicator: wristdrop
- outcome_wristdrop_time
Outcome time: wristdrop
- site
Study site identifier
Length of Stay data
Description
A simulated data set of hospitalization Length of Stay (LOS) from 3 sites
Usage
LOS
Format
A data frame with 1000 rows and 5 variables:
- site
site id, 500 'site1', 400 'site2' and 100 'site3'
- age
3 categories, 'young', 'middle', and 'old'
- sex
2 categories, 'M' for male and 'F' for female
- lab
lab test results, continuous value ranging from 0 to 100
- los
LOS in days, ranging from 1 tp 28. Treated as continuous outcome in DLM
Generate pda UWZ derivatives
Description
Generate pda UWZ derivatives
Usage
ODAC.derive(ipdata, control, config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Details
Calculate and broadcast 1st and 2nd order derivative at initial bbar for ODAC, this requires 2 substeps: 1st calculate summary stats (U, W, Z), 2nd calculate derivatives (logL_D1, logL_D2)
Value
list(T_all=T_all, b_meta=b_meta, site=control$mysite, site_size = nrow(ipdata), U=U, W=W, Z=Z, logL_D1=logL_D1, logL_D2=logL_D2)
Generate pda UWZ summary statistics before calculating derivatives
Description
Generate pda UWZ summary statistics before calculating derivatives
Usage
ODAC.deriveUWZ(ipdata, control, config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
list(T_all=T_all, b_meta=b_meta, site=control$mysite, site_size = nrow(ipdata), U=U, W=W, Z=Z, logL_D1=logL_D1, logL_D2=logL_D2)
PDA surrogate estimation
Description
PDA surrogate estimation
Usage
ODAC.estimate(ipdata, control, config)
Arguments
ipdata |
local data in data frame |
control |
pda control |
config |
cloud config |
Details
step-4: construct and solve surrogate logL at the master/lead site
Value
list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))
ODAC initialize
Description
ODAC initialize
Usage
ODAC.initialize(ipdata, control, config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
list(T_i = T_i, bhat_i = fit_i$coef, Vhat_i = summary(fit_i)$coef[,2]^2, site=control$mysite, site_size= nrow(ipdata))
References
Rui Duan, et al. "Learning from local to global: An efficient distributed algorithm for modeling time-to-event data". Journal of the American Medical Informatics Association, 2020, https://doi.org/10.1093/jamia/ocaa044 Chongliang Luo, et al. "ODACH: A One-shot Distributed Algorithm for Cox model with Heterogeneous Multi-center Data". medRxiv, 2021, https://doi.org/10.1101/2021.04.18.21255694
PDA synthesize surrogate estimates from all sites, optional
Description
PDA synthesize surrogate estimates from all sites, optional
Usage
ODAC.synthesize(ipdata, control, config)
Arguments
ipdata |
local data in data frame |
control |
pda control |
config |
cloud config |
Details
Optional step-4: synthesize all the surrogate est btilde_i from each site, if step-3 from all sites is broadcasted
Value
list(btilde=btilde, Vtilde=Vtilde)
ODACAT derivatives
Description
ODACAT derivatives
Usage
ODACAT.derive(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
list(site=config$site_id, site_size = nrow(ipdata), logL_D1=logL_D1, logL_D2=logL_D2)
PDA surrogate estimation
Description
PDA surrogate estimation
Usage
ODACAT.estimate(ipdata,control,config)
Arguments
ipdata |
local data in data frame |
control |
PDA control |
config |
cloud configuration |
Details
step-3: construct and solve surrogate logL at the master/lead site
Value
list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))
ODACAT initialize
Description
ODACAT initialize
Usage
ODACAT.initialize(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
init
PDA synthesize surrogate estimates from all sites, optional
Description
PDA synthesize surrogate estimates from all sites, optional
Usage
ODACAT.synthesize(ipdata,control,config)
Arguments
ipdata |
local data in data frame |
control |
pda control |
config |
pda cloud configuration |
Details
Optional step-4: synthesize all the surrogate est btilde from each site, if step-3 from all sites is broadcasted
Value
list(btilde=btilde, Vtilde=Vtilde)
ODACATH derivatives
Description
ODACATH derivatives
Usage
ODACATH.derive(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
list(site=config$site_id, site_size = n, S_site=S_site, eta=eta_mat[site,])
PDA surrogate estimation
Description
PDA surrogate estimation
Usage
ODACATH.estimate(ipdata,control,config)
Arguments
ipdata |
local data in data frame |
control |
PDA control |
config |
cloud configuration |
Details
step-3: construct and solve surrogate efficient score at the master/lead site
Value
list(btilde=betanew, btilde.se=beta_SE,eta_mat=eta_mat,eta_mat_theta=NULL,site=config$site_id, site_size=n_site)
ODACATH initialize
Description
ODACATH initialize
Usage
ODACATH.initialize(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
init
PDA synthesize surrogate estimates from all sites, optional
Description
PDA synthesize surrogate estimates from all sites, optional
Usage
ODACATH.synthesize(ipdata,control,config)
Arguments
ipdata |
local data in data frame |
control |
pda control |
config |
pda cloud configuration |
Details
Optional step-4: synthesize all the surrogate est btilde from each site, if step-3 from all sites is broadcasted
Value
list(btilde=btilde, Vtilde=Vtilde)
ODACAT simulated data with nominal outcome
Description
A simulated data set for ODACAT demonstration
Usage
ODACAT_nominal
Format
A list containing the following elements:
- id.site
site id, 100 'site1', 100 'site2', 100 'site3'
- outcome
nominal outcome taking values 1,2,3
- X1
a continuous covariate
- X2
a binary covariate
- X3
a binary covariate
ODACAT simulated data with ordinal outcome
Description
A simulated data set for ODACAT demonstration
Usage
ODACAT_ordinal
Format
A data frame with 300 rows and 5 variables:
- id.site
site id, 105 'site1', 105 'site2', 90 'site3'
- outcome
3-category outcome, possible values are 1,2,3. Category 3 will be used as reference
- X1
the first covariate, continuous
- X2
the second covariate, binary
- X3
the third covariate, binary
Generate pda derivatives
Description
Generate pda derivatives
Usage
ODACH_CC.derive(ipdata, control, config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Details
Calculate and broadcast 1st and 2nd order derivative at initial bbar
Value
list(bbar=bbar, site=control$mysite, site_size = nrow(ipdata), logL_D1=logL_D1, logL_D2=logL_D2)
PDA surrogate estimation
Description
PDA surrogate estimation
Usage
ODACH_CC.estimate(ipdata, control, config)
Arguments
ipdata |
local data in data frame |
control |
pda control |
config |
cloud config |
Details
step-4: construct and solve surrogate logL at the master/lead site
Value
list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))
ODACH_CC initialize
Description
ODACH_CC initialize
Usage
ODACH_CC.initialize(ipdata, control, config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
list(bhat_i = fit_i$coef, Vhat_i = summary(fit_i)$coef[,2]^2, site=control$mysite, site_size= nrow(ipdata))
References
Chongliang Luo, et al. "ODACH: A One-shot Distributed Algorithm for Cox model with Heterogeneous Multi-center Data". medRxiv, 2021, https://doi.org/10.1101/2021.04.18.21255694
PDA synthesize surrogate estimates from all sites, optional
Description
PDA synthesize surrogate estimates from all sites, optional
Usage
ODACH_CC.synthesize(ipdata, control, config)
Arguments
ipdata |
local data in data frame |
control |
pda control |
config |
cloud config |
Details
Optional step-4: synthesize all the surrogate est btilde_i from each site, if step-3 from all sites is broadcasted
Value
list(btilde=btilde, Vtilde=Vtilde)
Generate pda ODACT derivatives
Description
Generate pda ODACT derivatives
Usage
ODACT.derive(ipdata, control, config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Details
Calculate and broadcast 1st and 2nd order derivative at initial bbar for ODACT
Value
list(b_meta=b_meta, site=control$mysite, site_size = nrow(ipdata), logL_D1=logL_D1, logL_D2=logL_D2)
PDA ODACT surrogate estimation
Description
PDA ODACT surrogate estimation
Usage
ODACT.estimate(ipdata, control, config)
Arguments
ipdata |
local data in data frame |
control |
pda control |
config |
cloud config |
Details
step-4: construct and solve surrogate logL at the master/lead site
Value
list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))
ODACT initialize
Description
ODACT initialize
Usage
ODACT.initialize(ipdata, control, config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
list(bhat_i, Vhat_i, site, site_size)
References
Liang CJ, Luo C, Kranzler HR, Bian J, Chen Y. Communication-efficient federated learning of temporal effects on opioid use disorder with data from distributed research networks. J Am Med Inform Assoc. 2025 Apr 1;32(4):656-664. doi: 10.1093/jamia/ocae313. PMID: 39864407; PMCID: PMC12005629.
PDA synthesize surrogate estimates from all sites, optional
Description
PDA synthesize surrogate estimates from all sites, optional
Usage
ODACT.synthesize(ipdata, control, config)
Arguments
ipdata |
local data in data frame |
control |
pda control |
config |
cloud config |
Details
Optional step-4: synthesize all the surrogate est btilde_i from each site, if step-3 from all sites is broadcasted
Value
list(btilde=btilde, Vtilde=Vtilde)
ODAH derivatives
Description
ODAH derivatives
Usage
ODAH.derive(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
derivatives list(site = config$site_id, site_size = nrow(ipdata), logL_D1_zero = logL_D1_zero, logL_D1_count = logL_D1_count, logL_D2_zero = logL_D2_zero, logL_D2_count = logL_D2_count)
PDA surrogate estimation
Description
PDA surrogate estimation
Usage
ODAH.estimate(ipdata,control,config)
Arguments
ipdata |
local data in a list(ipdata, X_count, X_zero) |
control |
PDA control |
config |
cloud configuration |
Details
construct and solve surrogate logL at the master/lead site
Value
list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))
ODAH initialize
Description
ODAH initialize
Usage
ODAH.initialize(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
init
References
TBD
ODAL derivatives
Description
ODAL derivatives
Usage
ODAL.derive(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
list(site=config$site_id, site_size = nrow(ipdata), logL_D1=logL_D1, logL_D2=logL_D2)
PDA surrogate estimation
Description
PDA surrogate estimation
Usage
ODAL.estimate(ipdata,control,config)
Arguments
ipdata |
local data in data frame |
control |
PDA control |
config |
cloud configuration |
Details
step-3: construct and solve surrogate logL at the master/lead site
Value
list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))
ODAL initialize
Description
ODAL initialize
Usage
ODAL.initialize(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
init
References
Rui Duan, et al. "Learning from electronic health records across multiple sites: A communication-efficient and privacy-preserving distributed algorithm". Journal of the American Medical Informatics Association, 2020, https://doi.org/10.1093/jamia/ocz199
PDA synthesize surrogate estimates from all sites, optional
Description
PDA synthesize surrogate estimates from all sites, optional
Usage
ODAL.synthesize(ipdata,control,config)
Arguments
ipdata |
local data in data frame |
control |
pda control |
config |
pda cloud configuration |
Details
Optional step-4: synthesize all the surrogate est btilde_i from each site, if step-3 from all sites is broadcasted
Value
list(btilde=btilde, Vtilde=Vtilde)
ODAP derivatives
Description
ODAP derivatives
Usage
ODAP.derive(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
derivatives list(site = config$site_id, site_size = nrow(ipdata), logL_D1 = logL_D1, logL_D2 = logL_D2)
PDA surrogate estimation
Description
PDA surrogate estimation
Usage
ODAP.estimate(ipdata,control,config)
Arguments
ipdata |
local data in data frame (generated in |
control |
PDA control |
config |
cloud configuration |
Details
construct and solve surrogate logL at the master/lead site
Value
list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))
ODAP initialize
Description
ODAP initialize
Usage
ODAP.initialize(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
init
References
TBD
ODAPB derivatives
Description
ODAPB derivatives
Usage
ODAPB.derive(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
derivatives list(site = config$site_id, site_size = nrow(ipdata), logL_D1 = logL_D1, logL_D2 = logL_D2)
PDA surrogate estimation
Description
PDA surrogate estimation
Usage
ODAPB.estimate(ipdata,control,config)
Arguments
ipdata |
local data in data frame (generated in |
control |
PDA control |
config |
cloud configuration |
Details
construct and solve surrogate logL at the master/lead site
Value
list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))
ODAPB initialize
Description
ODAPB initialize
Usage
ODAPB.initialize(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
init
References
TBD
COLA-GLMM
Description
Fits a generalized linear mixed model with site-level random intercepts using
only one-shot per-site summaries (Ck, Sk, S2k, X0). Each of the
iterations constructs weighted LMM summary statistics which are then solved
by lmm.fit (from DLMM), yielding updated fixed effects and random
intercepts until convergence.
Usage
cola_glmm(
summary_by_site,
family = "poisson",
intercept = TRUE,
beta_init = NULL,
u_init = NULL,
max_iter = 50,
tol = 1e-06,
verbose = TRUE
)
Arguments
summary_by_site |
Named list of site summaries. Each element must contain
|
family |
Character; one of |
intercept |
Logical; whether the fixed-effect design includes an
intercept (affects how |
beta_init |
Optional named numeric vector of initial fixed effects. Defaults to zeros. |
u_init |
Optional named numeric vector of initial site random effects (one per site). Defaults to zeros. |
max_iter |
Integer maximum number of IRLS iterations. Default |
tol |
Convergence tolerance on relative squared parameter change.
Default |
verbose |
Logical; print iteration progress. Default |
Details
Uses canonical links: log for Poisson and logit for binomial. The fixed-effect
covariates in X0 are assumed binary (plus optional Intercept).
For numerically extreme logits, a small weight floor is used internally.
Requires lmm.fit from dlmm.R to be on the search path.
Value
A list with elements:
-
beta: named fixed-effect estimates -
u: named site random-intercept BLUPs -
V: variance component matrix for the random intercept -
s2: residual scale from the working LMM -
iter: number of iterations performed -
SiXYZ_last: last iteration's sufficient statistics (by site)
Examples
# fit <- cola_glmm(summary_by_site, family = "poisson")
COVID-19 LOS and mortality data
Description
A simulated data set of hospitalization Length of Stay (LOS) and mortality from 6 sites
Usage
covid
Format
A data frame with 2100 rows and 6 variables:
- site
site id, 600 'site1', 500 'site2', 400 'site3', 300 'site4', 200 'site5', 100 'site6'
- age
continuous age in year, min 3 max 97
- sex
2 categories, '1' for male and '0' for female
- lab
lab test results, continuous value ranging from 2.3 to 97.4
- los
LOS in days, ranging from 1 to 29
- death
mortality status, '1' for death and '0' for alive.
CrabSatellites data
Description
A data set modified from the CrabSatellites data in countreg package (see demo(ODAH)).
Usage
cs
Format
A data frame containing 173 observations on 4 variables.
- site
Simulated site id, 85 'site1' and 88 'site2'.
- satellites
Number of satellites. Treated as (zero-inflated) count outcome in ODAH
- width
Carapace width (cm).
- weight
Weight (kg).
Source
https://rdrr.io/rforge/countreg/man/CrabSatellites.html
dGEM hospital-specific effect derivation
Description
dGEM hospital-specific effect derivation
Usage
dGEM.derive(ipdata,control,config,hosdata)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
hosdata |
hospital-level data |
Value
hospital_effect
dGEM standardized event rate estimation
Description
dGEM standardized event rate estimation
Usage
dGEM.estimate(ipdata,control,config)
Arguments
ipdata |
local data in data frame |
control |
PDA control |
config |
cloud configuration |
Details
step-3:
Value
event rate
dGEM initialize
Description
dGEM initialize
Usage
dGEM.initialize(ipdata,control,config)
Arguments
ipdata |
individual participant data |
control |
pda control data |
config |
local site configuration |
Value
init
References
NA
PDA dGEM synthesize
Description
PDA dGEM synthesize
Usage
dGEM.synthesize(control,config)
Arguments
control |
pda control |
config |
pda cloud configuration |
Details
Synthesis to get the standardized mortality rate
Value
list(final_event_rate=final_event_rate)
Pooled estimation for COLA-GLM and COLA-GLM-H
Description
Performs pooled estimation for Generalized Linear Models (GLMs) using either the standard COLA-GLM approach or the heterogeneous intercept extension (COLA-GLM-H). This function supports binary and Poisson outcomes.
Usage
estimatePool(KSiteIPD, formula, family = "binomial",
outcome_name, heter_intercept = FALSE)
Arguments
KSiteIPD |
|
formula |
|
family |
|
outcome_name |
outcome name |
heter_intercept |
Logical; if |
Value
A data frame with point estimates and standard errors for each coefficient
One-shot site summaries for COLA-GLMM
Description
Produces the **lossless**, pattern-level sufficient statistics for a single site:
pattern counts 'Ck', outcome sums 'Sk = \sum y', squared sums 'S2k = \sum y^2', and the
corresponding pattern matrix 'X0'. Works for both binomial and Poisson outcomes
(for Bernoulli, 'S2k == Sk').
Usage
generate_CSU_site(df_site, x_names, intercept = TRUE)
Arguments
df_site |
Data frame for one site. Must include outcome column named 'y' and the fixed-effect covariates in 'x_names'. If 'intercept = TRUE', the function will add an 'Intercept' column when missing. |
x_names |
Character vector of fixed-effect names (binary covariates; may include '"Intercept"' if 'intercept = TRUE'). |
intercept |
Logical; include a fixed intercept in the pattern matrix. |
Value
A list with elements:
'Ck' (integer vector) pattern counts
'Sk' (numeric vector) sums of y per pattern
'S2k' (numeric vector) sums of y^2 per pattern
'X0' (matrix) pattern design matrix aligned to 'Ck/Sk/S2k'
Examples
# df_site$y must exist; x_names are binary
# out <- generate_CSU_site(df_site, c("Intercept","age","sex"), intercept = TRUE)
gather cloud settings into a list
Description
gather cloud settings into a list
Usage
getCloudConfig(site_id,dir=NULL,uri=NULL,secret=NULL,silent_message=T)
Arguments
site_id |
site identifier |
dir |
shared directory path if flat files |
uri |
web uri if web service |
secret |
web token if web service |
silent_message |
logical, if the message will be muted |
Value
A list of cloud parameters: site_id, secret and uri
See Also
pda
DisC2o simulated data
Description
A simulated long-covid data set for Distributed causal inference with covariates shift (DisC2o). This only contains 5 covariates and more noisy covariates can be added when running demo example.
Usage
long_covid
Format
A data frame with 900 rows and 53 variables:
- PASC_features
number of Post Acute Sequelae of COVID (PASC, or long covid) features
- covid_vaccination
treatment of covid vaccination, 1=vaccinated
- site
site id, 300 participants each for 'site1', 'site2', and 'site3'
- X1
a binary covariate
- X2
a binary covariate
- X3
a continuous covariate
- X4
a continuous covariate
- X5
a continuous covariate
Lung cancer survival time data
Description
A data set modified from the lung data in survival package (see demo(ODAC)).
Usage
lung2
Format
A data frame with 228 rows and 5 variables:
- site
simulated site id, 86 'site1', 83 'site2' and 59 'site3'
- time
survival time in days
- status
censoring status 0=censored, 1=dead
- age
age in years
- sex
1 for female and 0 for male
Source
https://CRAN.R-project.org/package=survival
Construct binary covariate pattern matrix
Description
Builds the full design grid of binary fixed-effect patterns used by COLA to aggregate counts and outcome sums. When 'intercept = TRUE', an 'Intercept' column of ones is included and all other variables are expanded over {0,1}.
Usage
make_patterns(x_names, intercept = TRUE)
Arguments
x_names |
Character vector of fixed-effect names. If 'intercept = TRUE', it may include '"Intercept"'; otherwise it must not. |
intercept |
Logical; include a fixed intercept column. Default: 'TRUE'. |
Value
A tibble of all binary patterns over 'x_names' (with/without 'Intercept'), one row per pattern.
A flexible version of MASS::glmmPQL
Description
A flexible version of MASS::glmmPQL
Usage
myglmmPQL(formula.glm, formula, offset=NULL, family, data, fixef.init = NULL,
weights=NULL, REML=T, niter=10, verbose=T)
Arguments
formula.glm |
formula used to fit |
formula |
formula used to fit iterative |
offset |
|
family |
|
data |
|
fixef.init |
initial fixed effects estimates, set to zeros if NULL |
weights |
|
REML |
|
niter |
|
verbose |
|
Details
Use lme4::lmer instead of nlme::varFixed in PQL iteration to allow REML
Value
An object wiht the same format as lmer.
ODACATH simulated data with nominal outcome
Description
A simulated data set for ODACATH demonstration
Usage
nominal_data_hetero
Format
A list containing the following elements:
- id.site
site id, 100 'site1', 100 'site2', 100 'site3'
- y
nominal outcome taking values 1,2,3
- X1_cont
a continuous covariate
- X2_bin
a binary covariate
ODACH_CC simulated data
Description
A simulated data set for ODACH with case-cohort design demonstration
Usage
odach_cc
Format
A data frame with 413 rows and 8 variables:
- site
site id, 187 'site1', 133 'site2', 93 'site3'. The full_cohort_size are 800, 600 and 400 respectively
- subcohort
1=subcohort, e.g. uniformly subsampled from full_cohort_size, 0=case
- time
survival time
- status
censoring status 0=censored, 1=dead
- X1
the first covariate, continuous
- X2
the second covariate, continuous
- Category
the third covariate, categorical
- Group
the fourth covariate, categorical
ODACATH simulated data with ordinal outcome
Description
A simulated data set for ODACATH demonstration
Usage
ordinal_data
Format
A list containing the following elements:
- id.site
site id, 100 'site1', 100 'site2', 100 'site3'
- y
ordinal outcome taking values 1,2,3
- X1_cont
a continuous covariate
- X2_bin
a binary covariate
PDA: Privacy-preserving Distributed Algorithm
Description
Fit Privacy-preserving Distributed Algorithms for linear, logistic, Poisson and Cox PH regression with possible heterogeneous data across sites.
Usage
pda(ipdata=NULL,site_id,control=NULL,dir=NULL,uri=NULL,secret=NULL,
upload_without_confirm=F, silent_message=F, digits=4,hosdata=NULL)
Arguments
ipdata |
Local IPD data in data frame, should include at least one column for the outcome and one column for the covariates |
site_id |
Character site name |
control |
pda control data |
dir |
directory for shared flat file cloud |
uri |
Universal Resource Identifier for this run |
secret |
password to authenticate as site_id on uri |
upload_without_confirm |
logical. TRUE if want silent upload, no interactive confirm |
silent_message |
logical. TRUE if want to mute message |
digits |
digits after decimal points in the output json files |
hosdata |
(for dGEM) hospital-level data, should include the same name as defined in the control file |
Value
control
control
References
Michael I. Jordan, Jason D. Lee & Yun Yang (2019) Communication-Efficient Distributed Statistical Inference,
Journal of the American Statistical Association, 114:526, 668-681
doi:10.1080/01621459.2018.1429274.
(DLM) Yixin Chen, et al. (2006) Regression cubes with lossless compression and aggregation.
IEEE Transactions on Knowledge and Data Engineering, 18(12), pp.1585-1599.
(DLMM) Chongliang Luo, et al. (2020) Lossless Distributed Linear Mixed Model with Application to Integration of Heterogeneous Healthcare Data.
medRxiv, doi:10.1101/2020.11.16.20230730.
(DPQL) Chongliang Luo, et al. (2021) dPQL: a lossless distributed algorithm for generalized linear mixed model with application to privacy-preserving hospital profiling.
medRxiv, doi:10.1101/2021.05.03.21256561.
(ODAL) Rui Duan, et al. (2020) Learning from electronic health records across multiple sites:
A communication-efficient and privacy-preserving distributed algorithm.
Journal of the American Medical Informatics Association, 27.3:376–385,
doi:10.1093/jamia/ocz199.
(ODAC) Rui Duan, et al. (2020) Learning from local to global: An efficient distributed algorithm for modeling time-to-event data.
Journal of the American Medical Informatics Association, 27.7:1028–1036,
doi:10.1093/jamia/ocaa044.
(ODACH) Chongliang Luo, et al. (2021) ODACH: A One-shot Distributed Algorithm for Cox model with Heterogeneous Multi-center Data.
medRxiv, doi:10.1101/2021.04.18.21255694.
(ODAH) Mackenzie J. Edmondson, et al. (2021) An Efficient and Accurate Distributed Learning Algorithm for Modeling Multi-Site Zero-Inflated Count Outcomes.
medRxiv, pp.2020-12.
doi:10.1101/2020.12.17.20248194.
(ADAP) Xiaokang Liu, et al. (2021) ADAP: multisite learning with high-dimensional heterogeneous data via A Distributed Algorithm for Penalized regression.
(dGEM) Jiayi Tong, et al. (2022) dGEM: Decentralized Generalized Linear Mixed Effects Model
(COLA) Wu, Q., Reps, J.M., Li, L. et al. COLA-GLM: collaborative one-shot and lossless algorithms of generalized linear models for decentralized observational healthcare data. npj Digit. Med. 8, 442 (2025). https://doi.org/10.1038/s41746-025-01781-1.
(ODACT) Liang CJ, Luo C, Kranzler HR, Bian J, Chen Y. Communication-efficient federated learning of temporal effects on opioid use disorder with data from distributed research networks. J Am Med Inform Assoc. 2025 Apr 1;32(4):656-664. doi: 10.1093/jamia/ocae313. PMID: 39864407; PMCID: PMC12005629.
(DisC2o) Tong J, et al. 2025. DisC2o-HD: Distributed causal inference with covariates shift for analyzing real-world high-dimensional data. Journal of Machine Learning Research. 2025;26(3):1-50.
See Also
pdaPut, pdaList, pdaGet, getCloudConfig and pdaSync.
use this function to guide end-users step-by-step to identify best pda models for their tasks, and set up control.
Description
use this function to guide end-users step-by-step to identify best pda models for their tasks, and set up control.
Usage
pdaCatalog(task=c('Regression', 'Survival', 'Trial_emulation',
'Causal_inference', 'Design_analysis', 'Clustering'),
write_json_file_path=getwd(), optim_maxit,optim_method,init_method)
Arguments
task |
user-specified task, c('Regression', 'Survival', 'Trial_emulation', 'Causal_inference', 'Design_analysis', 'Clustering'). If no specify, display all models |
write_json_file_path |
directory path to write the control file to |
optim_maxit |
option in the control file for the optimization in pda, default 100 |
optim_method |
option in the control file for the optimization in pda, default "BFGS" |
init_method |
option in the control file for calculating the initial estimate in pda, default "meta" |
Value
pda control
See Also
pda
Function to download json and return as object
Description
Function to download json and return as object
Usage
pdaGet(name,config)
Arguments
name |
of file |
config |
cloud configuration |
Value
A list of data objects from the json file on the cloud
See Also
pda
Function to list available objects
Description
Function to list available objects
Usage
pdaList(config)
Arguments
config |
a list of variables for cloud configuration |
Value
A list of (json) files on the cloud
See Also
pda
Function to upload object to cloud as json
Description
Function to upload object to cloud as json
Usage
pdaPut(obj,name,config,upload_without_confirm=F,silent_message=F,digits=4)
Arguments
obj |
R object to encode as json and uploaded to cloud |
name |
of file |
config |
a list of variables for cloud configuration |
upload_without_confirm |
logical. TRUE if want silent upload, no interactive confirm |
silent_message |
logical. TRUE if want to mute message |
digits |
digits after decimal points in the output json files |
Value
NONE
See Also
pda
pda control synchronize
Description
update pda control if ready (run by lead)
Usage
pdaSync(config,upload_without_confirm,silent_message, digits)
Arguments
config |
cloud configuration |
upload_without_confirm |
logical. TRUE if want silent upload, no interactive confirm |
silent_message |
logical. TRUE if want to mute message |
digits |
digits after decimal points in the output json files |
Value
control
See Also
pda
Function to perform all data processing and pooled stratified analysis
Description
Function to perform all data processing and pooled stratified analysis
Usage
run_pooled_analysis(data, outcome_id, outcome_time, sites)
Arguments
data |
The combined LATTE_ADRD data. |
outcome_id |
The name of the primary outcome column. |
outcome_time |
The name of the outcome time column (for Poisson). |
sites |
A vector of site identifiers. |
Value
A list containing the results of the standard logistic and Poisson pooled analysis.