BayLum provides a collection of various R functions for Bayesian analysis of Luminescence data. This includes, amongst others, data import, export, application of age models and palaeodose model.
It is possible to process data for various samples simultaneously and to consider more than one BIN file per sample. Single-grain and Multi-grain OSL measurement can be analysed. Stratigraphic constrains and systematic errors can be added in the analyse.
For those who already know how to use R, BayLum won’t be difficult to use.
To install BayLum you first need to install R and Rstudio that has a nice desktop environment for using R. In addition, you need to install JAGS for the Bayesian analysis of the models. Once in R (or in RStudio) you can type:
install.packages('BayLum')
at the R command prompt to install BayLum If you then type:
library(BayLum)
it will load in all the BayLum functions.
A first step is to read information from BIN file.
Let us consider the sample named GDB3. All informations concerning this sample (environmental dose, source dose, disc and position of grains,… see “What are the required files in each subfolder?” section in help of Generate_DataFile function) are containing in a subfolder named as the sample GDB3 located in path.
Then, you can use the function Generate_DataFile as follows:
path="path" # Put path in quotes to go to the folder containing bin.BIN file and associated .csv files
# Note that in parameter "Path", the character must be terminated by "/"
folder=c("GDB3")# give the name of the data folder
nbsample=1 # give the number of samples
DATA1=Generate_DataFile(Path=path,FolderNames=folder,Nb_sample=nbsample)
The user can use the function save to save output \(DATA1\) as follows:
save(DATA1,file=c(paste(path,"DATA1.RData",sep="")))
The user can acced to this file with the function load:
load(file=c(paste(path,"DATA1.RData",sep="")))
To check if there is no problem, the user can read information on BIN file and associated .csv files as folows:
str(DATA1)
List of 9
$ LT :List of 1
..$ : num [1:101, 1:6] 5.66 6.9 4.05 3.43 4.97 ...
$ sLT :List of 1
..$ : num [1:101, 1:6] 0.373 0.315 0.245 0.181 0.246 ...
$ ITimes :List of 1
..$ : num [1:101, 1:5] 160 160 160 160 160 160 160 160 160 160 ...
$ dLab : num [1:2, 1] 1.53e-01 5.89e-05
$ ddot_env : num [1:2, 1] 2.26 0.0617
$ regDose :List of 1
..$ : num [1:101, 1:5] 24.6 24.6 24.6 24.6 24.6 ...
$ J : num 101
$ K : num 5
$ Nb_measurement: num 14
In DATA1, there is information for the sample named GDB3. It is expected (for more information we refer to “Value” section in help of Generate_DataFile function):
You can also see the Lx/Tx values as a function of regenerative dose using the function LT_RegenDose:
LT_RegenDose(DATA=DATA1,Path=path,FolderNames=folder,
SampleNames=folder,Nb_sample=nbsample)
Note that, here we consider only one sample, and the name of the folder is the name of the sample, that is why \(FolderNames=\) folder and \(SampleNames=\) folder.
For a multi-grain OSL measurements, instead of Generate_DataFile function, the user use Generate_DataFile_MG whith similar parameter. The difference come from the associated Disc.csv file instead DiscPos.csv file for Single-grain OSL Measurements (we refer to help of Generate_DataFile_MG function).
The user can also plot Lx/Tx values as a function of Regenerative Dose, for every selected aliquot and for each sample, with the LT_RegenDose with the option \(SG=\)False.
To compute age of the sample GDB3, you can run the following code:
priorage=c(10,100) # GDB3 is an old sample
Age=Age_Computation(DATA=DATA1,SampleName="GDB3",
PriorAge=priorage,
distribution="cauchy", # Option on equivalent dose distribution
LIN_fit=TRUE,Origin_fit=FALSE, # Option on growth curves
Taille=10000)
If \(DATA1\) is output of Generate_DataFile_MG, then the user can use Age_Computation exactly as in the previous example.
If MCMC trajectories did not converge, you can add more iteration at the parameter \(Taille\) in the function Age_Computation, for example \(Taille=20000\) or \(Taille=50000\).
Or you can precise the prior distribution if it was not spécified before (for example if it is a young sample \(PriorAge=\) c(0.01,10), if it is an old sample \(PriorAge=\)c(10,100)).
If trajectories are still not converged, you can check if the choice of \(distribution\) and growth cruves are good.
By default, a saturating exponential plus linear growth curve is considered, but the user can choose other growth curves by changing option \(LIN\_fit\) and \(Origin\_fit\) in the function. We refer to the user manual for more detail.
What we can recommend is to use … if you have…
By default, a cauchy distribution is considered, but the user can choose other distribution replacing cauchy by gaussian or lognormal_A or lognormal_M in \(distribution\) option.
The difference between the models: lognormal_A and lognormal_M is that the equivalent dose dispersion are distributed according to:
If you are in this cases: ….. , you can consider …..
These two options allow to save directly information.
\(SavePdf=TRUE\) saves a pdf file with MCMC trajectories of parameters \(A\) (age), \(D\) (palaeodose), \(sD\) (equivalent doses dispersion). The user must precise \(OutputFileName\) and \(OutputFilePath\) to define name and path of the pdf file.
\(SaveEstimates=TRUE\) saves a csv file containing Bayes estimates, credible interval at 68% and 95% and the Gelman and Rudin test of convergency of the parameters \(A\), \(D\), \(sD\). The user must precise \(OutputTableName\) and \(OutputTablePath\) to define name and path of the csv file.
By default, an age between 0.01 Ka and 100 ka is expected. If the user has more informations on the sample age it can change \(PriorAge\).
For example, if you know that the sample is an old sample, you can choose \(PriorAge=c(10,120)\). In contrast, if you know that is a young sample, you can choose \(PriorAge=c(0.001,10)\). Note that, it is not possible to consider as lower bound 0, you can consider 0.001.
You must take care with this option, because \(PriorAge\) are the lower bound and the upper bound of the estimated age. If you give a \(PriorAge\) too precise it can bias the result.
In the previous example we consider the samplest case: one sample, and one BIN file for this sample. But we can consider various BIN files for one sample. To do this, you write in \(Names\) the names of subfolder coresponding to a specific BIN file, both located in \(path\). For example \(Names=c("name\_binfile1","name\_binfile2")\). Then, you can complete Generate_DataFile (or Generate_DataFile_MF) as follows:
nbsample=1
nbbinfile = length(Names)
Binpersample = c(length(Names))
DATA_BF=Generate_DataFile(Path=path,
FolderNames=Names,
Nb_sample=nbsample,
Nb_binfile=nbbinfile,
BinPerSample=Binpersample)
# Computation of age
Age=Age_Computation(DATA=DATA_BF,
SampleName="Names",
BinPerSample=Binpersample)
More precisely, the function Generate_DataFile (or Generate_DataFile_MF) can consider various samples simultaneously, and can consider more than one BIN file per sample.
Assume that we are interested in two samples named: sample1 and sample2. In addition, we have two BIN files for the first sample named: sample1-1 and sample1-2, and one Bin file for the sample2, named sample2-1. Then we must have 3 subfolders named sample1-1, sample1-2 and sample2-1; which each subfolder contains one BIN file named bin.BIN, and its associated files DiscPos.csv, DoseEnv.csv, DoseSourve.csv and rule.csv. These 3 subfolders are located in path.
To fill corectly \(BinPerSample\): \[binpersample=c(\underbrace{2}_{\text{sample 1: 2 bin files}},\underbrace{1}_{\text{sample 2: 1 bin file}})\]
Names=c("sample1-1","sample1-2","sample2-1") # give the name of the folder datat
nbsample=2 # give the number of samples
nbbinfile=3 # give the number of bin files
DATA=Generate_DataFile(Path=path,FolderNames=Names,
Nb_sample=nbsample,
Nb_binfile=nbbinfile,
BinPerSample=binpersample)
If the user has already saved informations obtained with Generate_DataFile function (or Generate_DataFile_MG function) about samples in RData file. The user can concatenate data with the Concat_DataFile.
For example if DATA1 is output of sample named “GDB3”, and DATA2 is output of “GDB5”,
data("DATA1",envir = environment())
data("DATA2",envir = environment())
DATA3=Concat_DataFile(L1=DATA2,L2=DATA1)
str(DATA3)
List of 9
$ LT :List of 2
..$ : num [1:188, 1:6] 4.54 2.73 2.54 2.27 1.48 ...
..$ : num [1:101, 1:6] 5.66 6.9 4.05 3.43 4.97 ...
$ sLT :List of 2
..$ : num [1:188, 1:6] 0.333 0.386 0.128 0.171 0.145 ...
..$ : num [1:101, 1:6] 0.373 0.315 0.245 0.181 0.246 ...
$ ITimes :List of 2
..$ : num [1:188, 1:5] 40 40 40 40 40 40 40 40 40 40 ...
..$ : num [1:101, 1:5] 160 160 160 160 160 160 160 160 160 160 ...
$ dLab : num [1:2, 1:2] 1.53e-01 5.89e-05 1.53e-01 5.89e-05
$ ddot_env : num [1:2, 1:2] 2.512 0.0563 2.26 0.0617
$ regDose :List of 2
..$ : num [1:188, 1:5] 6.14 6.14 6.14 6.14 6.14 6.14 6.14 6.14 6.14 6.14 ...
..$ : num [1:101, 1:5] 24.6 24.6 24.6 24.6 24.6 ...
$ J : num [1:2] 188 101
$ K : num [1:2] 5 5
$ Nb_measurement: num [1:2] 14 14
In DATA, there is information for samples named GDB3 and GDB5, then it is expected
It is possible to concat data from Single-grain OSL measurement and from Multi-grain OSL measurement.
You can also see the L/T in function of the regenerative doses for each sample as follows:
LT_RegenDose(DATA=DATA3,Path=path,FolderNames=Names,
Nb_sample=nbsample,SG=rep(TRUE,nbsample))
As in DATA3 there are informations concerning sample GDB3 and GDB5 which are Single-grain OSL measurements, \(SG[1]=SG[2]=TRUE\). If one sample (whose number ID is equal to \(i\)) is Multi-grain OSL measurements, the user changes \(SG\) option by \(SG[i]=\)FALSE.
If there is no stratigraphic constraint, you can compute the following code to analyse simultaneously the age of the sample GDB5 and GDB3.
priorage=c(1,10,10,100) # see remark to have more information on this parameter
Age=AgeS_Computation(DATA=DATA3,Nb_sample=2,SampleNames=c("GDB5","GDB3"),
PriorAge=priorage,
distribution="cauchy", # Option on equivalent dose distribution
LIN_fit=TRUE,Origin_fit=FALSE, # Option on growth curves
Iter=10000)
If MCMC trajectories did not converge, we refer to remark 0 in Section 1.3.
Options detailled in remark 1, 2, 3, of the section1.3, are still available for this function.
As for the function Age_computation, age for each sample is by default between 0.01 Ka and 100 ka. If you have more informations on your samples it is possible to change \(PriorAge\) parameters. \(PriorAge\) is a vector of size = 2*\(Nb\_sample\), the two first values of \(PriorAge\) concern the first sample and so on.
For example, if you know that sample named GDB5 is a young sample whose its age is between 0.01 ka and 10 ka, and GDB3 is an old sample whose age is between 10 ka and 100 ka, \[PriorAge=c(\underbrace{0.01,10}_{GDB5\ prior\ age},\underbrace{10,100}_{GDB3\ prior\ age})\]
With the function AgeS_Computation it is possible to take into account stratigraphic relations between samples.
For example, we know that GDB5 age is supposed lower than GDB3 age.
To take into acount stratigraphic constraints, informations on samples must be ordered.
Either you inter sample Name (corresponding to subforlder names) in \(Names\) parameter of the function Generate_DataFile, ordered by order of increasing ages; or you enter saved .RData informations of each sample in Concat_DataFile, ordered by order of increasing ages.
# using Generate_DataFile function
Names=c("GDB5","GDB3")
nbsample=2
Generate_DataFile(Path=path,FolderNames=Names,Nb_sample=nbsample)
# using Concat_DataFile function
data(DATA1,envir = environment()) # .RData on sample GDB3
data(DATA2,envir = environment()) # .RData on sample GDB5
DATA3=Concat_DataFile(L1=DATA1,L2=DATA2)
Let SC be the matrix containing all informations on stratigraphic relations for this two samples. This matrix is defined as follows:
the size of the matrix: row number of \(StratiConstraints\) matrix is equal to \(Nb\_sample+1\), and column number is equal to \(Nb\_sample\).
the first line of the matrix: for all i in {1,…,\(Nb\_Sample\)}, \(StratiConstraints[1,i]=1\), that means the lower bound of the sample age given in \(PriorAge[2i-1]\) for the sample whose number ID is equal to \(i\), is taken into account.
the sample relations: for all j in {2,…,\(Nb\_Sample\)+1} and all i in {j,…,\(Nb\_Sample\)}, \(StratiConstraints[j,i]=1\) if sample age whose number ID is equal to j-1 is lower than sample age whose number ID is equal to i. Otherwise, \(StratiConstraints[j,i]=0\).
The user can use the function SCMatrix to define this matrix:
SC=SCMatrix(Nb_sample=2,SampleNames=c("GDB5","GDB3"))
GDB5 is younger than sample GDB3? 1 for TRUE or 0 for FALSE -->
1
[1] 1
In our case: 2 samples, SC is a matrix with 3 lines and 2 columns. The first line containts \(c(1,1)\) (because we take into acount prior ages), the second line containts \(c(0,1)\) (because the sample 2, named GDB3 is supposed older than the sample 1, named GDB5) and the third line contains \(c(0,0)\) (because the sample 2, named GDB3 is not younger than the sample 1, named GDB5). Then, we can also fill the matrix of stratigraphic relations as follow:
(SC=matrix(data=c(1,1,0,1,0,0),ncol=2,nrow = (2+1),byrow = T))
[,1] [,2]
[1,] 1 1
[2,] 0 1
[3,] 0 0
Age=AgeS_Computation(DATA=DATA3,Nb_sample=2,SampleNames=c("GDB5","GDB3"),
PriorAge=priorage,
distribution="cauchy", # Option on equivalent dose distribution
LIN_fit=TRUE,Origin_fit=FALSE, # Option on growth curves
StratiConstraints=SC, # Stratigraphic relations
Taille=10000)
Models used in this package are detailled in the following publication:
For more details on the diagnostic of Markov chain :
For more detail on data used in examples, we refer to the following publication: