1. Optimum Contribution Selection

Robin Wellmann

2017-03-23

The aim of optimum contribution selection is to find the optimum number of offspring for each breeding animal and to determine if a young animal (a selection candidate) should be selected for breeding or not. This is done in an optimal way, i.e. in a way that ensures that genetic gain is achieved, and that genetic diversity and genetic originality of the population are maintained or recovered. It can be based either on pedigree data or on marker data, whereby the latter approach is recommended. It requires that this data is available for all selection candidates, or, at least from a large sample of selection candidates.

Even if the frequency of use of breeding animals is not regulated by the breeding organization, running the optimization still provides valuable information for a breeder, as the animals with highest optimum contributions are most valuable for a breeding program.

This vignette is organized as follows:

Example Data Set

All evaluations using marker data are demonstrated at the example of cattle data included in the package. This multi-breed data has already been described in the companion vignette for basic marker-based evaluations.

Data frame phen includes the phenotypic information and has columns Indiv (individual IDs), Born (year of birth), Breed (breed name), BV (breeding values), and Sex (sexes). Sexes are recoded as 1 for males and 2 for females.

library("optiSel")
data(Cattle)
head(Cattle)
##                           Indiv Born  Breed         BV Sex
## 276000101676415 276000101676415 1991 Angler -1.0706066   1
## 276000108612636 276000108612636 1994 Angler -0.3362574   2
## 276000102372349 276000102372349 1986 Angler -2.0735649   2
## 276000102379430 276000102379430 1987 Angler  1.5968307   1
## 276000108826036 276000108826036 1994 Angler  1.0023969   1
## 276000111902076 276000111902076 1998 Angler -0.2426676   1

The data frame contains information on the 4 breeds Angler, Fleckvieh, Holstein, Rotbunt. The “Angler” is an endangered German cattle breed, which had been upgraded with Red Holstein (also called “Rotbunt”). The Rotbunt cattle are a subpopulation of the “Holstein” breed. The “Fleckvieh” or Simmental breed is unrelated to the Angler. The Angler cattle are the selection candidates.

This small example data set contains only genotypes from the first parts of the first two chromosomes. Vector GTfiles defined below contains the names of the genotype files. There is one file for each chromosome. Data frame map contains the marker map and has columns Name (marker name), Chr (chromosome number), Position, kb (position in kilo base pairs), and cM (position in centiMorgan):

data(map)
dir     <- system.file("extdata", package="optiSel")
GTfiles <- file.path(dir, paste("Chr", unique(map$Chr), ".phased", sep=""))
head(map)
##                                    Name Chr Position      kb cM
## ARS-BFGL-NGS-16466   ARS-BFGL-NGS-16466   1   267940 267.940  0
## ARS-BFGL-NGS-98142   ARS-BFGL-NGS-98142   1   471078 471.078  0
## ARS-BFGL-NGS-114208 ARS-BFGL-NGS-114208   1   533815 533.815  0
## ARS-BFGL-NGS-65067   ARS-BFGL-NGS-65067   1   883895 883.895  0
## ARS-BFGL-BAC-32722   ARS-BFGL-BAC-32722   1   929617 929.617  0
## ARS-BFGL-BAC-34682   ARS-BFGL-BAC-34682   1   950841 950.841  0

Introductory Example: Traditional OCS

As an introductory example you may run a traditional OCS with marker based kinship matrices. All alternative approaches involve the same steps, so it is recommended to read this section even if you want to minimize inbreeding instead of maximizing genetic gain. The following steps are involved:

Compute the kinships that are to be managed and put them into a list with function kinlist. Below, the kinship is named sKin, which is a shorthand for segment based kinship.

Kin  <- kinlist(sKin=segIBD(GTfiles, map))

Define a data frame containing only the phenotypes of the selection candidates. Make sure that there is one column for each trait that should be improved.

phen <- Cattle[Cattle$Breed=="Angler",]

Display the objective functions and constraints that are available for your data with function help.opticont:

help.opticont(Kin, phen)
## Available objective functions: 
##        min.sKin  
##        min.BV  
##        max.BV  
## 
## Available constraints: 
##        ub.sKin  
##        lb.BV  
##        ub.BV  
##        eq.BV  
##        ub  lb
## 
## Attention: Probably not all these objective functions and
##            constraints make sense in animal breeding

For numeric columns in data frame phen the possibility is provided to define an upper bound (prefix ub), a lower bound (prefix lb), an equality constraint (prefix eq), or to minimize (prefix min) or to maximize (prefix max) the weighted sum of the values, whereby the weights are the contributions of the selection candidates. If the column contains breeding values, then this is the expected mean breeding value in the offspring.

For each kinship named by function kinlist, the possibility is provided to define an upper bound for the expected mean value of the kinship in the offspring (prefix ub), or to minimize the value (prefix min).

Constraints ub and lb allow to define upper bounds and lower bounds for the contributions of the selection candidates.

Now choose the parameters you want to restrict and the parameters you want to optimize. For traditional OCs the objective is to maximize genetic gain with method max.BV, and to restrict the mean kinship in the offspring by defining constraint ub.sKin.

Create an empty list for the constraints:

con <- list()

and put the constraints into the list. To define upper bounds for the contributions of the selection candidates you may define component ub as

con$ub <- c(M=NA, F=-1)

In this case, ub is a named vector with two components, which define the upper bounds for the contributions of males and females. The value F=-1 means that females have equal contributions, so optimization will be done only for the males, and M=NA means that there is no upper bound for the contributions of males. Alternatively, different lower bounds and upper bounds could be defined for each selection candidate (see the help page of function opticont).

To define an upper bound for the mean kinship sKin in the offspring, put component ub.sKin into the list. In general, if an upper bound for a kinship \(K\) should be defined, it is recommended to derive the threshold value from the desired effective size \(N_e\) of the population by the formula \[ub.K=\overline{K}+(1-\overline{K})\Delta F,\] where \(\overline{K}\) is the mean kinship in the actual population, and \(\Delta F=\frac{1}{2 N_e}\). The critical effective size, i. e. the size below which the fitness of the population steadily decreases, depends on the population and is usually between 50 and 100. But there seems to be a consensus that 50-100 is a long-term viable effective size. To be on the safe side, an effective size of \(N_e=100\) should be envisaged (T H E Meuwissen 2009). Thus, the constraint is defined as

Ne <- 100
meanKin     <- mean(Kin$sKin[phen$Indiv, phen$Indiv])
con$ub.sKin <- meanKin + (1-meanKin)*(1/(2*Ne))

Now the optimum contributions of the selection candidates can be calculated:

maxBV <- opticont(method="max.BV", K=Kin, phen=phen, con=con, trace=FALSE)
## Objective: maximizing mean BV of the offspring.
## Constraints:
##   Mean kinship sKin in the offspring is not exceeding ub.sKin.
##   Minimum contribution of males to offspring is defined.
##   Minimum contribution of females to offspring is defined
##   Number of offspring of males is not limited.
##     (Thus, the maximum contribution per male to the offspring is 0.5).
##   All females have equal contributions to the offspring.
##     (Thus, no optimization is done for the females).
##   The total genetic contribution of   males to offspring is 0.5.
##   The total genetic contribution of females to offspring is 0.5.
## Computing Cholesky decomposition for  sKin ...finished
## 
## Using solver cccp with parameters: trace=0, abstol=1e-05, feastol=1e-05, stepadj=0.9, maxiters=100, reltol=1e-06, beta=0.5

Check if the results fulfill the constraints and look at the value of the objective function:

maxBV.s <- summary(maxBV)
## Checking constraints:
##   min(oc) >= 0           : TRUE
##   total male cont   = 0.5: TRUE
##   total female cont = 0.5: TRUE
##   females have equal cont: TRUE
##   all male cont <= ub    : TRUE
##   mean sKin <= ub.sKin       : TRUE
## 
maxBV.s$obj.fun
## [1] 1.039011

The results look OK. If they are not OK, then try to use another solver. The solver can be specified in parameter solver of function opticont. Available solvers are "alabama", "cccp", "cccp2", "csdp", and "slsqp". The default is "cccp". The solvers are described in the help page of function opticont. Alternatively you may use the same solver but with different tuning parameters. The available paramters are displayed if the function opticont is called (as shown above).

The optimum contributions of the selection candidates are in component parent:

Candidate <- maxBV$parent[,  c("Indiv", "Sex", "oc")]
head(Candidate[rev(order(Candidate$oc)),])
##                           Indiv Sex         oc
## 276000121507437 276000121507437   1 0.06638186
## 276000121243787 276000121243787   1 0.06534406
## 276000120949468 276000120949468   1 0.06239349
## 276000120061822 276000120061822   1 0.06043548
## 276000110948577 276000110948577   1 0.05580201
## 276000113913566 276000113913566   1 0.04061777

The optimum numbers of offspring can be obtained from the optimum contributions and the size N of the offspring population with function noffspring:

Candidate$nOff <- noffspring(Candidate, N=250)$nOff
head(Candidate[rev(order(Candidate$oc)),])
##                           Indiv Sex         oc nOff
## 276000121507437 276000121507437   1 0.06638186   33
## 276000121243787 276000121243787   1 0.06534406   33
## 276000120949468 276000120949468   1 0.06239349   31
## 276000120061822 276000120061822   1 0.06043548   30
## 276000110948577 276000110948577   1 0.05580201   28
## 276000113913566 276000113913566   1 0.04061777   20

Defining the Objective of a Breeding Program

The objective of a breeding program depends on several factors. These are the intended use of the breed, the presence of historic bottlenecks, and the importance being placed on the maintenance of genetic originality. In most livestock breeds the focus is on increasing the economic merit, so the objective of the breeding program is to maximize genetic gain. In contrast, companion animals often suffer from historic bottlenecks due to an overuse of popular sires. Hence, in these breeds the objective is to minimize inbreeding. In endangered breeds, which get subsidies for conservation, the focus may be on increasing their conservation values by recovering the native genetic background or by increasing the diversity between breeds.

However, these are conflicting objectives: To maximize genetic gain, the animals with highest breeding values would be used for breeding, which may create a new bottleneck and contribute to inbreeding depression. Maximizing genetic gain would also favor the use of animals with high genetic contributions from commercial breeds because these animals often have the highest breeding values. But this would reduce the genetic originality of the breed. Minimizing inbreeding in the offspring favors the use of animals with high contributions from other breeds because they have low kinship with the population and it may require the use of outcross animals with breeding values below average.

Thus, focussing on only one aspect automatically worsens the other ones. This can be avoided by imposing constraints on the aspects that are not optimized.

In general, best practice is genotying all selection candidates to enable marker based evaluations. A breeding program based on marker information is more efficient than a breeding program based only on pedigree information, provided that the animals are genotyped for a sufficient number of markers. For several species, however, genotyping is still too expensive, so the breeding programs rely only on pedigree information.

Depending on what the objective of your breeding program is, you may continue reading at the appropriate section:

Marker-based OCS

The required genotype file format, the marker map, the parameters minSNP, minL, unitL, unitP, and ubFreq, which are used for estimating the segment based kinship, the kinships at native haplotype segments, and the breed composition, have been described in the companion vignette for basic marker-based evaluations.

A matrix containing the segment based kinship between all pairs of individuals can be computed with function segIBD, whereas the kinships at native haplotype segments can be calculated from the results of function segIBDatN. Both kinships are computed below and combined into a single R-object with function kinlist. Below, the kinship at native haplotype segments is named sKinatN:

Kin  <- kinlist(
            sKin    = segIBD(GTfiles, map, minSNP=20, minL=1000), 
            sKinatN = segIBDatN(GTfiles, Cattle, map, thisBreed="Angler", ubFreq=0.01, minL=1000)
            )

The breed composition of individuals can be estimated with function segBreedComp. The migrant contributions MC of the Angler cattle are added as an additional column to data frame Cattle.

wdir  <- file.path(tempdir(), "HaplotypeEval")
wfile <- haplofreq(GTfiles, Cattle, map, thisBreed="Angler", minSNP=20, minL=1000, w.dir=wdir)
Comp  <- segBreedComp(wfile$match, map)
Cattle[rownames(Comp), "MC"] <- 1 - Comp$native
head(Cattle[,-1])
##                 Born  Breed         BV Sex        MC
## 276000101676415 1991 Angler -1.0706066   1 0.4194310
## 276000108612636 1994 Angler -0.3362574   2 0.4312843
## 276000102372349 1986 Angler -2.0735649   2 0.2581902
## 276000102379430 1987 Angler  1.5968307   1 0.5954559
## 276000108826036 1994 Angler  1.0023969   1 0.7760017
## 276000111902076 1998 Angler -0.2426676   1 0.7541481

There are two functions that can now be used to perform the optimization, which are opticont and opticont4mb. The latter is used especially if the aim is to decrease the average kinship in a multi-breed population. The parameters that can be constrained or optimized can be viewed with functions help.opticont

help.opticont(Kin, phen=Cattle)
## Available objective functions: 
##        min.sKin  min.segIBDandN  min.sKinatN  
##        min.BV  min.MC  
##        max.BV  max.MC  
## 
## Available constraints: 
##        ub.sKin  ub.segIBDandN  ub.sKinatN  
##        lb.BV  lb.MC  
##        ub.BV  ub.MC  
##        eq.BV  eq.MC  
##        ub  lb
## 
## Attention: Probably not all these objective functions and
##            constraints make sense in animal breeding

Compared to the introductory example the possibility to restrict or to minimize migrant contributions becomes available because column MC was added to data frame Cattle. Additionally, the possibility to minimize or to restrict the kinship at native segments sKinatN becomes available since this kinship was also defined in function kinlist.

The options available for function opticont4mb are displayed with help.opticont4mb:

help.opticont4mb(Kin, phen=Cattle)
## Available objective functions: 
##        min.sKin.acrossBreeds  min.sKin  min.segIBDandN  min.sKinatN  
##        min.BV  min.MC  
##        max.BV  max.MC  
## 
## Available constraints: 
##        ub.sKin.acrossBreeds  ub.sKin  ub.segIBDandN  ub.sKinatN  
##        lb.BV  lb.MC  
##        ub.BV  ub.MC  
##        eq.BV  eq.MC  
##        ub  lb
## 
## Attention: Probably not all these objective functions and
##            constraints make sense in animal breeding

Here, the additional possibility becomes available to minimize or to restrict the mean kinship sKin in a core set consisting of individuals from several breeds. The offspring of the selection candidates is included in the core set. This can be done by using method min.sKin.acrossBreeds in function opticont4m or by defining the constraint ub.sKin.acrossBreeds.

Depending on what the objective of your breeding program is, you may continue reading at the appropriate section:

Maximize Genetic Gain

Fist we define a data frame containing only the phenotypes of the selection candidates and create a list of constraints:

phen        <- Cattle[Cattle$Breed=="Angler",]
con         <- list(ub=c(M=NA, F=-1))
meanKin     <- mean(Kin$sKin[phen$Indiv, phen$Indiv])
con$ub.sKin <- meanKin + (1-meanKin)*(1/(2*Ne))

Again, equal contributions are assumed for the females and only the contributions of males are to be optimized. The upper bound for the mean segment based kinship was derived from the effective population size as explained above. Now the optimum contributions of the selection candidates can be calculated:

maxBV   <- opticont(method="max.BV", K=Kin, phen=phen, con=con, trace=FALSE)
maxBV.s <- summary(maxBV)
maxBV.s[,c("valid", "meanBV", "meanMC", "sKin", "sKinatN")]
##       valid   meanBV    meanMC      sKin    sKinatN
## maxBV  TRUE 1.039011 0.6398001 0.0615719 0.08159881

The results are the same as in the introductory example (as expected). This approach may be apppropriate for a population without introgression, but for populations with historic introgression, the kinship at native alleles should be restricted as well in accordance with the desired effective size, and the migrant contributions should be restricted in order not to increase. Otherwise the genetic originality of the breed may get lost in the long term.

meanKinatN     <- mean(Kin$segIBDandN)/mean(Kin$segN)
con$ub.sKinatN <- meanKinatN +(1-meanKinatN)*(1/(2*Ne))
con$ub.MC      <- mean(phen$MC)
maxBV2         <- opticont(method="max.BV", K=Kin, phen=phen, con=con, solver="slsqp")
maxBV2.s       <- summary(maxBV2)

For comparison, the summaries of both scenarios are combined into a single data frame with rbind:

Results <- rbind(maxBV.s, maxBV2.s)
Results[,c("valid", "meanBV", "meanMC", "sKin", "sKinatN")]
##        valid    meanBV    meanMC       sKin    sKinatN
## maxBV   TRUE 1.0390112 0.6398001 0.06157190 0.08159881
## maxBV2  TRUE 0.9208328 0.6146568 0.05913621 0.07161696

Since migrant contributions and breeding values are positively correlated, the genetic gain decreases slightly when migrant contributions are constrained not to increase.

Minimize Inbreeding

Minimizing inbreeding means to minimize the average kinship of the population in order to enable breeders to avoid inbreeding. This is the appropriate approach e.g. for companion animals suffering from a historic bottleneck. It can be done with or without accounting for breeding values. In the example below no breeding values are considered since accurate breeding values are not available for most of these breeds.

First we define a data frame containing only the phenotypes of the selection candidates and create a list of constraints:

phen <- Cattle[Cattle$Breed=="Angler",]
con  <- list(ub=c(M=NA, F=-1))

Again, equal contributions are assumed for the females and only the contributions of males are to be optimized. The segment based kinship is not constrained in this example because it should be minimized.

minKin   <- opticont(method="min.sKin", K=Kin, phen=phen, con=con, trace=FALSE)
minKin.s <- summary(minKin)
minKin.s[,c("valid", "meanBV", "meanMC", "sKin", "sKinatN")]
##        valid     meanBV    meanMC       sKin    sKinatN
## minKin  TRUE -0.3326149 0.5445139 0.04216327 0.05524461

Minimizing kinship without constraining the mean breeding value decreases the mean breeding value in the offspring slightly because the individuals with high breeding values are related. For this breed, it also decreases the migrant contribution because individuals from other breeds were related.

While in livestock breeds the migrant contributions should be restricted in order to maintain the genetic originality of the breeds, in several companion breeds the opposite is true. Several companion breeds have high inbreeding coefficients and descend from only very few (e.g. 3) founders (Wellmann and Pfeiffer 2009), and purging seems to be not feasible. Hence, a sufficient genetic diversity of the population cannot be achieved in the population even if marker data is used to minimize inbreeding. For these breeds it may be appropriate to use unrelated individuals from a variety of other breeds in order to increase the genetic diversity. However, only a small contribution from other breeds is needed, so the migrant contributions should be restricted also for these breeds in order to preserve their genetic originality. Hence, the difference between a breed with high diversity and a breed with low diversity suffering from inbreeding depression is, that the optimum value for the migrant contribution is larger than 0 for the latter.

For such a breed it is advisable to allow the use of individuals from other breeds but to restrict the admissible mean contribution from other breeds in the population. The mean kinship at native alleles should be restricted as well to require only a small amount of introgression:

con$ub.MC      <- 0.50
meanKinatN     <- mean(Kin$segIBDandN)/mean(Kin$segN)
con$ub.sKinatN <- meanKinatN +(1-meanKinatN)*(1/(2*Ne))
minKin2        <- opticont(method="min.sKin", K=Kin, phen=phen, con=con, solver="slsqp", trace=FALSE)
minKin2.s      <- summary(minKin2)

For comparison, the summaries of both scenarios are combined into a single data frame with rbind:

Results <- rbind(minKin.s, minKin2.s)
Results[,c("valid", "meanBV", "meanMC", "sKin", "sKinatN")]
##         valid     meanBV    meanMC       sKin    sKinatN
## minKin   TRUE -0.3326149 0.5445139 0.04216327 0.05524461
## minKin2  TRUE -0.5032611 0.5000000 0.04465374 0.06431074

Recover the Native Genetic Background

For endangered breeds the priority of a breeding program could be to recover the original genetic background by minimizing migrant contributions. However, since the individuals with smallest migrant contributions are related, this may considerably increase the inbreeding coefficients if the diversity at native alleles is not preserved. Hence, constraints are defined below not only for the segment based kinship but also for the kinship at native segments in accordance with the desired effective size:

phen           <- Cattle[Cattle$Breed=="Angler",]
con            <- list(ub=c(M=NA, F=-1))
meanKin        <- mean(Kin$sKin[phen$Indiv, phen$Indiv])
meanKinatN     <- mean(Kin$segIBDandN)/mean(Kin$segN)
con$ub.sKin    <- meanKin    + (1-meanKin)*(1/(2*Ne))
con$ub.sKinatN <- meanKinatN + (1-meanKinatN)*(1/(2*Ne))
minMC   <- opticont(method="min.MC", K=Kin, phen=phen, con=con, solver="slsqp", trace=FALSE)
minMC.s <- summary(minMC)
minMC.s[,c("valid", "meanBV", "meanMC", "sKin", "sKinatN")]
##       valid     meanBV    meanMC       sKin    sKinatN
## minMC  TRUE -0.5571579 0.4807247 0.04941089 0.07161696

For this breed, minimizing migrant contributions results in negative genetic gain because migrant contributions and breeding values are positively correlated. This can be avoided by adding an additional constraint for the breeding values:

con$lb.BV <- mean(phen$BV)
minMC2    <- opticont(method="min.MC", K=Kin, phen=phen, con=con, solver="cccp", trace=FALSE)
minMC2.s  <- summary(minMC2)

For comparison, the summaries of both scenarios are combined into a single data frame with rbind:

Results <- rbind(minMC.s, minMC2.s)
Results[,c("valid", "meanBV", "meanMC", "sKin", "sKinatN")]
##        valid      meanBV    meanMC       sKin    sKinatN
## minMC   TRUE -0.55715788 0.4807247 0.04941089 0.07161696
## minMC2  TRUE  0.02702478 0.4997468 0.04941060 0.07143108

Increase Diversity Between Breeds

While removing introgressed genetic material from the population is one possibility to increase the conservation value of an endangered breed, an alternative approach is to increase the genetic distance between the endangered breed and commercial breeds. In this case we do not care about whether alleles are native or not. We just want to accumulate haplotype segments which are rare in commercial breeds. This can be done with a core set approach.

In the core set approach, a hypothetical subdivided population is considered, consisting of individuals from various breeds. This population is called the core set. The individuals from the breed of interest are the selection candidates. The contributions of each breed to the core set are such that the genetic diversity of the core set is maximized. The parameter to be minimized is the mean kinship of individuals from the core set in the next generation. Thereby, it is assumed that the contributions of the selection candidates from the breed of interest are optimized, whereas individuals from all other breeds have equal contributions.

If the contributions of the selection candidates minimize the mean kinship in the core set, then they maximize the genetic diversity of the core set. This could be achieved by increasing the gentic diversity within the breed or by increasing the genetic distance between the breed of interest and the other breeds.

First, the contributions of the breeds to the core set are calculated with function opticomp. A lower bound is specified for the breed of interest:

CoreSet <- opticomp(Kin$sKin, Cattle$Breed, lb=c(Angler=0.1))
CoreSet$bc
##     Angler  Fleckvieh   Holstein    Rotbunt 
## 0.47908531 0.42425265 0.04434691 0.05231513

Second, the constraints are defined:

con         <- list(ub=c(M=NA, F=-1))
meanKin     <- mean(Kin$sKin[phen$Indiv, phen$Indiv])
con$ub.sKin <- meanKin  + (1-meanKin)*(1/(2*Ne))

Equal contributions are assumed for the females and only the contributions of males are to be optimized. The upper bound for the mean segment based kindship was derived from the effective population size as explained above. Now the optimum contributions of the selection candidates can be calculated with fumction opticont4mb:

minKin4mb <- opticont4mb("min.sKin.acrossBreeds", Kin, phen=Cattle, CoreSet$bc, 
                       thisBreed="Angler", con=con, trace=FALSE)
minKin4mb.s <- summary(minKin4mb)
minKin4mb.s[,c("valid", "meanBV", "meanMC", "sKin", "sKinatN", "sKin.acrossBreeds")]
##           valid     meanBV    meanMC      sKin    sKinatN
## minKin4mb  TRUE -0.2754905 0.5364331 0.0429423 0.05805913
##           sKin.acrossBreeds
## minKin4mb        0.03014492

Note that parameter phen of function opticont4mb is a data frame containing individuals from all genotyped breeds. In contrast, parameter phen of function opticont contains only the selection candidates.

Pedigree-based OCS

All evaluations using pedigree data are demonstrated at the example of the Hinterwald cattle. A pedigree is contained in the package. The pedigree and the functions dealing with pedigree data have already been described in the companion vignette for basic pedigree-based evaluations.

The pedigree completeness is an important factor to get reliable results. If an animal has many missing ancestors, then it would falsely considered to be unrelated to other animals, so it will falsely obtain high optimum contributions. There are several approaches to overcome this problem:

Of course, all 3 approaches can be followed simultaneously. First, we prepare the pedigree and classify the breed of founders born after 1970 to be "unknown":

data(PedigWithErrors)
Pedig <- prePed(PedigWithErrors, thisBreed="Hinterwaelder", lastNative=1970)
head(Pedig)
##                             Indiv Sire  Dam Sex   Breed Born BV
## S276000890888469 S276000890888469 <NA> <NA>   1 unknown   NA NA
## S276000891160479 S276000891160479 <NA> <NA>   1 unknown   NA NA
## S276000811209491 S276000811209491 <NA> <NA>   1 unknown   NA NA
## S276000811063904 S276000811063904 <NA> <NA>   1 unknown   NA NA
## S276000892147965 S276000892147965 <NA> <NA>   1 unknown   NA NA
## S276000891895524 S276000891895524 <NA> <NA>   1 unknown   NA NA

Then we define the individuals with offspring born between 2006 and 2007 with breeding values to be selection candidates if their number of equivalent complete generations is at least 3.0. These are the individuals contained in vector keep. In a real application the individuals would be used as selection candidates that could become parents of the forthcomming birth cohorts.

Summary <- summary(Pedig, keep=Pedig$Born %in% (2006:2007) & !is.na(Pedig$BV))
keep    <- Summary[Summary$equiGen>=5.0, "Indiv"]
table(Pedig[keep, "Sex"])
## 
##   1   2 
## 106 309

A matrix containing the pedigree based kinship between all pairs of individuals can be computed with function pedIBD. It is half the additive relationship matrix. The pedigree based kinship at native alleles can be calculated from the results of function pedIBDatN. Both kinships are computed below and combined into a single R-object with function kinlist. Below, the pedigree based kinship is named pKin, and the kinship at native alleles is named pKinatN:

Kin <- kinlist(
    pKin    = as(pedIBD(Pedig, keep.only=keep), "matrix"),
    pKinatN = pedIBDatN(Pedig, thisBreed="Hinterwaelder", keep.only=keep)
)
## Number of Migrant Founders: 335
## Number of Native  Founders: 166
## Individuals in Pedigree   : 2811
## Mean kinship at native alleles:  0.0774

The breed composition of individuals can be estimated with function pedBreedComp. The migrant contributions MC of the Angler cattle are added as an additional column to the pedigree.

cont     <- pedBreedComp(Pedig, thisBreed="Hinterwaelder")
Pedig$MC <- 1-cont$native
head(cont[keep, 2:6])
##                    native   unknown      unbek0  Fleckvieh Vorderwaelder
## 276000813616211 0.7685547 0.0312500 0.041503906 0.08837891    0.07031250
## 276000813369196 0.5747070 0.0156250 0.048583984 0.11108398    0.25000000
## 276000813151807 0.6916504 0.0156250 0.035888672 0.09667969    0.16015625
## 276000813677682 0.4113770 0.1093750 0.008544922 0.06054688    0.16015625
## 276000813609151 0.6544189 0.0234375 0.011596680 0.08593750    0.22460938
## 276000813315987 0.5856934 0.1875000 0.030395508 0.09680176    0.09960938

Data frame phen defined below contains the individual IDs in Colmumn 1 (Indiv), sexes in Column 2 (Sex), breed names (Breed), years of birth (Born), breeding values (BV), and the migrant contributions (MC) of the selection candidates. The breeding values are simulated such that breeding values and migrant contributions are positively correlated. This mimics historic introgression from a high-yielding commercial breed:

phen <- Pedig[Pedig$Indiv %in% keep, c("Indiv", "Sex", "Breed", "Born", "BV", "MC")]
phen$BV <- phen$BV + 4*(phen$MC-mean(phen$MC))
head(phen[,-1])
##                 Sex         Breed Born         BV        MC
## 276000813616211   1 Hinterwaelder 2006 -1.6308321 0.2314453
## 276000813369196   2 Hinterwaelder 2006  0.1929867 0.4252930
## 276000813151807   2 Hinterwaelder 2006  0.1055514 0.3083496
## 276000813677682   2 Hinterwaelder 2007 -0.1521481 0.5886230
## 276000813609151   1 Hinterwaelder 2006 -1.2954090 0.3455811
## 276000813315987   2 Hinterwaelder 2007 -2.3918885 0.4143066

The parameters that can be constrained or optimized can be viewed with function help.opticont

help.opticont(Kin, phen)
## Available objective functions: 
##        min.pKin  min.pedIBDandN  min.pKinatN  
##        min.BV  min.MC  
##        max.BV  max.MC  
## 
## Available constraints: 
##        ub.pKin  ub.pedIBDandN  ub.pKinatN  
##        lb.BV  lb.MC  
##        ub.BV  ub.MC  
##        eq.BV  eq.MC  
##        ub  lb
## 
## Attention: Probably not all these objective functions and
##            constraints make sense in animal breeding

Compared to the introductory example the possibility to restrict or to minimize migrant contributions becomes available because column MC is now included in data frame phen. Additionally, there is the possibility to minimize or to restrict the kinship at native alleles pKinatN and the pedigree based kinship pKin.

Depending on what the objective of your breeding program is, you may continue reading at the appropriate section:

Maximize Genetic Gain

This is the traditional approach proposed by T. H. E. Meuwissen (1997). First we create a list of constraints:

con         <- list(ub=c(M=NA, F=-1))
meanKin     <- mean(Kin$pKin[phen$Indiv, phen$Indiv])
con$ub.pKin <- meanKin + (1-meanKin)*(1/(2*Ne))

Here, equal contributions are assumed for the females and only the contributions of males are to be optimized. The upper bound for the mean segment based kinship was derived from the effective population size as explained above. Now the optimum contributions of the selection candidates can be calculated:

maxBV   <- opticont(method="max.BV", K=Kin, phen=phen, con=con, trace=FALSE)
maxBV.s <- summary(maxBV)
maxBV.s[,c("valid", "meanBV", "meanMC", "pKin", "pKinatN")]
##       valid   meanBV    meanMC       pKin   pKinatN
## maxBV  TRUE 1.000721 0.5855356 0.03665814 0.0930294

This approach may be apppropriate for a population without introgression and complete pedigrees, but for populations with historic introgression, the kinship at native alleles should be restricted as well in accordance with the desired effective size, and the migrant contributions should be restricted in order not to increase. Otherwise the genetic originality of the breed may get lost in the long term.

meanKinatN     <- mean(Kin$pedIBDandN)/mean(Kin$pedN)
con$ub.pKinatN <- meanKinatN +(1-meanKinatN)*(1/(2*Ne))
con$ub.MC      <- mean(phen$MC)
maxBV2         <- opticont(method="max.BV", K=Kin, phen=phen, con=con, solver="slsqp")
maxBV2.s       <- summary(maxBV2)

For comparison, the summaries of both scenarios are combined into a single data frame with rbind:

Results <- rbind(maxBV.s, maxBV2.s)
Results[,c("valid", "meanBV", "meanMC", "pKin", "pKinatN")]
##        valid    meanBV    meanMC       pKin    pKinatN
## maxBV   TRUE 1.0007207 0.5855356 0.03665814 0.09302940
## maxBV2  TRUE 0.7549783 0.4956402 0.03469981 0.08201541

Genetic gain in Method 2 is only slightly below the genetic gain in Method 1, but the migrant contributions do not increase and the kinship at native alleles increases at a lower rate.

Minimize Inbreeding

Minimizing inbreeding means to minimize the average kinship of the population in order to enable breeders to avoid inbreeding. This is the appropriate approach e.g. for companion animals suffering from a historic bottleneck. It can be done with or without accounting for breeding values. In the example below no breeding values are considered since accurate breeding values are not available for most of these breeds.

First we create a list of constraints:

con  <- list(ub=c(M=NA, F=-1))

Again, equal contributions are assumed for the females and only the contributions of males are to be optimized. The segment based kinship is not constrained in this example because it should be minimized.

minKin   <- opticont(method="min.pKin", K=Kin, phen=phen, con=con, trace=FALSE)
minKin.s <- summary(minKin)
minKin.s[,c("valid", "meanBV", "meanMC", "pKin", "pKinatN")]
##        valid    meanBV    meanMC      pKin    pKinatN
## minKin  TRUE 0.2687059 0.5470033 0.0258729 0.07268504

The approach shown above has the disadvantage that kinships between individuals are less reliable if ancestors are missing in the pedigree. The alternative approach, shown below, is to minimize the kinship at native alleles and to restrict pedigree based kinship.

While in livestock breeds the migrant contributions should be diminished in order to maintain the genetic originality of the breeds, in several companion breeds the opposite is true. Several companion breeds have high inbreeding coefficients and descend from only very few (e.g. 3) founders. Hence, a sufficient genetic diversity cannot be achieved in the population. For these breeds it may be appropriate to use unrelated individuals from a variety of other breeds in order to increase the genetic diversity. However, only a small contribution from other breeds is needed, so the migrant contributions should be restricted also for these breeds in order to preserve their genetic originality. The difference between a breed with high diversity and a breed with low diversity suffering from inbreeding depression is, that the optimum value for the migrant contribution is larger than 0 for the latter. For such a breed it is advisable to allow the use of individuals from other breeds but to restrict the admissible mean contribution from other breeds.

In summary, the alternative approach is to minimize the kinship at native alleles and to restrict pedigree based kinship and migrant contributions:

con  <- list(ub=c(M=NA, F=-1))
con$ub.MC   <- 0.50
meanKin     <- mean(Kin$pKin[phen$Indiv, phen$Indiv])
con$ub.pKin <- meanKin + (1-meanKin)*(1/(2*Ne))

minKin2     <- opticont(method="min.pKinatN", K=Kin, phen=phen, con=con, solver="slsqp")
minKin2.s   <- summary(minKin2)

For comparison, the summaries of both scenarios are combined into a single data frame with rbind:

Results <- rbind(minKin.s, minKin2.s)
Results[,c("valid", "meanBV", "meanMC", "pKin", "pKinatN")]
##         valid     meanBV    meanMC       pKin    pKinatN
## minKin   TRUE 0.26870592 0.5470033 0.02587290 0.07268504
## minKin2  TRUE 0.08188994 0.4817010 0.02880479 0.06857486

The pedigree based kinship is slightly higher in the second approach, but the kinship at native alleles is lower. Since pedigree based kinships are less reliable due to missing ancestors in the pedigree, the second approach is recommended. However, the use of pedigree data has the disadvantage that only the expected kinships can be minimized. The expected kinships deviate from the realized kinships due to mendelian segregation. Hence, for breeds with serious inbreeding problems it is recommended to genotype the selection candidates and to perform marker-based optimum contribution selection.

Recover the Native Genetic Background

For endangered breeds the priority of a breeding program could be to recover the original genetic background by minimizing migrant contributions. However, since the individuals with smallest migrant contributions are related, this may considerably increase the inbreeding coefficients if the diversity at native alleles is not preserved. Hence, constraints are defined below not only for the pedigree based kinship, but also for the kinship at native alleles in accordance with the desired effective size:

con            <- list(ub=c(M=NA, F=-1))
meanKin        <- mean(Kin$pKin[phen$Indiv, phen$Indiv])
meanKinatN     <- mean(Kin$pedIBDandN)/mean(Kin$pedN)
con$ub.pKin    <- meanKin    + (1-meanKin)*(1/(2*Ne))
con$ub.pKinatN <- meanKinatN + (1-meanKinatN)*(1/(2*Ne))
minMC   <- opticont(method="min.MC", K=Kin, phen=phen, con=con, trace=FALSE)
minMC.s <- summary(minMC)
minMC.s[,c("valid", "meanBV", "meanMC", "pKin", "pKinatN")]
##       valid     meanBV    meanMC       pKin   pKinatN
## minMC  TRUE -0.1595658 0.4230985 0.03665832 0.0754081

For some breeds, migrant contributions and breeding values are positively correlated, so minimizing migrant contributions results in negative genetic. This can be avoided by adding an additional constraint for the breeding values:

con$lb.BV <- mean(phen$BV)
minMC2    <- opticont(method="min.MC", K=Kin, phen=phen, con=con, trace=FALSE)
minMC2.s  <- summary(minMC2)

For comparison, the summaries of both scenarios are combined into a single data frame with rbind:

Results <- rbind(minMC.s, minMC2.s)
Results[,c("valid", "meanBV", "meanMC", "pKin", "pKinatN")]
##        valid     meanBV    meanMC       pKin    pKinatN
## minMC   TRUE -0.1595658 0.4230985 0.03665832 0.07540810
## minMC2  TRUE  0.1018865 0.4269057 0.03665823 0.07581931

References

Meuwissen, T H E. 2009. “Genetic Management of Small Populations: A Review.” Acta Agriculturae Scand Section A 59: 71–79.

Meuwissen, T. H. E. 1997. “Maximising the Response of Selection with a Predefined Rate of Inbreeding.” J. Animal Sci 75: 934–40.

Wellmann, R., and I. Pfeiffer. 2009. “Pedigree Analysis for Conservation of Genetic Diversity and Purging.” Genetical Research (Camb) 91 (34).