In this exercise, you will simulate a base population and look at the randomness of DNA inheritance between parents and progeny (the DNA lottery) and how this process drives variation between genetic and phenotype values of relatives in the following steps:
For each step, we have given you instructions and you have to write
AlphaSimR
code yourself (replace ??? with an appropriate
code).
Use the runMacs()
function to simulate 6 individuals
(nInd = 6
), 2 chromosomes (nChr = 2
), and
capture 100 loci per chromosome (segSites = 100
). Simulate
a general population, so you can omit the species
argument.
# Clean the working environment
rm(list = ls())
# Set the default plot layout
par(mfrow = c(1, 1))
# Load AlphaSimR and simulate founder genomes
library(AlphaSimR)
## Loading required package: R6
founderGenomes = runMacs(nInd = 6,
nChr = 2,
segSites = 100)
Create the SP
object holding simulation parameters. Add
a trait with simple additive genetic architecture, controlled by 50 loci
on each chromosome, and with the mean of 100 units and genetic variance
of 100 unit\(^2\).
SP = SimParam$new(founderGenomes)
SP$addTraitA(nQtlPerChr = 50, mean = 100, var = 100)
Use the newPop()
function to create a base population of
individuals from the founding genomes. Save heritability of 0.3 in a
variable heritability
. Then generate phenotypes for these
individuals by using the setPheno()
function by using the
variable heritability
.
# Base population
basePop = newPop(founderGenomes)
# Phenotype base population individuals
heritability = 0.3
basePop = setPheno(pop = basePop,
h2 = heritability)
Explore genetic and phenotype values of the base population using
gv()
and pheno()
functions.
# Genetic values
gv(basePop)
## Trait1
## [1,] 100.04260
## [2,] 104.32673
## [3,] 80.79562
## [4,] 113.95685
## [5,] 97.49886
## [6,] 103.37934
# Phenotype values
pheno(basePop)
## Trait1
## [1,] 116.38216
## [2,] 103.16936
## [3,] 55.69816
## [4,] 116.06678
## [5,] 109.43900
## [6,] 75.01508
To show the extent of variation in progeny from given parents, you
will now create three crosses, each with 100 progeny. The first cross
will be between individuals 1 and 2, the second cross will be between
individuals 1 and 4, and the third cross will be between individuals 5
and 6. Therefore you will create 2 half-sib crosses and 1 unrelated
cross. To make a cross use the function makeCross()
. Also,
phenotype the progeny using setPheno
and saved
heritability.
# First cross - between founders 1 and 2
cross12 = makeCross(pop = basePop,
crossPlan = matrix(c(1, 2), ncol = 2),
nProgeny = 100)
cross12 = setPheno(pop = cross12,
h2 = heritability)
# Second cross - between founders 1 and 4
cross14 = makeCross(pop = basePop,
crossPlan = matrix(c(1, 4), ncol = 2),
nProgeny = 100)
cross14 = setPheno(pop = cross14,
h2 = heritability)
# Third cross - between founders 5 and 6
cross56 = makeCross(pop = basePop,
crossPlan = matrix(c(5, 6), ncol = 2),
nProgeny = 100)
cross56 = setPheno(pop = cross14,
h2 = heritability)
Now analyse the variation in genetic and phenotype values within and across crosses. First, save genetic and phenotype values of parents and their progeny.
# Save genetic and phenotype values of parents
gvPar = gv(basePop)
gvPar1 = gvPar[1]
gvPar2 = gvPar[2]
gvPar4 = gvPar[4]
gvPar5 = gvPar[5]
gvPar6 = gvPar[6]
phenoPar = pheno(basePop)
phenoPar1 = phenoPar[1]
phenoPar2 = phenoPar[2]
phenoPar4 = phenoPar[4]
phenoPar5 = phenoPar[5]
phenoPar6 = phenoPar[6]
# Save genetic and phenotype values of progeny in each cross
gvCross12 = gv(cross12)
gvCross14 = gv(cross14)
gvCross56 = gv(cross56)
phenoCross12 = pheno(cross12)
phenoCross14 = pheno(cross14)
phenoCross56 = pheno(cross56)
Calculate the range of genetic and phenotype values across parents and all their progeny.
# Total range to set x-axis in the histograms below
(rangeGv = range(c(gvPar, gvCross12, gvCross14, gvCross56)))
## [1] 80.79562 122.68084
(rangePheno = range(c(phenoPar, phenoCross12, phenoCross14, phenoCross56)))
## [1] 55.69816 158.27209
(rangeVal = range(c(rangeGv, rangePheno)))
## [1] 55.69816 158.27209
To visualise the variation of genetic and phenotype values in parents and among their progeny, plot a histogram of genetic values and then another histogram of phenotype values. Add vertical lines for the values in parents and the average value between the two parents. Colour the parent 1 line as blue, parent 2 line as red, parent 4 line as green, parent 5 line as orange, and parent 6 line as purple. Also, colour the parent average line as black.
by = sqrt(SP$varA) / 2
bins = seq(from = floor(rangeVal[1]) - by, to = ceiling(rangeVal[2]) + by, by = by)
# First cross - between founders 1 and 2
par(mfrow = c(2, 1),
mar = c(4, 4, 2, 1))
hist(gvCross12,
xlim = rangePheno, breaks = bins,
xlab = "Genetic value")
abline(v = gvPar1, col = "blue", lwd = 3)
abline(v = gvPar2, col = "red", lwd = 3)
abline(v = (gvPar1 + gvPar2) / 2, col = "black", lwd = 3, lty = 2)
hist(phenoCross12,
xlim = rangePheno, breaks = bins,
xlab = "Phenotype value")
abline(v = phenoPar1, col = "blue", lwd = 3)
abline(v = phenoPar2, col = "red", lwd = 3)
abline(v = (phenoPar1 + phenoPar2) / 2, col = "black", lwd = 3, lty = 2)
# Second cross - between founders 1 and 4
par(mfrow = c(2, 1),
mar = c(4, 4, 2, 1))
hist(gvCross14,
xlim = rangePheno, breaks = bins,
xlab = "Genetic value")
abline(v = gvPar1, col = "blue", lwd = 3)
abline(v = gvPar4, col = "green", lwd = 3)
abline(v = (gvPar1 + gvPar4) / 2, col = "black", lwd = 3, lty = 2)
hist(phenoCross14,
xlim = rangePheno, breaks = bins,
xlab = "Phenotype value")
abline(v = phenoPar1, col = "blue", lwd = 3)
abline(v = phenoPar4, col = "green", lwd = 3)
abline(v = (phenoPar1 + phenoPar4) / 2, col = "black", lwd = 3, lty = 2)
# Third cross - between founders 5 and 6
par(mfrow = c(2, 1),
mar = c(4, 4, 2, 1))
hist(gvCross56,
xlim = rangePheno, breaks = bins,
xlab = "Genetic value")
abline(v = gvPar5, col = "orange", lwd = 3)
abline(v = gvPar6, col = "darkgray", lwd = 3)
abline(v = (gvPar5 + gvPar6) / 2, col = "black", lwd = 3, lty = 2)
hist(phenoCross56,
xlim = rangePheno, breaks = bins,
xlab = "Phenotype value")
abline(v = phenoPar5, col = "orange", lwd = 3)
abline(v = phenoPar6, col = "darkgray", lwd = 3)
abline(v = (phenoPar5 + phenoPar6) / 2, col = "black", lwd = 3, lty = 2)
# All crosses together
par(mfrow = c(2, 1),
mar = c(4, 4, 2, 1))
hist(c(gvCross12, gvCross14, gvCross56),
xlim = rangePheno, breaks = bins,
xlab = "Genetic value")
abline(v = gvPar1, col = "blue", lwd = 3)
abline(v = gvPar2, col = "red", lwd = 3)
abline(v = gvPar4, col = "green", lwd = 3)
abline(v = gvPar5, col = "orange", lwd = 3)
abline(v = gvPar6, col = "darkgray", lwd = 3)
hist(c(phenoCross12, phenoCross14, phenoCross56),
xlim = rangePheno, breaks = bins,
xlab = "Phenotype value")
abline(v = phenoPar1, col = "blue", lwd = 3)
abline(v = phenoPar2, col = "red", lwd = 3)
abline(v = phenoPar4, col = "green", lwd = 3)
abline(v = phenoPar5, col = "orange", lwd = 3)
abline(v = phenoPar6, col = "darkgray", lwd = 3)
What can you conclude about within and across crosses variation based on these histograms? Variation of genetic values within each family is substantial compared to the genetic values of parents. This large variation among progeny is caused by the DNA lottery of parental genomes, as we have seen before when we looked only in variation of progeny genomes. However, here we are looking at the variation of traits, which is in addition to the variation in genomes, influenced by QTL effects. That is, some QTL effects are positive, some are negative, they can be large or small, and possibly involved in complex interactions. While genetic values are not observable, phenotype values are. As shown in the above histograms, phenotype values vary even more than genetic values because of the additional source of variation from the environment. Sometimes, we can even see that a parent with a higher genetic value has a lower phenotype value than the other parent, due to environmental effects.
Extend the above exercise by increasing the number of chromosomes
(say, use 10), increasing the number of sites per chromosomes (say, use
1000), and trait complexity - simulate a polygenic trait with additive,
dominance, and epistatic genetic effects (see the
addTraitADE()
function in the SimParam()
help
page; say, use
SimParam$addTraitADE(nQtlPerChr = 1000, mean = 100, var = 100, meanDD = 0.2, varDD = 0.5, relAA = 0.2)
).