In this notebook, we will show you the basic usages of rTASSEL
. rTASSEL
can be utilized in genetic analysis pipelines in several ways:
rTASSEL
can simply be loaded using the library()
method once installed:
library(rTASSEL)
In the following code block, we will load example phenotype and genotype data from a file path within the rTASSEL
package. Once loaded, we can use rTASSEL
's readGenotypePhenotype
method to create an rTASSEL
data object from the file paths:
# Load gentoype info
genoPathHMP <- system.file("extdata", "mdp_genotype.hmp.txt", package = "rTASSEL")
# Load phenotype info
phenoPath <- system.file("extdata", "mdp_traits.txt", package = "rTASSEL")
# Create rTASSEL data object
tasGenoPheno <- readGenotypePhenotype(
genoPathOrObj = genoPathHMP,
phenoPathDFOrObj = phenoPath
)
# View data object
tasGenoPheno
In TASSEL, for mixed linear model analyses, a kinship matrix calculated from genotype data is necessary. This can be accomplished by calculating a kinship TASSEL object using the function kinshipMatrix()
from rTASSEL
. The main parameter input is a TasselGenotypePhenotype
class object that contains a genotype data set (i.e. our prior data object):
tasKin <- kinshipMatrix(tasObj = tasGenoPheno)
# Show summary object
tasKin
Since this object is essentially a TASSEL reference object, it is relatively not of use until association analysis. We can coerce this TASSEL object by using the following function, as.matrix()
from base R to return a matrix
object:
# Get full R matrix
tasKinRMat <- as.matrix(tasKin)
# Inspect the first 5 rows and columns
tasKinRMat[1:5, 1:5]
One of TASSEL's most powerful functionalities is its capability of performing a variety of different association modeling techniques. If you have started reading the walkthrough here it is strongly suggested that you read the other components of this walkthrough since the following parameters require what we have previously created!
If you are not familar with these methods, more information about how these operate in base TASSEL can be found at following links:
The assocModelFitter()
function has several primary components:
tasObj
: a TasselGenotypePhenotype
class R objectformula
: an R-based linear model formulafitMarkers
: a boolean parameter to differentiate between BLUE and GLM
analyseskinship
: a TASSEL kinship objectfastAssociation
: a boolean parameter for data sets that have many traitsProbably the most important concept of this function is formula
parameter.
If you are familar with standard R linear model functions, this concept is
fairly similar. In TASSEL, a linear model is composed of the following scheme:
Where $y$ is any TASSEL data
type and $A_{n}$ is any TASSEL covariate
and / or factor
types. In the following example, we will fit a mixed linear model (MLM) with our rTASSEL
data object. In addition to the prior parameters, we will also need the TASSEL kinship object:
# Calculate MLM
tasMLM <- assocModelFitter(
tasObj = tasGenoPheno, # <- our prior TASSEL object
formula = EarHT ~ ., # <- run only EarHT
fitMarkers = TRUE, # <- set this to TRUE for GLM
kinship = tasKin, # <- our prior kinship object
fastAssociation = FALSE
)
# Return names of objects returned
names(tasMLM)
For more information about the extent of these methods, please read this section of our introductory vignette.
rTASSEL
supports automated visualizations for Manhattan plots and linkage disequilibrium (LD) analyses. In this example, we will generate a Manhattan plot for our prior MLM object:
# Generate Manhattan plot for ear height trait
manhattanEH <- manhattanPlot(
assocStats = tasMLM$MLM_Stats,
trait = "EarHT",
threshold = 5
)
# View visualization
manhattanEH
Similarly, we can also visualize LD using automated methods. Like most LD plots, it is wise to filter your genotype information to a specific region of interest:
# Filter genotype table by position
tasGenoPhenoFilt <- filterGenotypeTableSites(
tasObj = tasGenoPheno,
siteRangeFilterType = "position",
startPos = 228e6,
endPos = 300e6,
startChr = 2,
endChr = 2
)
# Generate and visualize LD
myLD <- ldPlot(
tasObj = tasGenoPhenoFilt,
ldType = "All",
plotVal = "r2",
verbose = FALSE
)
# View LD
myLD
rTASSEL
also allows for phenotypic prediction through genotype information via genomic best linear unbiased predictors (gBLUPs):
tasCV <- genomicPrediction(
tasPhenoObj = tasGenoPheno,
kinship = tasKin,
doCV = TRUE,
kFolds = 5,
nIter = 1
)
head(tasCV)