#' @title Exclude markers depending on proportions of homo/hetorozygous
#' @title Exclude markers depending on Mendelian proportions
#'
#' @description This function uses the dataframe produced by the tab_mark function and fills the "exclude" column for all the markers that present too much missing genotypes or odd proportions of each genotype.
#' You can define these proportions thanks to the arguments of the function.
#' Specific function arguments can be used to handle genotype proportions for markers on X chromosome.
#'
#' For autosomes, use the homo and hetero arguments or the pval argument. With homo and hetero, mark_prop() excludes markers if the proportion of homozygous or heterozygous individuals is inferior to the value assigned to each argument.
#' With pval, mark_prop() sorts markers with a Chi2 test. If the p-value of the test is significative and inferior to the value assigned to pval argument, the marker is excluded.
#' For X chromosomes, the arguments homo1X, homo2X and heteroX are vectors with two values that correspond to the lower and the upper limit of the proportion of each genotype.
#' homo1X corresponds to the homozygous genotype with the highest proportion while homo2X corresponds to the homozygous genotype with the lowest genotype. If a genotype must not be found, use c(0,0) as limits.
#'
#'
#' @description This function uses the dataframe produced by the tab_mark function and fills the "exclude" column for all the markers that present too much missing genotypes or odd proportions of each genotype. You can define these proportions thanks to the arguments of the function. The filter on genotype proportions applies to autosomes only (and not the chromosomes encoded as "X", "Y" and "M")
#' @param tab data frame obtained with tab_mark function.
#' @param tab data frame obtained with tab_mark function.
#' @param cross F2 or N2.
#' @param cross F2 or N2.
#' @param homo proportion of homozygous individuals under which the marker is excluded. Will apply on both homozygous genotypes for a F2, but only on one for N2.
#' @param homo proportion of homozygous individuals under which the marker is excluded. Will apply on both homozygous genotypes for a F2, but only on one for N2.
...
@@ -18,7 +26,7 @@
...
@@ -18,7 +26,7 @@
#### mark_prop ####
#### mark_prop ####
## excludes markers depending on proportions of homo/hetorozygous
## excludes markers depending on proportions of homo/hetorozygous
\title{Exclude markers depending on proportions of homo/hetorozygous}
\title{Exclude markers depending on Mendelian proportions}
\usage{
\usage{
mark_prop(tab, cross, homo = NA, hetero = NA, pval = NA, na = 0.5)
mark_prop(
tab,
cross,
homo = NA,
hetero = NA,
pval = NA,
homo1X = NULL,
homo2X = NULL,
heteroX = NULL,
na = 0.5
)
}
}
\arguments{
\arguments{
\item{tab}{data frame obtained with tab_mark function.}
\item{tab}{data frame obtained with tab_mark function.}
...
@@ -15,8 +25,21 @@ mark_prop(tab, cross, homo = NA, hetero = NA, pval = NA, na = 0.5)
...
@@ -15,8 +25,21 @@ mark_prop(tab, cross, homo = NA, hetero = NA, pval = NA, na = 0.5)
\item{hetero}{proportion of heterozygous individuals under which the marker is excluded.}
\item{hetero}{proportion of heterozygous individuals under which the marker is excluded.}
\item{homo1X}{a vector of two numbers. The lower and upper limits for the proportion of homozygous individuals for markers on X chromosome. This argument is for homozygous genotype with the highest expected proportion.}
\item{homo2X}{a vector of two numbers. The lower and upper limits for the proportion of homozygous individuals for markers on X chromosome. This argument is for homozygous genotype with the lowest expected proportion.}
\item{heteroX}{a vector of two numbers. The lower and upper limits for the proportion of heterozygous individuals for markers on X chromosome.}
\item{na}{proportion of non-genotyped individuals above which the marker is excluded.}
\item{na}{proportion of non-genotyped individuals above which the marker is excluded.}
}
}
\description{
\description{
This function uses the dataframe produced by the tab_mark function and fills the "exclude" column for all the markers that present too much missing genotypes or odd proportions of each genotype. You can define these proportions thanks to the arguments of the function. The filter on genotype proportions applies to autosomes only (and not the chromosomes encoded as "X", "Y" and "M")
This function uses the dataframe produced by the tab_mark function and fills the "exclude" column for all the markers that present too much missing genotypes or odd proportions of each genotype.
You can define these proportions thanks to the arguments of the function.
Specific function arguments can be used to handle genotype proportions for markers on X chromosome.
For autosomes, use the homo and hetero arguments or the pval argument. With homo and hetero, mark_prop() excludes markers if the proportion of homozygous or heterozygous individuals is inferior to the value assigned to each argument.
With pval, mark_prop() sorts markers with a Chi2 test. If the p-value of the test is significative and inferior to the value assigned to pval argument, the marker is excluded.
For X chromosomes, the arguments homo1X, homo2X and heteroX are vectors with two values that correspond to the lower and the upper limit of the proportion of each genotype.
homo1X corresponds to the homozygous genotype with the highest proportion while homo2X corresponds to the homozygous genotype with the lowest genotype. If a genotype must not be found, use c(0,0) as limits.
The `mark_prop()` function can be used to filter markers depending on the proportion of each genotype. Here, we have a F2 and we use `homo=0.1, hetero=0.1` so the function will exclude all markers with less than 10% of each genotype. Moreover, this function allows to filter marker depending on the proportion on non genotyped animals. By defaults, markers for which more than 50% of individuals were not genotyped.
The `mark_prop()` function can be used to filter markers depending on the proportion of each genotype. Here, we have a F2 and we use `homo=0.1, hetero=0.1` so the function will exclude all markers with less than 10% of each genotype. Moreover, this function allows to filter marker depending on the proportion on non genotyped animals. By defaults, markers for which more than 50% of individuals were not genotyped. For chromosome X, we use the `homo=0.1, hetero=0.1`
We could also use the `pval` argument which allows to exclude markers by performing a Chi2 test comparing observed distribution with Mendelian proportions. By using `pval=0.5` we would exclude all markers with a Chi2 p-value inferior to 0.05. However, for some markers, Chi2 approximation may be incorrect.
We could also use the `pval` argument which allows to exclude markers by performing a Chi2 test comparing observed distribution with Mendelian proportions. By using `pval=0.5` we would exclude all markers with a Chi2 p-value inferior to 0.05. However, for some markers, Chi2 approximation may be incorrect.
...
@@ -178,6 +175,10 @@ The cross object was saved in stuart. Here we can load it as well as the newmap
...
@@ -178,6 +175,10 @@ The cross object was saved in stuart. Here we can load it as well as the newmap