|
|
|
# Input format
|
|
|
|
|
|
|
|
- **Genotypes bim file**
|
|
|
|
|
|
|
|
This is a PLINK file, with columns separated by tabulations and no header line. It contains one line per variant with the following six fields: chromosome, variant identifier, position in morgans or centimorgans, base-pair coordinate, allele 1 and allele 2.
|
|
|
|
|
|
|
|
Example:
|
|
|
|
|
|
|
|
*(chromosome)* | *(variant identifier)* | *(position)* | *(base-pair coordinate)* | *(A1)* | *(A2)*
|
|
|
|
:---: | :-------: | :----: | :-----: | :---: | :---:
|
|
|
|
1 | rs123456 | 7568 | 15411 | A | T
|
|
|
|
5 | rs6715 | 89863 | 41347 | G | A
|
|
|
|
21 | rs75354 | 148962 | 305716 | C | A
|
|
|
|
|
|
|
|
|
|
|
|
- **Genotypes raw file**
|
|
|
|
|
|
|
|
This is a PLINK file, with columns separated by spaces and a header line. It contains one line per sample with V+6 fields, where V is the number of variants.
|
|
|
|
To recode bed/bim/fam to raw file, use this command on PLINK:
|
|
|
|
|
|
|
|
```bash
|
|
|
|
plink --bfile $inputFile --recodeA --out $outputFile
|
|
|
|
```
|
|
|
|
|
|
|
|
Example:
|
|
|
|
|
|
|
|
FID | IID | PAT | MAT | SEX | PHENOTYPE | SNP1 | SNP2 | SNP3 | ..........
|
|
|
|
:---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---:
|
|
|
|
1 | 1 | 0 | 0 | 2 | 0 | 0 | 1 | 2 | ..........
|
|
|
|
2 | 2 | 0 | 0 | 1 | 0 | 1 | 0 | 2 | ..........
|
|
|
|
|
|
|
|
|
|
|
|
- **Phenotypes file**
|
|
|
|
|
|
|
|
This is a text file, with columns separated by tabulations and a header line. In contains one line per individual. First column must be the individual ID.
|
|
|
|
|
|
|
|
Example:
|
|
|
|
|
|
|
|
ID | Sex | Age | LDL-C | HDL-C | HDL-D | HDL-TG | ..........
|
|
|
|
:---: | :---: | :---: | :---: | :---: | :---: | :---: | :---:
|
|
|
|
1 | 1 | 45 | 0.1 | 0.48 | 0.85 | 0.89 | ..........
|
|
|
|
2 | 1 | 32 | 0.2 | 0.65 | 0.1 | 0.41 | ..........
|
|
|
|
3 | 2 | 47 | 0.8 | 0.21 | 0.5 | 0.3 | ..........
|
|
|
|
|
|
|
|
|
|
|
|
- **Summary file**
|
|
|
|
|
|
|
|
This is a csv file with columns separated by commas and a header line. This file aims at describing the role of each variable contained in the phenotypes file. For each selected variable, the user must provide a label and a binary indicator for classification as confounding factors (i.e. variables systematically included as covariates), outcome (i.e. each single variable that will be treated as a primary outcome) and candidate covariates (i.e. variables that will be assessed by CMS for inclusion as a covariate).
|
|
|
|
`Note that variables classified as confounding factor cannot be used as either outome or covariate, and such combination will be flagged as an error.`
|
|
|
|
By default, all variables in "Covariates" column will be included as covariates in each outcome analysis. The "Excluded" column give the opportunity to exclude specific variables from covariates for a given outcome. These variables must be separated by ";" without any spaces. If no variables need to be excluded, simply let the column empty. In the example, we exclude all "HDL" variable when analysing one of them.
|
|
|
|
|
|
|
|
Example:
|
|
|
|
|
|
|
|
Label | Conf | Outcome | Covariate | Excluded
|
|
|
|
:---: | :---: | :---: | :---: | :---:
|
|
|
|
Sex | 1 | 0 | 0 |
|
|
|
|
Age | 1 | 0 | 0 |
|
|
|
|
LDL-C | 0 | 1 | 1 |
|
|
|
|
HDL-C | 0 | 1 | 1 | HDL-D;HDL-TG
|
|
|
|
HDL-D | 0 | 1 | 1 | HDL-C;HDL-TG
|
|
|
|
HDL-TG | 0 | 1 | 1 | HDL-C;HDL-D |
|
|
|
\ No newline at end of file |