Package 'mrMLM'

Title: Multi-Locus Random-SNP-Effect Mixed Linear Model Tools for GWAS
Description: Conduct multi-locus genome-wide association study under the framework of multi-locus random-SNP-effect mixed linear model (mrMLM). First, each marker on the genome is scanned. Bonferroni correction is replaced by a less stringent selection criterion for significant test. Then, all the markers that are potentially associated with the trait are included in a multi-locus genetic model, their effects are estimated by empirical Bayes, and all the nonzero effects were further identified by likelihood ratio test for significant QTL. The program may run on a desktop or laptop computers. If marker genotypes in association mapping population are almost homozygous, these methods in this software are very effective. If there are many heterozygous marker genotypes, the IIIVmrMLM software is recommended. Wen YJ, Zhang H, Ni YL, Huang B, Zhang J, Feng JY, Wang SB, Dunwell JM, Zhang YM, Wu R (2018, <doi:10.1093/bib/bbw145>), and Li M, Zhang YW, Zhang ZC, Xiang Y, Liu MH, Zhou YH, Zuo JF, Zhang HQ, Chen Y, Zhang YM (2022, <doi:10.1016/j.molp.2022.02.012>).
Authors: Ya-Wen Zhang [aut], Jing-Tian Wang [aut], Pei Li [aut], Yuan-Ming Zhang [aut, cre]
Maintainer: Yuan-Ming Zhang <[email protected]>
License: GPL (>= 2)
Version: 5.0.1
Built: 2024-11-07 03:41:51 UTC
Source: https://github.com/cran/mrMLM

Help Index


process raw data

Description

process raw data for later use

Usage

DoData(genRaw,Genformat,pheRaw1q,kkRaw,psmatrixRaw,covmatrixRaw,trait,
type,PopStrType)

Arguments

genRaw

raw genotype matrix.

Genformat

genotype format.

pheRaw1q

raw phenotype matrix.

kkRaw

raw kinship matrix.

psmatrixRaw

raw population structure matrix.

covmatrixRaw

raw covariate matrix.

trait

which trait to analysis.

type

which type to transform.

PopStrType

The type of population structure.

Author(s)

Zhang Ya-Wen, Wang Jing-Tian, Li Pei, Zhang Yuan-Ming
Maintainer: Yuan-Ming Zhang<[email protected]>

Examples

G1=data(Gen)
P1=data(Phe)
readraw=ReadData(fileGen=Gen,filePhe=Phe,fileKin=NULL,filePS =NULL,
fileCov=NULL,Genformat=1)
result=DoData(readraw$genRaw,Genformat=1,readraw$pheRaw1q,readraw$kkRaw,
readraw$psmatrixRaw,readraw$covmatrixRaw,trait=1,type=2,PopStrType=NULL)

To perform GWAS with FASTmrEMMA method

Description

FAST multi-locus random-SNP-effect EMMA

Usage

FASTmrEMMA(gen,phe,outATCG,genRaw,kk,psmatrix,svpal,svmlod,Genformat,Likelihood,CLO)

Arguments

gen

genotype matrix.

phe

phenotype matrix.

outATCG

genotype for code 1.

genRaw

raw genotype.

kk

kinship matrix.

psmatrix

population structure matrix.

svpal

Critical P-value for selecting variable.

svmlod

Critical LOD score for significant QTN.

Genformat

Format for genotypic codes.

Likelihood

restricted maximum likelihood (REML) and maximum likelihood (ML).

CLO

number of CPU.

Author(s)

Zhang Ya-Wen, Wang Jing-Tian, Li Pei, Zhang Yuan-Ming
Maintainer: Yuan-Ming Zhang<[email protected]>

Examples

G1=data(Gen)
P1=data(Phe)
Readraw=ReadData(fileGen=Gen,filePhe=Phe,fileKin=NULL,filePS =NULL,
Genformat=1)
InputData=inputData(readraw=Readraw,Genformat=1,method="FASTmrEMMA",trait=1)
result=FASTmrEMMA(InputData$doFME$gen,InputData$doFME$phe,
InputData$doFME$outATCG,InputData$doFME$genRaw,
InputData$doFME$kk,InputData$doFME$psmatrix,0.005,
svmlod=3,Genformat=1,Likelihood="REML",CLO=1)

To perform GWAS with FASTmrMLM method

Description

FAST multi-locus random-SNP-effect Mixed Linear Model

Usage

FASTmrMLM(gen,phe,outATCG,genRaw,kk,psmatrix,svpal,svrad,svmlod,Genformat,CLO)

Arguments

gen

genotype matrix.

phe

phenotype matrix.

outATCG

genotype for code 1.

genRaw

raw genotype.

kk

kinship matrix.

psmatrix

population structure matrix.

svpal

Critical P-value for selecting variable.

svrad

Search Radius in search of potentially associated QTN.

svmlod

Critical LOD score for significant QTN.

Genformat

Format for genotypic codes.

CLO

number of CPU.

Author(s)

Zhang Ya-Wen, Wang Jing-Tian, Li Pei, Zhang Yuan-Ming
Maintainer: Yuan-Ming Zhang<[email protected]>

Examples

G1=data(Gen)
P1=data(Phe)
Readraw=ReadData(fileGen=Gen,filePhe=Phe,fileKin=NULL,filePS =NULL,
Genformat=1)
InputData=inputData(readraw=Readraw,Genformat=1,method="FASTmrMLM",trait=1)
result=FASTmrMLM(InputData$doMR$gen,InputData$doMR$phe,
InputData$doMR$outATCG,InputData$doMR$genRaw,
InputData$doMR$kk,InputData$doMR$psmatrix,0.01,svrad=20,
svmlod=3,Genformat=1,CLO=1)

Genotype data

Description

Numeric format of genotype dataset.

Usage

data(Gen)

Details

Dataset input of Genotype for mrMLM function.

Author(s)

Maintainer: Yuan-Ming Zhang<[email protected]>


Genotype of real data

Description

Numeric format of genotype dataset.

Usage

data(Genotype)

Details

Dataset input of Genotype for mrMLM function.

Author(s)

Maintainer: Yuan-Ming Zhang<[email protected]>


Input data which have been transformed

Description

Input all the dataset which have been transformed

Usage

inputData(readraw,Genformat,method,trait,PopStrType)

Arguments

readraw

genotype matrix.

Genformat

genotype format.

method

which method to analysis.

trait

which trait to analysis.

PopStrType

The type of population structure.

Author(s)

Zhang Ya-Wen, Wang Jing-Tian, Li Pei, Zhang Yuan-Ming
Maintainer: Yuan-Ming Zhang<[email protected]>

Examples

G1=data(Gen)
P1=data(Phe)
Readraw=ReadData(fileGen=Gen,filePhe=Phe,fileKin=NULL,filePS =NULL,
fileCov=NULL,Genformat=1)
result=inputData(readraw=Readraw,Genformat=1,method="mrMLM",trait=1,
PopStrType=NULL)

To perform GWAS with ISIS EM-BLASSO method

Description

Iterative Sure Independence Screening EM-Bayesian LASSO

Usage

ISIS(gen,phe,outATCG,genRaw,kk,psmatrix,svpal,svmlod,Genformat,CLO)

Arguments

gen

genotype matrix.

phe

phenotype matrix.

outATCG

genotype for code 1.

genRaw

raw genotype.

kk

kinship matrix.

psmatrix

population structure matrix.

svpal

Critical P-value for selecting variable.

svmlod

Critical LOD score for significant QTN.

Genformat

Format for genotypic codes.

CLO

number of CPU.

Author(s)

Zhang Ya-Wen, Li Pei, Zhang Yuan-Ming
Maintainer: Yuan-Ming Zhang<[email protected]>

Examples

G1=data(Gen)
P1=data(Phe)
Readraw=ReadData(fileGen=Gen,filePhe=Phe,fileKin=NULL,filePS =NULL,
Genformat=1)
InputData=inputData(readraw=Readraw,Genformat=1,method="ISIS EM-BLASSO",
trait=1)
result=ISIS(InputData$doMR$gen,InputData$doMR$phe,InputData$doMR$outATCG,
InputData$doMR$genRaw,InputData$doMR$kk,InputData$doMR$psmatrix,
0.01,svmlod=3,Genformat=1,CLO=1)

Multi-Locus Random-SNP-Effect Mixed Linear Model Tools for GWAS

Description

Conduct multi-locus genome-wide association study under the framework of multi-locus random-SNP-effect mixed linear model (mrMLM). First, each marker on the genome is scanned. Bonferroni correction is replaced by a less stringent selection criterion for significant test. Then, all the markers that are potentially associated with the trait are included in a multi-locus genetic model, their effects are estimated by empirical Bayes, and all the nonzero effects were further identified by likelihood ratio test for true QTL. The program may run on a desktop or laptop computers. If marker genotypes in association mapping population are almost homozygous, these methods in this software are very effective. If there are many heterozygous marker genotypes, the IIIVmrMLM software is recommended. Wen YJ, Zhang H, Ni YL, Huang B, Zhang J, Feng JY, Wang SB, Dunwell JM, Zhang YM, Wu R (2018, <doi:10.1093/bib/bbw145>), and Li M, Zhang YW, Zhang ZC, Xiang Y, Liu MH, Zhou YH, Zuo JF, Zhang HQ, Chen Y, Zhang YM (2022, <doi:10.1016/j.molp.2022.02.012>).

Usage

mrMLM(fileGen,filePhe,fileKin,filePS,PopStrType,fileCov,Genformat,
method,Likelihood,trait,SearchRadius,CriLOD,SelectVariable,Bootstrap,
DrawPlot,Plotformat,dir,PC,RAM)

Arguments

fileGen

File path and name in your computer of Genotype, i.e.,"D:/Users/Genotype_num.csv".

filePhe

File path and name in your computer of Phenotype, i.e.,"D:/Users/Phenotype.csv".

fileKin

File path and name in your computer of Kinship, i.e.,"D:/Users/Kinship.csv".

filePS

File path and name in your computer of Population Structure,i.e.,"D:/Users/PopStr.csv".

PopStrType

The type of population structure,i.e.,Q (Q matrix), PCA (principal components), EvolPopStr (evolutionary population structure).

fileCov

File path and name in your computer of covariate, i.e.,"D:/Users/Covariate.csv".

Genformat

Format for genotypic codes, Num (number), Cha (character) and Hmp (Hapmap).

method

Six multi-locus GWAS methods. Users may select one to six methods, including mrMLM, FASTmrMLM, FASTmrEMMA, pLARmEB, pKWmEB and ISIS EM-BLASSO.

Likelihood

This parameter is only for FASTmrEMMA, including REML(restricted maximum likelihood) and ML(maximum likelihood).

trait

Traits analyzed from number 1 to number 2,i.e.,1:2.

SearchRadius

This parameter is only for mrMLM and FASTmrMLM, indicating Search Radius in search of potentially associated QTN,the default is 20.

CriLOD

Critical LOD score for significant QTN.

SelectVariable

This parameter is only for pLARmEB. SelectVariable=50 indicates that 50 potentially associated variables are selected from each chromosome. Users may change this number in real data analysis in order to obtain the best results as final results,the default is 50.

Bootstrap

This parameter is only for pLARmEB, including FASLE and TRUE, Bootstrap=FALSE indicates the analysis of only real dataset, Bootstrap=TRUE indicates the analysis of both real dataset and four resampling datasets,the default is FALSE.

DrawPlot

This parameter is for all the six methods, including FALSE and TRUE, DrawPlot=FALSE indicates no figure output, DrawPlot=TRUE indicates the output of the Manhattan, QQ figures,the default is TRUE.

Plotformat

This parameter is for all the figure files, including *.jpeg, *.png, *.tiff and *.pdf,the default is "tiff".

dir

This parameter is for the save path,i.e.,"D:/Users"

PC

This parameter is used to specify whether only small RAM device is available to run the mrMLM program, such as desktop or laptop. The default value is PC=FALSE. PC=TRUE indicates running the program on low RAM desktop or laptop.

RAM

This parameter is the RAM of your desktop or laptop. The default value is RAM=4. RAM=4 indicates the RAM of your device is 4G.

Details

Package: mrMLM
Type: Package
Version: 5.0.1
Date: 2022-3-27
Depends: lars
Imports: methods,foreach,ncvreg,coin,sampling,data.table,doParallel,BEDMatrix
License: GPL version 2 or newer
LazyLoad: yes

Note

Once the running of the software mrMLM v5.0.1 is ended, the "results" files should appear on the Directory, which was set up by users before running the software. The results for each trait include "*_intermediate result.csv", "*_Final result.csv", Manhattan plot, and QQ plot. If only pLARmEB and ISIS EM-BLASSO methods are selected, there will be no intermediate results and figures output. Users can decompress the mrMLM package and find the User Manual file (name: Instruction.pdf) in the folder of ".../mrMLM/inst".

Author(s)

Zhang Ya-Wen, Wang Jing-Tian, Li Pei, Zhang Yuan-Ming
Maintainer: Yuan-Ming Zhang<[email protected]>

References

1. Zhang YM, Mao Y, Xie C, Smith H, Luo L, Xu S. Genetics 2005,169:2267-2275. 2. Wang SB, Feng JY, Ren WL, Huang B, Zhou L, Wen YJ, Zhang J, Dunwell JM, Xu S, Zhang YM. Sci Rep 2016,6:19444. 3. Tamba CL, Ni YL, Zhang YM. PLoS Comput Biol 2017,13(1):e1005357. 4. Zhang J, Feng JY, Ni YL, Wen YJ, Niu Y, Tamba CL, Yue C, Song Q, Zhang YM. Heredity 2018,118(6):517-524. 5. Ren WL, Wen YJ, Dunwell JM, Zhang YM. Heredity 2018,120(3): 208-218. 6. Wen YJ, Zhang H, Ni YL, Huang B, Zhang J, Feng JY, Wang SB, Dunwell JM, Zhang YM, Wu R. Brief Bioinform 2018,19(4): 700-712. 7. Tamba CL, Zhang YM. bioRxiv,preprint first posted online Jun. 7, 2018, doi:https://doi.org/10.1101/341784. 8. Zhang YW, Tamba CL, Wen YJ, Li P, Ren WL, Ni YL, Gao J, Zhang YM. Genomics, Proteomics & Bioinformatics 2020, 18: 481-487. 9.Li M, Zhang YW, Zhang ZC, Xiang Y, Liu MH, Zhou YH, Zuo JF, Zhang HQ, Chen Y, Zhang YM. A compressed variance component mixed model for detecting QTNs, and QTN-by-environment and QTN-by-QTN interactions in genome-wide association studies. Molecular Plant 2022, online, S1674-2052(22)00060-0. doi: 10.1016/j.molp.2022.02.012.

Examples

Ge1=data(Genotype)
Ph1=data(Phenotype)
mrMLM(fileGen=Genotype,filePhe=Phenotype,Genformat="Num",
method=c("FASTmrMLM"),trait=1,CriLOD=3,DrawPlot=FALSE,
dir=tempdir(),PC=FALSE,RAM=4)

To perform GWAS with mrMLM method

Description

multi-locus random-SNP-effect Mixed Linear Model

Usage

mrMLMFun(gen,phe,outATCG,genRaw,kk,psmatrix,svpal,svrad,svmlod,Genformat,CLO)

Arguments

gen

genotype matrix.

phe

phenotype matrix.

outATCG

genotype for code 1.

genRaw

raw genotype.

kk

kinship matrix.

psmatrix

population structure matrix.

svpal

Critical P-value for selecting variable

svrad

Search Radius in search of potentially associated QTN.

svmlod

Critical LOD score for significant QTN.

Genformat

Format for genotypic codes.

CLO

number of CPU.

Author(s)

Zhang Ya-Wen, Wang Jing-Tian, Li Pei, Zhang Yuan-Ming
Maintainer: Yuan-Ming Zhang<[email protected]>

Examples

G1=data(Gen)
P1=data(Phe)
Readraw=ReadData(fileGen=Gen,filePhe=Phe,fileKin=NULL,filePS =NULL,
Genformat=1)
InputData=inputData(readraw=Readraw,Genformat=1,method="mrMLM",trait=1)
result=mrMLMFun(InputData$doMR$gen,InputData$doMR$phe,InputData$doMR$outATCG,
InputData$doMR$genRaw,InputData$doMR$kk,InputData$doMR$psmatrix,
0.01,svrad=20,svmlod=3,Genformat=1,CLO=1)

Drawing multi-locus Manhattan plot

Description

Using the results of the mrMLM software to draw a multi-locus Manhattan plot

Usage

MultiManhattan(ResultIntermediate,ResultFinal,mar=c(2.9,2.8,0.7,2.8), 
LabDistance=1.5,ScaleDistance=0.4,LabelSize=0.8,ScaleSize=0.7,
AxisLwd=5,TckLength=-0.03,LogTimes=2,LODTimes=1.2,lodline=3, 
dirplot=getwd(), PlotFormat="tiff", 
width=28000,height=7000,pointsize = 60,res=600,
MarkGene=FALSE,Pos_x=NULL,Pos_y=NULL,GeneName=NULL,
GeneNameColour=NULL,...)

Arguments

ResultIntermediate

Intermediate results obtained by the mrMLM software,"D:/Users/ResultIntermediate.csv".

ResultFinal

Final results obtained by the mrMLM software,"D:/Users/ResultFinal.csv".

mar

A numerical vector of the form c(bottom, left, top, right) which gives the number of lines of margin to be specified on the four sides of the plot, and the default is c(2.9, 2.8, 0.7, 2.8).

LabDistance

Distance between label and axis; the default is 1.5.

ScaleDistance

Distance between scale values and axis; the default is 0.4.

LabelSize

Size of all the three labels; the default is 0.8.

ScaleSize

Size of scale values; the default is 0.7.

AxisLwd

The width of axis, a positive number; the default is 5.

TckLength

The length of tick marks; the default is -0.03.

LogTimes

Magnification of -log10(P-value); the default is 2.

LODTimes

Magnification of LOD score; the default is 1.2.

lodline

The significant LOD score; the default is 3.

dirplot

Path to save plot; the default is current working directory

PlotFormat

Format of the plot.i.e., *.tiff, *.png, *.jpeg, *.pdf

width

Figure width; the default is 28000.

height

Figure height; the default is 7000.

pointsize

Word resolution, with the unit of 1/72 inch, being pixels per inch (ppi); the default is 60.

res

Figure resolution, with the unit of pixels per inch (ppi); the default is 600.

MarkGene

To mark genes in plot or not; if "TRUE" is selected, a file, namely "Reference information to mark gene.csv", that contains the x and y axis information of all the significant QTNs will generate. The default is "FALSE", indicating that no candidate or known gene names are marked in Manhattan plot.

Pos_x

Numeric vectors of x axis where the text labels should be written.

Pos_y

Numeric vectors of y axis where the text labels should be written.

GeneName

A character vector or expression specifying the text to be written.

GeneNameColour

The colour of gene names.

...

Arguments passed to points, axis, text.

Author(s)

Zhang Ya-Wen, Wang Jing-Tian, Li Pei, and Zhang Yuan-Ming
Maintainer: Yuan-Ming Zhang<[email protected]>

Examples

inter<-data(ResultIntermediate)
fin<-data(ResultFinal)
MultiManhattan(ResultIntermediate=ResultIntermediate,ResultFinal=ResultFinal,dirplot=tempdir())

Matrix multiplication acceleration algorithm.

Description

Matrix multiplication acceleration algorithm.

Usage

multiplication_speed(A,B)

Arguments

A

matrix A.

B

matrix B.

Author(s)

Zhang Ya-Wen, Wen Yang-Jun, Wang Shi-Bo, and Zhang Yuan-Ming
Maintainer: Yuanming Zhang<[email protected]>

Examples

## Not run: 
A<-matrix(1:10,2,5)
B<-matrix(1:10,5:2)
result<-multiplication_speed(A,B)

## End(Not run)

Phenotype dataset

Description

Phenotype dataset of multiple traits.

Usage

data(Phe)

Details

Dataset input of phenotype in mrMLM function.

Author(s)

Maintainer: Yuan-Ming Zhang<[email protected]>


Phenotype of real data

Description

Phenotype dataset of multiple traits.

Usage

data(Phenotype)

Details

Dataset input of phenotype in mrMLM function.

Author(s)

Maintainer: Yuan-Ming Zhang<[email protected]>


To perform GWAS with pKWmEB method

Description

Kruskal-Wallis test with empirical Bayes under polygenic background control

Usage

pKWmEB(gen,phe,outATCG,genRaw,kk,psmatrix,svpal,svmlod,Genformat,CLO)

Arguments

gen

genotype matrix.

phe

phenotype matrix.

outATCG

genotype for code 1.

genRaw

raw genotype.

kk

kinship matrix.

psmatrix

population structure matrix.

svpal

Critical P-value for selecting variable.

svmlod

Critical LOD score for significant QTN.

Genformat

Format for genotypic codes.

CLO

number of CPU.

Author(s)

Zhang Ya-Wen, Wang Jing-Tian, Li Pei, Zhang Yuan-Ming
Maintainer: Yuan-Ming Zhang<[email protected]>

Examples

G1=data(Gen)
P1=data(Phe)
Readraw=ReadData(fileGen=Gen,filePhe=Phe,fileKin=NULL,filePS =NULL,
Genformat=1)
InputData=inputData(readraw=Readraw,Genformat=1,method="pKWmEB",trait=1)
result=pKWmEB(InputData$doMR$gen,InputData$doMR$phe,InputData$doMR$outATCG,
InputData$doMR$genRaw,InputData$doMR$kk,InputData$doMR$psmatrix,
0.05,svmlod=3,Genformat=1,CLO=1)

To perform GWAS with pLARmEB method

Description

polygene-background-control-based least angle regression plus Empirical Bayes

Usage

pLARmEB(gen,phe,outATCG,genRaw,kk,psmatrix,CriLOD,lars1,Genformat,Bootstrap,CLO)

Arguments

gen

genotype matrix.

phe

phenotype matrix.

outATCG

genotype for code 1.

genRaw

raw genotype.

kk

kinship matrix.

psmatrix

population structure matrix.

CriLOD

Critical LOD score for significant QTN.

lars1

No. of potentially associated variables selected by LARS.

Genformat

Format for genotypic codes.

Bootstrap

Bootstrap=FALSE indicates the analysis of only real dataset, Bootstrap=TRUE indicates the analysis of both real dataset and four resampling datasets.

CLO

number of CPU.

Author(s)

Zhang Ya-Wen, Wang Jing-Tian, Li Pei, Zhang Yuan-Ming
Maintainer: Yuan-Ming Zhang<[email protected]>

Examples

G1=data(Gen)
P1=data(Phe)
Readraw=ReadData(fileGen=Gen,filePhe=Phe,fileKin=NULL,filePS =NULL,
Genformat=1)
InputData=inputData(readraw=Readraw,Genformat=1,method="pLARmEB",trait=1)
result=pLARmEB(InputData$doMR$gen,InputData$doMR$phe,InputData$doMR$outATCG,
InputData$doMR$genRaw,InputData$doMR$kk,InputData$doMR$psmatrix,
CriLOD=3,lars1=20,Genformat=1,Bootstrap=FALSE,CLO=1)

read raw data

Description

read raw data which have not been transformed

Usage

ReadData(fileGen,filePhe,fileKin,filePS,fileCov,Genformat)

Arguments

fileGen

genotype matrix.

filePhe

phenotype matrix.

fileKin

kinship matrix.

filePS

population structure matrix.

fileCov

Covariate matrix.

Genformat

genotype format.

Author(s)

Zhang Ya-Wen, Wang Jing-Tian, Li Pei, Zhang Yuan-Ming
Maintainer: Yuan-Ming Zhang<[email protected]>

Examples

G1=data(Gen)
P1=data(Phe)
result=ReadData(fileGen=Gen,filePhe=Phe,fileKin=NULL,filePS =NULL,
fileCov=NULL,Genformat=1)

Final result used to draw manhattan plot.

Description

Final result used to draw manhattan plot.

Usage

data(ResultFinal)

Details

Final result used to draw manhattan plot.

Author(s)

Maintainer: Yuan-Ming Zhang<[email protected]>


Intermediate result used to draw manhattan plot.

Description

Intermediate result used to draw manhattan plot.

Usage

data(ResultIntermediate)

Details

Intermediate result used to draw manhattan plot.

Author(s)

Maintainer: Yuan-Ming Zhang<[email protected]>