全基因组选择(Genomic selection, GS)是利用覆盖全基因组标记对育种目标个体进行遗传潜能评估及预测的方法。全基因组选择作为新一代的育种技术,通过构建预测模型,根据全基因组估计育种值进行早期个体的预测和选择,从而缩短育种世代间隔,加快育种进程,节约成本,推动现代育种向精准化和高效化方向发展。
2023年9月6日-10日,应中国林业科学研究院林业研究所和林木遗传育种全国重点实验室邀请,美国华盛顿州立大学张志武教授来我院进行交流访问,进行为期5天的全基因组选择技术专题培训。培训内容包括:基因型与表型数据模拟、全基因组关联分析、分子标记辅助育种、模型拟合与交叉验证、经典统计(gBLUP、rrBLUP)模型、贝叶斯基因组预测模型的概念原理和算法实现。培训班共收学员40名,其中20名分配给中国林业科学研究院,余下20名由张志武教授根据研究基础、编程能力和区域平衡在全国选取,所有学员全部免收培训费。
张志武教授任教于华盛顿州立大学作物与土壤科学系,致力于开发创新的统计方法和计算工具,继而推动人类疾病的攻克和加速动植物重要经济性状的遗传改良。他在2014年被华盛顿谷物协会授予杰出教授,享受特殊津贴。他在全基因组信息应用于动植物选择、育种方面进行了大量创新性研究,提出和发展了用于大规模性状-遗传标记关联分析的紧凑混合线性模型及连锁不平衡分析新技术,是著名基因组数据分析软件TASSEL和GAPIT的开发者和设计者,研究成果发表于Nature Genetics、Science、Briefings in Bioinformatics和Plant Cell等著名学术期刊,文章被引量2万余次(https://zzlab.net)。
参会人员
从事植物、动物、林木、水产和医学等的统计遗传及相关领域的在读研究生、科研人员及企业研发人员。
时间
2023年9月6日-10日;9月5日全天报到。
地点:
北京市海淀区中国林业科学研究院林木遗传育种国家重点实验室B座一楼115-116室
报名
本次研讨会无任何会议费用,不安排接送站,参会人需食宿费用自理,考虑到学习效果,名额有限(40人以内)。
申请人务必将申请材料在8月1日-10日(北京时间)之间用电子信件发送给张苗苗博士(mmzhang@caf.ac.cn), 邮件题目标明“GS Workshop Beijing 2023”,申请材料(zip压缩格式)包括:大学与研究生成绩单、简历、书面申请(限一页)和导师推荐信。录取人员将在8月20日前得到通知和参会回执表。录取人员须将填写的参会回执表,通过电子邮件以附件形式在8月25日前发送至mmzhang@caf.ac.cn,邮件题目标明“GS Workshop Beijing 2023-回执”。
电脑和软件要求
自备笔记本电脑(Mac,Windows和Linux系统均可),提前安装R及Rstudio软件平台。
联系方式
张苗苗 博士 中国林业科学研究院林业研究所
邮箱:mmzhang@caf.ac.cn;电话:15010431326
卢楠 博士 中国林业科学研究院林业研究所
邮箱:ln_890110@163.com;电话:13522657300
网站
https://zzlab.net/Beijing2023GS
Genomic Prediction Workshop
Chinese Academy of Forestry, Beijing, China
September 6-10, 2023
Instructor: Dr. Zhiwu Zhang (Zhiwu.Zhang@wsu.edu)
Teaching Assistant: Dr. Jiabo Wang (wangjiaboyifeng@163.com)
Dr. Miaomiao Zhang (mmzhang@caf.ac.cn)
Dr. Nan Lu (n_890110@163.com)
Website: https://zzlab.net/Beijing2023GS
Classroom: Room 115-116, Block B, State Key Laboratory of Forest Genetics and Breeding, Chinese Academy of Forestry, Haidian District, Beijing
Lecture: 8:30 AM-Noon
Lab: 1:30-5:00 PM
Objective: Develop concepts and analytical skills for modern breeding by using genomic prediction in framework of mixed linear models and Bayesian approaches.
Assessments: Exam (40%) and Project (60%).
Grade and Certificate: A (93%-100%); A- (90%-93%); B+ (87%-90%); B (83%-87%) B- (80%- 83%); C+ (77%-80%); C (73%-77%); C- (70%-73%); D+ (66%-70%); D (60%-66%); and F (0%-60%).
Note: The upper grade will be assigned to a score at a cutting point without rounding. For examples, score of 93.00% receives “A” and score 92.99% receives “A-”. Certificate is available only for grade of D and above.
Exam: September 10, 90 minutes (3:30-5:00 PM), 25 multiple-choice questions.
Project: Due at 5 pm on October 13, 2023 (Beijing time).
Workshop Schedule
September 2023
|
Content
|
CROPS545*
|
6 (Saturday)
|
GWAS
|
8-20
|
7 (Sunday)
|
MAS and Cross Validation
|
21 and 23
|
8 (Monday)
|
BLUP Alphabet GS
|
22 and 27
|
9 (Tuesday)
|
Bayesian Alphabet GS
|
24-26
|
10 (Wednesday)
|
GWAS Assisted GS
|
NA
|
*The equivalent lecture number of CROPS545(Statistical Genomics) at Washington State University in 2023. PPT and R code are available at https://zzlab.net/teaching.
Final Project
Genomic Prediction Workshop
September 6-10, 2023
Beijing, China
Website: https://zzlab.net/Beijing2023GS
Instructor: Zhiwu Zhang
Due on October 13, 2023, 5PM (Beijing time)
Data files: Choose a dataset from the recommended list (http://zzlab.net/StaGen/2023/Data/PublicData.pdf), or a dataset outside the list (please specify source, 5 extra points), or your own data that are released to the public (please specify source, 10 extra points).
Hand in: Email Zhiwu.Zhang@WSU.edu with subject of “Beijing2023GS Project”. The email should contain two links, one to GitHub hosting your R code and data source. The other links to preprint.org hosting your manuscript. The email should identify your real name and indicate if extra credit is applicable.
Marker Assisted Selection (MAS) is the earliest format of molecular breeding based on linkage analysis or Genome-Wide Association Study (GWAS). Genomic Prediction, or Genomic Selection (GS) in animal or plant breeding, uses all available markers regardless of if they are statistically significant or not to predict genetic merit of individuals, especially for complex traits. Literature claims that incorporating GWAS results in genomic selection (GS) would improve accuracy than conducting GS and MAS along (e.g. two fold accuracy increase reported by Ravelombola et al., 2020). However, many of the reports have an invalidate procedure for such claim. As reviewed by McGowan et al. (2021) on Plant Breeding Reviewer (can be accessed by the links in Reference), invalidate procedure can artificially create fake accuracy improvement for such incorporation. There is critical needs to conduct validate procedure if there are conditions that the incorporation improves genomic prediction accuracy and provide guidance to apply the incorporation.
Your investigation should be based on the comparisons of the incorporation to both GS and MAS to address at least one of the following topics.
-
Marker density
When marker density is too low, there are barely chances to identified associated markers in GWAS. The value of incorporation is dramatically reduced. Appropriate marker density is essential for the incorporation.
-
Multiple real traits
Traits are different in regard of genetic architecture, including number of genes and heritability. The knowledge of incorporation is beneficial to breeding on the specific traits.
-
Simulated traits
Simulation has advantages of knowing the genetic architecture underlying a trait, including the true breeding values of individuals. Number of genes and heritability can be varied to find appropriate combinations to incorporate GWAS into GS.
-
GWAS models
There are substantial variations of statistical power among different GWAS models. A model with high statistical power provides valuable prior knowledge and less noise for the incorporation in GS.
-
GS models
Some of GS models also have additional function for GWAS. Therefore, the benefit of incorporating GWAS varies among GS models.
-
Incorporation procedures
There are multiple procedures to incorporate GWAS results in GS, including fitting the associated markers as covariates in BLUP and Bayesian models (Spindel, 2016) or building kinship using associated markers in gBLUP (Zhang et al. 2014).
Your R code should be bug free (10 points) and contain comments to illustrate the purpose (10 points). Your manuscript should contain the following sections:
-
Title (5 points)
-
Abstract (10 points)
-
Introduction (10 points)
-
Method (15 points)
-
Results (15 points)
-
Discussion (10 points)
-
Conclusion (10 points)
-
References (5 points)
Reference
-
Bernardo, R. 1994. Prediction of maize single-cross performance using RFLPs and information from related hybrids. Crop Sci. 34: 20–25.
https://doi.org/10.2135/cropsci1994.0011183X003400010003x
-
Brian Rice and Alexander E. Lipka, Evaluation of RR-BLUP Genomic Selection Models that Incorporate Peak Genome-Wide Association Study Signals in Maize and Sorghum, The Plant Genome, 2019, doi: 10.3835/plantgenome2018.07.0052.
-
Bertrand C.Y Collard and David J Mackill, Marker-assisted selection: an approach for precision plant breeding in the twenty-first century. Philosophical Transactions of the Rolyal Society B, 2007. https://doi.org/10.1098/rstb.2007.2170
-
Matthew McGowan, Jiabo Wang, Haixiao Dong, Xiaolei Liu, Yi Jia, Xianfeng Wang, Hiroyoshi Iwata, Yutao Li, Alexander E Lipka, and Zhiwu Zhang*. Ideas in Genomic Selection with the Potential to Transform Plant Molecular Breeding: A Review. Plant Breeding Review Vol 45 (eds. Irwin Goldman), John Wiley & Sons, Inc. 2021. pp. 273-320, https://doi.org/10.1002/9781119828235.ch7, also available as Preprint, Publication, and Google Book.
-
Spindel, JE, H Begum, D Akdemir, B Collard, E Redo?a, J-L Jannink and S McCouch. Genome-wide prediction models that incorporate de novo GWAS are a powerful new tool for tropical rice improvement. Heredity,2016. https://www.nature.com/articles/hdy2015113.
-
VanRaden, P. M. 2008. Efficient methods to compute genomic predictions. J. Dairy Sci. 91: 4414–4423. DOI: 10.3168/jds.2007-0980
-
Waltram Second Ravelombola, Jun Qin, Ainong ShiID, Liana Nice, Yong Bao, Aaron Lorenz, James H. Orf, Nevin D. Young, Senyu Chen, Genome-wide association study and genomic selection for tolerance of soybean biomass to soybean cyst nematode infestation. PLOS One, 2020, https://doi.org/10.1371/journal.pone.0235089.
-
Yao Zhou, MI Vales, Aoxue Wang, Zhiwu Zhang. Systematic bias of correlation coefficient may explain negative accuracy of genomic prediction. Briefings in Bioinformatics 2016, https://doi.org/10.1093/bib/bbx133.
-
Zhe Zhang, Ulrike Ober, Malena Erbe, Hao Zhang, Ning Gao, Jinlong He, Jiaqi Li. Improving the Accuracy of Whole Genome Prediction for Complex Traits Using the Results of Genome Wide Association Studies. PLOS One. https://doi.org/10.1371/journal.pone.0093017.
-
Zhang Z, Todhunter RJ, Buckler ES, Van Vleck LD. Technical note: Use of marker-based relationships with multiple-trait derivative-free restricted maximal likelihood. J Anim Sci 2007, 85:881–885, https://doi.org/10.2527/jas.2006-656.