Rice, Oryza sativa L., is the staple food
for half the world’s population. By 2030, rice production must increase
by at least 25% to keep pace with population growth. Accelerated genetic
gains in rice improvement are needed to mitigate the effects of climate
change and loss of arable land and to ensure global food supply. On May
28, 2014, data from an international effort resequencing a core
collection of 3,024 rice accessions from 89 countries was released as a
global public good. This data provides a foundation for
large-scale discovery of novel alleles for important rice phenotypes
using various bioinformatics and/or genetic approaches. It also serves
to understand, at a higher level of detail, the genomic diversity within
O. sativa, and provides a foundation for establishing a global,
public rice genetic/genomic database and information platform for
advancing rice breeding technology for future rice improvement. The initial publications on the dataset, namely The 3,000 rice genomes project, data note and The 3,000 rice genomes project: new opportunities and challenges for future rice research, commentary are published in GigaScience Journal. The MAIN publication of the 3K RG project in Nature journal is entitled: The complete list of rice accessions sequenced for the 3K RG project is available in this site and in GigaDB. RAW SEQUENCING DATA AVAILABILITY The 3,024 sequenced rice genomes had an average sequencing depth of 14X, average genome coverage and mapping rates of 94.0% and 92.5%, respectively. Raw sequencing data are available from GigaDB, EBI, NCBI (accession PRJEB6180), and DDBJ (accession ERP005654). To further enable the utilization of the 3K RG dataset
by the global rice community, we also released the primary analyses results
for variant discovery on the sequencing data, with 24 additional genomes
being included , resulting in over 120 terabytes of downloadable data. The dataset is released under the stipulations for data analysts and data users in
the Toronto Statement , in the following resources: 1. SNP-Seek download area 2. Amazon Web Services (AWS) Public Data Set. Through a partnership with AWS, the 3000 Rice Genome data is freely available on Amazon S3. This enables anyone to use AWS on-demand computing resources to perform analysis and create new products. You can learn more about
accessing and utilizing the data on AWS from the 3000 Rice Genome on AWS page. You can view IRIC resources that use the 3K RG dataset on AWS here. 3. Philippine DOST-ASTI COARE facility: IRRI is collaborating with the Philippines’ Department of Science and Technology - Advanced Science and Technology Institute (DOST-ASTI) to utilize their data storage service — COARE Data Catalog. The COARE Data Catalog is a web-based research repository that hosts a number of research datasets. Also, it offers a web interface to search and access research datasets. The 3kRG dataset is hosted in the COARE Data Catalog. To access the dataset, visit https://asti.dost.gov.ph/coare/datahttps://asti.dost.gov.ph/coare/data and register for an account. 4. Web resources hosted by the Chinese Academy of Agricultural Sciences (CAAS) 5. Internally at IRRI, the sequences, alignment and SNP call files , in fastq, BAM and VCF formats, respectively, are available for copying; just send an email request to the IRIC coordinator. Be advised that these files are huge. The following tables give more information about the re-sequenced accessions from the International Rice Genebank collection at IRRI (Table 1) , and from the China National Crop Genebank and the CAAS working collections (Table 2).
Table 2. Information for the 534 rice accessions from the China National Crop Genebank and the CAAS working collections. *MC = the mini-core collection accessions established by China Agricultural University [9]; IRMBN are the parental lines used in the international rice molecular breeding network, selected previously to represent the mini-core collections based on the isozyme data
Selected Varieties We suggest using this list of 72 varieties in further studies. The selection was made based on diversity, availability of seeds in IRRI Genebank and sequencing coverage. Publications
♦ 3K R.G.P. The 3,000 rice genomes project. GigaScience 2014, 3:7.
♦ Li, J., Wang, J. and Zeigler, R. S. The 3,000 rice genomes project: new opportunities and challenges for future rice research. GigaScience 2014, 3:8. |
Resources >