Rice SNP-Seek Datasets
Variety Datasets
The 3KRG consists of 3024 genotypes.
Publication | Metadata | Metadata
RICE_RP
The RICE_RP is comprised of 4591 combined samples from those genotyped on the HDRA (1568) and/or sequenced in the 3kRG (3024). Complete SNP calls were obtained through imputation across the unique genotypes.
BAAP
The Bengal and Assam Aus Panel (BAAP) is comprised of 299 cultivars with 2 million SNPs identified after imputation relative to the 3KRG 4.8M filtered dataset.
HDRA germplasm consists of 1,568 diverse rice lines genotyped using a high-density rice array (HDRA) comprised of 700,000 SNPs.
GQ
The Grain Quality (GQ) panel is comprised of 92 IRRI breeding lines developed by IRRI from years 1966-2015. Whole-genome sequencing of these 92 lines was performed to identify the genetic variants of Head Rice Yield (HRY) and chalkiness genes/loci.
PUE
The Phosphorous Utilization Efficiency (PUE) Panel consists of nine entries that were extensively studied for PUE by Wissuwa et al.
Publication | Metadata
IRRI Elite Lines
The IRRI Elite Lines panel is comprised of 169 genotypes drawn from the former irrigated and rainfed breeding programs. These were chosen to represent the diversity found in those breeding programs, combined with high performance for yield. These lines comprised the core panels described in various publications as the Elite Core Panel (ECP), ICP etc.
3K RG SNPs Datasets
3KAll
32 million full 3K RG SNPs Dataset biallelic & multiallelic SNP set v.5
Total SNPs: 32,064,217
Samples : 3024
3Kbase
A Base SNP set of ~18 million SNPs was created from the ~29 million biallelic SNPs by removing SNPs with excess of heterozygous calls.
18,128,777 SNPs (the Base SNP set)
3Kcore
404k CoreSNP dataset
The Core SNP set (v0.7) was obtained from the filtered SNP set (v0.7) by applying two-step LD pruning procedure as follows:
1) LD pruning with window size 10kb, step 1 SNP, R2 threshold 0.8
2) LD pruning with window size 50 SNPs, step 1 SNP, R2 threshold 0.8
3K filtered
4.8million filtered SNP dataset
The filtered SNP set was obtained from the Base SNP set by applying the following filtering criteria:
alternative allele frequency at least 0.01
proportion of missing calls per SNP at most 0.2
1k-RiCA
1K-Rice Custom Amplicon, or 1k-RiCA, a robust custom sequencing-based amplicon panel of ~1000-SNPs (version 3 = xxxx SNPs) that are uniformly distributed across the rice genome, designed to be highly informative within indica rice breeding pools, and tailored for genomic prediction in elite indica rice breeding programs.
Cornell_6K_Array_Infinium_Rice (C6AIR)
The Cornell_6K_Array_Infinium_Rice panel includes 4429 SNPs from re-sequencing data and 1571 SNP markers from previous BeadXpress 384-SNP sets, selected based on polymorphism rate and allele frequency within and between target germplasm groups.