Celebrating achievements in the IITA Year of Open Access
The Institute has made significant progress towards ensuring compliance to the decision by management to adopt Open Access, thereby making available its work to the public for free. As a direct outcome of implementing Open Access, a new institutional repository was introduced called CG Space. It is a shared DSpace implementation hosted by the International Livestock Research Institute. CG Space stores not only scholarly publication data (journal articles, books etc.) but also other textual and even multimedia content like reports, field protocols, pictures, posters, presentations and the like. This repository fulfills all Open Access requirements, such as being permanently, unlimited accessible without a login and free of charge, with sufficient metadata and enabling other machines or websites to harvested content. It also comes with a easier and flexible search and browsing functionality, shows use statistics about views and downloads, gives useful metadata and all this consistently across all 8 CG Centers, 7 CRP’s, 4 CG programs and 9 other CG Space partners.
Furthermore, the Cassava Breeding Program of IITA developed an innovative method to securely capture cassava field data by using electronic field book applications in tablets, which capture data in milliseconds. A barcode reader in these tablets reads barcode labels that are generated and used, for example, for accurate and efficient plot identification. The tablets are then connected to a multifunction platform called Cassavabase, which makes the collected data readily available in compliance with the Institute’s Open Access policy and can be used for downstream analysis.
The program uses Cassavabase as its primary data management tool for uploading both phenotyping and genotyping data. These data are useful for implementing genomic selection and will improve accuracy in estimating breeding values and genetic gain for quantitative traits compared to traditional breeding methods. Currently, Cassavabase has over 1500 phenotyping trials with ~8 million phenotypic observations and ~2 billion genotypic data points with more than 400 registered users.
IITA has also developed and implemented the Breeding Management System (BMS), a comprehensive and easy to use software suite designed to help breeders conduct their routine activities more efficiently. Developed by the Integrated Breeding Platform (IBP) based in IITA-Nairobi in Kenya, the BMS provides interconnected tools for breeding program management, data analysis, and decision support. It also provides a database that works seamlessly to manage pedigree information, phenotypic and molecular characterization as well as germplasm evaluation.
The Bioinformatics Unit of the Institute has also began using High-throughput sequencing-an emerging technology that allows for fast and inexpensive sequencing of a whole genome, which makes the process affordable to many researchers and lead to the production of large amounts of data. However, this technology demands high computer processing power to efficiently store and analyze large data sets. IITA has been using these sequencing data for more than a year for gene discovery and genotyping to accelerate breeding cycles.
Currently, the unit holds more than 4 TB of compressed sequencing data from different crops. To visualize this amount of stored data, if just the text of this sequencing data is printed, the printout will cover about 300 km end-to-end. For large-scale data processing, the Bioinformatics Unit is equipped with upgraded computing power consisting of 64 cores and combined 900 gigabytes of RAM. The actual capacity is set up for the storage of 30 TB of data and processing of 2 TB compressed data in a one data analysis process. This allows IITA to master large-scale genotyping, gene expression whole genome sequencing data for advanced research in plant genomics. This important capacity enables IITA researchers to increase the precision of correlating traits, also complex traits, to markers which, in turn, contribute towards faster and more efficient crop breeding.
Work is now in progress to develop sister platforms: Musabase and Yambase arising from the success of Cassavabase. IITA is also an active contributor to the development of the CGIAR Consortium’s “Big Data Platform Project”. The envisaged data pool to be generated from this multi-CGIAR center platform could be used, for example, to directly feed agronomic information and advice to farmers through electronic or mobile technology-based means.