Labaer

Annotation and Curation

RNA interference (RNAi) has emerged as an essential tool for loss-of-function studies in model organisms.  The development of the RNAi Consortium (TRC) short hairpin RNA (shRNA) library, which contains 150,000 hairpins targeting about 30,000 human and mouse genes, has allowed for genome-scale loss-of-function screening in mammalian cells and systems.  The utility of this important resource has been demonstrated through various studies, notably the high-content screening of mitotic progression in human cancer cells.   However, the results obtained from screens only have relevance when the biological roles of the shRNA “hits” are identified and studied in the context of the biological question of interest.  To achieve this, the gene that is targeted by the shRNA needs to be accurately identified.  This is not necessarily straightforward because, although RNAi sequences are static when the library is constructed, the reference sequences as well as gene annotation are updated constantly. Moreover, in a large scale screen, checking each hairpin to validate its annotation is not feasible.  Therefore there is a need to maintain an updated annotation of this collection.  We developed the multiple-step strategy to annotate the shRNA clones based on the current Reference Sequences and Gene assignment.  In addition, we also developed the program iTARGET to distinguish clones targeting all isoforms or specific isoform as well as clones targeting multiple genes.

With more genome sequencing and experimental data available, our knowledge about the mammalian genome and proteome is improving every day.  Therefore the annotation and curation of biological reagents is an ongoing effort.  Especially for the reagents used in genome-scale functional studies, scientists will benefit from the most update to date and accurate annotation during experimental design and data interpretation.