Virus detection by high-throughput sequencing of small RNAs: Large-scale performance testing of sequence analysis strategies

Sebastien Massart, Michela Chiumenti, Kris De Jonghe, Rachel Clover, Annelies Haegeman, Igor Koloniuk, Petr Kominek, Jan Kreuze, Denis Kutnjak, Leonidas Lotos, François Maclot, Varvara Maliogka, Hano Maree, Thibaut Olivier, Antonio Olmos, Mikhail Pooggin, Jean-Sébastien Reynard, Anna Ruiz-Garcia, Dana Safarova, Pierre SchneebergerNoa Sela, Sylvia Turco, Eeva Vainio, Eva Varallyay, Eric Verdin, Marcel Westenberg, Yves Brosteaux, Thierry Candresse

    Onderzoeksoutput: Bijdrage aan tijdschriftA1: Web of Science-artikel


    Recent developments in high throughput, next-generation sequencing (NGS) technologies and bioinformatics have drastically changed research on viral pathogens and spurred growing interest in the field of virus diagnostics. However, the reliability of NGS-based virus detection protocols must be evaluated before adopting them for diagnostics. Many different bioinformatics algorithms aimed at tracking viruses in NGS data have been reported, but little attention has been paid so far to their sensitivity and reliability for diagnostic purposes. We therefore compared the performance of existing bioinformatics pipelines through a double blind large scale performance test involving 21 participants from 16 countries and using ten datasets of 21-24 nt small (s)RNA sequences from three different infected plants. The sensitivity of virus detection ranged between 35 and 100% among participants, with a marked negative effect when sequence numbers decreased. The false positive detection rate was very low and mainly related to the identification of host genome-integrated viral sequences or misinterpretation of the results. Reproducibility was high (91.6%). This work revealed that (i) the complex nature of virus detection and new viruses could be discovered using sRNAs, (ii) it is difficult to detect viral agents when sRNA abundance is low, (iii) reference sequence databases can be inconsistent for virus detection, and (iv) scientific expertise is important when interpreting diagnostic results. Overall, this work brings valuable insights into the reliability of bioinformatics pipelines and the impact of end-user and database completeness on the results. It also, underlines key parameters and proposes recommendations for reliable sRNA-based detection of known and unknown viruses.
    Pagina's (van-tot)488-497
    Aantal pagina's10
    StatusGepubliceerd - 8-feb-2019

    Dit citeren