NGS and virus diagnostic: does sequence analysis strategies really matter? Results of an international proficiency testing on siRNA

COST-Divas working group 1: Sébastien Massart, Ian Adams, Kris De Jonghe, Annalisa Giampetruzzi, Igor Koloniuk, Petr Kominek, Jan Kreuze, Denis Kutnjak, Leonidas Lotos, Hans J. Maree, Thibaut Olivier, Mikhail Pooggin, Ana B. Ruiz-García, Dana Safarova, P. H. H. Schneeberger, Noa Sela, Eva Varallyay, Eeva Vainio, Eric Verdin, Marcel Westenberg, Yves Brostaux and Thierry Candresse

    Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureC3: Congres abstract


    The recent developments of high-throughput sequencing (also called Next Generation Sequencing - NGS) technologies and bioinformatics have drastically changed the research on viral pathogens and is now raising a growing interest for virus diagnostics. However, any diagnostic technique has to be included in standardized protocols. Currently, a huge diversity of bioinformatics protocols for virus discovery has been reported in the scientific literature but, to date, without addressing their reliability for diagnostic purpose. The objective of this work was therefore to compare the performance of existing bioinformatics pipelines and of the result interpretation through a double-blinded large scale proficiency testing based on a set of ten fastq files and involving 21 laboratories from 16 countries. The fastq files contained 50,000 (3), 250,000 (4) and 2.5 M (3) sequences of 21-24 nt coming from 3 samples. The false positive rate was only 0.5% and mainly related to the identification of integrated sequences or misinterpretation of the results. The overall sensitivity of detection was 57 % and ranged between 35 and 100% between laboratories with a marked effect of rarefaction for some laboratories. A principal component analysis and correlation studies underlined the most important parameters for appropriate diagnostic. The repeatability of detection corresponded to 73%. This work also underlined (i) the complexity of discovering new viruses by NGS, (ii) the difficulty to detect viral pathogens with low number of siRNA reads, (iii) the inconsistencies of databases and its impact on results. Overall, this work brings key insights into the reliability of bioinformatics pipelines and underlines some key parameters for achieving a reliable detection of viruses in a diagnostic setting using siRNA sequencing.

    Acknowledgement: This article is based upon work from COST Action FA1407 –, supported by COST (European Cooperation in Science and Technology)
    TitelAbstract book - Rencontre de virologie végétale 2017
    Aantal pagina's1
    StatusGepubliceerd - jan-2017
    EvenementRencontre de virologie végétale" Aussois (France) - Aussois, Frankrijk
    Duur: 15-jan-201719-jan-2017

    Dit citeren