Disciplina Discipline IBI5071
Abordagens teóricas e práticas de metagenômica para a descoberta de vírus

Área de Concentração: 95131

Concentration area: 95131

Criação: 06/01/2016

Creation: 06/01/2016

Ativação: 07/01/2016

Activation: 07/01/2016

Nr. de Créditos: 2

Credits: 2

Carga Horária:

Workload:

Teórica

(por semana)

Theory

(weekly)

Prática

(por semana)

Practice

(weekly)

Estudos

(por semana)

Study

(weekly)

Duração Duration Total Total
12 12 6 1 semanas 1 weeks 30 horas 30 hours

Docente Responsável:

Professor:

Arthur Gruber

Objetivos:

Esta disciplina visa apresenta conceitos fundamentais de desenho experimental e análise de dados de metagenômica usando dados de sequenciamento de próxima geração. Os tópicos abordados tecnologias de seqüenciamento, base torica dos aplicativo, bancos de dados públicos e exemlos no avanço da descoberta de novos vírus a partir de estudos de metagenômica, incluindo algumas abordagens tradicionais e uma sessão prática usando o programa GenSeed-HMM.

Objectives:

Course Description: This course introduces fundamental concepts of metagenomics experimental design and data analysis using next-generation sequencing data. Topics covered include rational for experimental design of metagenomics experiments, theoretical basis of the most common bioinformatics and statistical tools used for metagenome analysis, and practical approaches for data analysis using web-based and command line tools and commands. Theoretical and practical classes will be provided. The course will also exemplify some of the advances in microbial ecology and viral discovery derived from recent metagenomic studies, including some mainstream approaches and a practical session using the GenSeed-HMM program.

Justificativa:

O advento de técnicas de sequenciamento de nova geração, trouxe a possibilidade de se sequenciar não apenas um único genoma, mas os genomas de toda uma comunidade de microrganismos de um bioma. A metagenômica permite estimar a diversidade de biológica de uma amostra, o conjunto de enzimas e vias presentes, bem como detectar organismos ainda desconhecidos. Atualmente conhecemos somente uma pequena fração da imensa diversidade dos vírus. O uso de dados metagenômicos e a identificação de vírus emergentes representa um grande desafio em termos de bioinformática. Nessa disciplina pretendemos abordar alguns métodos e ferramentas para o processamento de dados metagenômicos e seu uso para a descoberta de novos vírus, cobrindo conceitos teóricos e sessões práticas.

Rationale:

The advent of next-generation sequencing has brought the possibility of sequencing not only a single genome but the genomes of a whole community of microorganisms of a biome. Metagenomics allows estimating the biological diversity of a sample, the whole set of enzymes and pathways present in the community, as well as to detect unknown organisms. We currently know only a small fraction of the viral diversity. The use of metagenomic data and the identification of emerging viruses represent a major challenge in terms of bioinformatics. In this discipline we intend to cover some methods and tools for processing metagenomic data and their use for the viral discovery, including theoretical concepts and practical sessions.

Conteúdo:

1. Introdução à análise em bioinformática (questões biológicas vs. análise bioinformática, principais abordagens em bioinformática, alinhamento de pares de sequências, bancos de dados de sequencias, de vias metabólicas e de ortologia) 2. Seqüenciamento de DNA (sequenciamento Sanger e seqüenciamento de nova geração, vantagens de cada plataforma, formatos de dados, qualidade, trimming, montagem de fragmentos e suas medidas) 3. O que é metagenômica? 4. Tipos de análise de metagenômica: amplicon, shotgun, funcional. 5. Metagenômica de amplicons (história, marcadores filogenéticos, design experimental , análise de dados preliminares, OTU, designação taxonômica, análise estatística multivariada, o pacote QIIME) 6. Metagenômica de shotgun (montagem , design experimental , análise absoluta vs análise relativa, anotação funcional e taxonômica) 7. Métodos inovadores para metagenomica virail (VirSorter, métodos alternativos de reconstrução , bases de dados de sequências virais, de grupos ortólogos e de HMMs de perfil, vFam, POGs e seção viral do EGGNOG, reconstruindo genomas virais usando HMMs de perfil como sementes)

Content:

1. Introduction to bioinformatics analysis (biological issues vs. bioinformatics analysis, main approaches in bioinformatics, sequence pair alignment, sequence databases, metabolic pathways and ortology) 2. DNA sequencing (Sanger sequencing and next-generation sequencing, advantages of each platform, data formats, quality, trimming, DNA assembly evaluation) 3. What is metagenomics? 4. Types of metagenomic analysis: amplicon, shotgun, functional 5. Amplicon metagenomics (history, phylogenetic markers, experimental design, preliminary data analysis, OTU, taxonomic designation, multivariate statistical analysis, QIIME package) 6. Shotgun metagenomics (assembly, experimental design, absolute analysis vs relative analysis, functional annotation 7. Innovative methods for viral metagenomics and viral discovery (VirSorter, alternative methods of sequence reconstruction, viral sequence databses, orthologous groups and profile HMMs, vFam, POGs and viral section of eggNOG, viral genome reconstruction using profile HMMs as seeds).

Forma de Avaliação:

Exercícios práticos avaliados em aula, prova teórica.

Observação:

Esta disciplina será ministrada em inglês

Notes/Remarks:

The course will be taught in English.

Bibliografia:

Bexfield, N., and Kellam, P. (2011). Metagenomics and the molecular identification of novel viruses. Vet J 190, 191-198. Bibby, K., and Peccia, J. (2013). Identification of viral pathogen diversity in sewage sludge by metagenome analysis. Environ Sci Technol 47, 1945-1951. Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., Pena, A.G., Goodrich, J.K., Gordon, J.I., Huttley, G.A., Kelley, S.T., Knights, D., Koenig, J.E., Ley, R.E., Lozupone, C.A., Mcdonald, D., Muegge, B.D., Pirrung, M., Reeder, J., Sevinsky, J.R., Turnbaugh, P.J., Walters, W.A., Widmann, J., Yatsunenko, T., Zaneveld, J., and Knight, R. (2010). QIIME allows analysis of high-throughput community sequencing data. Nat Methods 7, 335-336. Fancello, L., Raoult, D., and Desnues, C. (2012). Computational tools for viral metagenomics and their application in clinical research. Virology 434, 162-174. Huerta-Cepas, J., Szklarczyk, D., Forslund, K., Cook, H., Heller, D., Walter, M.C., Rattei, T., Mende, D.R., Sunagawa, S., Kuhn, M., Jensen, L.J., Von Mering, C., and Bork, P. (2015). eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Res. Kristensen, D.M., Waller, A.S., Yamada, T., Bork, P., Mushegian, A.R., and Koonin, E.V. (2013). Orthologous gene clusters and taxon signature genes for viruses of prokaryotes. J Bacteriol 195, 941-950. Mokili, J.L., Rohwer, F., and Dutilh, B.E. (2012). Metagenomics and future perspectives in virus discovery. Curr Opin Virol 2, 63-77. Reyes, A., Haynes, M., Hanson, N., Angly, F.E., Heath, A.C., Rohwer, F., and Gordon, J.I. (2010). Viruses in the faecal microbiota of monozygotic twins and their mothers. Nature 466, 334-338. Roux, S., Enault, F., Hurwitz, B.L., and Sullivan, M.B. (2015). VirSorter: mining viral signal from microbial genomic data. PeerJ 3, e985. Sharma, D., Priyadarshini, P., and Vrati, S. (2015). Unraveling the web of viroinformatics: computational tools and databases in virus research. J Virol 89, 1489-1501. Skewes-Cox, P., Sharpton, T.J., Pollard, K.S., and Derisi, J.L. (2014). Profile hidden Markov models for the detection of viruses within metagenomic sequence data. PLoS One 9, e105067. Smits, S.L., Bodewes, R., Ruiz-Gonzalez, A., Baumgartner, W., Koopmans, M.P., Osterhaus, A.D., and Schurch, A.C. (2015). Recovering full-length viral genomes from metagenomes. Front Microbiol 6, 1069. Tang, P., and Chiu, C. (2010). Metagenomics for the discovery of novel human viruses. Future Microbiol 5, 177-189. Yutin, N., Wolf, Y.I., Raoult, D., and Koonin, E.V. (2009). Eukaryotic large nucleo-cytoplasmic DNA viruses: clusters of orthologous genes and reconstruction of viral genome evolution. Virol J 6, 223.

Bibliography:

No textbook is required for this course. Some papers covering the main topics are listed below. Additional papers will be assigned and made available on the course’s web site in advance. Alves, J.M., de Oliveira, A.L., Sandberg, T.O., Moreno-Gallego, J.L., de Toledo, M.A., de Moura, E.M., Oliveira, L.S., Durham, A.M., Mehnert, D.U., Zanotto, P.M., Reyes, A., and Gruber, A. (2016). GenSeed-HMM: A Tool for Progressive Assembly Using Profile HMMs as Seeds and its Application in Alpavirinae Viral Discovery from Metagenomic Data. Front Microbiol. 7, 269. Bexfield, N., and Kellam, P. (2011). Metagenomics and the molecular identification of novel viruses. Vet J 190, 191-198. Bibby, K., and Peccia, J. (2013). Identification of viral pathogen diversity in sewage sludge by metagenome analysis. Environ Sci Technol 47, 1945-1951. Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., Pena, A.G., Goodrich, J.K., Gordon, J.I., Huttley, G.A., Kelley, S.T., Knights, D., Koenig, J.E., Ley, R.E., Lozupone, C.A., Mcdonald, D., Muegge, B.D., Pirrung, M., Reeder, J., Sevinsky, J.R., Turnbaugh, P.J., Walters, W.A., Widmann, J., Yatsunenko, T., Zaneveld, J., and Knight, R. (2010). QIIME allows analysis of high-throughput community sequencing data. Nat Methods 7, 335-336. Fancello, L., Raoult, D., and Desnues, C. (2012). Computational tools for viral metagenomics and their application in clinical research. Virology 434, 162-174. Huerta-Cepas, J., Szklarczyk, D., Forslund, K., Cook, H., Heller, D., Walter, M.C., Rattei, T., Mende, D.R., Sunagawa, S., Kuhn, M., Jensen, L.J., Von Mering, C., and Bork, P. (2015). eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Res. Kristensen, D.M., Waller, A.S., Yamada, T., Bork, P., Mushegian, A.R., and Koonin, E.V. (2013). Orthologous gene clusters and taxon signature genes for viruses of prokaryotes. J Bacteriol 195, 941-950. Mokili, J.L., Rohwer, F., and Dutilh, B.E. (2012). Metagenomics and future perspectives in virus discovery. Curr Opin Virol 2, 63-77. Reyes, A., Haynes, M., Hanson, N., Angly, F.E., Heath, A.C., Rohwer, F., and Gordon, J.I. (2010). Viruses in the faecal microbiota of monozygotic twins and their mothers. Nature 466, 334-338. Roux, S., Enault, F., Hurwitz, B.L., and Sullivan, M.B. (2015). VirSorter: mining viral signal from microbial genomic data. PeerJ 3, e985. Sharma, D., Priyadarshini, P., and Vrati, S. (2015). Unraveling the web of viroinformatics: computational tools and databases in virus research. J Virol 89, 1489-1501. Skewes-Cox, P., Sharpton, T.J., Pollard, K.S., and Derisi, J.L. (2014). Profile hidden Markov models for the detection of viruses within metagenomic sequence data. PLoS One 9, e105067. Smits, S.L., Bodewes, R., Ruiz-Gonzalez, A., Baumgartner, W., Koopmans, M.P., Osterhaus, A.D., and Schurch, A.C. (2015). Recovering full-length viral genomes from metagenomes. Front Microbiol 6, 1069. Tang, P., and Chiu, C. (2010). Metagenomics for the discovery of novel human viruses. Future Microbiol 5, 177-189. Yutin, N., Wolf, Y.I., Raoult, D., and Koonin, E.V. (2009). Eukaryotic large nucleo-cytoplasmic DNA viruses: clusters of orthologous genes and reconstruction of viral genome evolution. Virol J 6, 223.