Bioinformatics tools for metagenomic analysis of bacterial communities

Research output: ThesisDoctoral thesis

Institutes & Expert groups

Documents & links

Abstract

Metagenomics, that is sequence analysis of DNA extracted directly from the environment, bypasses strain isolation and cultivation, and the associated limitations of conventional analyses that rely on them. This is a rather recent field of study, enabled by the development of high-throughput sequencing techniques, that requires extensive computational analysis to be usable. As a result, related software is in need of development, which is what the present project aimed to accomplish. To this end, we created two new methods, PaSiT and MAGISTA. PaSiT is a new method designed to efficiently compute inter-genome distances, which can be used to obtain the taxonomy of genomes obtained through metagenome analysis, without requiring extensive computational infrastructure. MAGISTA is a machine-learning approach designed to provide an alternative to marker-gene-based approaches for estimating the quality of these putative genomes. In addition to developing new tools, we also evaluated the quality of existing sequencing technologies and tools that analyse their output using a pre-defined mix of 227 bacterial strains, the most complex DNA mock created so far. The sequencing platforms considered were those produced by Illumina, Oxford Nanopore Technologies, and Pacific Biosciences. We concluded that overall Oxford Nanopore Technologies provided the best value for metagenomics, but other technologies had their own use-cases.

Details

Original languageEnglish
QualificationMaster of Science
Awarding Institution
  • UGent - Universiteit Gent
Supervisors/Advisors
  • Vandamme, Peter, Supervisor, External person
  • Van Houdt, Rob, SCK CEN Mentor
Award date9 Nov 2021
Publisher
  • UGent - Universiteit Gent
Publication statusPublished - 9 Nov 2021

Keywords

  • Prokaryotes, Bioinformatics

ID: 7382514