sequence simulator for ancient DNA
gargammel is a pipeline to simulate ancient DNA (aDNA) from a set of known references. gargammel seeks to emulate the in vivo process that leads to the sequencing of aDNA fragments:
The resulting data is a set of Illumina reads that can be used to test certain hypotheses about aDNA. Here is a potential set of questions that could be answered using our pipeline:
- First, fragments are collected from a set of reference sequences. These reference sequences are designed to represent the endogenous DNA, the contamination both from the same species (e.g. present-day humans), and/or microbes.
- The patterns of mis-incorporations of ancient DNA fragments can be also accounted for.
- Second, aDNA damage is added to those fragments. Sequencing adapters are appended to form reads of a specific length.
- Finally, sequencing errors with corresponding quality scores are applied to the reads.
To simulate Illumina sequencing errors, we use the ART package.
- Impact of present-day human contamination on various statistics used in population genetics
- Influence of high levels of deamination on mapping
- Impact of a high number of microbial sequences on alignment to a specific reference
- Ability to infer the metagenomic profile of a sample depending on the aDNA fragment length distribution
- Accuracy of contamination estimates
gargammel has been developed in Ludovic Orlando's and Eske Willerslev's research groups at the Center for GeoGenetics at the University of Copenhagen. The code was implemented by Gabriel Renaud in collaboration with Kristian Hanghoej.
- February 28, 2021: We can now handle circular references.
- November 21, 2016: gargammel is published in Bioinformatics!
You can either type:
git clone --depth 1 https://github.com/grenaud/gargammel.git
Or click on the tar/zip links at the top of the page.
gargammel is composed of the following subprograms:
The 3 programs were written in C++. There is driver script in Perl that automates the process and simulates the in vivo process that generates aDNA fragments and calls ART to add sequencing errors.
- fragSim: simulation of DNA fragmentation due to degradation. Ability to simulate the DNA composition at the 5'/3' ends of the fragments
- deamSim: program to add in silico damage (or deamination) to the DNA fragments
- apdtSim: module to transform the fragments (damaged or not) into raw Illumina reads
Documentation, requirements and examples of usage
For installation, usage and other questions please refer to the README.
Please cite our paper:
Renaud, G., Hanghoej, K., Willerslev, E. & Orlando, L. (2016). gargammel: a sequence simulator for ancient DNA Bioinformatics, btw670
This work was supported by the Danish Council for Independent Research, Natural Sciences (4002-00152B); the Danish National Research Foundation (DNRF94); the Villum Fonden (miGENEPI), and; Initiative d'Excellence Chaires d’attractivité, Université de Toulouse (OURASI).
Support/Feature request/Push requests
Please contact Gabriel Renaud (@grenaud) for further information: