Computational Genomics Platform
Overview
question Questionsobjectives Objectives
Please address your questions by e-mail to any of the contributors (e-mails at the bottom of this wiki).
To design, organize and mantain the Computational Genomics Platform at IRCCS Azienda Ospedaliero-Universitaria di Bologna (AOUBO)
To develop, deploy and update a large range of bioinformatic tools for the analysis of genomic data, and to organize them in software environments and analysis workflows
To provide support for the developments of bioinformatic software and for the design of genomic analysis
To curate a genomic variant database populated by the genetic variation identified by IRCCS AOUBO projects and/or collaborations
last_modification Last modification: Nov 22, 2022
The Computational Genomics Platform (https://www.aosp.bo.it/content/genomica-computazionale) is an integrated system of bioinformatic solutions designed and mantained by the Bologna Sant’Orsola Computational Genomics (BOSCO) team at IRCCS AOUBO. The platform offers diversified solutions for the analysis of genomic data:
- Command Line Interface (CLI) to run bioinformatic tools. The systems through which the users interact with the CLI are mainly:
- Galaxy to let the users with no or poor programming experience to carry out computational genomic projects in a user-friendly web portal
- OpenCGA to organize genomic projects in a database for easily storing and querying variant datasets
- GitLab to support bioinformatic software development.
Agenda
In this documentation, you can find:
CLI
In the CLI, the users can run on their own any of the currently available bioinformatic tools as listed below, where the conda environment they belong to is also indicated.
Mantained CLI workflows
The BOSCO team also constructs and mantains some general purpose analysis workflows which can be launched in the CLI:
Workflow snakemake for WES data pre-processing and germline short variants calling, SNV and indels
Workflow snakemake for WGS data pre-processing and germline short variants calling (SNV and indels) UNDER CONSTRUCTION
Workflow snakemake for BAM conversion to CRAM UNDER CONSTRUCTION
Galaxy
In Galaxy, the users can run on their own any of the currently available bioinformatic tools as listed below, where their accessibility in Galaxy is indicated.
To launch the Galaxy aosp instance browse to Galaxy aosp instance galaxy.aosp.biodec.com. Click the Log in or register link (top panel) and enter your email and password.
comment Galaxy
Galaxy is an open-source, web-based portal for accessible, reproducible, and transparent computational research. As a first step with Galaxy visit the page https://galaxyproject.org/get-started/. A collection of tutorials developed and maintained by the Galaxy community is available at https://training.galaxyproject.org/training-material. To view the list of tools that can be used within the Galaxy instance visit https://toolshed.g2.bx.psu.edu/.
Mantained Galaxy workflows
The BOSCO team also constructs and mantains some general purpose analysis workflows which can be launched in Galaxy:
- Galaxy workflow to run Rabdomyzer tool
- [Galaxy workflow for the analysis of amplicon-based gene panel](https://git.aosp.biodec.com/aosp/piattaforma-bioinformatica/-/wikis/Galaxy-workflow-for-gene-pane
Bioinformatic tools
Tool | Version | Galaxy | Commandline | Conda Environment |
---|---|---|---|---|
ensembl-vep | 101.0 | ❌ | ✅ | vep |
bamkit | 16.07.26 | ❌ | ✅ | svtools |
bamsurgeon | 1.2 | ❌ | ✅ | bamsurgeon1.2 |
bcftools | 1.9 | ✅ | ✅ | aligners bamsurgeon1.2 |
bedtools | 2.27.1 2.30.0 |
✅ | ✅ | aligners sv2 |
blast | 2.10.1 | available in Tool Shed | ✅ | svtools |
bwa | 0.7.17 | ✅ | ✅ | aligners bamsurgeon1.2 svtools |
clump | 1.0.0 | ❌ | ✅ | clump |
cnvfilter | 1.6.0 | ❌ | ✅ | cnvfilter |
cnvkit | 0.9.7 | ❌ | ✅ | cnvkit |
cnvpytor | 1 | ❌ | ✅ | cnvpytor |
CoNIFER | 0.2.2 | ❌ | ✅ | conifer |
DECoN | 1.0.2 | ❌ | ✅ | decon |
delly | 0.8.5 | available in Tool Shed | ✅ | svtools |
ensembl-vep | 101 | available in Tool Shed | ✅ | vep |
erds | 1.1 | ❌ | ✅ | erds |
excavator2 | 2.0.0 | ❌ | ✅ | singularity |
exomiser | ❌ | ✅ | exomiser | |
exonerate | 2.4.0 | available in Tool Shed | ✅ | bamsurgeon1.2 |
fastp | 0.20.1 | ✅ | ✅ | quality |
fastqc | 0.11.8 | ✅ | ✅ | aligners quality |
gatk | 4.1.2.0 3.8 |
❌ | ✅ | gatk4 gatk3 |
gridss | 2.10.1 | ❌ | ✅ | svtools |
IntegrationSiteMapper | 1.3.8 | ❌ | ✅ | discvrseq |
kraken2 | 2.1.0 | available in Tool Shed | ✅ | svtools |
lumpy-sv | 0.3.1 | available in Tool Shed | ✅ | svtools |
manta | 1.6.0 | available in Tool Shed | ✅ | svtools |
mipgen | 4 | ❌ | ✅ | mipgen |
mosdepth | 0.3.1 | ❌ | ✅ | quality |
multiqc | 1.9 | ✅ | ✅ | aligners quality |
pear | 0.9.6 | available in Tool Shed | ✅ | pear |
picard | 2.18.14 | ✅ | ✅ | aligners |
pybedtools | 0.8.1 | available in Tool Shed | ✅ | sv2 |
python | 2.7.15 | ✅ | ✅ | bamsurgeon1.2 |
r | 3.5.1 | available on galaxy.eu.org | ✅ | rstudio |
r-exomedepth | 1.1.15 | available in Tool Shed | ✅ | decon |
sambamba | 0.7.1 | available in Tool Shed | ✅ | svtools |
samblaster | 0.1.26 | available in Tool Shed | ✅ | svtools |
samtools | 0.1.19 1.1 1.9 1.9 |
✅ | ✅ | erds svtools bamsurgeon1.2 aligners |
singularity | 3.6.3 | ❌ | ✅ | singularity |
snakemake | 7.3.8 | ❌ | ✅ | snakemake |
somatic-sniper | 1.0.5.0 | available in Tool Shed | ✅ | bamsurgeon1.2 |
survivor | 1.0.7 | ❌ | ✅ | svtools |
sv2 | 1.4.3.4 | ❌ | ✅ | sv2 |
svaba | 1.1.0 | ❌ | ✅ | svaba |
svtyper | 0.7.1 | available in Tool Shed | ✅ | svtools |
svviz2 | 2.0a3 | ❌ | ✅ | svviz2 |
t_coffee | 11.0.8 | available in Tool Shed | ✅ | vep |
uncoverapp | 1.6.0 | in progress | ✅ | uncoverapp |
VariantQC | 1.3.8 | ❌ | ✅ | discvrseq |
varscan | 2.4.3 | available in Tool Shed (iuc) | ✅ | bamsurgeon1.2 |
velvet | 1.2.10 | available in Tool Shed (devteams) | ✅ | bamsurgeon1.2 |
vt | 2015.11.10 | ❌ | ✅ | vt |
platypus-variant | 0.8.1.1 | ❌ | ✅ | platypus-variant |
h3m2 | ✅ | custom |
OpenCGA
OpenCGA represents the framework to load and retrieve variation from a genomic database and also provides a data visualization browser and fucntional as well as clinical analysis modules.
comment OpenCGA
OpenCGA is the most advanced big data genomic analysis platform. It is implemented as an open-source project that implements a high-performance, scalable and secure platform for Genomic data analysis and visualisation. OpenCGA implements a complete solution that covers all aspects of genomic analysis: metadata database, authentication and security, variant normalisation and aggregation, variant storage and annotation, highly scalable variant NoSQL storage engine, alignment and coverage, big data variant analysis, RESTful web services, visualisation OpenCGA is developed and maintained in the University of Cambridge and it is currently used by several big data projects such as GEL (Genomics England).
Gitlab
With Gitlab, the BOSCO team builds its own bioinformatic software and supports anyone who wants to do it. It also provides the issue-tracking system to handle the problems encounterd by the users on the Computational Genomics Platform.
comment GitLab
GitLab is a DevOps software package that combines the ability to develop, secure, and operate software collaboratively and in a single application.
Contributors
Bologna Sant’Orsola Computational Genomics (BOSCO) team U.O.C. Genetica Medica, S.S. Genomica Computazionale, IRCCS Azienda Ospedaliero-Universitaria di Bologna (AOUBO) Please address your questions to:
- Tania Giangregorio - tania.giangregorio@aosp.bo.it
- Federica Isidori - federica.isidori@aosp.bo.it
- Tommaso Pippucci - tommaso.pippucci@aosp.bo.it
feedback Give us feedback on this content!
To give us feedback about these materials, or to get in touch with us, write an email to tania.giangregorio@aosp.bo.it.