For full functionality of this site it is necessary to enable JavaScript. Here are the instructions how to enable JavaScript in your web browser.
Skip to Menu

Home

AVIA logo
Analysis of Genomic Variations with AVIA

Annotation, Visualization, and Impact Analysis

We have developed an interactive web-based tool, AVIA, to explore and interpret large sets of genomic variations (single nucleotide variations and insertion/deletions) to help guide and summarize genomic experiments. The tool is based on coupling a comprehensive annotation pipeline with a flexible visualization method. We leveraged the ANNOVAR (Wang et. al, 2010) framework for assigning functional impact to genomic variations by extending its list of reference annotation databases (RefSeq, UCSC, SIFT, Polyphen etc.) with additional in-house developed sources (Non-B DB, PolyBrowse). Further, because many users also have their own annotation sources, we have added the ability to supply their own files as well. The results can be obtained in tabular format or as tracks in whole genome circular views generated by the Circos application (Krzywinski et. al, 2009). Users can also select different sets of pre-computed tracks, including whole genome distributions of different genomic features (genes, exons, repeats), as well as variations analysis tracks for the 69 CGI public genomes for reference.

This version of AVIA is focused on gene related impact assessment. Tracks showing the distribution of genes with variations of specific functional effects such as non-synonymous variations, frame shifts, variable miRNAs target sites or variations in G-quadruplexes in 5'UTRs can be produced. Additional modules that inspect functional implication of the variations in the non-coding regions of the genome are being developed. During exploratory work with AVIA, users can browse different tracks with their data and then re-generate signature plots to summarize the project. To our knowledge, this is the first web-based program that integrates annotation, visualization, and impact analysis.

We have also developed a variation detection pipeline for Sanger sequencing, in particular for PCR directed re-sequencing. Using open source programs phred (basecalling) and polyphred (variation detection), in addition to in-house tools, we identify mutations in sequences as compared to a NCBI reference. Once variations have been identified, we determine the mutation"s impact on the gene and related disease information. For diagnostic purposes, pictures of each variation can be obtained for use in publications or for patient files.

For our Sanger pipeline, we have implemented three methods of chromatogram retrieval, using a querying via the NCBI trace archives, querying publicly available requests, or submitting by ftp. Using the NCBI trace archives method, the user will be able to query the database based on a project, center submission, gene, or by trace_ids. Due to the size of the traces, the LIMs and the trace archive retrieval are the preferable methods as the ftp method may have slow connection speeds. Like many variation detection programs for Sanger Data (e.g. Mutation Surveyor, VarDetect, Genalys), our pipeline uses peak ratio differences between two fluorescence bases. The ftp site will become available for a small set of traces less than 2 GB. However multiple submissions can be done and then combined.

If you already have mapped genomic data, please click here to find full descriptions of the tools available.