UGENE
UGENE is computer software for bioinformatics. It helps biologists to analyze various biological genetics data, such as sequences, annotations, multiple alignments, phylogenetic trees, NGS assemblies, and others. UGENE integrates dozens of well-known biological tools, algorithms, and original tools in the context of genomics, evolutionary biology, virology, and other branches of life science.
UGENE works on personal computer operating systems such as Windows, macOS, or Linux. It is released as free and open-source software, under a GNU General Public License version 2. The data can be stored both locally and on shared/networked storage. The graphical user interface provides access to pre-built tools so users with no computer programming experience can access those tools easily. UGENE also has a command-line interface to execute Workflows.
Using [|UGENE Workflow Designer], it is possible to streamline a multi-step analysis. The workflow consists of blocks such as data readers, blocks executing embedded tools and algorithms, and data writers. Blocks can be created with command line tools or a script. A set of sample workflows is available in the Workflow Designer, to annotate sequences, convert data formats, analyze NGS data, etc.
To improve performance, UGENE uses multi-core processors and graphics processing units to optimize a few algorithms.
Key features
The software supports the following features:- Create, edit, and annotate nucleic acid and protein sequences
- Fast search in a sequence
- Multiple sequence alignment: Clustal W and O, MUSCLE, Kalign, MAFFT, T-Coffee
- Create and use shared storage, e.g., lab database
- Search through online databases: National Center for Biotechnology Information, Protein Data Bank, UniProtKB/Swiss-Prot, UniProtKB/TrEMBL, DAS servers
- Local and NCBI Genbank BLAST search
- Open reading frame finder
- Restriction enzyme finder with integrated REBASE restriction enzymes list
- Integrated Primer3 package for PCR primer design
- Plasmid construction and annotation
- Cloning in silico by designing of cloning vectors
- Genome mapping of short reads with Bowtie, BWA, and UGENE Genome Aligner
- Visualize next generation sequencing data using [|UGENE Assembly Browser]
- Variant calling with SAMtools
- RNA-Seq data analysis with Tuxedo pipeline
- ChIP-seq data analysis with Cistrome pipeline
- Raw NGS data processing
- HMMER 2 and 3 packages integration
- Chromatogram viewer
- Search for transcription factor binding sites with weight matrix and algorithms
- Search for direct, inverted, and tandem repeats in DNA sequences
- Local sequence alignment with optimized Smith-Waterman algorithm
- Build and edit phylogenetic trees
- Combine various algorithms into custom workflows with UGENE Workflow Designer
- Contigs assembly with CAP3
- 3D structure viewer for files in Protein Data Bank and Molecular Modeling Database formats, anaglyph view support
- Predict protein secondary structure with GOR IV and PSIPRED algorithms
- Construct dot plots for nucleic acid sequences
- mRNA alignment with Spidey
- Search for complex signals with ExpertDiscovery
- Search for a pattern of various algorithms' results in a nucleic acid sequence with [|UGENE Query Designer]
- PCR in silico for primer designing and mapping
- Spade de novo assembler
Sequence View
- 3D structure view
- Circular view
- Chromatogram view
- Graphs View: GC-content, AG-content, and other
- Dot plot view
Alignment Editor
Phylogenetic Tree Viewer
The Phylogenetic Tree Viewer helps to visualize and edit phylogenetic trees. It is possible to synchronize a tree with the corresponding multiple alignment used to build the tree.Assembly Browser
The Assembly Browser project was started in 2010 as an entry for Illumina iDEA Challenge 2011. The browser allows users to visualize and browse large next generation sequence assemblies. It supports SAM, BAM, and ACE formats. Before browsing assembly data in UGENE, an input file is converted to a UGENE database file automatically. This approach has its pros and cons. The pros are that this allows viewing the whole assembly, navigating in it, and going to well-covered regions rapidly. The cons are that a conversion may take time for a large file, and needs enough disk space to store the database.Workflow Designer
UGENE Workflow Designer allows creating and running complex computational workflow schemas.The distinguishing feature of Workflow Designer, relative to other bioinformatics workflow management systems is that workflows are executed on a local computer. It helps to avoid data transfer issues, whereas other tools’ reliance on remote file storage and internet connectivity does not.
The elements that a workflow consists of correspond to the bulk of algorithms integrated into UGENE. Using Workflow Designer also allows creating custom workflow elements. The elements can be based on a command-line tool or a script.
Workflows are stored in a special text format. This allows their reuse, and transfer between users.
A workflow can be run using the graphical interface or launched from the command line. The graphical interface also allows controlling the workflow execution, storing the parameters, and so on.
There is an embedded library of workflow samples to convert, filter, and annotate data, with several pipelines to analyze NGS data developed in collaboration with NIH NIAID. A wizard is available for each workflow sample.
Supported biological data formats
- Sequences and annotations: FASTA, GenBank, EMBL, GFF
- Multiple sequence alignments: Clustal, MSF, Stockholm, Nexus
- 3D structures: PDB, MMDB
- Chromatograms: ABIF, SCF
- Short reads: Sequence Alignment/Map, binary version of SAM, ACE, FASTQ
- Phylogenetic trees: Newick, PHYLIP
- Other formats: Bairoch, HMM, PWM and PFM, SNP and VCF4
Release cycle
The features to include in each release are mostly initiated by users.