Compare the Difference Between Similar Terms

Difference Between Homology and Similarity in Bioinformatics

The key difference between homology and similarity in bioinformatics is that homology refers to a statement about common evolutionary ancestry of two sequences whilst similarity refers to the degree of likeness between two sequences.

Bioinformatics is a field of science that combines biology, information engineering, computer science, mathematics and statistics to analyze and interpret biological data. Homology and similarity are two terms we use in the field of bioinformatics. We can easily calculate the similarity as a percentage of similar residues over a given length of the alignment. However, we cannot calculate homology since it could be true or false and usually depend on the hypothesis used.

CONTENTS

1. Overview and Key Difference
2. What is Homology in Bioinformatics
3. What is Similarity in Bioinformatics
4. Similarities Between Homology and Similarity in Bioinformatics
5. Side by Side Comparison –  Homology vs Similarity in Bioinformatics in Tabular Form
6. Summary

What is Homology in Bioinformatics?

Homology in bioinformatics refers to the biological homology between DNA, RNA and protein sequences which are defined in terms of shared ancestral properties in the evolutionary tree of life. In other words, it is the common evolutionary ancestry of two sequences. The reason for such occurrence could be either due to speciation events (orthologs), horizontal gene transfer events (xenologs) or duplication events (paralogs).

Figure 01: Multiple Sequence Alignment

It is possible to deduce the homology between DNA, RNA or proteins by their amino acid or nucleotide sequence similarity. A significant similarity serves as a strong evidential property to infer that two sequences are related to a common ancestral sequence with evolutionary changes. Alignments of multiple sequences indicate the regions of each sequence with homologous nature.

What is Similarity in Bioinformatics?

In bioinformatics, similarity assesses the similarity between two proteins or nucleotide sequences. There are two main steps to this process. The initial step is pair-wise alignment, which helps to find the optimal alignment between two sequences (including gaps) using algorithms such as BLAST, FastA, and LALIGN. After pair-wise alignment, it is necessary to obtain two quantitative parameters from each pair-wise comparison. They are identity and similarity.  In BLAST, search similarities are known as positives.

Figure 02: Pairwise Alignment

A conservative mutation occurs when an amino acid mutates to a similar residue while preserving the physiochemical properties. For example, if arginine mutates to lysine with +1 positive charge, it is accepted since the two amino acids are similar in property and do not change the translated protein. Hence, similarity measurements are dependent on the criteria of how two amino acid residues are to each other.

What are the Similarities Between Homology and Similarity in Bioinformatics?

What is the Difference Between Homology and Similarity in Bioinformatics?

Homology refers to a statement about common evolutionary ancestry of two sequences while similarity refers to the degree of likeness between two sequences. So, this is the key difference between homology and similarity in bioinformatics. In addition, homology cannot be calculated since it could be true or false and usually depend on the hypothesis while similarity can be easily calculated as a percentage of similar residues over a given length of the alignment. Hence, this is a significant difference between homology and similarity in bioinformatics.

The following infographic lists the difference between homology and similarity in bioinformatics.

Summary – Homology vs Similarity in Bioinformatics

In brief, the key difference between homology and similarity in bioinformatics lies in their definitions. Homology is a statement of common evolutionary ancestry of two sequences while similarity is the likeness between two sequences. Homology occurs due to orthologs, paralogs, and xenologs. When deducing similarity, it is possible to use algorithms such as FastA, BLAST, and LALIGN. Homology cannot be expressed as a calculation, but similarity could be expressed as a percentage of similar residues over a given length of the alignment. Thus, this is the summary of the difference between homology and similarity in bioinformatics.

Reference:

1.“ Homologue.” Homologue – Bioinformatics.Org Wiki, Available here.
2. “Identity and Similarity – a Quantitative Measure.” Identity and Similarity – a Quantitative Measure, Available here.

Image Courtesy:

1. “An excerpt of a multiple sequence alignment of TMEM66 proteins” By Wikid25 – Open source ClustalW and sequences from public NCBI Protein database (CC0) via Commons Wikimedia
2. “Global Pairwise Alignment” By Alfat003 – Own work (CC BY-SA 4.0) via Commons Wikimedia