Compare the Difference Between Similar Terms

Difference Between

Home / Technology / IT / Database / Difference Between Semi Join and Bloom Join

Difference Between Semi Join and Bloom Join

June 9, 2011 Posted by Indika

Semi Join vs Bloom Join

Semi join and Bloom join are two joining methods used in query processing for distributed databases. When processing queries in distributed databases, data needs to be transferred between databases located in different sites. This could be an expensive operation depending on the amount of data that needs to be transferred. Therefore, when processing queries in a distributed database environment, it is important to optimize the queries to minimize the amount of data transferred between sites. Semi join and bloom join are two methods that can be used to reduce the amount of data transfer and perform efficient query processing.

What is Semi Join?

Semi join is a method used for efficient query processing in a distributed database environments. Consider a situation where an Employee database (holding information such as employee’s name, department number she is working for, etc) located at site 1 and a Department database (holding information such as department number, department name, location, etc) located at site 2. For example if we want to obtain the employee name and department name that she is working for (only of departments located in “New York”), by executing a query at a query processor located at site 3, there are several ways that data could be transferred between the three sites to achieve this task. But when transferring data, it is important to note that it is not necessary to transfer the whole database between the sites. Only some of the attributes (or tuples) that are required for the join need to be transferred between the sites to execute the query efficiently. Semi join is a method that can be used to reduce the amount of data shipped between the sites. In semi join, only the join column is transferred from one site to the other and then that transferred column is used to reduce the size of the shipped relations between the other sites. For the above example, you can just transfer the department number and department name of tuples with location=”New York” from site 2 to site 1 and perform the joining at site 1 and transfer the final relation back to site 3.

What is Bloom Join?

As mentioned earlier, bloom join is another method used to avoid transferring unnecessary data between sites when executing queries in a distributed database environments. In bloom join, rather than transferring the join column itself, a compact representation of the join column is transferred between the sites. Bloom join uses a bloom filter which employs a bit vector to execute membership queries. Firstly, a bloom filter is built using the join column and it is transferred between the sites and then the joining operations are performed.

What is the difference between Semi Join and Bloom Join?

Even though both semi join and bloom join methods are used to minimize the amount of data transferred between the sites when executing queries in a distributed database environment, bloom join reduces the amount of data (number of tuples) transferred compared to semi join by utilizing the concept of bloom filters, which employ a bit vector to determine set memberships. Therefore using bloom join will be more efficient than using semi join.

Related posts:

Difference Between DBMS and Database Difference Between SQL and PL SQL Difference Between DBMS and File System Difference Between KDD and Data mining Difference Between Distributed Database and Centralized Database

Filed Under: Database Tagged With: bloom filters, Bloom join, Distributed databases, query processing for distributed databases, Semi join

About the Author: Indika

Indika, BSc.Eng, MSECE Computer Engineering, PhD. Computer Science, is an Assistant Professor and has research interests in the areas of Bioinformatics, Computational Biology, and Biomedical Natural Language Processing.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Request Article

Featured Posts

Difference Between Coronavirus and Cold Symptoms

Difference Between Coronavirus and Cold Symptoms

Difference Between Coronavirus and SARS

Difference Between Coronavirus and SARS

Difference Between Coronavirus and Influenza

Difference Between Coronavirus and Influenza

Difference Between Coronavirus and Covid 19

Difference Between Coronavirus and Covid 19

You May Like

Difference Between Netbook and Notebook

Difference Between Licence and License

Difference Between Licence and License

Difference Between Calcium Carbonate and Calcium Bicarbonate

Difference Between Calcium Carbonate and Calcium Bicarbonate

Difference Between Terrestrial and Aquatic Biomes

Difference Between Terrestrial and Aquatic Biomes

Difference Between LG Spectrum and Motorola Droid Razr

Latest Posts

  • What is the Difference Between Diphtheria and Whooping Cough
  • What is the Difference Between Polyester and Satin
  • What is the Difference Between Syphilis and Gonorrhea
  • What is the Difference Between Stainless Steel and Sterling Silver
  • What is the Difference Between Infrared and Induction Cooker
  • What is the Difference Between Polyester and Viscose
  • Home
  • Vacancies
  • About
  • Request Article
  • Contact Us

Copyright © 2010-2018 Difference Between. All rights reserved. Terms of Use and Privacy Policy: Legal.