Distributed Database vs Centralized Database
Centralized database is a database in which data is stored and maintained in a single location. This is the traditional approach for storing data in large enterprises. Distributed database is a database in which data is stored in storage devices that are not located in the same physical location but the database is controlled using a central Database Management System (DBMS).
What is Centralized Database?
In a centralized database, all the data of an organization is stored in a single place such as a mainframe computer or a server. Users in remote locations access the data through the Wide Area Network (WAN) using the application programs provided to access the data. The centralized database (the mainframe or the server) should be able to satisfy all the requests coming to the system, therefore could easily become a bottleneck. But since all the data reside in a single place it easier to maintain and back up data. Furthermore, it is easier to maintain data integrity, because once data is stored in a centralized database, outdated data is no longer available in other places.
What is Distributed Database?
In a distributed database, the data is stored in storage devices that are located in different physical locations. They are not attached to a common CPU but the database is controlled by a central DBMS. Users access the data in a distributed database by accessing the WAN. To keep a distributed database up to date, it uses the replication and duplication processes. The replication process identifies changes in the distributed database and applies those changes to make sure that all the distributed databases look the same. Depending on the number of distributed databases, this process could become very complex and time consuming. The duplication process identifies one database as a master database and duplicates that database. This process is not complicated as the replication process but makes sure that all the distributed databases have the same data.
What is the difference between Distributed Database and Centralized Database?
While a centralized database keeps its data in storage devices that are in a single location connected to a single CPU, a distributed database system keeps its data in storage devices that are possibly located in different geographical locations and managed using a central DBMS. A centralized database is easier to maintain and keep updated since all the data are stored in a single location. Furthermore, it is easier to maintain data integrity and avoid the requirement for data duplication. But, all the requests coming to access data are processed by a single entity such as a single mainframe, and therefore it could easily become a bottleneck. But with distributed databases, this bottleneck can be avoided since the databases are parallelized making the load balanced between several servers. But keeping the data up to date in distributed database system requires additional work, therefore increases the cost of maintenance and complexity and also requires additional software for this purpose. Furthermore, designing databases for a distributed database is more complex than the same for a centralized database.