HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads
Abouzeid, Azza, Bajda-Pawlikowski, Kamil, Abadi, Daniel, Rasin, Alexander and Silberschatz, Avi (2009). ''HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads'', VLDB'09: Proceedings of the 2009 VLDB Endowment
Many proponents of relational databases see traditional distributed databases such as key-value stores (e.g., the google database) as a step backward.
Abouzeid at el., therefore, introduce HadoopDB a hybrid approach combining the strengths of parallel databases (performance, SQL compliant) with the advantages of the hadoop MapReduce framework (ability to run in heterogeneous environments, fault tolerance).
HadoopDB combines the following technologies.
- the hadoop MapReduce framework
- Hive (a data warehouse infrastructure built on top of hadoop) as a translational layer
- a PostgreSQL or MySQL database