The Performance of SQL-on-Hadoop Systems - An Experimental Study

التفاصيل البيبلوغرافية
العنوان: The Performance of SQL-on-Hadoop Systems - An Experimental Study
المؤلفون: Jun Chen, Yueguo Chen, Shuai Li, Huijie Zhang, Xiongpai Qin, Jiesi Liu
المصدر: BigData Congress
بيانات النشر: IEEE, 2017.
سنة النشر: 2017
مصطلحات موضوعية: SQL, Database, business.industry, Computer science, Big data, InformationSystems_DATABASEMANAGEMENT, 020206 networking & telecommunications, Unstructured data, 02 engineering and technology, computer.software_genre, 020204 information systems, Server, 0202 electrical engineering, electronic engineering, information engineering, Benchmark (computing), Operating system, Query by Example, business, computer, Massively parallel, computer.programming_language, De facto standard
الوصف: Hadoop is now the de facto standard for storing and processing big data, not only for unstructured data but also for some structured data. As a result, providing SQL analysis functionality to the big data resided in HDFS becomes more and more important. Hive is a pioneer system that supports SQL-like analysis to the data in HDFS. However, the performance of the early-version of Hive is not satisfactory. This leads to the quick emergence of dozens of SQL-on-Hadoop systems that try to support interactive SQL query processing to the data stored in HDFS. This paper firstly gives a brief technical review on recent efforts of SQL-on-Hadoop systems. Then we test and compare the performance of three representative SQL-on-Hadoop systems, based on the TPC-H benchmark. According to the results, we show that such systems can benefit more from applications of many parallel query processing techniques that have been widely studied in the traditional massively parallel processing databases.
DOI: 10.1109/bigdatacongress.2017.68
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::0d38cda757be950762c077a08d951612
https://doi.org/10.1109/bigdatacongress.2017.68
رقم الانضمام: edsair.doi...........0d38cda757be950762c077a08d951612
قاعدة البيانات: OpenAIRE
الوصف
DOI:10.1109/bigdatacongress.2017.68