Dissertation/ Thesis
Structured Data Processing on MapReduce in NoSQL Database
العنوان: | Structured Data Processing on MapReduce in NoSQL Database |
---|---|
Alternate Title: | MapReduce於非關聯式資料庫之結構化資料處理 |
المؤلفون: | Lin, Hung-Pin, 林弘斌 |
Thesis Advisors: | Chung, Yeh-Ching, 鍾葉青 |
سنة النشر: | 2012 |
المجموعة: | National Digital Library of Theses and Dissertations in Taiwan |
الوصف: | 100 As the rapidly data exploration in recent years, data store and processing are getting more attentions to extract the important information. To find a scalable solution to process the large scale data is a critical issue in either the relational data base system or the emerging NoSQL database. Since Google published some techniques they have successfully operated in their corporation, a great impact was given on the literature of distributed data store and processing such that a brand new paradigm was step forwarded; so-called Cloud Computing. MapReduce is one of the critical techniques to process the massive data in parallel. With the inherent scalability and fault-tolerance, MapReduce is attractive to the large-scale data processing. Using MapReduce to support the SQL or SQL-like queries has been presented in several studies. Most of the previous works focus on the Hadoop distributed file system. However, from the view point of some enterprises, the data resided in a database may be frequently changed as the update occurs. Accordingly, we need a flexible data store as Bigtable or HBase not only to place the data over a scale-out storage system, but also to manipulate the changeable data in a transparent way. In this thesis, we propose a systematical method using MapReduce for the structured data processing in NoSQL database. We exploit the HBase as the underlying NoSQL database to analyze some major manipulation languages of the ANSI SQL and provide the corresponding queries to manipulate the data residing in the NoSQL database. To organize the data with less complexity, we also introduce a remapping strategy to translate the data model from the relational database to the NoSQL database. Experimental results show that our approaches can outperform the conventional approach in terms of the efficiency and the scalability in large scale data sets. |
Original Identifier: | 100NTHU5392106 |
نوع الوثيقة: | 學位論文 ; thesis |
وصف الملف: | 26 |
الاتاحة: | http://ndltd.ncl.edu.tw/handle/58850616709254764470 |
رقم الانضمام: | edsndl.TW.100NTHU5392106 |
قاعدة البيانات: | Networked Digital Library of Theses & Dissertations |
الوصف غير متاح. |