Dissertation/ Thesis

Highly Parallel Sequential Pattern Mining on a Heterogeneous Platform

التفاصيل البيبلوغرافية
العنوان: Highly Parallel Sequential Pattern Mining on a Heterogeneous Platform
Alternate Title: 於異質性平台上之高度平行化序列模式探勘
المؤلفون: Yu-Heng Hsieh, 謝友恆
Thesis Advisors: 陳銘憲
سنة النشر: 2018
المجموعة: National Digital Library of Theses and Dissertations in Taiwan
الوصف: 106
Sequential pattern mining can be applied to various fields such as disease prediction and stock analysis. Many different algorithms have been proposed for sequential pattern mining, together with acceleration methods. In this thesis, we argue that a heterogeneous platform with CPU and GPU is much more suitable for sequential pattern mining than traditional CPU-based approaches since the support counting process is inherently succinct and repetitive. Therefore, we propose the PArallel SequenTial pAttern mining algorithm, referred to as PASTA, to accelerate sequential pattern mining by combining the merits of CPU and GPU computing. Explicitly, PASTA adopts the vertical bitmap representation of database with an optimized expansion strategy to effectively explore and reduce the search space. In addition, a pipeline strategy is proposed to ensure that both CPU and CPU on the heterogeneous platform operate concurrently to fully utilize the computing power of the platform. Furthermore, we develop a dynamic swapping scheme to mitigate the limited memory problem of the GPU hardware without decreasing the performance. Finally, a comprehensive experiment and a sensitivity test are conducted to analyze PASTA with different baselines. The experiments show that PASTA outperforms the state-of-the-art algorithms by orders of magnitude on both synthetic and real datasets.
Original Identifier: 106NTU05442031
نوع الوثيقة: 學位論文 ; thesis
وصف الملف: 50
اللغة: en_US
الاتاحة: http://ndltd.ncl.edu.tw/handle/4vrf6s
رقم الانضمام: edsndl.TW.106NTU05442031
قاعدة البيانات: Networked Digital Library of Theses & Dissertations