Dissertation/ Thesis

Investigating the scaleability of analyzing and processing RDBMS datasets with Apache Spark

التفاصيل البيبلوغرافية
العنوان: Investigating the scaleability of analyzing and processing RDBMS datasets with Apache Spark
المؤلفون: Bahceci, Ferhat
بيانات النشر: Uppsala universitet, Institutionen för informationsteknologi, 2018.
سنة النشر: 2018
المجموعة: DiVA Archive at Upsalla University
مصطلحات موضوعية: Engineering and Technology, Teknik och teknologier
الوصف: At the Uppsala Monitoring Centre (UMC), individual case safety reports (ICSRs) are managed, analyzed and processed for publishing statistics of adverse drug reactions. On top of the UMC’s ICSR database there is a data processing tool used to analyze the data. Unfortunately, there are some constraints limiting the current processing-tool along with that the amount of arriving data to be processed grows at a rapid rate. The UMC’s processing system must be improved in order to handle future demands. In order to improve performance various frameworks forparallelization can be used. In this work, the in-memory computing framework Sparkwas used for parallelization of one of the current data processing tasks. Local clusters for running the new implementation in parallel was also established.
Original Identifier: oai:DiVA.org:uu-348885
نوع الوثيقة: Text
وصف الملف: application/pdf
اللغة: English
Relation: IT ; 18004
الاتاحة: http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-348885
Rights: info:eu-repo/semantics/openAccess
رقم الانضمام: edsndl.UPSALLA1.oai.DiVA.org.uu.348885
قاعدة البيانات: Networked Digital Library of Theses & Dissertations