التفاصيل البيبلوغرافية
العنوان: |
Investigating the scaleability of analyzing and processing RDBMS datasets with Apache Spark |
المؤلفون: |
Bahceci, Ferhat |
بيانات النشر: |
Uppsala universitet, Institutionen för informationsteknologi, 2018. |
سنة النشر: |
2018 |
المجموعة: |
DiVA Archive at Upsalla University |
مصطلحات موضوعية: |
Engineering and Technology, Teknik och teknologier |
الوصف: |
At the Uppsala Monitoring Centre (UMC), individual case safety reports (ICSRs) are managed, analyzed and processed for publishing statistics of adverse drug reactions. On top of the UMC’s ICSR database there is a data processing tool used to analyze the data. Unfortunately, there are some constraints limiting the current processing-tool along with that the amount of arriving data to be processed grows at a rapid rate. The UMC’s processing system must be improved in order to handle future demands. In order to improve performance various frameworks forparallelization can be used. In this work, the in-memory computing framework Sparkwas used for parallelization of one of the current data processing tasks. Local clusters for running the new implementation in parallel was also established. |
Original Identifier: |
oai:DiVA.org:uu-348885 |
نوع الوثيقة: |
Text |
وصف الملف: |
application/pdf |
اللغة: |
English |
Relation: |
IT ; 18004 |
الاتاحة: |
http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-348885 |
Rights: |
info:eu-repo/semantics/openAccess |
رقم الانضمام: |
edsndl.UPSALLA1.oai.DiVA.org.uu.348885 |
قاعدة البيانات: |
Networked Digital Library of Theses & Dissertations |