Descripción del título

There has been a tremendous increase in the amount of data collected everyday by Internet companies like Google, Yahoo!, Facebook and Twitter. These companies use large Hadoop clusters with thousands of machines to analyze the collected data. The usage model for data-intensive computing platforms like Hadoop is challenging, as many users can be submitting jobs to a cluster at the same time. Therefore, there is need to understand user behavior, cluster resource usage, and how data-intensive computing platforms react to multi-user workloads. In this thesis we describe HERDER, a multi-user benchmarking tool to execute synthetic workloads on a cluster. HERDER provides a flexible user model to simulate an actual environment with users working on a cluster and an extensible framework to execute jobs belonging to a variety of data-intensive computing platforms. The HERDER workload generator reports statistics for the steady-state and the overall execution time periods at the end of workload execution
Monografía
monografia Rebiun37351457 https://catalogo.rebiun.org/rebiun/record/Rebiun37351457 m o d t om 000 0 cr ||||||||||| 111026s2011 cau obm 000 0 eng d 9781124521732 1124521739 CUI eng CUI OCLCQ OCLCO OCLCF OCLCQ OCLCO OCLCQ OCLCO Ayyalasomayajula, Vandana Herder a heterogeneous engine for running data-intensive experiments & reports Vandana Ayyalasomayajula Irvine, Calif. University of California, Irvine 2011 Irvine, Calif. Irvine, Calif. University of California, Irvine 1 online resource (85 pages) 1 online resource (85 pages) Text txt rdacontent computer c rdamedia online resource cr rdacarrier Title from PDF title page (ProQuest, viewed on October 26, 2011) M.S., Computer Science University of California, Irvine 2011 Includes bibliographical references There has been a tremendous increase in the amount of data collected everyday by Internet companies like Google, Yahoo!, Facebook and Twitter. These companies use large Hadoop clusters with thousands of machines to analyze the collected data. The usage model for data-intensive computing platforms like Hadoop is challenging, as many users can be submitting jobs to a cluster at the same time. Therefore, there is need to understand user behavior, cluster resource usage, and how data-intensive computing platforms react to multi-user workloads. In this thesis we describe HERDER, a multi-user benchmarking tool to execute synthetic workloads on a cluster. HERDER provides a flexible user model to simulate an actual environment with users working on a cluster and an extensible framework to execute jobs belonging to a variety of data-intensive computing platforms. The HERDER workload generator reports statistics for the steady-state and the overall execution time periods at the end of workload execution Dissertations, Academic- University of California, Irvine- Computer Science dissertations. Academic theses. Academic theses. Thèses et écrits académiques.