Tags: analytics* + spark*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. Lambda architecture consists of a serving layer that gets updated infrequently from the batch layer and a speed layer that computes real time analytics to compensate for the slow batch layer. Essentially, Hadoop is computing analytics in batches and in between batch runs, the speed layer is incrementally updating metrics by examining events in a streaming fashion.Both Spark and Storm can operate in a Hadoop cluster and access Hadoop storage. Storm-YARN is Yahoo’s open source implementation of Storm and Hadoop convergence. Spark is providing native integration for Hadoop. Integration with Hadoop is achieved through YARN (NextGen MapReduce). Integrating real time analytics with Hadoop based systems allows for better utilization of cluster resources through computational elasticity and being in the same cluster means that network transfers can be minimal.
    2014-03-28 Tags: , , , , by klotz
  2. In order to create a cluster that can support Shark, we need to launch an Amazon EMR cluster with Hive installed and then use a bootstrap action to install Spark and Shark.
    2013-09-13 Tags: , , , , , by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "analytics+spark"

About - Propulsed by SemanticScuttle