klotz: pig*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. 2014-07-23 Tags: , , , by klotz
  2. 2014-07-23 Tags: , , by klotz
  3. 2014-07-23 Tags: , , , , , by klotz
  4. A = LOAD '/path/to/data.json' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad')
  5. 2014-07-18 Tags: , , by klotz
  6. 2014-06-25 Tags: , , by klotz
  7. 2014-06-16 Tags: , , by klotz
  8. 2014-05-23 Tags: , by klotz
  9. To extract unique values from a column in a relation you can use DISTINCT or GROUP BY/GENERATE. DISTINCT is the preferred method; it is faster and more efficient.

    Example using GROUP BY - GENERATE:

    A = load 'myfile' as (t, u, v);
    B = foreach A generate u;
    C = group B by u;
    D = foreach C generate group as uniquekey;
    dump D;
    Example using DISTINCT:

    A = load 'myfile' as (t, u, v);
    B = foreach A generate u;
    C = distinct B;
    dump C;
    2014-05-20 Tags: , by klotz
  10. 2014-05-20 Tags: , , by klotz

Top of the page

First / Previous / Next / Last / Page 2 of 0 SemanticScuttle - klotz.me: Tags: pig

About - Propulsed by SemanticScuttle