klotz: distinct*

Bookmarks on this page are managed by an admin user.

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. To extract unique values from a column in a relation you can use DISTINCT or GROUP BY/GENERATE. DISTINCT is the preferred method; it is faster and more efficient.

    Example using GROUP BY - GENERATE:

    A = load 'myfile' as (t, u, v);
    B = foreach A generate u;
    C = group B by u;
    D = foreach C generate group as uniquekey;
    dump D;
    Example using DISTINCT:

    A = load 'myfile' as (t, u, v);
    B = foreach A generate u;
    C = distinct B;
    dump C;
    2014-05-20 Tags: , by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: distinct

About - Propulsed by SemanticScuttle