Come ottenere valori distinti su GROUP_CONCAT utilizzando Google Big Query

Sto cercando di ottenere valori distinti quando utilizzo GROUP_CONCAT in BigQuery.Come ottenere valori distinti su GROUP_CONCAT utilizzando Google Big Query

io ricreare la situazione utilizzando una, ad esempio statica semplice:

EDIT: Ho modificato l'esempio per rappresentare al meglio la mia situazione reale: 2 colonne con group_concat che deve essere distinta:

SELECT 
    category, 
    GROUP_CONCAT(id) as ids, 
    GROUP_CONCAT(product) as products 
FROM 
(SELECT "a" as category, "1" as id, "car" as product), 
(SELECT "a" as category, "2" as id, "car" as product), 
(SELECT "a" as category, "3" as id, "car" as product), 
(SELECT "b" as category, "4" as id, "car" as product), 
(SELECT "b" as category, "5" as id, "car" as product), 
(SELECT "b" as category, "2" as id, "bike" as product), 
(SELECT "a" as category, "1" as id, "truck" as product), 
GROUP BY 
    category

Questo esempio restituisce:

Row category ids products 
1 a 1,2,3,1 car,car,car,truck 
2 b 4,5,6 car,car,bike

mi piacerebbe mettere a nudo i valori duplicati trovati, a macerare urna come:

Row category ids products 
1 a 1,2,3 car,truck 
2 b 4,5,6 car,bike

In MySQL, group_concat ha un'opzione distinti, ma in BigQuery non c'è.

Qualche idea?

fonte

2015-02-20 Leonardo Naressi

possibile duplicato di [sintassi per eseguire un netto GROUP \ _CONCAT in Google BigQuery] (http://stackoverflow.com/questions/28324533/syntax-to-run-a-distinct-group-concat-in-google-bigquery) – Pentium10

Penso che sia simile ma non esattamente lo stesso, ma grazie per indicare @ Pentium10 –

Ecco soluzione che utilizza la funzione di aggregazione UNIQUE portata per rimuovere i duplicati. Si noti, che al fine di usarlo, prima abbiamo bisogno di costruire un REPEATED utilizzando NEST aggregazione:

SELECT 
    GROUP_CONCAT(UNIQUE(ids)) WITHIN RECORD, 
    GROUP_CONCAT(UNIQUE(products)) WITHIN RECORD 
FROM (
SELECT 
    category, 
    NEST(id) as ids, 
    NEST(product) as products 
FROM 
(SELECT "a" as category, "1" as id, "car" as product), 
(SELECT "a" as category, "2" as id, "car" as product), 
(SELECT "a" as category, "3" as id, "car" as product), 
(SELECT "b" as category, "4" as id, "car" as product), 
(SELECT "b" as category, "5" as id, "car" as product), 
(SELECT "b" as category, "2" as id, "bike" as product), 
(SELECT "a" as category, "1" as id, "truck" as product), 
GROUP BY 
    category 
)

fonte

2015-02-23 19:16:51

Perfetto Mosha! Non ho mai sentito parlare della funzione UNIQUE. Ha funzionato alla perfezione! Grazie! –

Non credo che abbiate bisogno di fare la sottoselezione NEST – Roman

rimozione dei duplicati prima di applicare group_concat otterrà il risultato desiderato:

SELECT 
     category, 
     GROUP_CONCAT(id) as ids 
    FROM ( 
    SELECT category, id 
    FROM 
    (SELECT "a" as category, "1" as id), 
    (SELECT "a" as category, "2" as id), 
    (SELECT "a" as category, "3" as id), 
    (SELECT "b" as category, "4" as id), 
    (SELECT "b" as category, "5" as id), 
    (SELECT "b" as category, "6" as id), 
    (SELECT "a" as category, "1" as id), 
    GROUP BY 
     category, id 
    ) 
    GROUP BY 
     category

fonte

2015-02-21 00:41:31

Grazie Ahmed, funziona per una singola colonna, ma nella mia situazione reale ho bisogno di 2 diverse colonne distinte. Ho modificato la domanda per mostrare il problema. –

Come ottenere valori distinti su GROUP_CONCAT utilizzando Google Big Query

risposta

Problemi correlati