COLLECT_SET
描述
聚合函数,聚合指定列的所有唯一值,去除重复的元素,并返回一个集合类型的结果。
别名
- GROUP_UNIQ_ARRAY
语法
COLLECT_SET(<expr> [,<max_size>])
参数
参数 | 说明 |
---|---|
<expr> | 要聚合的列或表达式 |
<max_size> | 可选参数,通过设置该参数能够将结果数组的大小限制为 max_size 个元素 |
返回值
返回类型是 ARRAY,该数组包含去重后的所有值,特殊情况:
- 如果值为NULL,则会过滤
举例
select k1,k2,k3 from collect_set_test order by k1;
+------+------------+-------+
| k1 | k2 | k3 |
+------+------------+-------+
| 1 | 2023-01-01 | hello |
| 2 | 2023-01-01 | NULL |
| 2 | 2023-01-02 | hello |
| 3 | NULL | world |
| 3 | 2023-01-02 | hello |
| 4 | 2023-01-02 | doris |
| 4 | 2023-01-03 | sql |
+------+------------+-------+
select collect_set(k1),collect_set(k1,2) from collect_set_test;
+-------------------------+--------------------------+
| collect_set(`k1`) | collect_set(`k1`,2) |
+-------------------------+--------------------------+
| [4,3,2,1] | [1,2] |
+----------------------------------------------------+
select k1,collect_set(k2),collect_set(k3,1) from collect_set_test group by k1 order by k1;
+------+-------------------------+--------------------------+
| k1 | collect_set(`k2`) | collect_set(`k3`,1) |
+------+-------------------------+--------------------------+
| 1 | [2023-01-01] | [hello] |
| 2 | [2023-01-01,2023-01-02] | [hello] |
| 3 | [2023-01-02] | [world] |
| 4 | [2023-01-02,2023-01-03] | [sql] |
+------+-------------------------+--------------------------+