Today I Learned

hashrocket A Hashrocket project

Aggregate Arrays In Postgres

array_agg is a great aggregate function in Postgres but it gets weird when aggregating other arrays.

First let's look at what array_agg does on rows with integer columns:

select array_agg(x) from (values (1), (2), (3), (4)) x (x);
-- {1,2,3,4}

It puts each value into an array. What if are values are arrays?

select array_agg(x)
from (values (Array[1, 2]), (Array[3, 4])) x (x);
-- {{1,2},{3,4}}

Put this doesn't work when the arrays have different numbers of elements:

select array_agg(x)
from (values (Array[1, 2]), (Array[3, 4, 5])) x (x);
-- ERROR:  cannot accumulate arrays of different dimensionality

If you are trying to accumulate elements to process in your code, you can use jsonb_agg.

select jsonb_agg(x.x)
from (values (Array[1, 2]), (Array[3, 4, 5])) x (x);
-- [[1, 2], [3, 4, 5]]

The advantage of using Postgres arrays however is being able to unnest those arrays downstream:

select unnest(array_agg(x))
from (values (Array[1, 2]), (Array[3, 4])) x (x);
--      1
--      2
--      3
--      4
See More #sql TILs
Looking for help? Hashrocket developers believe that data quality is as important as code quality. We enjoy all the challenges of relational databases, from finding the fastest index, to structuring data to fit the needs of an application. We're eager to share our experiences; check out PG Casts, our series of free PostgreSQL screencasts.