Share


PostgreSQL array_agg Function


By gobrain

Apr 15th, 2024

The PostgreSQL array_agg is an aggregate function that allows aggregating multiple values into an array. It is used when you want to group values from different rows into a single array.

In this article, we will explain the PostgreSQL array_agg function by providing examples for different use cases.

Syntax

The basic syntax of the postgresql array_agg function is as follows:

array_agg(expression)

Here, the expression parameter represents the column or value that you want to aggregate into an array. It can be any valid expression in PostgreSQL.

Basic Example

Let’s consider a table called employees with the following sample data:

idnamedepartment
1JohnHR
2JaneHR
3MikeHR
4SarahIT
5MichaelIT

Suppose, you want to aggregate the names of employees in the HR department into a single array. You can accomplish this using the arrray_agg function as shown below:

SELECT array_agg(name) AS hr_employees
FROM employees
WHERE department = 'HR';

The result is as follows:

hr_employees
{John, Jane}

As you can see, the array_agg function has aggregated the names of employees in the HR department into a single array.

PostgreSQL array_agg Function With GROUP BY

If you want to aggregate values of rows based on a column, then the postgresql array_agg function can also be used with the GROUP BY clause. Consider the following table called orders with the following sample data

idcustomer_idproduct_namequantity
11Laptop2
21Smartphone3
32TV1
42Laptop2
53Smartphone4

If you want to aggregate the names of products that an customer buy into arrays, you can use the array_agg function with the GROUP BY clause. This enable you to group products by customers that buy them.

SELECT customer_id, array_agg(product_name) AS purchased_products
FROM orders
GROUP BY customer_id;

The result of this query will be:

customer_idpurchased_products
1{Laptop, Smartphone}
2{TV, Laptop}
3{Smartphone}

In this example, the function grouped the orders by customer_id and aggregated the product names for each customer into arrays.

Order And Filter Values In Aggregated Array

In PostgreSQL, you can also order and filter values in an aggregated array using the ORDER BY and WHERE clauses in combination with the array_agg function.

Here’s the syntax of how you can order and filter values in ARRAY_AGG:

SELECT id, ARRAY_AGG(value ORDER BY value) AS sorted_array
FROM your_table
WHERE condition
GROUP BY id;

Now, suppose you have a table called orders with the following data:

idcustomer_idorder_datetotal_amount
112023-06-15100.00
212023-06-14150.00
322023-06-16200.00
422023-06-1350.00
532023-06-16300.00

To order the total_amount values within the aggregated array in descending order and filter for orders where the total_amount is greater than 100, you can use the following query:

SELECT customer_id, ARRAY_AGG(total_amount ORDER BY total_amount DESC) AS sorted_array
FROM orders
WHERE total_amount > 100
GROUP BY customer_id;

The result will be like:

customer_idpurchased_products
1{150, 100}
2{200}
3{300}

Handle Duplicate Values

Additionally, the array_agg function can handle duplicate values by using the DISTINCT keyword. This allows you to eliminate duplicate elements from the resulting array.

Consider the following table called students:

student_idsubject
1Math
1Science
1Math
2History
2Science

To aggregate the unique subjects of each student into an array, you can use the following query:

SELECT student_id, array_agg(DISTINCT subject) AS subjects
FROM students
GROUP BY student_id;

The result of this query will be:

student_idsubjects
1{Math,Science}
2{History,Science}

PostgreSQL array_agg Function With Many to Many Relation

When dealing with a many-to-many relationship, where one entity is related to multiple entities in another table, you can use array_agg to aggregate the related values into an array.

Let’s say you have two tables: users and roles, with a many-to-many relationship between them through an intermediary table called user_roles. Each user can have multiple roles, and each role can be assigned to multiple users.

Here’s an example schema:

user_idname
1John
2Sarah
3Michael
4Emily
5David

Table: roles

role_idname
1Admin
2Manager
3Employee
4Intern
5Analyst

Table: user_roles

user_idrole_id
11
13
22
33
44
53
55

To retrieve all users with their corresponding roles aggregated into an array, you can use the following query:

SELECT u.user_id, u.name, array_agg(r.name) AS roles
FROM users u
JOIN user_roles ur ON u.user_id = ur.user_id
JOIN roles r ON ur.role_id = r.role_id
GROUP BY u.user_id, u.name;

This query joins the three tables (users, user_roles, and roles) using the appropriate foreign key relationships. Then, it uses array_agg(r.name) to aggregate the role names into an array for each user.

Here is the result:

user_idnameroles
5David{Employee,Analyst}
4Emily{Intern}
2Sarah{Manager}
1John{Admin,Employee}
3Michael{Employee}

This allows you to obtain structured data in your applications.

PostgreSQL array_agg Function At A Glance

1. What is the array_agg function in PostgreSQL?
The array_agg function in PostgreSQL is an aggregate function that collects values from a set of input rows into an array. It’s particularly useful when you want to group values and store them in an array for further analysis or manipulation.

2. What’s the syntax of the array_agg function?
The basic syntax of the array_agg function is as follows:

array_agg(expression)

Where expression is the value you want to aggregate into an array.

3. What does the array_agg function return?
The array_agg function returns an array containing all the aggregated values from the input rows.

4. How does array_agg handle NULL values?
By default, array_agg includes NULL values in the resulting array. If you want to exclude NULLs, you can use the ARRAY_AGG(expression IGNORE NULLS) syntax.

5. Can I order the elements within the aggregated array?
Yes, you can specify the order using the ORDER BY clause within the array_agg function. For instance:

SELECT category_id, array_agg(product_name ORDER BY product_name) AS sorted_products
FROM products
GROUP BY category_id;

6. How can I get a distinct array using array_agg?
To obtain a distinct array, you can combine array_agg with the DISTINCT keyword:

SELECT order_id, array_agg(DISTINCT product_name) AS unique_products
FROM orders
GROUP BY order_id;

7. Is there a way to limit the number of elements in the aggregated array?
Yes, you can use the LIMIT clause to restrict the number of elements in the aggregated array:

SELECT order_id, array_agg(product_name ORDER BY order_date LIMIT 5) AS recent_products
FROM orders
GROUP BY order_id;

8.Can I nest the array_agg function within other functions?
Absolutely, you can use the array_agg function as an input to other functions. This allows for complex aggregation and transformations of data.

Conclusion

PostgreSQL array_agg function is a function for aggregating values of rows into arrays. It allows us to combine multiple values into a single array so that you can work with grouped data in your applications. The values in arrays generated are also can be handled by filtering, ordering and eliminating duplicate values.

Thank you for reading.