SQL subqueries enable reuse of the results from a selection query within another query.
CockroachDB supports two kinds of subqueries:
- Relational subqueries, which appear as operands in selection queries and table expressions.
- Scalar subqueries, which appear as operands in a scalar expression.
Data writes in subqueries
When a subquery contains a data-modifying statement (INSERT
,DELETE
, etc.), the data modification is always executed to
completion even if the surrounding query only uses a subset of the result rows.
This is true for subqueries defined using the (...)
notation and those defined using WITH
.
For example:
> WITH t AS (INSERT INTO t(x) VALUES (1), (2), (3) RETURNING x)
SELECT * FROM t LIMIT 1;
This query inserts 3 rows into t
, even though the surrounding
query only observes 1 row using LIMIT
.
Correlated subqueries
A subquery is said to be correlated when it uses table or column names defined in the surrounding query.
For example, to find every customer with at least one order, run:
> SELECT
c.name
FROM
customers AS c
WHERE
EXISTS (SELECT * FROM orders AS o WHERE o.customer_id = c.id);
The subquery is correlated because it uses c
defined in the surrounding query.
The cost-based optimizer supports most correlated subqueries,
with the exception of correlated subqueries that generate side effects inside a CASE
statement.
LATERAL
subqueries
A LATERAL
subquery is a correlated subquery that references another query or subquery in its SELECT
statement, usually in the context of a LEFT
join or an INNER
join. Unlike other correlated subqueries, LATERAL
subqueries iterate through each row in the referenced query for each row in the inner subquery, like a for loop.
To create a LATERAL
subquery, use the LATERAL
keyword directly before the inner subquery's SELECT
statement.
For example, the following statement performs an INNER
join of the users
table and a subquery of the rides
table that filters on values in the users
table:
> SELECT name, address FROM users, LATERAL (SELECT * FROM rides WHERE rides.start_address = users.address AND city = 'new york');
name | address
+------------------+-----------------------------+
Robert Murphy | 99176 Anderson Mills
James Hamilton | 73488 Sydney Ports Suite 57
Judy White | 18580 Rosario Ville Apt. 61
Devin Jordan | 81127 Angela Ferry Apt. 8
Catherine Nelson | 1149 Lee Alley
Nicole Mcmahon | 11540 Patton Extensions
(6 rows)
LATERAL
subquery joins are especially useful when the join table includes a computed column.
For example, the following query joins a subquery of the rides
table with a computed column (adjusted_revenue
), and a subquery of the vehicles
table that references columns in the rides
subquery:
> SELECT
ride_id, vehicle_id, type, adjusted_revenue
FROM
(
SELECT
id AS ride_id, vehicle_id, revenue - 0.25*revenue AS adjusted_revenue
FROM
rides
) AS r
JOIN
LATERAL (
SELECT
type
FROM
vehicles
WHERE
city = 'new york' AND vehicles.id = r.vehicle_id AND r.adjusted_revenue > 65 )
ON true;
ride_id | vehicle_id | type | adjusted_revenue
+--------------------------------------+--------------------------------------+------------+------------------+
049ba5e3-53f7-4ec0-8000-000000000009 | 11111111-1111-4100-8000-000000000001 | scooter | 71.2500
0624dd2f-1a9f-4e80-8000-00000000000c | 00000000-0000-4000-8000-000000000000 | skateboard | 70.5000
08b43958-1062-4e00-8000-000000000011 | 11111111-1111-4100-8000-000000000001 | scooter | 70.5000
0bc6a7ef-9db2-4d00-8000-000000000017 | 00000000-0000-4000-8000-000000000000 | skateboard | 68.2500
0d4fdf3b-645a-4c80-8000-00000000001a | 00000000-0000-4000-8000-000000000000 | skateboard | 67.5000
1ba5e353-f7ce-4900-8000-000000000036 | 11111111-1111-4100-8000-000000000001 | scooter | 70.5000
(6 rows)
In a LATERAL
subquery join, the rows returned by the inner subquery are added to the result of the join with the outer query. Without the LATERAL
keyword, each subquery is evaluated independently and cannot refer to objects defined in separate queries.
Performance best practices
The results of scalar subqueries are loaded entirely into memory when the execution of the surrounding query starts. To prevent execution errors due to memory exhaustion, ensure that subqueries return as few results as possible.