I also wrote a cheat sheet for SQL for myself. For reference, I used the material from the online course, “Introduction to Databases”, offered by Stanford (you can currently access the whole course from here).
Basic statements
Cross product of given relations. In the join/filtering condition; ‘and’, ‘or’, ‘between/and’, ‘like’, ‘exists’, ‘not’, ‘in’, ‘all’, ‘any’, ‘is null’, ‘is not null’, etc can be used.
select A1, A2, A3 from R1, R2, R3 where condition;
If only distinct values are needed.
select distinct A1, A2, A3 from R1, R2, R3 where condition;
Sorting by A1; ‘ascending’ is default.
select distinct A1, A2, A3 from R1, R2, R3 where condition order by A1 desc;
Only when A1 contains the string ‘abc’.
select A1 from R1 where A1 like '%abc%';
Table variables and set operators (Union, Intersect, Except)
Find the pairs with the same A1 values (assuming A2 is an index).
select distinct R11.A1, R11.A2 from R1 as R11, R1 as R12 where R11.A1 = R12.A1 and R11.A2 < R12.A2;
Returns the union; by default, union
eliminates duplicates, use union all
to allow duplicates.
select A1 as new_name from R1 union select A2 as new_name from R2;
Some SQL implementations don’t support intersect
, but it can be easily done without it.
select A1 from R1 where A2 = 'abc' intersect select A1 from R1 where A2 = 'def';
Some SQL implementations don’t support except
, and it cannot be easily done without it.
select A1 from R1 where A2 = 'abc' except select A1 from R1 where A2 = 'def';
Subqueries in where clause
select A1, A2 from R1 where A1 in (select A1 from R2 where A3 = 'abc');
is the same as
select distinct R1.A1, A2 from R1, R2 where R1.A1 = R2.A1 and A3 = 'abc';
assuming A1 is an index for both R1 and R2.
The except
example above can be written as this.
select A1 from R where A1 in (select A1 from R where A2 = 'abc') and A1 not in (select A1 from R where A2 = 'def');
Find all rows that have any row with the same A2 values when A1 is unique.
select A1, A2 from R as R1 where exists (select * from R as R2 where R1.A2 = R2.A2 and R1.A1 <> R2.A1);
The same as finding the max of A1, while duplicates are allowed.
select A1 from R as R1 where not exists (select * from R as R2 where R1.A1 < R2.A1);
This is the same as above.
select A1 from R where A1 >= all (select A1 from R);
Find the max of A2 when A1 is unique.
select A1, A2 from R as R1 where A2 > all (select A2 from R as R2 where R1.A1 <> R2.A1);
This is the same as above.
select A1, A2 from R as R1 where not A2 <= any (select A2 from R as R2 where R1.A1 <> R2.A1);
Find rows that don’t include the smallest A2 value (all
and any
may not be supported)
select A1, A2 from R where A2 > any (select A2 from R);
This is the same as above, but it’s not using any
.
select A1, A2 from R as R1 where exists (select * from R as R2 where R2.A2 < R1.A2);
Subqueries in from clause and select
By using a subquery, A3, the product of two existing columns, can be used in the condition.
select * from (select A1, A2, A1*A2 as A3 from R) as R1 where R1.A3 > 1;
For all A1, find the highest A2 value(s).
select distinct A1, A2 from R where A2 >= all (select A2 from R as R1 where R.A1 = R1.A1);
This is the same as above (select
subquery in select
clause should return only one row).
select distinct A1, (select distinct A2 from R as R2 where R1.A1 = R2.A1 and A2 >= all (select A2 from R as R3 where R2.A1 = R3.A1)) as A2 from R as R1;
The join family of operators
They can be expressed in other ways.
An example for inner join.
select R1.A1, R1.A2, R2.A3 from R1 join R2 on R1.A1 = R2.A1;
This is the same as above using using
. Both on
and using
cannot be used together.
select A1, A2 from R1 join R2 using (A1);
This is the same as above using natural join, but it’s not a good practice.
select A1, A2 from R1 natural join R2;
select R1.A1, R2.A2, A3 from R1, R2, R3 where R1.A1 = R2.A1 and R2.A2 = R3.A2;
is the same as
select R1.A1, R2.A2, A3 from R1 join R2 join R3 on R1.A1 = R2.A1 and R2.A2 = R3.A2;
Note that postgres only supports binary joins.
An examples for outer join. It can be expressed using union
and not in
.
select A1, A2 from R1 left/right/outer/full join R2 using (A1);
Note that outer joins are not associative, and left/right outer joins are not commutative.
Here is an interesting example with left outer join.
select R1.A1, R1.A2, R2.A3 from R1 left outer join R2 on R1.A2 = R2.A1 and R2.A3 > 1;
and
select R1.A1, R1.A2, R2.A3 from R1 left outer join R2 on R1.A2 = R2.A1 where R2.A3 > 1;
are different. If a condition is used with on
, that condition is used when finding matches in join; since this is left outer join, all rows of R1 will appear, but only values R2.A3 satisfying the condition will appear in the new table, giving NULL for all other rows. If the same condition is used with where
, only rows satisfying the condition will appear in the new table.
Aggregation (min, max, sum, avg, count)
select A1, A2, A3 from R1, R2, R3 where condition group by columns having condition;
Find the minimum of A1 in R.
select min(A1) from R;
Count the number for A2=’abc’ (only distinct A1).
select count(distinct A1) from R where A2 = 'abc';
Find the difference of averages of A2 for A1=’abc’ and A1 is not ‘abc’.
select R1.avgA2 - R2.avgA2 from (select avg(A2) as avgA2 from R where A1 = 'abc') as R1, (select avg(A2) as avgA2 from R where not A1 = 'abc') as R2;
This is the same as above.
select distinct (select avg(A2) from R where A1 = 'abc') - (select avg(A2) from R where not A1 = 'abc') from R;
An example for group by
.
select A2, count(*) from R1 group by A2;
An example for having
.
select A1 from R1 group by A2 having count(distinct A2) > 5;
count(*)
and count(column)
can be different because count does not count null values. count(nullif(exp1, exp2))
can be useful when counting rows with specific values of a column.
Expressions
1, case when
/else
select A1, case A2 when 1 then 'abc' when 2 then 'def' else ‘ghi' end from R1;
is the same as
select A1, case when A2 = 1 then 'abc' when A2 = 2 then 'def' else 'ghi' end as A2 from R;
2, coalesce
: returns the first non-null expression.
coalesce ( expression [ ,...n ] )
3, cast
and convert
: convert expression into given data_type.
cast ( expression AS data_type [ ( length ) ] ) convert ( data_type [ ( length ) ] , expression [ , style ] )
4, nullif
: return null if expression1 is equal to expression2
nullif ( expression1 , expression2 )
5, isnull
: return the alternative value if expression is null
isnull ( expression, alternative_value )
Miscellaneous
insert into R values (123, 'abc', null);
insert into R select …;
delete from R where A1 in (select A1 in R where A2 = 'abc');
update R set A1 = 123, A2 = 'abc', A3 = 456 where condition;
alter table R add col1 integer;
alter table R drop col1;