Basic SQL usages

I also wrote a cheat sheet for SQL for myself. For reference, I used the material from the online course, “Introduction to Databases”, offered by Stanford (you can currently access the whole course from here).

Basic statements

Cross product of given relations. In the join/filtering condition; ‘and’, ‘or’, ‘between/and’, ‘like’, ‘exists’, ‘not’, ‘in’, ‘all’, ‘any’, ‘is null’, ‘is not null’, etc can be used.

select A1, A2, A3
from R1, R2, R3
where condition;

If only distinct values are needed.

select distinct A1, A2, A3	
from R1, R2, R3
where condition;

Sorting by A1; ‘ascending’ is default.

select distinct A1, A2, A3
from R1, R2, R3
where condition
order by A1 desc;

Only when A1 contains the string ‘abc’.

select A1
from R1
where A1 like '%abc%'; 	

Table variables and set operators (Union, Intersect, Except)

Find the pairs with the same A1 values (assuming A2 is an index).

select distinct R11.A1, R11.A2
from R1 as R11, R1 as R12
where R11.A1 = R12.A1 and R11.A2 < R12.A2;

Returns the union; by default, union eliminates duplicates, use union all to allow duplicates.

select A1 as new_name from R1
union 
select A2 as new_name from R2;

Some SQL implementations don’t support intersect, but it can be easily done without it.

select A1 from R1 where A2 = 'abc'
intersect
select A1 from R1 where A2 = 'def';

Some SQL implementations don’t support except, and it cannot be easily done without it.

select A1 from R1 where A2 = 'abc'
except
select A1 from R1 where A2 = 'def';

Subqueries in where clause

select A1, A2
from R1
where A1 in (select A1 from R2 where A3 = 'abc');

is the same as

select distinct R1.A1, A2
from R1, R2
where R1.A1 = R2.A1 and A3 = 'abc';

assuming A1 is an index for both R1 and R2.

The except example above can be written as this.

select A1
from R
where A1 in (select A1 from R where A2 = 'abc')
    and A1 not in (select A1 from R where A2 = 'def');

Find all rows that have any row with the same A2 values when A1 is unique.

select A1, A2
from R as R1
where exists (select * from R as R2
                      where R1.A2 = R2.A2 and R1.A1 <> R2.A1);

The same as finding the max of A1, while duplicates are allowed.

select A1
from R as R1
where not exists (select * from R as R2
                      where R1.A1 < R2.A1);

This is the same as above.

select A1
from R
where A1 >= all (select A1 from R);

Find the max of A2 when A1 is unique.

select A1, A2
from R as R1
where A2 > all (select A2 from R as R2
                      where R1.A1 <> R2.A1);

This is the same as above.

select A1, A2
from R as R1
where not A2 <= any (select A2 from R as R2
                      where R1.A1 <> R2.A1);

Find rows that don’t include the smallest A2 value (all and any may not be supported)

select A1, A2
from R
where A2 > any (select A2 from R);

This is the same as above, but it’s not using any.

select A1, A2
from R as R1
where exists (select * from R as R2
                      where R2.A2 < R1.A2);

Subqueries in from clause and select

By using a subquery, A3, the product of two existing columns, can be used in the condition.

select *
from (select A1, A2, A1*A2 as A3 from R) as R1
where R1.A3 > 1;

For all A1, find the highest A2 value(s).

select distinct A1, A2
from R
where A2 >= all (select A2 from R as R1 where R.A1 = R1.A1);

This is the same as above (select subquery in select clause should return only one row).

select distinct A1, (select distinct A2 from R as R2 where R1.A1 = R2.A1 and 
                        A2 >= all (select A2 from R as R3 where R2.A1 = R3.A1)) as A2
from R as R1;

The join family of operators

They can be expressed in other ways.

An example for inner join.

select R1.A1, R1.A2, R2.A3
from R1 join R2
on R1.A1 = R2.A1;

This is the same as above using using. Both on and using cannot be used together.

select A1, A2
from R1 join R2 using (A1);

This is the same as above using natural join, but it’s not a good practice.

select A1, A2
from R1 natural join R2;
select R1.A1, R2.A2, A3
from R1, R2, R3
where R1.A1 = R2.A1 and R2.A2 = R3.A2;

is the same as

select  R1.A1, R2.A2, A3
from R1 join R2 join R3
on R1.A1 = R2.A1 and R2.A2 = R3.A2;

Note that postgres only supports binary joins.

An examples for outer join. It can be expressed using union and not in.

select A1, A2
from R1 left/right/outer/full join R2 using (A1);

Note that outer joins are not associative, and left/right outer joins are not commutative.

Here is an interesting example with left outer join.

select R1.A1, R1.A2, R2.A3
from R1 left outer join R2 on R1.A2 = R2.A1 and R2.A3 > 1;

and

select R1.A1, R1.A2, R2.A3
from R1 left outer join R2 on R1.A2 = R2.A1
where R2.A3 > 1;

are different. If a condition is used with on, that condition is used when finding matches in join; since this is left outer join, all rows of R1 will appear, but only values R2.A3 satisfying the condition will appear in the new table, giving NULL for all other rows. If the same condition is used with where, only rows satisfying the condition will appear in the new table.

Aggregation (min, max, sum, avg, count)

select A1, A2, A3
from R1, R2, R3	
where condition	
group by columns
having condition;

Find the minimum of A1 in R.

select min(A1)
from R;

Count the number for A2=’abc’ (only distinct A1).

select count(distinct A1)
from R
where A2 = 'abc';

Find the difference of averages of A2 for A1=’abc’ and A1 is not ‘abc’.

select R1.avgA2 - R2.avgA2
from (select avg(A2) as avgA2 from R
         where A1 = 'abc') as R1,
     (select avg(A2) as avgA2 from R
         where not A1 = 'abc') as R2;

This is the same as above.

select distinct (select avg(A2) from R where A1 = 'abc') -
        (select avg(A2) from R where not A1 = 'abc')
from R;

An example for group by.

select A2, count(*)
from R1
group by A2;

An example for having.

select A1
from R1
group by A2
having count(distinct A2) > 5;

count(*) and count(column) can be different because count does not count null values. count(nullif(exp1, exp2)) can be useful when counting rows with specific values of a column.

Expressions

1, case when/else

select A1,
case A2
  when 1 then 'abc'
  when 2 then 'def'
  else ‘ghi'
end
from R1;

is the same as

select A1,
case
  when A2 = 1 then 'abc'
  when A2 = 2 then 'def'
  else 'ghi'
end as A2
from R;

2, coalesce: returns the first non-null expression.

coalesce ( expression [ ,...n ] )

3, cast and convert: convert expression into given data_type.

cast ( expression AS data_type [ ( length ) ] )
convert ( data_type [ ( length ) ] , expression [ , style ] )

4, nullif: return null if expression1 is equal to expression2

nullif ( expression1 , expression2 )

5, isnull: return the alternative value if expression is null

isnull ( expression, alternative_value )

Miscellaneous

insert into R values (123, 'abc', null);
insert into R 
select …;
delete from R
where A1 in (select A1 in R where A2 = 'abc');
update R
set A1 = 123, A2 = 'abc', A3 = 456
where condition;
alter table R add col1 integer;
alter table R drop col1;

Leave a comment