Today I Learned

A Hashrocket project

139 posts about #sql

Show Tables That Match A Pattern In MySQL

An unfamiliar MySQL database with tons of tables can be a difficult thing to navigate. You may have an idea of the kind of table you are looking for based on a domain concept you’ve seen elsewhere.

You can pare down the results returned by show tables by including a like clause with a pattern. For example, this statement will show me only tables that have the word user in them:

> show tables like '%user%';
| Tables_in_jbranchaud (%user%) |
| admin_users                   |
| users                         |

Dump A MySQL Database To A File

The mysqldump client is a handy tool for creating a backup or snapshot of a MySQL database. The standard use of this command produces an alphabetical series of statements that comprise the structure and data of the specified database. It directs all of this to stdout. You’ll likely want to redirect it to a file.

$ mysqldump my_database > my_database_backup.sql

The output will include special comments with MySQL directives that disable things like constraint checking. This is what allows the output to be in alphabetical order without necessarily violating any foreign key constraints.

If you need to dump multiple databases, include the --databases flag with a space-separated list of database names. Or dump all of them with --all-databases.

See man mysqldump for more details.

Display Output In A Vertical Format In MySQL

Output for tables with lots of columns can be hard to read and sometimes overflow the terminal window. Consider the output from Show Indexes For A Table:

> show indexes in users;
| Table | Non_unique | Key_name     | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
| users |          0 | PRIMARY      |            1 | id          | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| users |          0 | unique_email |            1 | email       | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |

We can vertically orient the output of a statement by terminating it with \G instead of ; (or \g).

> show indexes in users\G
*************************** 1. row ***************************
        Table: users
   Non_unique: 0
     Key_name: PRIMARY
 Seq_in_index: 1
  Column_name: id
    Collation: A
  Cardinality: 0
     Sub_part: NULL
       Packed: NULL
   Index_type: BTREE
*************************** 2. row ***************************
        Table: users
   Non_unique: 0
     Key_name: unique_email
 Seq_in_index: 1
  Column_name: email
    Collation: A
  Cardinality: 0
     Sub_part: NULL
       Packed: NULL
   Index_type: BTREE

Show Indexes For A Table In MySQL

When describing a table, such as users:

> describe users;
| Field      | Type                  | Null | Key | Default | Extra          |
| id         | mediumint(8) unsigned | NO   | PRI | NULL    | auto_increment |
| first_name | varchar(80)           | NO   |     | NULL    |                |
| last_name  | varchar(80)           | NO   |     | NULL    |                |
| email      | varchar(80)           | NO   | UNI | NULL    |                |

We can see in the Key column that there’s a primary key and a unique key for this table on id and email, respectively.

These keys are indexes. To get more details about each of the indexes on this table, we can use the show indexes command.

> show indexes in users;
| Table | Non_unique | Key_name     | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type |
| users |          0 | PRIMARY      |            1 | id          | A         |           0 |     NULL | NULL   |      | BTREE      |
| users |          0 | unique_email |            1 | email       | A         |           0 |     NULL | NULL   |      | BTREE      |

List Databases And Tables In MySQL

If you’ve started a mysql session, but haven’t connected to a particular database yet, you can list the available databases like so:

> show databases;
| Database                    |
| information_schema          |
| my_app_dev                  |

If you are curious about the tables in a particular database, you can list them by specifying the database’s name:

> show tables in my_app_dev;
| Tables_in_my_app_dev         |
| pokemons                     |
| trainers                     |

Alternatively, you can connect to the database of interest and then there is no need to specify the name of the database going forward.

> use my_app_dev;
> show tables;
| Tables_in_my_app_dev         |
| pokemons                     |
| trainers                     |

`any` and `all` special Postgres constructs

To determine if a value exists in a subquery or scalar value list you can use the in operator like:

> select 1 in (1, 2)

any and all go a step further by allowing for an operator in the special construct which is used against each value in the subquery or array (no scalar value list for these constructs).

any examples:

> select 1 = any(Array[1, 2, 3])
> select 1 > any(Array[1, 2, 3])

all examples

> select 17 <> any(Array[2, 4, 8])
> select 17 < any(Array[2, 4, 8])

some is a synonym for any.

Set foreign key to null on delete in #Postgres

Say you have a table with a foreign key:

id serial primary key
primary_image_id references images (id)

id serial primary key
post_id references posts (id)

If you attempted to delete an image from the images table while that image’s ID is referenced in one of the posts you will receive an error preventing you from deleting the image. Postgres is trying to protect your data integrity by preventing foreign keys that point to records which don’t exist.

To prevent errors and allow deleting records from the images table freely you must define the on delete strategy on the posts table.

One of the options available to you is set null so if you are creating a new table it will look like this:

create table posts (
  id int serial primary key,
  -- ...
  primary_image_id int references images (id) on delete set null

Now if the primary image is deleted it will set the primary_image_id to null.

This an alternative to on delete cascade which in this case will delete the post from the posts table and is not what we want.

Read the full documentation under ‘5.3.5. Foreign Keys’

Compute group proportions of a whole in SQL

It’s easy with SQL to use aggregates to count members of a group. But what if you also want the proportion that group is of the whole. The answer is to use a window function combined with the aggregate. The window function is evaluated after the normal aggregation. i.e. it can reference the values of other post-aggregation rows.

select type, count(*), count(*)::numeric / sum(count(*)) over ()
from pets
group by type;
  type  | count |        ?column?        
 dog    |    13 | 0.52000000000000000000
 rabbit |     3 | 0.12000000000000000000
 cat    |     9 | 0.36000000000000000000

Pretty Printing JSONB Rows in PostgreSQL

Who needs a document store when you can just use PostgreSQL’s JSONB data type? Viewing rows of JSONB output can be challenging though because it defaults to printing them as a single line of text.

> select '{"what": "is this", "nested": {"items 1": "are the best", "items 2": [1, 2, 3]}}'::jsonb;
 {"what": "is this", "nested": {"items 1": "are the best", "items 2": [1, 2, 3]}}
(1 row)

Fortunately, Postgres comes with a function for prettying up the format of the output of these rows — jsonb_pretty

> select jsonb_pretty('{"what": "is this", "nested": {"items 1": "are the best", "items 2": [1, 2, 3]}}'::jsonb);
 {                                 +
     "what": "is this",            +
     "nested": {                   +
         "items 1": "are the best",+
         "items 2": [              +
             1,                    +
             2,                    +
             3                     +
         ]                         +
     }                             +
(1 row)

h/t Jack Christensen

Write A Query Result To File With Postgres

Generally when writing a query in psql a statement will be terminated with a semicolon. An alternative approach is to end it with a \g instead. This will also send the query to the Postgres server for execution.

select 1 \g

If a filename is included after the \g, then the result of the query will be written to that file instead of output to the psql session.

> select 1, 2, 3 \g query_result.txt

If we cat that file, we can see the query result.

Time: 4.293 ms
> \! cat query_result.txt
 ?column? | ?column? | ?column?
        1 |        2 |        3
(1 row)

See man psql for more details.

Age in Postgres

On this day every year I get curious about how old I actually am. Postgres has some utilities to help me out with that, like subtraction of dates.

> select now() - '1978-3-24';
14245 days 17:39:55.623866

Ah yes, 14245 days on this earth, it really makes you think. What got returned anyways, that looks weird:

> select pg_typeof(now() - '1978-3-24');

Got it, that’s an interval. It doesn’t really tell me much, is this interval infinite?

> select isfinite(now() - '1978-3-24');

I can squint and read isfinite as infinite and suddenly my outlook changes, but only momentarily.

Postgres, just tell me how many years I’ve been alive ok? None of this days stuff.

> select age('1978-3-24'::date);
39 years

Create jsonb data structure from columns

You can create a jsonb object by casting a string version of the object to jsonb:

> select '{"a": 1, "b": 2}'::jsonb;
 {"a": 1, "b": 2}

When using values from columns to create a jsonb object its not practical to construct a string. In this case you want to use jsonb_build_object:

> select jsonb_build_object('a', a.x) from (select x from generate_series(1, 10) as x) as a;
 {"a": 1}
 {"a": 2}
 {"a": 3}
 {"a": 4}
 {"a": 5}
 {"a": 6}
 {"a": 7}
 {"a": 8}
 {"a": 9}
 {"a": 10}

jsonb_build_object is a variatic function and takes a variable number of arguments. So you can add additional values like this:

jsonb_build_object('a', 1, 'b', 2)

Prepare, Execute, and Deallocate Statements

In PostgreSQL, you can prepare a named statement to be executed later using prepare.

> prepare column_names (text) as
    select column_name from information_schema.columns where table_name = $1;

These statements are kept around for the duration of the session. To see the available statements, check out the pg_prepared_statements view.

> select * from pg_prepared_statements;
     name     |                                  statement                                  |         prepare_time          | parameter_types | from_sql
 column_names | prepare column_names (text) as                                             +| 2017-03-10 15:01:09.154528-06 | {text}          | t
              |   select column_name from information_schema.columns where table_name = $1; |                               |                 |

To run a prepared statement, use execute with the name of the statement and any arguments.

> execute column_names('users');

You can also delete a statement with deallocate if you’d like.

> deallocate column_names;

New PostgreSQL 9.6 slice syntax

PostgreSQL arrays support slice operations.

jack=# select ('{a,b,c,d,e,f}'::text[])[2:4];

New in PostgreSQL 9.6 is slice syntax with implicit lower and upper bounds.

jack=# select ('{a,b,c,d,e,f}'::text[])[:4];
jack=# select ('{a,b,c,d,e,f}'::text[])[3:];

Previously, array_lower and array_upper was needed to slice from a particular index to the bounds of an array.

Between Symmetric in PostgreSQL

PostgreSQL’s between construct allows you to make a comparison between two values (numbers, timestamps, etc.).

> select * from generate_series(1,10) as numbers(a)
    where numbers.a between 3 and 6;

If you supply an empty range by using the larger of the two values first, an empty set will result.

> select * from generate_series(1,10) as numbers(a)
    where numbers.a between 6 and 3;

Tacking symmetric onto the between construct is one way to avoid this issue.

> select * from generate_series(1,10) as numbers(a)
    where numbers.a between symmetric 6 and 3;

BETWEEN SYMMETRIC is the same as BETWEEN except there is no requirement that the argument to the left of AND be less than or equal to the argument on the right. If it is not, those two arguments are automatically swapped, so that a nonempty range is always implied.

group by 1, 2, 3 🔢

Postgres allow group by and order by to reference column order in a sql statement. This is particularly useful when an aggregate result needs to be referenced in the group by or order by statement.

-- group by aggregate
  (created_at at time zone 'UTC' at time zone 'America/Chicago')::date,
from posts
group by (created_at at time zone 'UTC' at time zone 'America/Chicago')::date
order by (created_at at time zone 'UTC' at time zone 'America/Chicago')::date


-- group by 1, 2, 3
  (created_at at time zone 'UTC' at time zone 'America/Chicago')::date,
from posts
group by 1
order by 1

Generating UUIDs With pgcrypto

If you check out the docs for the uuid-ossp extension, you’ll come across the following message.

The OSSP UUID library… is not well maintained, and is becoming increasingly difficult to port to newer platforms.

A little bit later, it says:

If you only need randomly-generated (version 4) UUIDs, consider using the gen_random_uuid() function from the pgcrypto module instead.

So, if we are using the UUID data type and only need to generate random UUIDs, we can rely on the pgcrypto extension. It comes with the gen_random_uuid() function which generates random v4 UUIDs.

> create extension "pgcrypto";

> select gen_random_uuid();

> select gen_random_uuid();

Postgres Intersect

Today I got to see some great SQL debugging that included the use of intersect to compare two datasets.

intersect computes the set intersection of the rows returned by the involved select statements.

Here’s a simple implementation:

dev=# select generate_series(1,1000) intersect 
  select generate_series(10,15);

Like a lot of things in Postgres, the order seems deterministic, but not predictable.



Today I learned that PostgreSQL INSERT/UPDATE/DELETE has a RETURNING clause that returns the computed values.

We can use it to get values that are generated by database such as sequences or defaults.

Check this out:



INSERT INTO users (email) VALUES ('');
-- INSERT 0 1

INSERT INTO users (email) VALUES ('') RETURNING *;
--  id |       email
-- ----+-------------------
--   2 |
-- (1 row)

h/t @joshuadavey

Aggregate Expressions with FILTER in Postgresql

Today I learned to use FILTER on aggregate expressions such as COUNT in postgresql.

See this example:

SELECT u.login,
       COUNT(1) FILTER (WHERE r.language ilike 'ruby') as ruby,
       COUNT(1) FILTER (WHERE r.language ilike 'javascript') as js,
       COUNT(1) FILTER (WHERE r.language ilike 'elixir') as elixir,
       COUNT(1) as all_langs
FROM users u
LEFT JOIN repositories r ON ( = r.user_id)

-- login | ruby | js | elixir | all_langs
-- bill  |    5 |  2 |      3 |         15
-- karen |    2 |  7 |      4 |         19
-- bob   |    9 |  1 |      2 |         23

h/t @joshuadavey

Migrate from MySQL to Postgres

Today I wanted to see if a large client app would see any performance improvement using Postgres vs. MySQL.

Luckily there is a great tool called pgloader that can quickly scan and import a MySQL DB right into Postgres.

Homebrew FTW!

$ brew install pgloader

Create an import script to define the connection strings to each of your databases and configuration the import options you want.

-- import_mysql.load
FROM mysql://root@localhost/db_name
INTO postgresql://localhost/db_name

 WITH include drop, create tables, no truncate,
      create indexes, reset sequences, foreign keys;

Run the import:

$ pgloader import_mysql.load

If you get this error:

An unhandled error condition has been signalled:
   MySQL Error [1055]: "Expression #6 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'information_schema.rc.UPDATE_RULE' which is not functionally dependent on columns in GROUP
 BY clause; this is incompatible with sql_mode=only_full_group_by"

Then create a MySQL configuration file (if you don’t have one already) and open it:

$ sudo cp $(brew --prefix mysql)/support-files/my-default.cnf /etc/my.cnf
$ sudo vim /etc/my.cnf

Make sure ONLY_FULL_GROUP_BY is not in the mode list:

# /etc/my.cnf

Now restart MySQL and try to import with pgloader again.

Two swings at guassian distribution in Postgres

I’ve come across 2 ways to replicate a bell curve.

This is adding 3 random numbers together.

floor(random()*3)+floor(random()*3)+floor(random()*3) as y, 
from generate_series(1,10000000) group by y order by y;
 y |  count
 0 |  369905
 1 | 1111997
 2 | 2222763
 3 | 2593306
 4 | 2220452
 5 | 1111506
 6 |  370071
(7 rows)

Time: 18806.885 ms

Consistent distribution, but 18 seconds for 10 million rows!

The second method has a tail end of the curve that keeps going even as it tapers to 0. This uses the normal_rand function provided by postgres with arguments of quantity, median, and standard deviation.

select  x, count(*) 
(select abs(ceil(normal_rand(10000000,2,1))) as x) as y 
group by x order by count;
 x |  count
 0 |  214152
 1 | 1373200
 2 | 3414028
 3 | 3411424
 4 | 1359447
 5 |  214194
 6 |   13234
 7 |     319
 8 |       2
(9 rows)

Time: 3221.894 ms

I want the peak of the curve around 2 and 3 but for this method of Gaussian distribution I get results below 0 unless I use abs. And its fast, 3 seconds!

Ever so slightly faster intervals in postgres

When working with dates in postgres I often construct an interval like so:

> select (2 || ' days')::interval;
2 days

Which is fine! But there’s an alternative that’s ever so slightly faster.

> select 2 * '1 day'::interval;
2 days

The time difference can be seen over large quantities of data, like 10 million.

> select (x || ' days')::interval from generate_series(1, 10000000) as x;
Time: 10437.314 ms
 > select x * '1 days'::interval from generate_series(1, 10000000) as x;
Time: 7831.568 ms

Microbenchmarking in action! Remember, take all microbenchmarks with a grain of salt.

Postgres age function #postgresql

If you want to select records according to a specific interval, like Rails’ ActiveSupport 1.year.ago PostgreSQL has you covered.

The age function returns an interval type and can be used in queries like so:

select * from sometbl where age(created_at) > '1 year';

By default the age function will use the current system time to calculate the age. If you want to calculate the age relative to a different time you can simply pass in another argument:

psql> select age(timestamp '2016-08-28', timestamp '1957-06-13');
 59 years 2 mons 15 days
(1 row)

You can also use the make_interval function to create intervals using numeric parameters:

psql> select make_interval(1,2,3,4,5);
 1 year 2 mons 25 days 05:00:00
 (1 row)

In a query it can be used like so:

select * from sometbl where age(created_at) > make_interval(1);

to select rows with created_at older than one year from now.

Read more about the age function and other cool date/time manipulations see the official documentation.

h/t Jack Christensen

Psql Command with Tuples Only

With psql, we can run any string command once and exit using the -c (command) flag. Here’s an example:

$ psql hr-til_development -c 'select title, slug from posts limit 5';
                  title                   |    slug
 Start rails server in production mode    | 97e26f5f68
 Enable ScreenSharing from a terminal     | 55dfe1a1a3
 Variable Hoisting in JavaScript          | eeedb8dda0
 Three ways to compile Elixir             | f2a423d1e1
 Indent and de-indent line in INSERT MODE | 8de0fb5d93
(5 rows)

If you are parsing the result, the headers can get in the way. This is where the -t (tuples only) flag is useful. Same query:

$ psql hr-til_development -tc 'select title, slug from posts limit 5';
 Start rails server in production mode    | 97e26f5f68
 Enable ScreenSharing from a terminal     | 55dfe1a1a3
 Variable Hoisting in JavaScript          | eeedb8dda0
 Three ways to compile Elixir             | f2a423d1e1
 Indent and de-indent line in INSERT MODE | 8de0fb5d93

Notice the headers and row count are gone. Toggle this in a psql session with the \t command.

Grep your Database 🔎

Wading through a database full of text (as might support a CMS) and feeling lost? Sometimes you gotta ditch the queries and just start grepping.

This Stack Overflow post sums up the idea. Here’s a version of the command we used today to pick through a text-rich database:

$ curl `heroku pg:backups public-url -rproduction` | pg_restore | egrep -o '[a-z]+\'

This found, via a regex match, all subdomains of our target URL, in any text field of the production database. Pretty awesome.

Treat null as if it is a known value #postgresql

When you query a table sometimes you want to check if a nullable field is not equal to a value. For example:

select * from sometable where afield != 'avalue';

However the query above will exclude rows where afield is null, so you would typically add that as an additional condition:

select * from sometable where afield is null or afield != 'avalue';

When you are doing it once it may be ok but as queries get bigger this makes the query messy and harder to read. Fortunately Postgres offers a more idiomatic way to check if a value does not equal something, including null values: is distinct from and is not distinct from.

select * from sometable where afield is distinct from 'avalue';

This query will return all the rows where afield is null or anything but avalue. Conversely:

select * from sometable where afield is NOT distinct from (select x from y limit 1);

will return all the values that are equal to the result of the subquery above and is useful when the result of the subquery could be null.

h/t Jack Christensen

Original docs:

`NULLIF` in Postgres

If there multiple values in a database column that both mean the the column for that row has no data, then the NULLIF function can be used in queries to normalize the values in that column.

Null char is 😵

chriserin=# \pset null 😵
Null display is "😵".
chriserin=# select nullif('NOTHING', 'NOTHING');
(1 row)

chriserin=# select nullif('something', 'NOTHING');
(1 row)

Nulls Last, Nulls First

Postgres query sorting has a great feature,nulls first/nulls last, that helps us specify where in a query result null values appear.

Nulls in Postgres are sorted as larger than any other value, so this is a descending sort containing null values (💀 == null):

example=# select * from users order by full_name desc;
 Tracy Jordan
 Liz Lemon
 Jack Donaghy

Reverse the null position with nulls last:

 example=# select * from users order by full_name desc nulls last;
 Tracy Jordan
 Liz Lemon
 Jack Donaghy
(6 rows)

Here’s the opposite query, ascending:

example=# select * from users order by full_name asc;
 Jack Donaghy
 Liz Lemon
 Tracy Jordan
(6 rows)

Now, reverse the null position with nulls first:

example=# select * from users order by full_name asc nulls first;
 Jack Donaghy
 Liz Lemon
 Tracy Jordan
(6 rows)

Lots of ways to explore our data.

Create Database from a Template

Today I got to see Postgres’ createdb invoked with a specified template, as such:

$ createdb -T prod_dump myapp_development

The -T feature flag corresponds to the TEMPLATE modifier in a SQL CREATE DATABASE statement. In the example above, this will create myapp_development that is a deep clone (structure and data) of prod_dump.

For more info, see:

$ man createdb


$ psql
user=# \h create database;

h/t Brian Dunn

Determining The Age Of Things

In PostgreSQL, we can determine the age of something (or someone) by passing a timestamp to the age function.

For instance, if we want to know how long it has been since y2k, we can run the following query:

> select age(timestamp '2000-01-01');
 16 years 4 mons 12 days

Additionally, if we want to know the amount of time between two dates, we can pass two timestamps to the age function.

For example, we can find out how old Prince lived to be by passing in the date of death and then date of birth:

> select age(timestamp 'April 21, 2016', timestamp 'June 7, 1958');
 57 years 10 mons 14 days

h/t Josh Davey

Dump And Restore A PostgreSQL Database

PostgreSQL comes with two command-line utilities for dumping and then restoring a database — pg_dump and pg_restore, respectively.

Using the pg_dump with the -Fc flag will create a dump of the given database in a custom format. The output of this command can be redirected into a file (the .dump extension is a standard convention):

$ pg_dump -Fc my_database > my_database.dump

Using the custom format option provides a couple benefits. The output is significantly compressed in comparison to a generic SQL dump. The dump and restoration is more flexible. Lastly, the dump can be performed in parallel if your machine has multiple cores to work with. Likewise, the restoration can be done in parallel with multiple jobs.

To restore the dump, create a fresh database and then use pg_restore:

$ createdb my_new_database
$ pg_restore -d my_new_database my_database.dump

Note: the dumped tables will depend on some user role. You will need to ensure that this role exists on the database cluster where the restore is happening. You can use the createuser command if necessary.

See the pg_dump docs and pg_restore docs for more details.

List Database Objects With Disk Usage

I’ll often times use \d or \dt to check out the tables in my database. This shows the schema, object name, object type (e.g. table), and owner for each.

By adding the + to that meta-command, I can also see the disk usage for each database object.

Here is an example of look at all tables in a database with the additional Size (or disk usage) information:

> \dt+
                              List of relations
 Schema |        Name        | Type  |   Owner    |    Size    | Description
 public | amount_types       | table | jbranchaud | 16 kB      |
 public | ingredient_amounts | table | jbranchaud | 8192 bytes |
 public | ingredient_types   | table | jbranchaud | 16 kB      |
 public | ingredients        | table | jbranchaud | 48 kB      |
 public | recipes            | table | jbranchaud | 16 kB      |
 public | schema_migrations  | table | jbranchaud | 16 kB      |
 public | users              | table | jbranchaud | 16 kB      |

Add a Comment in Postgres

Today I learned about a neat Postgres feature, commenting. Here is a statement adding a comment to a table:

hr-til_development=#  comment on table developers is 'these are the authors';

Let’s check it out:

hr-til_development=# \dt+ developers;
List of relations
-[ RECORD 1 ]----------------------
Schema      | public
Name        | developers
Type        | table
Owner       | jwworth
Size        | 16 kB
Description | these are the authors

According to the documentation, you can comment on many things. Use this wherever you feel a comment is necessary or appropriate.

h/t Chris Erin and Jack Christensen

Get The Size Of An Index

Want to get an idea of how much disk space that additional index is taking up? In PostgreSQL, you can query for it with the same methods discussed in Get The Size Of A Table and Pretty Print Data Sizes.

For instance, if I have a table with a users_pkey index and a users_unique_lower_email_idx index, I can check the sizes like so:

> select pg_size_pretty(pg_relation_size('users_pkey'));
 240 kB

> select pg_size_pretty(pg_relation_size('users_unique_lower_email_idx'));
 704 kB


Change The Current Directory For psql

When you start a psql session, your current directory is what psql will use as its current directory. This is important for meta-commands that use relative paths based on the current directory — for instance, the \i meta-command for importing files.

You can change the current directory within a psql session using the \cd meta-command.

If my current directory is home and there is a sql directory in home, these commands will do the following: ```sql \! pwd -- /home \cd sql \! pwd -- /home/sql ``` The\cdmeta-command even supports tab completion relative to the current directory. You can also change to your home directory using just\cd`.

Use Not Valid To Immediately Enforce A Constraint

When adding a constraint to a table, you can optionally include not valid. This tells Postgres that it doesn’t need to enforce the constraint on existing records in the table. At least not immediately. This constraint will be enforced for any updates and subsequent insertions. Thus, you can immediately enforce the constraint while giving yourself time to clean up or massage any existing records that conflict with the constraint.

Here is an example of how you would add a constraint this way:

alter table boxes
add constraint check_valid_length
check (length > 0) not valid;

Eventually, you will want to ensure that all data in the table conforms to the constraint. Once you get to that point, you can mark the constraint as valid with a validate constraint command:

alter table boxes
validate constraint check_valid_length;

As long as all records are valid with respect to this constraint, it will be marked as valid.

h/t Chris Erin

Lower Is Faster Than ilike

There are a couple ways to do a case-insensitive comparison of data in PostgreSQL. One way is to use the ilike operator for comparison. Another way is to use the lower() function on both sides of the = operator for comparison. Using lower() is a bit faster than using ilike.

When comparing

select * from users where email ilike '';


select * from users where lower(email) = lower('');

we find (via explain analyze) that using lower() was taking around 12ms where as the ilike example was taking around 17ms.

We earn orders of magnitude in performance when adding a functional index that uses the lower() function like so:

create unique index users_unique_lower_email_idx on users (lower(email));

After adding this index, the example using lower() drops to around 0.08ms.

For the full example and explain analyze outputs, see this document.

Add ON DELETE CASCADE To Foreign Key Constraint

The alter table command lets you do quite a bit. But when it comes to altering existing constraints, there is not much you can do. If you want to add an on delete cascade to an existing foreign key constraint, you are going to need two statements.

The first statement will drop the constraint and the second statement will recreate it with the addition of the on delete clause. Furthermore, you’ll want to do this in a transaction to ensure the integrity of your data during the transition between indexes.

Here is an example:


alter table orders
drop constraint orders_customer_id_fkey;

alter table orders
add constraint orders_customer_id_fkey
foreign key (customer_id)
references customers (id)
on delete cascade;



Print The Query Buffer In psql

I’ll often be composing a PostgreSQL query in Vim and decide I want to give it a try in psql. I copy the relevant snippet of SQL to my system buffer and then paste into psql. I’m usually hit with a mess of text like this though:

jbranchaud=# create table nullable_fields (
jbranchaud(#   id serial primary key,
  first varchar,
  last varchar
  id serial primary key,
jbranchaud(#   first varchar,
  last varchar
  first varchar,
jbranchaud(#   last varchar
  last varchar
jbranchaud(# )

Yikes. That’s not readable. Fortunately, psql provides a command for printing the current contents of the query buffer. By typing \p I’ll see a more readable version of what I just pasted in.

jbranchaud-# \p
create table nullable_fields (
  id serial primary key,
  first varchar,
  last varchar

After taking another glance at the snippet of SQL, I decide to complete the query to create my new table.

jbranchaud-# ;

Using Expressions In Indexes With PostgreSQL

Though we usually see column names by themselves when defining an index, it is also possible to create an index with an expression.

Let’s say I have a users table with an email column. Then I may end up creating an index like this

create index email_idx on users (email);

If I always perform queries on the email column with the lower() function, like this

select * from users where lower(email) = lower('');

then I will want to also create an index with that full expression — lower(email)

I can do this with a statement like the following

create index lower_email_idx on users (lower(email));

Without an index that uses the full lower(email) expression, select statements like the one above will be forced to do full sequential scans instead of indexed scans.

List Various Kinds Of Objects

Our PostgreSQL database can end up with all kinds of objects: tables, sequences, views, etc. We can use a variety of psql meta-commands to list the different types of (user-created) objects in our database.

  • \dt will list all the tables
  • \dE will list all the foreign tables
  • \di will list all the indexes
  • \ds will list all the sequences
  • \dv will list all the views
  • \dm will list all the materialized views

These can also be combined. For instance, to see all the tables and sequences, we can run \dts.