Today I Learned

A Hashrocket project

145 posts about #sql

Use Not Valid To Immediately Enforce A Constraint

When adding a constraint to a table, you can optionally include not valid. This tells Postgres that it doesn’t need to enforce the constraint on existing records in the table. At least not immediately. This constraint will be enforced for any updates and subsequent insertions. Thus, you can immediately enforce the constraint while giving yourself time to clean up or massage any existing records that conflict with the constraint.

Here is an example of how you would add a constraint this way:

alter table boxes
add constraint check_valid_length
check (length > 0) not valid;

Eventually, you will want to ensure that all data in the table conforms to the constraint. Once you get to that point, you can mark the constraint as valid with a validate constraint command:

alter table boxes
validate constraint check_valid_length;

As long as all records are valid with respect to this constraint, it will be marked as valid.

h/t Chris Erin

Lower Is Faster Than ilike

There are a couple ways to do a case-insensitive comparison of data in PostgreSQL. One way is to use the ilike operator for comparison. Another way is to use the lower() function on both sides of the = operator for comparison. Using lower() is a bit faster than using ilike.

When comparing

select * from users where email ilike 'some-email@example.com';

to

select * from users where lower(email) = lower('some-email@example.com');

we find (via explain analyze) that using lower() was taking around 12ms where as the ilike example was taking around 17ms.

We earn orders of magnitude in performance when adding a functional index that uses the lower() function like so:

create unique index users_unique_lower_email_idx on users (lower(email));

After adding this index, the example using lower() drops to around 0.08ms.

For the full example and explain analyze outputs, see this document.

Add ON DELETE CASCADE To Foreign Key Constraint

The alter table command lets you do quite a bit. But when it comes to altering existing constraints, there is not much you can do. If you want to add an on delete cascade to an existing foreign key constraint, you are going to need two statements.

The first statement will drop the constraint and the second statement will recreate it with the addition of the on delete clause. Furthermore, you’ll want to do this in a transaction to ensure the integrity of your data during the transition between indexes.

Here is an example:

begin;

alter table orders
drop constraint orders_customer_id_fkey;

alter table orders
add constraint orders_customer_id_fkey
foreign key (customer_id)
references customers (id)
on delete cascade;

commit;

source

Print The Query Buffer In psql

I’ll often be composing a PostgreSQL query in Vim and decide I want to give it a try in psql. I copy the relevant snippet of SQL to my system buffer and then paste into psql. I’m usually hit with a mess of text like this though:

jbranchaud=# create table nullable_fields (
jbranchaud(#   id serial primary key,
  first varchar,
  last varchar
)
  id serial primary key,
jbranchaud(#   first varchar,
  last varchar
)
  first varchar,
jbranchaud(#   last varchar
)
  last varchar
jbranchaud(# )
)
jbranchaud-#

Yikes. That’s not readable. Fortunately, psql provides a command for printing the current contents of the query buffer. By typing \p I’ll see a more readable version of what I just pasted in.

jbranchaud-# \p
create table nullable_fields (
  id serial primary key,
  first varchar,
  last varchar
)
jbranchaud-#

After taking another glance at the snippet of SQL, I decide to complete the query to create my new table.

jbranchaud-# ;
CREATE TABLE

Using Expressions In Indexes With PostgreSQL

Though we usually see column names by themselves when defining an index, it is also possible to create an index with an expression.

Let’s say I have a users table with an email column. Then I may end up creating an index like this

create index email_idx on users (email);

If I always perform queries on the email column with the lower() function, like this

select * from users where lower(email) = lower('some@email.com');

then I will want to also create an index with that full expression — lower(email)

I can do this with a statement like the following

create index lower_email_idx on users (lower(email));

Without an index that uses the full lower(email) expression, select statements like the one above will be forced to do full sequential scans instead of indexed scans.

List Various Kinds Of Objects

Our PostgreSQL database can end up with all kinds of objects: tables, sequences, views, etc. We can use a variety of psql meta-commands to list the different types of (user-created) objects in our database.

  • \dt will list all the tables
  • \dE will list all the foreign tables
  • \di will list all the indexes
  • \ds will list all the sequences
  • \dv will list all the views
  • \dm will list all the materialized views

These can also be combined. For instance, to see all the tables and sequences, we can run \dts.

Creating Custom Types in PostgreSQL

PostgreSQL has support for creating custom types. When you need something more expressive than the built-in types and you don’t want your data spread across multiple columns, you can instead create a custom type.

create type dimensions as (
  width integer,
  height integer,
  depth integer
);

This new type can then be used in the definition of a new table

create table moving_boxes (
  id serial primary key,
  dims dimensions not null
);

and when inserting data

insert into moving_boxes (dims) values (row(3,4,5)::dimensions);

See the create type docs for more details.

Show All Versions Of An Operator

We may be familiar with PostgreSQL’s containment operator (@>). Maybe we’ve used it with an array before, so we understand the general idea. But now we are curious about what are the other types with which this containment operator can be used.

We can quickly find out the answer with the \do command in psql:

> \do @>
                               List of operators
   Schema   | Name | Left arg type | Right arg type | Result type | Description
------------+------+---------------+----------------+-------------+-------------
 pg_catalog | @>   | aclitem[]     | aclitem        | boolean     | contains
 pg_catalog | @>   | anyarray      | anyarray       | boolean     | contains
 pg_catalog | @>   | anyrange      | anyelement     | boolean     | contains
 pg_catalog | @>   | anyrange      | anyrange       | boolean     | contains
 pg_catalog | @>   | jsonb         | jsonb          | boolean     | contains
...

The Left arg type and Right arg type columns tell us what we need to know.

This \do command can be used with any operator for a similar set of information.

h/t Bruce Momjian

Clear The Screen In psql

The psql interactive terminal does not have a built-in way of clearing the screen. What I usually do if I really need the screen cleared is quit, run clear from the shell, and then open a new psql session. This is unnecessary though. Instead, we can use the \! command to execute a shell command, in this case, the clear command.

> \! clear

This clears the screen in one step and keeps our current session running.

See the psql docs for more details.

Storing Emails With citext

Email addresses should be treated as case-insensitive because they are. If a user is trying to sign in with their email address, we shouldn’t care if they type user@example.com or User@example.com. Both of those email addresses should be treated as equal and ultimately lead us to the same User record.

With the citext extension in PostgreSQL, we can create a column that acts as a case-insensitive text type. Any comparisons on a column of that type will internally have the lower function executed on the arguments.

The following example shows this in action:

create extension if not exists citext;

create table citext_emails (
  id serial primary key,
  email citext not null unique
);

insert into citext_emails (email) values ('LizLemon@nbc.com');

select * from citext_emails where email = 'lizlemon@nbc.com';
--  id |      email
-- ----+------------------
--   1 | LizLemon@nbc.com

See citext_emails.sql for a full example.

Renaming A Table

Using the alter table command in PostgreSQL, you can rename an existing table. This command will also update any references to the table such as via foreign key constraints. Just run a command like the following:

alter table ingredient_types rename to item_types;

Note that this may result in breaking a number of conventions. Foreign keys, sequences, and constraints with names eponymous to the original table will no longer follow convention despite the references being updated. These can be renamed as well if desired.

See renaming_table.sql for a full example.

See the alter table docs for more details.

Psql Help

Psql includes a built-in help command that is amazing.

With no arguments, it returns all the help:

app_development=# \h
Available help:
  ABORT    ALTER TABLESPACE    CREATE FOREIGN TABLE
...

With a Postgres command as an argument, it returns a specific help:

app_development=# \h abort;
Command:     ABORT
Description: abort the current transaction
Syntax:
ABORT [ WORK | TRANSACTION ]

And with an asterisk, it prints the full description for all the help commands.

Plus we get this nicety:

Note: To simplify typing, commands that consists of several words do not have to be quoted. Thus it is fine to type \help alter table.

Write better SQL in your REPL today!

http://www.postgresql.org/docs/current/static/app-psql.html

h/t Jack Christensen

Salt And Hash A Password With Postgres' pgcrypto

The pgcrypto extension that ships with PostgreSQL can be used to do a number of interesting things. This includes functions for doing salted password hashing. Using the crypt and gen_salt functions, we can securely store a user password and later compare it to plain-text passwords for authentication purposes.

create extension pgcrypto;

select crypt('pa$$w0rd', gen_salt('bf'));
                            crypt
--------------------------------------------------------------
 $2a$06$Z7wmrkYMOyLboLcULUYzNe6nHUcWywSZTt6nSrT5Xdv/VLdJ4g99K

> select (
    '$2a$06$Z7wmrkYMOyLboLcULUYzNe6nHUcWywSZTt6nSrT5Xdv/VLdJ4g99K' =
    crypt(
      'pa$$w0rd',
      '$2a$06$Z7wmrkYMOyLboLcULUYzNe6nHUcWywSZTt6nSrT5Xdv/VLdJ4g99K'
    )
  ) as matched;
 matched
---------
 t

> select (
    '$2a$06$Z7wmrkYMOyLboLcULUYzNe6nHUcWywSZTt6nSrT5Xdv/VLdJ4g99K' =
    crypt(
      'password',
      '$2a$06$Z7wmrkYMOyLboLcULUYzNe6nHUcWywSZTt6nSrT5Xdv/VLdJ4g99K'
    )
  ) as matched;
 matched
---------
 f

The salt value is generated using the blowfish encryption algorithm (hence, the 'bf'). There is support for other algorithms such as md5.

See the pgcrypt documentation for more details.

Types and type casting in #PostgreSQL

To see the type of column or any entity in PostgreSQL use pg_typeof. Here’s an example:

select pg_typeof(array['thing']);

-- OUTPUT:

-- pg_typeof
-- ---------
-- text[]

To cast to another type use the :: operator:

select pg_typeof(array['thing']::varchar[]);

-- OUTPUT:

--      pg_typeof
-- -------------------
-- character varying[]

h/t Josh Branchaud

Terminating A Connection

Consider the scenario where you are trying to drop a database, but there are existing connections.

$ dropdb sample_db
dropdb: database removal failed: ERROR:  database "sample_db" is being accessed by other users
DETAIL:  There is 1 other session using the database.

If you don’t know where these connections are, you can terminate them within a psql session. You just have to figure out the pid of those connections. In List Connections To A Database, I explained how to get at the pid values of connections. Using the pid value and pg_terminate_backend(), you can terminate a connection.

> select pg_terminate_backend(12345);
 pg_terminate_backend
----------------------
 t

To terminate all connections to a particular database, use a query like the following:

select pg_terminate_backend(pg_stat_activity.pid)
from pg_stat_activity
where pg_stat_activity.datname = 'sample_db'
  and pid <> pg_backend_pid();
 pg_terminate_backend
----------------------
 t

This excludes the current session, so you’ll need to exit psql as well before dropping the database.

source

List Connections To A PostgreSQL Database

The pg_stat_activity table can be used to determine what connections there currently are to the PostgreSQL server and to a particular database. To see the process ids and usernames of all connection to your PostgreSQL server, run the following query:

> select pid, usename from pg_stat_activity;
  pid  |  usename
-------+------------
 57174 | jbranchaud
 83420 | jbranchaud

Include datname in the requested columns to figure out the database of each connection.

> select pid, usename, datname from pg_stat_activity;
  pid  |  usename   |  datname
-------+------------+-----------
 57174 | jbranchaud | hr_hotels
 83420 | jbranchaud | pgbyex

The results can be restricted to a particular database as necessary.

> select pid, usename from pg_stat_activity where datname = 'hr_hotels';
  pid  |  usename
-------+------------
 57174 | jbranchaud

Query Size of Postgres Array

Today I learned how to query the size of a Postgres array. There are (at least) two methods that work.

The first is array_length. This requires you to know the array dimension you seek; a useful filter, although most of the time this will probably be 1:

hr-til_development=# select title from posts
hr-til_development=# where array_length(slack_notified_at_likes_threshold, 1) = 1;
                   title
-------------------------------------------
 Because Javascript
 Percent Notation
 DIY Grids for Designing UI in Illustrator
(3 rows)

When the dimension is 1, a more terse solution is cardinality:

hr-til_development=# select title from posts
hr-til_development=# where cardinality(slack_notified_at_likes_threshold) = 1;
                   title
-------------------------------------------
 Because Javascript
 Percent Notation
 DIY Grids for Designing UI in Illustrator
(3 rows)

Documentation

Default to Empty Array in Postgres

Today I added an array of integers to a Postgres table, and like all such migrations, it will default to null for each new record.

This was a problem, because I wanted to use Rails 4’s built-in Postgres array support to make decisions based on that data. Ruby array methods like include? will raise NoMethodError: undefined method 'include?' for nil:NilClass if that array is ever nil, which it is by default.

This led me to learn how to set the default value to an empty array using Postgres array literal syntax. I was then able to include a not null constraint as an added benefit:

# db/migrate/20160211043316_add_slack_notified.rb

def up
  execute <<-SQL
    alter table posts
      add column slack_notified integer[] not null default '{}';
  SQL
end

Sets With The Values Command

You can concisely create sets of values in PostgreSQL using the values command.

> values (1), (2), (3);
 column1
---------
       1
       2
       3

You can even create multiple columns of values.

> values (1, 'a', true), (2, 'b', false);
 column1 | column2 | column3
---------+---------+---------
       1 | a       | t
       2 | b       | f

This is most often used with an insert command, but can be used on its own, as a subquery, within a CTE, etc.

source

Listing All Rows In A Table

In PostgreSQL, perhaps the more common way to list all rows in a table is with the following select command:

select * from bedding_types;

There is an alternative approach that also selects all rows from a table. It’s essentially a shorthand — the table command.

> table bedding_types;
   name
----------
 No Bed
 1 Full
 1 Double
 2 Double
 1 Twin
 2 Twins
 1 Queen
 2 Queen
 1 King
 2 Kings
 3 Kings
 Murphy
 Sofa Bed

h/t Jack Christensen

Be aware! Postgres rounds.

Yesterday, my pair and I created a test that calculated a value and compared that to the value of a calculation in the code we were testing. This worked out great except for one hitch, we were asserting about the derived value after it had been inserted into the database. What we didn’t count on is that Postgres rounds. Check this out:

create table money (amount numeric(4, 2));
insert into money (amount) values (10.342) returning amount;
 amount
--------
  10.34

insert into money (amount) values (10.347) returning amount;
 amount
--------
  10.35

Postgres rounds!

Postgres permissions to insert, but not return.

Postgres permissions can be harsh. If you don’t have the select permission on a table you might not be able to do some very rails like things that you think you ought to be able to do, take my friend Jimmy.

set role 'admin';
create role 'jimmy';
create table fruits (id serial primary key, name varchar);
grant insert on fruits to jimmy;
grant usage on fruits_id_seq to jimmy;

Given these permissions it’s possible for Jimmy to make inserts into the fruit table like so:

set role 'jimmy';
insert into fruits (name) values ('Apple');
INSERT 0 1

But Rails wants a little more, it wants to know the id of the thing that was just created which is a problem because Jimmy doesn’t have select permissions.

set role 'jimmy';
insert into fruits (name) values ('Orange') returning id;
ERROR:  permission denied for relation fruits

Argh this complicates matters, but I’ll relent and give Jimmy the appropriate permissions so that he can add records through ActiveRecord.

set role 'admin';
grant select on fruits to jimmy;
set role 'jimmy';
insert into fruits (name) values ('orange') returning id;
 id
----
  2

Add value to Postgres enum type, dropping tho...

An enum type in Postgres is a type that can restrict a value to only certain values. It can be defined like this:

create type fruit as Enum ('orange', 'apple');

We can now cast a string to a fruit.

chriserin=# select 'orange'::fruit;
 fruit
--------
 orange

Well… some strings…

chriserin=# select 'rasberry'::fruit;
ERROR:  invalid input value for enum fruit: "rasberry"

It’s all good! We can add rasberry to the enum type.

chriserin=# alter type fruit add value 'rasberry';
chriserin=# select 'rasberry'::fruit;
  fruit
----------
 rasberry

Postgres allows you to add values to an enum type, but you can’t drop values which is a bit of an inconvenience when creating up/down db migrations like you do in Rails or with a tool like tern.

Set A Seed For The Random Number Generator

In PostgreSQL, the internal seed for the random number generator is a run-time configuration parameter. This seed parameter can be set to a particular seed in order to get some determinism from functions that utilize the random number generator. The seed needs to be something between 0 and 1.

We can see this in action by setting the seed and then invoking random() a couple times. Doing this twice, we will see the reproducibility we can achieve with a seed.

> set seed to 0.1234;
SET

> select random();
      random
-------------------
 0.397731185890734

> select random();
      random
------------------
 0.39575699577108
(1 row)

> set seed to 0.1234;
SET

> select random();
      random
-------------------
 0.397731185890734

> select random();
      random
------------------
 0.39575699577108

The seed can also be configured with the setseed() function.

See the PostgreSQL docs for more details.

Shuffle an array in postgres

Somtimes an array is just too sequential

select ARRAY[1,2,3,4,5];
chriserin=# select ARRAY[1,2,3,4,5];
    array
-------------
 {1,2,3,4,5}
(1 row)

I want to shuffle this array into an unpredictable mess of numbers, and one way to do this in PostgreSQL is to turn these array elements into rows, order them randomly and then squash the rows back into an array like so:

chriserin=# select array_agg(foo.x) from (select unnest(ARRAY[1,2,3,4,5]) x order by random()) foo;
  array_agg
-------------
 {2,5,4,3,1}
(1 row)

The key here is the order by random() clause, which is a special postgres case. order by 0.284325 which is a number that random() could return, is not valid sql. There are dragons in order by, read here for more info http://postgresql.nabble.com/select-random-order-by-random-td1891064.html

Pretty Print Data Sizes

Use the pg_size_pretty() function to pretty print the sizes of data in PostgreSQL. Given a bigint, it will determine the most human-readable format with which to print the value:

> select pg_size_pretty(1234::bigint);
 pg_size_pretty
----------------
 1234 bytes

> select pg_size_pretty(123456::bigint);
 pg_size_pretty
----------------
 121 kB

> select pg_size_pretty(1234567899::bigint);
 pg_size_pretty
----------------
 1177 MB

> select pg_size_pretty(12345678999::bigint);
 pg_size_pretty
----------------
 11 GB

This function is particularly useful when used with the pg_database_size() and pg_relation_size() functions.

> select pg_size_pretty(pg_database_size('hr_hotels'));
 pg_size_pretty
----------------
 12 MB

Get The Size Of A Table

In Get The Size Of A Database, I showed a PostgreSQL administrative function, pg_database_size(), that gets the size of a given database. With the pg_relation_size() function, we can get the size of a given table. For instance, if we’d like to see the size of the reservations table, we can executing the following query:

> select pg_relation_size('reservations');
 pg_relation_size
------------------
          1531904

This gives us the size of the reservations table in bytes. As you might expect, the referenced table needs to be part of the connected database and on the search path.

See the Postgres docs for more details.

array indexes in postgres

Lets say I have an array of fruit types (Apple, Pear, Banana, Tangerine) and I want to turn each element in the array into a row I can do this.

select unnest(fruits) 
from (select Array['Apple', 'Pear', 'Tangerine', 'Banana'] fruits) as list;

which returns:

  unnest
-----------
 Apple
 Pear
 Tangerine
 Banana
(4 rows)

But it would be nice to have an array index as column in case the order of the elements in this array had some meaning. Postgres has a function generate_subscripts to help you with array index elements.

select generate_subscripts(fruits, 1), unnest(fruits) from (select Array['Apple', 'Pear', 'Tangerine', 'Banana'] fruits) as list;

which returns:

 generate_subscripts |  unnest
---------------------+-----------
                   1 | Apple
                   2 | Pear
                   3 | Tangerine
                   4 | Banana

Throw a minus 1 in there if you want to be base zero :).

Day Of Week By Name For A Date

In Day Of Week For A Date, I explained how to determine what day of the week a date is as an integer with PostgreSQL. This used the date_part() function. By using the to_char() function with a date or timestamp, we can determine the day of the week by name (e.g. Monday). For instance, to determine what day today is, try a statement like the following:

> select to_char(now(), 'Day');
  to_char
-----------
 Sunday

The Day part of the second argument is just one of many template patterns that can be used for formatting dates and times.

See Data Type Formatting Functions in the Postgres docs for more details.

pg Prefix Is Reserved For System Schemas

Have you ever tried to create a schema with pg_ as the first part of the name of the schema? If so, you probably didn’t get very far. Postgres won’t let you do that. It reserves the pg_ prefix for system schemas. If you try to create a schema in this way, you’ll get an unacceptable schema name error.

> create schema pg_cannot_do_this;
ERROR:  unacceptable schema name "pg_cannot_do_this"
DETAIL:  The prefix "pg_" is reserved for system schemas.

Compute Hashes With pgcrypto

The pgcrypto extension that comes with PostgreSQL adds access to some general hashing functions. Included are md5, sha1, sha224, sha256, sha384 and sha512. Any of these hashing functions can be applied to an arbitrary string using the digest function. Here are example of the md5 and sha1 algorithms:

> create extension pgcrypto;
CREATE EXTENSION

> select digest('Hello, World!', 'md5');
               digest
------------------------------------
 \x65a8e27d8879283831b664bd8b7f0ad4

> select digest('Hello, World!', 'sha1');
                   digest
--------------------------------------------
 \x0a0a9f2a6772942557ab5355d76af442f8f65e01

See the pgcrypto docs for more details.

Custom order #Postgres #SQL union queries

Consider the following data:

insert into users (email, first, last) values ('newuser@example.com', 'New', 'User');
insert into users (email, first, last) values ('notthereallemmy@example.com', 'Lemmy', 'Kilmister');
insert into users (email, first, last) values ('lemmy@example.com', 'Lemmy', 'Kilmister');

Assuming you want to match a user to the following data:

{ email: 'lemmy@example.com', first: 'New', last: 'User' }
````

And you consider email a better indication of a match, and first and last name are a fallback. The following query will not yield the desired results despite the `where` conditions being ordered per the requirements:

select * from users where email = ‘lemmy@example.com’ or (first = ‘New’ and last = ‘User’) limit 1;


This will result in:

1 | ‘newuser@example.com’ | ‘New’ | ‘User’


To find by email first we will need to union two queries, rank the match then order by rank:

(select 1 as rank, id, email, first, last from users where email = ‘lemmy@example.com’) union (select 2 as rank, id, email, first, last from users where first = ‘New’ and last = ‘User’) order by rank limit 1;


This will yield the correct result:

3 | ‘lemmy@example.com’ | ‘Lemmy’ | ‘Kilmister’

Creating Conditional Constraints in PostgreSQL

There are times when it doesn’t make sense for a constraint to apply to all records in a table. For instance, if we have a table of pokemon, we may only want to apply a unique index constraint to the names of non-wild pokemon. This can be achieved in PostgreSQL with the following conditional constraint:

create unique index pokemons_names on pokemons (names)
where wild = false;

If we try to insert a non-wild pokemon with a duplicate name, we will get an error. Likewise, if we try to update a pokemon with a duplicate name from wild to non-wild, we will get an error.

source

Sleeping In PostgreSQL

Generally you want your SQL statements to run against your database as quickly as possible. For those times when you are doing some sort of debugging or just want your queries to look very computationally expensive, PostgreSQL offers the pg_sleep function.

To sleep for 5 seconds, try the following:

> select now(); select pg_sleep(5); select now();
              now
-------------------------------
 2016-01-08 16:30:21.251081-06
(1 row)

Time: 0.274 ms
 pg_sleep
----------

(1 row)

Time: 5001.459 ms
              now
-------------------------------
 2016-01-08 16:30:26.252953-06
(1 row)

Time: 0.260 ms

As you’ll notice, the pg_sleep statement took about 5 seconds.

source

Is It Null Or Not Null?

In PostgreSQL, the standard way to check if something is NULL is like so:

select * as wild_pokemons from pokemons where trainer_id is null;

To check if something is not null, you just add not:

select * as captured_pokemons from pokemons where trainer_id is not null;

PostgreSQL also comes with ISNULL and NOTNULL which are non-standard ways of doing the same as above:

select * as wild_pokemons from pokemons where trainer_id isnull;
select * as captured_pokemons from pokemons where trainer_id notnull;

Restarting Sequences When Truncating Tables

PostgreSQL’s truncate feature is a handy way to clear out all the data from a table. If you use truncate on a table that has a serial primary key, you may notice that subsequent insertions keep counting up from where you left off. This is because the sequence the table is using hasn’t been restarted. Sure, you can restart it manually or you can tell truncate to do it for you. By appending restart identity to the end of a truncate statement, Postgres will make sure to restart any associated sequences at 1.

truncate pokemons, trainers, pokemons_trainers restart identity;

Postgres, varchar, and silent white space trimming

Given this table:

# create table initials (monogram varchar(2) primary key);

Insert a three character monogram.

# insert into initials(monogram) values ('BPD');
ERROR:  value too long for type character varying(2)

Ok, cool. Thanks Postgres. But what if my string ends in a space?

# insert into initials(monogram) values ('BP ');
INSERT 0 1

What? This caused me great pain when trying to do an insert. Here in my example I’ve used a CTE for brevity, but imagine you can’t see the input table, and it contains thousands of records. My insert went something like this:

with input as (
        select 'BD '::text as data
)
insert into initials(monogram)
select data from input where not exists (select 1 from initials where monogram = input.data)
;
ERROR:  duplicate key value violates unique constraint "initials_pkey"
DETAIL:  Key (monogram)=(BD) already exists.

A spot check based on the error message turns up no results:

select count(*) from input where data = 'BD';
count
-------
     0
(1 row)

But of course 'BD ' = 'BD', if you keep in mind that Postgres will silently truncate trailing whitespace when casting to a varchar.

Postgres Arrays are One-Based

Today I was reminded that Postgres arrays are one-based. This realization came while using Ruby’s times method (zero-based index) to generate SQL targeting a Postgres array (one-based index).

Take this table:

 name  |      pay_by_quarter
-------+---------------------------
 Bill  | {10000,10000,10000,10000}
 Carol | {20000,25000,25000,25000}
(2 rows)

To select the first-quarter pay for all employees, use index 1.

my_database=# select pay_by_quarter[1] from sal_emp;
 pay_by_quarter
----------------
          10000
          20000
(2 rows)

Using your first Ruby index value 0 to produce pay_by_quarter[0] returns a sad, empty column.

http://www.postgresql.org/docs/current/static/arrays.html

PostgreSQL's Max Identifier Length Is 63 Bytes

In PostgreSQL, identifiers — table names, column names, constraint names, etc. — are limited to a maximum length of 63 bytes. Identifiers longer than 63 characters can be used, but they will be truncated to the allowed length of 63.

> alter table articles
    add constraint this_constraint_is_going_to_be_longer_than_sixty_three_characters_id_idx
    check (char_length(title) > 0);
NOTICE:  identifier "this_constraint_is_going_to_be_longer_than_sixty_three_characters_id_idx" will be truncated to "this_constraint_is_going_to_be_longer_than_sixty_three_characte"
ALTER TABLE

Postgres warns us of identifiers longer than 63 characters, informing us of what they will be truncated to. It then proceeds to create the identifier.

If postgres is trying to generate an identifier for us - say, for a foreign key constraint - and that identifier is longer than 63 characters, postgres will truncate the identifier somewhere in the middle so as to maintain the convention of terminating with, for example, _fkey.

The 63 byte limit is not arbitrary. It comes from NAMEDATALEN - 1. By default NAMEDATALEN is 64. If need be, this value can be modified in the Postgres source. Yay, open-source database implementations.

See the postgres docs for more details.

Adding Composite Uniqueness Constraints (Postgres)

There are two ways in Postgres to create a composite uniqueness constraint; that is, a constraint that ensures that the combination of two or more values on a table only appear once. For the following two code snippets, assume that we have a table relating Pokemon and Trainers and that our domain restricts each Trainer to only having at most one of each Pokemon.

The first approach is to create a constraint directly on the table:

alter table pokemons_trainers
  add constraint pokemons_trainers_pokemon_id_trainer_id_key
  unique (pokemon_id, trainer_id);

The second approach is to create a unique index:

create unique index pokemons_trainers_pokemon_id_trainer_id_idx
  on pokemons_trainers (pokemon_id, trainer_id);

Export Query Results To A CSV

Digging through the results of queries in Postgres’s psql is great if you are a programmer, but eventually someone without the skills or access may need to check out that data. Exporting the results of a query to CSV is a friendly way to share said results because most people will have a program on their computer that can read a CSV file.

For example, exporting all your pokemon to /tmp/pokemon_dump.csv can be accomplished with:

copy (select * from pokemons) to '/tmp/pokemon_dump.csv' csv;

Because we are grabbing the entire table, we can just specify the table name instead of using a subquery:

copy pokemons to '/tmp/pokemon_dump.csv' csv;

Include the column names as headers to the CSV file with the header keyword:

copy (select * from pokemons) to '/tmp/pokemon_dump.csv' csv header;

source

Know your HISTFILE

Today I learned that psql has a cool feature called the HISTFILE. It’s a record of your commands in the REPL, named ~/.psql_history by default.

Building off Josh’s post yesterday about ~/.psqlrc settings, add this there to get a unique log for each database:

\set HISTFILE ~/.psql_history-:DBNAME

Here’s the tail of my TIL log:

$ tail ~./psql_history-hr-til_development
select * from developers limit 1;
select * from developers limit 2;
select * from developers limit 3;

This is just a text file; it can be edited. You could add a complex command, or meaningful reminder to yourself, and it will be available in psql when you traverse your history (↑ or CTRL-P).