Today I Learned

A Hashrocket project

465 posts by chriserin @mcnormalmode

`random()` in subquery is only executed once

I discovered this morning that random() when used in a subquery doesn’t really do what you think it does.

Random generally looks like this:

> select random() from generate_series(1, 3)
      random
-------------------
 0.856217631604522
 0.427044434007257
 0.237484132871032
(3 rows)

But when you use random() in a subquery the function is only evaluated one time.

> select (select random()), random() from generate_series(1, 3);
      random       |      random
-------------------+-------------------
 0.611774671822786 | 0.212534857913852
 0.611774671822786 | 0.834582580719143
 0.611774671822786 | 0.415058249142021
(3 rows)

So do something like this:

insert into things (widget_id) 
select 
  (select id from widgets order by random() limit 1)
from generate_series(1, 1000);

Results in 1000 entries into things all with the same widget_id.

Testing Shell Conditions

When you’re shell scripting you really want to get your head wrapped around conditions. GNU provides a command to test conditions.

test 1 -gt 0
# exits with exit code 0
echo $?
# prints 0
test 0 -gt 1
# exits with exit code 1
echo $?
# prints 1

Checking the $? env var is a bit awkward, you can chain the command with echo though.

test 1 -gt 0 && echo true
# outputs true

Just be aware that it doesn’t output false when false.

But if you’re chaining with && you might as well use the [[ compound command.

[[ 1 -gt 0]] && echo true
# outputs true

Now you’re using shell syntax directly.

Linux ZSH ls colors

ls does not colorize the output in linux.

ls --colordoes colorize the output. It’s smart to set an alias.

alias ls='ls --color=auto'

Ok, now you’ve got colors everytime, but how do you change those colors?

The color settings are defaulted, but can be overriden by the value of environment variable LS_COLORS.

The language for setting these colors is really obtuse, but you can generate the settings with the command dircolors. dircolors outputs an enivornment variable you can include into your zshrc file. This variable will give you the same colors as when LS_COLORS is not set.

You can figure out what values to set colors to with this resource.

Pass args to a custom vim command

Custom commands are easy in vim:

:command HelloWorld echo "hello world"
:HelloWorld
" outputs hello world

But what if I want to pass an arg to the command?

First you have to specify that you want args with the -narg flag. Then you need to have declare where the args would go with <q-args>.

:command! -narg=1 Say :echo "hello" <q-args>
:Say world
" outputs hello world

Creating a Bind Mount with `docker volume`

Creating a bind mount (a volume that has an explicitly declared directory underpinning it) is easy when using docker run:

docker run -v /var/app/data:/data:rw my-container

Now, anytime you write to the container’s data directory, you will be writing to /var/app/data as well.

You can do the same thing with the --mount flag.

docker run --mount type=bind,source=/var/app/data,target=/data my-container

Sometimes though you might want to create a bind mount that is independent of a container. This is less than clear but Cody Craven figured it out.

docker volume create \
--driver local \
-o o=bind \
-o type=none \
-o device=/var/app/data \
example-volume

The key value pairs passed with -o are not well documented. The man page for docker-create-volume says:

The built-in local driver on Linux accepts options similar to the linux mount command

The man page for mount will have options similiar to the above, but structred differently.

Set Git Tracking Branch on `push`

You hate this error, right?

$ git push
There is no tracking information for the current branch.

I especially hate git’s recommendation at this stage:

$ git branch --set-upstream-to=origin/<branch> my-branch

You can check for tracking information in your config file with:

$ git config -l | grep my-branch
# returns exit code 1 (nothing)

Yep, no tracking info. The first time you push you should use the -u flag.

# assuming you are on my-branch
$ git push -u origin HEAD

No do you have tracking info?

# returns the tracking information stored in config!
$ git config -l | grep my-branch
branch.my-branch.remote=origin
branch.my-branch.merge=refs/heads/my-branch
branch.my-branch.rebase=true

Did you forget to set up tracking on the first push? Don’t worry, this actually works anytime you push.

$ git push
There is no tracking information for the current branch.

$ git push -u origin HEAD
Branch 'my-branch' set up to track remote branch 'my-branch' from 'origin' by rebasing.

This is so more ergonomic than git’s recommendation.

Get Back To Those Merge Conflicts

You’ve probably experienced this:

Decision A
<<<<<<< HEAD
Decision H
Decision I
=======
Decision F
Decision G
>>>>>>> branch a
Decision E

And you wind up making some iffy decisions:

Decision A
Decision I
Decision G
Decision E

The tests don’t pass, you’re not confident in the choices you’ve made, but this is the third commit in a rebase and you don’t want to start over.

It’s easy to get back to a place where all your merge conflicts exist with:

git checkout --merge file_name
# or
git checkout -m file_name

Now you can re-evaluate your choices and make better decisions

Decision A
<<<<<<< HEAD
Decision H
Decision I
=======
Decision F
Decision G
>>>>>>> branch a
Decision E

H/T Brian Dunn

Sharing Volumes Between Docker Containers

In docker, it’s easy to share data between containers with the --volumes-from flag.

First let’s create a Dockerfile that declares a volume.

from apline:latest

volume ["/foo"]

Then let’s:

  1. Build it into an image foo-image
  2. Create & Run it as a container with the name foo-container
  3. Put some text into a file in the volume
docker build . -t foo-image
docker run -it --name foo-container foo-image sh -c 'echo abc > /foo/test.txt'

When you run docker volume ls you can see a volume is listed. By running a container from an image with a volume we’ve created a volume.

When you run docker container ls -a you can see that we’ve also created a container. It’s stopped currently, but the volume is still available.

Now let’s run another container passing the name of our previously created container to the --volumes-from flag.

docker run -it --volumes-from foo-container alpine cat /foo/test.txt

# outputs: abc

We’ve accessed the volume of the container and output the results.

Blocking ip6 addresses with /etc/hosts

Like many developers, I need to eliminate distractions to be able to focus. To do that, I block non-development sites using /etc/hosts entries, like this:

127.0.0.1 twitter.com

Today I learned that this doesn’t block sites that use ip6. I have cnn.com in my /etc/hosts file but it is not blocked in the browser.

To prove this is an ip6 issue I can use ping and ping6

> ping cnn.com
PING cnn.com (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.024 ms

> ping6 cnn.com
PING6(56=40+8+8 bytes) 2601:240:c503:87e3:fdee:8b0b:dadf:278e --> 2a04:4e42:200::323
16 bytes from 2a04:4e42:200::323, icmp_seq=0 hlim=57 time=9.288 ms

So for ip4 requests cnn.com is pinging localhost and not getting a response, which is what I want. For ip6 addresses cnn.com is hitting an address that is definitely not my machine.

Let’s add another entry to /etc/hosts:

::1 cnn.com

::1 is the simplification of the ip6 loopback address 0:0:0:0:0:0:0:1.

Now, does pinging cnn.com with ip6 hit my machine?

> ping6 cnn.com
PING6(56=40+8+8 bytes) ::1 --> ::1
16 bytes from ::1, icmp_seq=0 hlim=64 time=0.044 ms

Distractions eliminated.

View the `motd` after login in Ubuntu

When you ssh into an Ubuntu machine, you may see a welcome message that starts with something like this:

Welcome to Ubuntu 18.04.1 LTS (GNU/Linux 4.15.0-65-generic x86_64)

This is the motd (message of the day).

What if you clear your terminal after login but want to see that message again?

There are two ways to do this.

$ cat /run/motd.dynamic

This will show you the same message that was created for you when you logged in.

If there is dynamic information in that message and you want to see the latest version run:

$ sudo run-parts /etc/update-motd.d/

This will run all the scripts that make the motd message.

Highlight json with the `bat`

Sometimes I run a utility that outputs a whole bunch of json, like:

docker inspect hello-world 
# outputs a couple pages of json

I want to send it through bat because bat is a great output viewer, and I also want it to syntax highlight. If bat is viewing a file with the extenson of json then it will get syntax highlighting, but in this case there is no file and no extension.

You can turn on json syntax highlighting with the --language flag.

docker inspect hello-world | bat --language json
# or just use -l
docker inspect hello-world | bat -l json

Combine this with --theme and you’re looking good!

docker inspect hello-world | bat -l json --theme TwoDark

The interaction of CMD and ENTRYPOINT

The CMD and ENTRYPOINT instructions in a Dockerfile interact with each other in an interesting way.

Consider this simple dockerfile:

from alpine:latest

cmd echo A

When I run docker run -it $(docker build -q .) The out put I get is A, like you’d expect.

With an additional entrypoint instruction:

from alpine:latest

entrypoint echo B
cmd echo A

I get just B no A.

Each of these commands are using the shell form of the instruction. What if I use the exec form?

from alpine:latest

entrypoint ["echo", "B"]
cmd ["echo", "A"]

Then! Surprisingly, I get B echo A.

When using the exec form cmd provides default arguments to entrypoint

You can override those default arguments by providing an argument to docker run:

docker run -it $(docker build -q .) C
B C

`ets` table gets deleted when owning process dies

You can create a new ets table with:

:ets.new(:chris_table, [:named_table])

And you can confirm it was created with:

:ets.info(:chris_table)
[
id: #Reference<0.197283434.4219076611.147360>,
...
]

Now check this:

spawn(fn -> :ets.new(:spawn_table, [:named_table]) end)

Let’s see if it was created:

:ets.info(:spawn_table)
# returns :undefined

What gives? The erlang ets docs say this:

Each table is created by a process. When the process terminates, the table is automatically destroyed.

So, spawn created the process and then terminated, so :spawn_table got deleted when the process died.

Install all versions in .tool-versions with asdf

If you get the code for a new project and it is a project where versions are managed by asdf, then you will have a .tool-versions file and it will look something like this:

elixir 1.7.4-otp-21
erlang 21.3.8

If I don’t have those versions installed, then generally I install those individually.

If your working directory is the same version as the .tool-versions file then you can install all versions specified in that file with:

asdf install

Switch branches in git with... `git switch`

It’s experimental. It’s intuitive. It’s in the newest version of git, version 2.23.0. It is:

git switch my-branch

And, with every git command there are many ways to use the command with many flags:

You might want to create a branch and switch to it:

git switch -c new-branch

You might want to switch to a new version of a local branch that tracks a remote branch:

git switch -t origin/remote-branch

You can throw away unstaged changes with switching by using -f.

git switch -f other-branch

I feel that if I were learning git from scratch today, this would be much easier to learn, there’s just so much going on with checkout.

Delete remote branches with confirmation

Branches on the git server can sometimes get out of control. Here’s a sane way to clean up those remote branches that offers a nice confirmation before deletion, so that you don’t delete something you don’t want to delete.

git branch -a | grep remotes | awk '{gsub(/remotes\/origin\//, ""); print;}' | xargs -I % -p git push origin :%

The -p flag of xargs provides the confirmation.

Precise timings with `monotonic_time`

Monotonic time is time from a clock that only moves forward. The system clock on your CPU can be set and reset. Even when tied to the LAN ntp protocol the system clock can be out-of-sync by a couple of milliseconds. When measuring in microseconds, that’s a lot of time, and time drift can occur at the microsecond level even when attached to NTP, requiring system clock resets.

To get monotonic time in Elixir use, System.monotonic_time:

iex> System.monotonic_time
-576460718338896000
iex> System.monotonic_time
-576460324867892860

It’s ok that this number is negative, it’s always moving positive.

The number has a time unit of :native. To get a duration in millseconds you could convert from :native to millisecond.

iex> event_time = System.monotonic_time
-576459417748861340
iex> System.convert_time_unit(System.monotonic_time - event_time, :native, :millisecond)
38803

Or you could get a millisecond duration by using the one argument of monotonic_time to specify the time unit you want.

iex> event_time = System.monotonic_time(:millisecond)
-576459079381
iex> System.monotonic_time(:millisecond) - event_time
14519

Check out the elixir docs on time for more info.

FormData doesn't iterate over disabled inputs

If I have a form that looks like this:

<form>
  <input disabled name="first_name" />
  <input name="last_name" />
</form>

And then I use FormData to iterate over the inputs.

const form = new FormData(formElement);

const fieldNames = [];

form.forEach((value, name) => {
  fieldNames.push(name);
});

// fieldNames contains ['last_name']

Then fieldNames only contains one name!

Be careful if you’re using FormData to extract names from a form!

Erlang records are just tuples

Erlang has a Record type which helps you write clearer code by having data structures that are self descriptive.

One thing that is interesting about a Record is that a record is just a tuple underneath the hood.

Erlang/OTP 22 [erts-10.4.4] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1] [hipe]

Eshell V10.4.4  (abort with ^G)
1> rd(fruit, {name, color}).
fruit
2> #fruit{name="Apple", color="Red"}.
#fruit{name = "Apple",color = "Red"}
3> #fruit{name="Apple", color="Red"} == {fruit, "Apple", "Red"}.
true

As you can see, that internal representation is exposed when comparing a record to a tuple:

#fruit{name="Apple", color="Red"} == {fruit, "Apple", "Red"}.

You can even use pattern matching with the two data structures:

7> #fruit{name="Apple", color=NewColor} = {fruit, "Apple", "Green"}.
#fruit{name = "Aplle",color = "Green"}
8> NewColor.
"Green"

Declaring Erlang records in a shell

Erlang records are typically declared in a header file. If you want to experiment with records at the command line, you’ll have to use a shell command.

rd is an erl shell command that you can remember as standing for record definition

Let’s try that in the erlang shell tool, erl.

Erlang/OTP 21 [erts-10.1] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1] [hipe] [dtrace]

Eshell V10.1  (abort with ^G)
> rd(fruit, {name, color}).
fruit

And then you can declare a record the same way you would in Erlang.

> Apple = fruit#{name="Apple", color="Grren"}.
#fruit{name = "Apple",color = "green"}

And to inspect that variable in erl, just declare the variable and put a . after it to conclude the expression.

> Apple.
#fruit{name = "Apple",color = "green"}

You can find other shell commands that deal with records and other things here.

telemetry handler detaches automatically on error

Telemetry handlers should be designed to run efficiently over and over again without making a noticeable performance impact on your production system. But what happens if an occurs? Your monitoring should have no bearing on the success of your business logic.

So what happens if your monitoring has an error? With telemetry, it’s ok, the offending handler is just detached, never to run again. Business logic unaffected.

Here’s a test that demonstrates the handler being removed.

handlerFn = fn _, _, _, _ ->
  raise "something"
end

:telemetry.attach(
  :bad_handler,
  [:something_happened],
  handlerFn,
  :no_config
)

# the handler is in ets
assert [[[:something_happened]]] =
         :ets.match(:telemetry_handler_table, {:handler, :bad_handler, :"$1", :_, :_})

:telemetry.execute([:something_happened], %{}, %{})

# the handler is gone!
assert [] = :ets.match(:telemetry_handler_table, {:handler, :bad_handler, :"$1", :_, :_})

You can see the try/catch block in the telemetry source code here.

List all telemetry event handlers

Telemetry is a new library for application metrics and logging in beam applications. It was added to Phoenix in version 1.4.7 released in June 2019.

Telemetry consists of registered handler functions that are executed when specific events occur.

To see a list of what handlers are registered for which events you can call:

:telemetry.list_handlers([])

When returns a list of maps.

To see a list of handlers that have a specific event prefix, you can pass in a prefix as the only argument.

:telemetry.list_handlers([:phoenix, :endpoint])

Which returns:

[
  %{
    config: :ok,
    event_name: [:phoenix, :endpoint, :start],
    function: #Function<2.82557494/4 in Phoenix.Logger.install/0>,
    id: {Phoenix.Logger, [:phoenix, :endpoint, :start]}
  },
  %{
    config: :ok,
    event_name: [:phoenix, :endpoint, :stop],
    function: #Function<3.82557494/4 in Phoenix.Logger.install/0>,
    id: {Phoenix.Logger, [:phoenix, :endpoint, :stop]}
  }
]

`telemetry_event` Overrides Repo Query Event

Ecto gives you a single telemetry event out of the box, [:my_app, :repo, :query], where the [:my_app, :repo] is the telemetry prefix option for ecto.

This event is called whenever any request to the database is made:

  handler = fn _, measurements, _, _ ->
    send(self(), :test_message)
  end

  :telemetry.attach(
    "query",
    [:test_telemetry, :repo, :query],
    handler,
    %{}
  )

  Repo.all(TestTelemetry.Colour)

  assert_receive(:test_message)

This event is overriden when using the the :telemetry_event option, a shared option for all Repo query functions.

  handler = fn _, measurements, _, _ ->
    send(self(), :test_message)
  end

  custom_handler = fn _, measurements, _, _ ->
    send(self(), :custom_message)
  end

  :telemetry.attach(
    "query",
    [:test_telemetry, :repo, :query],
    handler,
    %{}
  )

  :telemetry.attach(
    "custom",
    [:custom],
    custom_handler,
    %{}
  )

  Repo.all(TestTelemetry.Colour, telemetry_event: [:custom])

  assert_receive(:custom_message)
  refute_receive(:test_message)

Which means for any given query you can only broadcast one event. If you have a system that keeps track of expensive queries but you also need to debug a particular query in production, you will take that query out of the system to track expensive queries.

Telemetry Attach and Execute

The telemetry api is simpler than the name tends to imply. There are only two primary functions, attach/4 and execute/4. Check out the telemetry docs to see the full api.

Attach is simple. Essentially, you want a function to be called when a certain event occurs:

handler = fn [:a_certain_event], _measuremnts, _metadata, _config ->
  IO.puts("An event occurred!!")
end

:telemetry.attach(
  :unique_handler_id,
  [:a_certain_event],
  handler,
  :no_config
)

The first argument is handler_id and can be anything be must be unique. The last argument is config and can also be anything. It will be passed along, untouched, to the handler function as the last argument.

Execute is simple. When the program calls execute, the handler that matches the event is called and the measurements and metadata are passed to the handler.

measurements = %{}
metadata = %{}
:telemetry.execute([:a_certain_event], measurements, metadata)

It’s important to note that the event name must be a list of atoms.

A simple test for attach and execute would look like this:

test "attach and execute" do
  handler = fn _, _, _, _ ->
    send(self(), :test_message)
  end

  :telemetry.attach(
    :handler_id,
    [:something_happened],
    handler,
    :no_config
  )

  :telemetry.execute([:something_happened], %{}, %{})

  assert_receive :test_message
end

Where in with multiple values in postgres

Postgres has a record type that you can use with a comma seperated list of values inside of parenthesis like this:

> SELECT pg_typeof((1, 2));

 pg_typeof
-----------
 record
(1 row)

What is also interesting is that you can compare records:

> select (1, 2) = (1, 2);

 ?column?
----------
 t
(1 row)

And additionally, a select statement results in a record:

> select (1, 2) = (select 1, 2);

 ?column?
----------
 t
(1 row)

What this allows you to do is to create a where statement where the expression can check to see that 2 or more values are contained in the results of a subquery:

> select true where (1, 2) in (
  select x, y
  from
    generate_series(1, 2) x,
    generate_series(1, 2) y
);

 bool
------
 t
(1 row)

This is useful when you declare composite keys for your tables.

Accumulating Attributes In Elixir

Typically, if you declare an attribute twice like this:

@unit_of_measure :fathom
@unit_of_measure :stone

The second declaration will override the first:

IO.inspect(@unit_of_measure)
# :stone

But by registering the attribute by calling register_attribute you get the opportunity to set the attribute to accumulate. When accumulating, each declaration will push the declared value onto the head of a list.

defmodule TriColarian do
  @moduledoc false

  Module.register_attribute(__MODULE__, :colors, accumulate: true)

  @colors :green
  @colors :red
  @colors :yellow

  def colors do
    @colors
  end
end

TriColarian.colors()
# [:yellow, :red, :green]

At compile time, perhaps when executing a macro, you have the opportunity to dynamically build a list.

I learned this when Andrew Summers gave a talk on DSLs at Chicago Elixir this past Wednesday. You can see his slides here

`sleep` is just `receive after`

While using sleep in your production application might not be for the best, it’s useful in situations where you’re simulating process behaviour in a test or when you’re trying to diagnose race conditions.

The Elixir docs say this:

Use this function with extreme care. For almost all situations where you would use sleep/1 in Elixir, there is likely a more correct, faster and precise way of achieving the same with message passing.

There’s nothing special about sleep though. The implementation for both :timer.sleep and Process.sleep is equivalent. In Elixir syntax, it’s:

receive after: (timeout -> :ok)

So, it’s waiting for a message, but doesn’t provide any patterns that would successfully be matched. It relies on the after of receive to initiate the timeout and then it returns :ok.

If you’re building out some production code that requires waiting for a certain amount of time, it might be useful to just use the receive after code so that if later you decide that the waiting can be interrupted you can quickly provide a pattern that would match the interrupt like:

timeout = 1000

receive do
  :interrupt -> 
    :ok_move_on_now
  
  after
    timeout -> :ok_we_waited
end

Assert one process gets message from another

Erlang has a very useful-for-testing function :erlang.trace/3, that can serve as a window into all sorts of behaviour.

In this case I want to test that one process sent a message to another. While it’s always best to test outputs rather than implementation, when everything is asynchronous and paralleized you might need some extra techniques to verify your code works right.

pid_a =
  spawn(fn ->
    receive do
      :red -> IO.puts("got red")
      :blue -> IO.puts("got blue")
    end
  end)

:erlang.trace(pid_a, true, [:receive])

spawn(fn ->
  send(pid_a, :blue)
end)

assert_receive({:trace, captured_pid, :receive, captured_message})

assert captured_pid == pid_a
assert :blue == captured_message

In the above example we setup a trace on receive for the first process:

:erlang.trace(pid_a, true, [:receive])

Now, the process that called trace will receive a message whenever traced process receives a message. That message will look like this:

{
  :trace,
  pid_that_received_the_message,
  :receive, # the action being traced
  message_that_was_received
}

This in combination with assert_receive allows you to test that the test process receives the trace message.

Assert Linked Process Raised Error

Linked processes bubble up errors, but not in a way that you can catch with rescue:

test "catch child process error?" do
  spawn_link(fn -> 
    raise "3RA1N1AC"
  end)
rescue 
  e in RuntimeError ->
    IO.puts e
end

This test fails because the error wasn’t caught. The error bubbles up outside of normal execution so you can’t rely on procedural methods of catching the error.

But, because the error causes on exit on the parent process (the test process) you can trap the exit with Process.flag(:trap_exit, true). This flag changes exit behavior. Instead of exiting, the parent process will now receive an :EXIT message.

test "catch child process error?" do
  Process.flag(:trap_exit, true)

  child_pid = spawn_link(fn -> 
    raise "3RA1N1AC"
  end)

  assert_receive {
    :EXIT,
    ^child_pid,
    {%RuntimeError{message: "3RA1N1AC"}, _stack}
  }
end

The error struct is returned in the message tuple so you can pattern match on it and assert about.

This method is still subject to race conditions. The child process must throw the error before the assert_receive times out.

There is a different example in the Elixir docs for catch_exit.

Assert Test Process Did or Will Receive A Message

The ExUnit.Assertions module contains a function assert_receive which the docs state:

Asserts that a message matching pattern was or is going to be received within the timeout period, specified in milliseconds.

It should possibly in addition say “received by the test process”. Let’s see if we can send a message from a different process and assert that the test process receives it:

test_process = self()

spawn(fn ->
  :timer.sleep(99)
  send(test_process, :the_message)
end)

assert_receive(:the_message, 100)
# It Passes!!!

In the above code, :the_message is sent 1 millisecond before the timeout, and the assertion passes.

Now let’s reverse the assertion to refute_receive, and change the sleep to the same time as the timeout.

test_process = self()

spawn(fn ->
  :timer.sleep(100)
  send(test_process, :the_message)
end)

refute_receive(:the_message, 100)
# It Passes!!!

Yep, it passes.

Are All Values True in Postgres

If you have values like this:

chriserin=# select * from (values (true), (false), (true)) x(x);
 x
---
 t
 f
 t

You might want to see if all of them are true. You can do that with bool_and:

chriserin=# select bool_and(x.x) from (values (true), (false), (true)) x(x);
bool_and 
---
 f

And when they are all true:

chriserin=# select bool_and(x.x) from (values (true), (true), (true)) x(x);
bool_and 
---
 t

Hiding and Revealing Struct Info with `inspect`

There are two ways to hide information when printing structs in Elixir.

Hiding by implementing the inspect protol.

defmodule Thing do
  defstruct color: "blue", tentacles: 7
end

defimpl Inspect, for: Thing do
  def inspect(thing, opts) do
    "A #{thing.color} thing!"
  end
end

So now in iex I can’t tell how many tentacles the thing has:

> monster = %Thing{color: "green", tentacles: 17}
> IO.inspect(monster, label: "MONSTER")
MONSTER: A green thing!
A green thing!

Note that iex uses inspect to output data which can get confusing.

NEW IN ELIXIR 1.8: You can also hide data with @derive

defmodule Thing do
  @derive {Inspect, only: [:color]}
  defstruct color: "blue", tentacles: 7
end

And now you won’t see tentacles on inspection

> monster = %Thing{color: "green", tentacles: 17}
> IO.inspect(monster)
#Thing<color: "green", ...>

In both cases, you can reveal the hidden information with the structs: false option:

> monster = %Thing{color: "green", tentacles: 17}
> IO.inspect(monster)
#Thing<color: "green", ...>
> IO.inspect(monster, structs: false)
%{__struct__: Thing, color: "green", tentacles: 17}

Custom Validation in Ecto

Sometimes the standard validation options aren’t enough.

You can create a custom validator in Ecto using the validate_change function.

In this particular case, I want to validate that thing has an odd number of limbs. I can pass the changeset to validate_odd_number.

def changeset(thing, attrs) do
  thing
  |> cast(attrs, [
    :number_of_limbs
  ])
  |> validate_odd_number(
    :number_of_limbs
  )
end

And then define validate_odd_number like this:

def validate_odd_number(changeset, field) when is_atom(field) do
  validate_change(changeset, field, fn (current_field, value) ->
    if rem(value, 2) == 0 do
      [{f, "This field must be an odd number"}]
    else 
      []
    end
  end)
end

We pass a function to validate_change that takes the field atom and the value for that field. In that function we test for oddness and return a keyword list that contains the field name and an error message.

Remember that when f is :number_of_limbs:

    [{f, "hi"}] == [number_of_limbs: "hi"]
    # true, these data structures are equal

Ecto's `distinct` adds an order by clause

When using distinct on you may encounter this error:

select distinct on (color) id, color from fruits order by id;
-- ERROR:  SELECT DISTINCT ON expressions must match initial ORDER BY expressions

This is because distinct on needs to enounter the rows in a specific order so that it can make the determination about which row to take. The expressions must match the initial ORDER BY expressions!

So the above query works when it looks like this:

select distinct on (color) id, color from fruits order by color, id;

OK, now it works.

Ecto’s distinct function helps us avoid this common error by prepending an order by clause ahead of the order by clause you add explicitly.

This elixir statement:

Fruit |> distinct([f], f.color) |> order_by([f], f.id) |> Repo.all

Produces this sql (cleaned up for legibility):

select distinct on (color) id, color from fruits order by color, id;

Ecto added color to the order by!

Without any order by at all, distinct does not prepend the order by.

Read the docs!

Transactions can timeout in Elixir

In Ecto, transactions can timeout. So this type of code:

  Repo.transaction fn -> 
    # Many thousands of expensive queries
    # And inserts
  end

This type of code might fail after 15 seconds, which is the default timeout.

In Ecto you can specify what the timeout should be for each operation. All functions that make a request to the database have the same same shared options of which :timeout is one.

Repo.all(massive_query, timeout: 20_000)

The above query now times out after 20 seconds.

These shared options apply to a transaction as well. If you don’t care that a transaction is taking a long time you can set the timeout to :infinity.

  Repo.transaction(fn -> 
    # Many thousands of expensive queries
    # And inserts
  end, timeout: :infinity)

Now this operation will be allowed to finish, despite the time it takes.

Timing A Function In Elixir

Erlang provides the :timer module for all things timing. The oddly named function tc will let you know how long a function takes:

{uSecs, :ok} = :timer.tc(IO, :puts, ["Hello World"])

Note that uSecs is microseconds not milliseconds so divide by 1_000_000 to get seconds.

microseconds are helpful though because sometimes functions are just that quick.

:timer.tc(IO, :puts, ["Hello World"])
# {22, :ok}

You can also call :timer.tc with a function and args:

adding = fn (x, y) ->  x + y end

:timer.tc(adding, [1,3])
# {5, 4}

Or just a function:

:timer.tc(fn -> 
    # something really expensive
    :ok
    end)
# {1_302_342, :ok}

Ack ignores node_modules by default

When searching through your JavaScript project it doesn’t make sense to search through your node_modules. But if your are on a spelunking journey into the depths of your dependencies, you may want to search through all your node_modules!

ack ignores node_modules by default, and ack being ack you can ack through ack to check it out:

> cat `which ack` | ack node_modules
--ignore-directory=is:node_modules

This is different behaviour from ag and rg which also ignore node_modules but not explicitly. They both ignore node_modules by ignoring all entries in the .gitignore file.

rg claims to implement full support for the .gitignore file while also claiming other search tools do not. The open issues list for ag bears that out.

With each of these tools, explicitly stating the directory to search through overrides the ignore.

> ack autoprefix node_modules
> rg autoprefix node_modules
> ag autoprefix node_modules

`user-select:none` needs prefixes for each browser

The user-select css property governs if text is selectable for a given element. user-select: none means that it is not selectable.

What’s interesting about this property is that while each browser supports it, they each require their own prefix, except Chrome, which does not need a prefix.

In create react app, what starts out as:

user-select: none;

Gets expanded to:

-webkit-user-select: none;
-moz-user-select: none;
-ms-user-select: none;
user-select: none;

But the default browserList configuration for development in your package.json file is:

"development": [
    "last 1 chrome version",
    "last 1 firefox version",
    "last 1 safari version"
]

And so in development, it gets expanded to:

-webkit-user-select: none;
-moz-user-select: none;
user-select: none;

Sorry Micrsoft.

How does create react app know which prefixes to render? The caniuse-lite npm package has up-to-date support data for each property and user-select:none is defined here.

That data is compressed. Here’s how to uncompress it using the node cli to see what it represents:

const compressedData = require('caniuse-lite/data/features/user-select-none');
const caniuse = require('caniuse-lite');
const unpackFunction = caniuse.feature;
unpackFunction(compressedData);

This is accomplished in create react app by the npm package autoprefixer;

Custom React Hook Must Use `use`

You can build your own hooks by composing existing hooks.

Here, I create a custom hook useBoolean by wrapping useState:

const useBoolean = () => useState(true);

Which I can then use in my component:

function Value() {
  const [value, setValue] = useBoolean();

  return <div onClick={() => setValue(!value)}>Click me {String(value)}</div>;
}

The react documentation very politely asks that you start the name of your hook with use. This is isn’t strictly necessary, and it will still work if you call it:

const doBoolean = () => useState(true);

But that violates the Rules of Hooks.

You can include an eslint plugin that will prevent you from breaking the rules. This plugin is installed by default in create-react-app version 3.

Aggregate Arrays In Postgres

array_agg is a great aggregate function in Postgres but it gets weird when aggregating other arrays.

First let’s look at what array_agg does on rows with integer columns:

select array_agg(x) from (values (1), (2), (3), (4)) x (x);
-- {1,2,3,4}

It puts each value into an array. What if are values are arrays?

select array_agg(x)
from (values (Array[1, 2]), (Array[3, 4])) x (x);
-- {{1,2},{3,4}}

Put this doesn’t work when the arrays have different numbers of elements:

select array_agg(x)
from (values (Array[1, 2]), (Array[3, 4, 5])) x (x);
-- ERROR:  cannot accumulate arrays of different dimensionality

If you are trying to accumulate elements to process in your code, you can use jsonb_agg.

select jsonb_agg(x.x)
from (values (Array[1, 2]), (Array[3, 4, 5])) x (x);
-- [[1, 2], [3, 4, 5]]

The advantage of using Postgres arrays however is being able to unnest those arrays downstream:

select unnest(array_agg(x))
from (values (Array[1, 2]), (Array[3, 4])) x (x);
--      1
--      2
--      3
--      4

Count true values with postgres

I can try to count everything that is not null in postgres:

select count(x.x)
from (
  values (null), ('hi'), (null), ('there')
) x (x);
# 2

But if I try to count everything that is true in postgres, I don’t get what I want:

select count(x.x)
from (
  values (false), (true), (false), (true)
) x (x);
# 4

I can, however, take advantage of the postgres ability to cast boolean to int:

select true::int;
# 1
select false::int;
# 0

Using ::int I get:

select count(x.x::int)
from (
  values (false), (true), (false), (true)
) x (x);
# 4

Postgres is still counting everything that is not null, but what if use sum instead?

select sum(x.x::int)
from (
  values (false), (true), (false), (true)
) x (x);
# 2

Because everything is either a 0 or a 1, sum behaves like count.

Now you can do something like this:

select
  sum((status = 'Awesome')::int) as awesomes,  
  sum((status = 'Terrible')::int) as terribles
from statuses;

Print the current stacktrace in Elixir

Stacktrace, backtrace, callstack, in Elixir its stacktrace and it’s available via Process.info/2 using the :current_stacktrace item:

Process.info(self(), :current_stacktrace)

And to print it:

IO.inspect(Process.info(self(), :current_stacktrace), label: "STACKTRACE")

I’m also learning that Process.info/2 takes a pid and an item as arguments. When you call Process.info/1 with just the pid you only get a subset of the info available, not everything.

The items available via Process.info/1 are listed in the erlang documentation here.

The additional items available via Process.info/2 are listed in the erlang documentation here.

You may note that backtrace is also an item that is available via Process.info but it contains more information than you are might need to figure out where you are in the code.

Give a commit a new parent

Given I have these logical commits on two branches:

normalbranch: A - B - C - D

funkybranch: Z - Y - X
git co normalbranch
git rebase --onto X C

Then the logical commits of normalbranch will now be:

normalbranch: Z - Y - X - C - D

We’ve given commit C the new parent of X. C’s hash will change and D’s hash will change.

I use this when I build on top of a branch that I’ve already submitted for a PR. That branch will get merged into the new root branch, and then I’ll need the new parent on the root branch for the commits that I’m currently working on.

`mix phx.digest` creates the manifest

mix phx.digest creates production ready assets: hashed, zipped and compressed.

In production, when you call:

Endpoint.static_path('asset_name.css')

This will look for a file in the priv/static directory and return the path for that file. But what you want is the hashed version of that file, because modern browsers are greedy cachers and will you use the use to bust that cache.

The static_path function can return a path that represents the cached version, but it needs to read the manifest, which maps the file to it’s hashed, zipped and compressed versions.

mix phx.digest creates the manifest along with the hashed, zipped and compressed versions of all the files in the priv/static directory.

Update Map Syntax

There is a special syntax for updating a map in elixir.

thing = %{a: 1, b: 2, c: 3}
updated_thing = %{thing | b: 4}
# %{a: 1, b: 4, c: 3}

But be careful, it throws an error if you try to update a key that doesn’t exist!!

thing = %{a: 1, b: 2, c: 3}
# ** (KeyError) key :x not found in: %{a: 1, b: 2, c: 3}

Skip Pending Tests in ExUnit

In ExUnit you have the ability to tag a test with any atom:

@tag :awesome
test "my awesome test" do
end

@tag :terrible
test "my terrible test" do
end

And then you can exclude those tags at the command line so that your terrible test does not run:

mix test --exclude terrible

So you can make your own tag :pending and exclude those. If you don’t want to have to set the flag at the command line everytime, you can call ExUnit.configure. This is typically placed in the test/test_helper.exs file before ExUnit.start().

ExUnit.configure(exclude: :pending)
ExUnit.start()

Now your pending tests will not run.

That is how you can use custom tags to skip tests, but you can also just use the built in skip tag to skip tests.

@tag :skip
test "my terrible test" do
end

Timex `between?` is exclusive but can be inclusive

Timex.between? is a great function for determining if a time is in a specific time period, but it’s exclusive, so if you are testing for a time at the boundary the result will be negative.

iex> end_time = Timex.now()
iex> start_time = Timex.shift(end_time, days: -7)
iex> time = start_time
iex> Timex.between?(time, start_time, end_time)
false

Never fear, you can pass an inclusive option:

iex> Timex.between?(time, start_time, end_time, inclusive: true)
true

But you don’t have to be inclusive on both sides! Here, I pass :end so that I am only inclusive at the end of my time period.

iex> Timex.between?(time, start_time, end_time, inclusive: :end)
false

And of course I can be only inclusive at the beginning of the period if I prefer:

iex> Timex.between?(time, start_time, end_time, inclusive: :start)
true

Match across lines with Elixir Regex `dotall` (s)

Elixir Regex is PCRE compliant and Ruby Regex isn’t in at least one specific way. The m (multiline) flag behaves differently in Elixir and Ruby.

Without any modifiers, Regex does not match across new lines:

iex> Regex.scan( ~r/.*/, "a\nb")
[["a"], [""], ["b"], [""]]
iex> Regex.scan( ~r/.*/, "ab")
[["ab"], [""]]

With the m (multiline) modifier, Regex will match the beginning of each line when using the ^ char.

iex> Regex.scan( ~r/.*/m, "a\nb")
[["a"], [""], ["b"], [""]]
iex> Regex.scan( ~r/^.*/, "a\nb")
[["a"]]
iex> Regex.scan( ~r/^.*/m, "a\nb")
[["a"], ["b"]]

The s (dotall) modifier is the right way in Elixir to match across newlines.

iex> Regex.scan( ~r/.*/s, "a\nb")
[["a\nb"], [""]]
iex> Regex.scan( ~r/^.*/s, "a\nb")
[["a\nb"]]

In ruby, you can match across lines with the m (modifier) and the s is ignored.

irb> "a\nb".scan(/.*/)
=> ["a", "", "b", ""]
irb> "a\nb".scan(/.*/m)
=> ["a\nb", ""]
irb> "a\nb".scan(/.*/s)
=> ["a", "", "b", ""]

Read about the history of dotallhere

Get _just one_ value from Ecto query

With the data structure you use in select’s expression argument you can specify what type of data structure the query will return for a row. I’ve used [] and %{} and %Something{} but you can also specify that each row is just one value without using a data structure at all.

Combine that with Repo.one to just return one row and you can just get the specific value you are looking for without any destructuring.

age = 
  User
  |> where(id: 42)
  |> select([u], u.age)
  |> Repo.one()