Today I Learned

A Hashrocket project

508 posts by chriserin @mcnormalmode

Get Back To Those Merge Conflicts

You’ve probably experienced this:

Decision A
<<<<<<< HEAD
Decision H
Decision I
=======
Decision F
Decision G
>>>>>>> branch a
Decision E

And you wind up making some iffy decisions:

Decision A
Decision I
Decision G
Decision E

The tests don’t pass, you’re not confident in the choices you’ve made, but this is the third commit in a rebase and you don’t want to start over.

It’s easy to get back to a place where all your merge conflicts exist with:

git checkout --merge file_name
# or
git checkout -m file_name

Now you can re-evaluate your choices and make better decisions

Decision A
<<<<<<< HEAD
Decision H
Decision I
=======
Decision F
Decision G
>>>>>>> branch a
Decision E

H/T Brian Dunn

Sharing Volumes Between Docker Containers

In docker, it’s easy to share data between containers with the --volumes-from flag.

First let’s create a Dockerfile that declares a volume.

from apline:latest

volume ["/foo"]

Then let’s:

  1. Build it into an image foo-image
  2. Create & Run it as a container with the name foo-container
  3. Put some text into a file in the volume
docker build . -t foo-image
docker run -it --name foo-container foo-image sh -c 'echo abc > /foo/test.txt'

When you run docker volume ls you can see a volume is listed. By running a container from an image with a volume we’ve created a volume.

When you run docker container ls -a you can see that we’ve also created a container. It’s stopped currently, but the volume is still available.

Now let’s run another container passing the name of our previously created container to the --volumes-from flag.

docker run -it --volumes-from foo-container alpine cat /foo/test.txt

# outputs: abc

We’ve accessed the volume of the container and output the results.

Blocking ip6 addresses with /etc/hosts

Like many developers, I need to eliminate distractions to be able to focus. To do that, I block non-development sites using /etc/hosts entries, like this:

127.0.0.1 twitter.com

Today I learned that this doesn’t block sites that use ip6. I have cnn.com in my /etc/hosts file but it is not blocked in the browser.

To prove this is an ip6 issue I can use ping and ping6

> ping cnn.com
PING cnn.com (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.024 ms

> ping6 cnn.com
PING6(56=40+8+8 bytes) 2601:240:c503:87e3:fdee:8b0b:dadf:278e --> 2a04:4e42:200::323
16 bytes from 2a04:4e42:200::323, icmp_seq=0 hlim=57 time=9.288 ms

So for ip4 requests cnn.com is pinging localhost and not getting a response, which is what I want. For ip6 addresses cnn.com is hitting an address that is definitely not my machine.

Let’s add another entry to /etc/hosts:

::1 cnn.com

::1 is the simplification of the ip6 loopback address 0:0:0:0:0:0:0:1.

Now, does pinging cnn.com with ip6 hit my machine?

> ping6 cnn.com
PING6(56=40+8+8 bytes) ::1 --> ::1
16 bytes from ::1, icmp_seq=0 hlim=64 time=0.044 ms

Distractions eliminated.

View the `motd` after login in Ubuntu

When you ssh into an Ubuntu machine, you may see a welcome message that starts with something like this:

Welcome to Ubuntu 18.04.1 LTS (GNU/Linux 4.15.0-65-generic x86_64)

This is the motd (message of the day).

What if you clear your terminal after login but want to see that message again?

There are two ways to do this.

$ cat /run/motd.dynamic

This will show you the same message that was created for you when you logged in.

If there is dynamic information in that message and you want to see the latest version run:

$ sudo run-parts /etc/update-motd.d/

This will run all the scripts that make the motd message.

Highlight json with the `bat`

Sometimes I run a utility that outputs a whole bunch of json, like:

docker inspect hello-world 
# outputs a couple pages of json

I want to send it through bat because bat is a great output viewer, and I also want it to syntax highlight. If bat is viewing a file with the extenson of json then it will get syntax highlighting, but in this case there is no file and no extension.

You can turn on json syntax highlighting with the --language flag.

docker inspect hello-world | bat --language json
# or just use -l
docker inspect hello-world | bat -l json

Combine this with --theme and you’re looking good!

docker inspect hello-world | bat -l json --theme TwoDark

The interaction of CMD and ENTRYPOINT

The CMD and ENTRYPOINT instructions in a Dockerfile interact with each other in an interesting way.

Consider this simple dockerfile:

from alpine:latest

cmd echo A

When I run docker run -it $(docker build -q .) The out put I get is A, like you’d expect.

With an additional entrypoint instruction:

from alpine:latest

entrypoint echo B
cmd echo A

I get just B no A.

Each of these commands are using the shell form of the instruction. What if I use the exec form?

from alpine:latest

entrypoint ["echo", "B"]
cmd ["echo", "A"]

Then! Surprisingly, I get B echo A.

When using the exec form cmd provides default arguments to entrypoint

You can override those default arguments by providing an argument to docker run:

docker run -it $(docker build -q .) C
B C

`ets` table gets deleted when owning process dies

You can create a new ets table with:

:ets.new(:chris_table, [:named_table])

And you can confirm it was created with:

:ets.info(:chris_table)
[
id: #Reference<0.197283434.4219076611.147360>,
...
]

Now check this:

spawn(fn -> :ets.new(:spawn_table, [:named_table]) end)

Let’s see if it was created:

:ets.info(:spawn_table)
# returns :undefined

What gives? The erlang ets docs say this:

Each table is created by a process. When the process terminates, the table is automatically destroyed.

So, spawn created the process and then terminated, so :spawn_table got deleted when the process died.

Install all versions in .tool-versions with asdf

If you get the code for a new project and it is a project where versions are managed by asdf, then you will have a .tool-versions file and it will look something like this:

elixir 1.7.4-otp-21
erlang 21.3.8

If I don’t have those versions installed, then generally I install those individually.

If your working directory is the same version as the .tool-versions file then you can install all versions specified in that file with:

asdf install

Switch branches in git with... `git switch`

It’s experimental. It’s intuitive. It’s in the newest version of git, version 2.23.0. It is:

git switch my-branch

And, with every git command there are many ways to use the command with many flags:

You might want to create a branch and switch to it:

git switch -c new-branch

You might want to switch to a new version of a local branch that tracks a remote branch:

git switch -t origin/remote-branch

You can throw away unstaged changes with switching by using -f.

git switch -f other-branch

I feel that if I were learning git from scratch today, this would be much easier to learn, there’s just so much going on with checkout.

Delete remote branches with confirmation

Branches on the git server can sometimes get out of control. Here’s a sane way to clean up those remote branches that offers a nice confirmation before deletion, so that you don’t delete something you don’t want to delete.

git branch -a | grep remotes | awk '{gsub(/remotes\/origin\//, ""); print;}' | xargs -I % -p git push origin :%

The -p flag of xargs provides the confirmation.

Precise timings with `monotonic_time`

Monotonic time is time from a clock that only moves forward. The system clock on your CPU can be set and reset. Even when tied to the LAN ntp protocol the system clock can be out-of-sync by a couple of milliseconds. When measuring in microseconds, that’s a lot of time, and time drift can occur at the microsecond level even when attached to NTP, requiring system clock resets.

To get monotonic time in Elixir use, System.monotonic_time:

iex> System.monotonic_time
-576460718338896000
iex> System.monotonic_time
-576460324867892860

It’s ok that this number is negative, it’s always moving positive.

The number has a time unit of :native. To get a duration in millseconds you could convert from :native to millisecond.

iex> event_time = System.monotonic_time
-576459417748861340
iex> System.convert_time_unit(System.monotonic_time - event_time, :native, :millisecond)
38803

Or you could get a millisecond duration by using the one argument of monotonic_time to specify the time unit you want.

iex> event_time = System.monotonic_time(:millisecond)
-576459079381
iex> System.monotonic_time(:millisecond) - event_time
14519

Check out the elixir docs on time for more info.

FormData doesn't iterate over disabled inputs

If I have a form that looks like this:

<form>
  <input disabled name="first_name" />
  <input name="last_name" />
</form>

And then I use FormData to iterate over the inputs.

const form = new FormData(formElement);

const fieldNames = [];

form.forEach((value, name) => {
  fieldNames.push(name);
});

// fieldNames contains ['last_name']

Then fieldNames only contains one name!

Be careful if you’re using FormData to extract names from a form!

Erlang records are just tuples

Erlang has a Record type which helps you write clearer code by having data structures that are self descriptive.

One thing that is interesting about a Record is that a record is just a tuple underneath the hood.

Erlang/OTP 22 [erts-10.4.4] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1] [hipe]

Eshell V10.4.4  (abort with ^G)
1> rd(fruit, {name, color}).
fruit
2> #fruit{name="Apple", color="Red"}.
#fruit{name = "Apple",color = "Red"}
3> #fruit{name="Apple", color="Red"} == {fruit, "Apple", "Red"}.
true

As you can see, that internal representation is exposed when comparing a record to a tuple:

#fruit{name="Apple", color="Red"} == {fruit, "Apple", "Red"}.

You can even use pattern matching with the two data structures:

7> #fruit{name="Apple", color=NewColor} = {fruit, "Apple", "Green"}.
#fruit{name = "Aplle",color = "Green"}
8> NewColor.
"Green"

Declaring Erlang records in a shell

Erlang records are typically declared in a header file. If you want to experiment with records at the command line, you’ll have to use a shell command.

rd is an erl shell command that you can remember as standing for record definition

Let’s try that in the erlang shell tool, erl.

Erlang/OTP 21 [erts-10.1] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1] [hipe] [dtrace]

Eshell V10.1  (abort with ^G)
> rd(fruit, {name, color}).
fruit

And then you can declare a record the same way you would in Erlang.

> Apple = fruit#{name="Apple", color="Grren"}.
#fruit{name = "Apple",color = "green"}

And to inspect that variable in erl, just declare the variable and put a . after it to conclude the expression.

> Apple.
#fruit{name = "Apple",color = "green"}

You can find other shell commands that deal with records and other things here.

telemetry handler detaches automatically on error

Telemetry handlers should be designed to run efficiently over and over again without making a noticeable performance impact on your production system. But what happens if an occurs? Your monitoring should have no bearing on the success of your business logic.

So what happens if your monitoring has an error? With telemetry, it’s ok, the offending handler is just detached, never to run again. Business logic unaffected.

Here’s a test that demonstrates the handler being removed.

handlerFn = fn _, _, _, _ ->
  raise "something"
end

:telemetry.attach(
  :bad_handler,
  [:something_happened],
  handlerFn,
  :no_config
)

# the handler is in ets
assert [[[:something_happened]]] =
         :ets.match(:telemetry_handler_table, {:handler, :bad_handler, :"$1", :_, :_})

:telemetry.execute([:something_happened], %{}, %{})

# the handler is gone!
assert [] = :ets.match(:telemetry_handler_table, {:handler, :bad_handler, :"$1", :_, :_})

You can see the try/catch block in the telemetry source code here.

List all telemetry event handlers

Telemetry is a new library for application metrics and logging in beam applications. It was added to Phoenix in version 1.4.7 released in June 2019.

Telemetry consists of registered handler functions that are executed when specific events occur.

To see a list of what handlers are registered for which events you can call:

:telemetry.list_handlers([])

When returns a list of maps.

To see a list of handlers that have a specific event prefix, you can pass in a prefix as the only argument.

:telemetry.list_handlers([:phoenix, :endpoint])

Which returns:

[
  %{
    config: :ok,
    event_name: [:phoenix, :endpoint, :start],
    function: #Function<2.82557494/4 in Phoenix.Logger.install/0>,
    id: {Phoenix.Logger, [:phoenix, :endpoint, :start]}
  },
  %{
    config: :ok,
    event_name: [:phoenix, :endpoint, :stop],
    function: #Function<3.82557494/4 in Phoenix.Logger.install/0>,
    id: {Phoenix.Logger, [:phoenix, :endpoint, :stop]}
  }
]

`telemetry_event` Overrides Repo Query Event

Ecto gives you a single telemetry event out of the box, [:my_app, :repo, :query], where the [:my_app, :repo] is the telemetry prefix option for ecto.

This event is called whenever any request to the database is made:

  handler = fn _, measurements, _, _ ->
    send(self(), :test_message)
  end

  :telemetry.attach(
    "query",
    [:test_telemetry, :repo, :query],
    handler,
    %{}
  )

  Repo.all(TestTelemetry.Colour)

  assert_receive(:test_message)

This event is overriden when using the the :telemetry_event option, a shared option for all Repo query functions.

  handler = fn _, measurements, _, _ ->
    send(self(), :test_message)
  end

  custom_handler = fn _, measurements, _, _ ->
    send(self(), :custom_message)
  end

  :telemetry.attach(
    "query",
    [:test_telemetry, :repo, :query],
    handler,
    %{}
  )

  :telemetry.attach(
    "custom",
    [:custom],
    custom_handler,
    %{}
  )

  Repo.all(TestTelemetry.Colour, telemetry_event: [:custom])

  assert_receive(:custom_message)
  refute_receive(:test_message)

Which means for any given query you can only broadcast one event. If you have a system that keeps track of expensive queries but you also need to debug a particular query in production, you will take that query out of the system to track expensive queries.

Telemetry Attach and Execute

The telemetry api is simpler than the name tends to imply. There are only two primary functions, attach/4 and execute/4. Check out the telemetry docs to see the full api.

Attach is simple. Essentially, you want a function to be called when a certain event occurs:

handler = fn [:a_certain_event], _measuremnts, _metadata, _config ->
  IO.puts("An event occurred!!")
end

:telemetry.attach(
  :unique_handler_id,
  [:a_certain_event],
  handler,
  :no_config
)

The first argument is handler_id and can be anything be must be unique. The last argument is config and can also be anything. It will be passed along, untouched, to the handler function as the last argument.

Execute is simple. When the program calls execute, the handler that matches the event is called and the measurements and metadata are passed to the handler.

measurements = %{}
metadata = %{}
:telemetry.execute([:a_certain_event], measurements, metadata)

It’s important to note that the event name must be a list of atoms.

A simple test for attach and execute would look like this:

test "attach and execute" do
  handler = fn _, _, _, _ ->
    send(self(), :test_message)
  end

  :telemetry.attach(
    :handler_id,
    [:something_happened],
    handler,
    :no_config
  )

  :telemetry.execute([:something_happened], %{}, %{})

  assert_receive :test_message
end

Where in with multiple values in postgres

Postgres has a record type that you can use with a comma seperated list of values inside of parenthesis like this:

> SELECT pg_typeof((1, 2));

 pg_typeof
-----------
 record
(1 row)

What is also interesting is that you can compare records:

> select (1, 2) = (1, 2);

 ?column?
----------
 t
(1 row)

And additionally, a select statement results in a record:

> select (1, 2) = (select 1, 2);

 ?column?
----------
 t
(1 row)

What this allows you to do is to create a where statement where the expression can check to see that 2 or more values are contained in the results of a subquery:

> select true where (1, 2) in (
  select x, y
  from
    generate_series(1, 2) x,
    generate_series(1, 2) y
);

 bool
------
 t
(1 row)

This is useful when you declare composite keys for your tables.

Accumulating Attributes In Elixir

Typically, if you declare an attribute twice like this:

@unit_of_measure :fathom
@unit_of_measure :stone

The second declaration will override the first:

IO.inspect(@unit_of_measure)
# :stone

But by registering the attribute by calling register_attribute you get the opportunity to set the attribute to accumulate. When accumulating, each declaration will push the declared value onto the head of a list.

defmodule TriColarian do
  @moduledoc false

  Module.register_attribute(__MODULE__, :colors, accumulate: true)

  @colors :green
  @colors :red
  @colors :yellow

  def colors do
    @colors
  end
end

TriColarian.colors()
# [:yellow, :red, :green]

At compile time, perhaps when executing a macro, you have the opportunity to dynamically build a list.

I learned this when Andrew Summers gave a talk on DSLs at Chicago Elixir this past Wednesday. You can see his slides here

`sleep` is just `receive after`

While using sleep in your production application might not be for the best, it’s useful in situations where you’re simulating process behaviour in a test or when you’re trying to diagnose race conditions.

The Elixir docs say this:

Use this function with extreme care. For almost all situations where you would use sleep/1 in Elixir, there is likely a more correct, faster and precise way of achieving the same with message passing.

There’s nothing special about sleep though. The implementation for both :timer.sleep and Process.sleep is equivalent. In Elixir syntax, it’s:

receive after: (timeout -> :ok)

So, it’s waiting for a message, but doesn’t provide any patterns that would successfully be matched. It relies on the after of receive to initiate the timeout and then it returns :ok.

If you’re building out some production code that requires waiting for a certain amount of time, it might be useful to just use the receive after code so that if later you decide that the waiting can be interrupted you can quickly provide a pattern that would match the interrupt like:

timeout = 1000

receive do
  :interrupt -> 
    :ok_move_on_now
  
  after
    timeout -> :ok_we_waited
end

Assert one process gets message from another

Erlang has a very useful-for-testing function :erlang.trace/3, that can serve as a window into all sorts of behaviour.

In this case I want to test that one process sent a message to another. While it’s always best to test outputs rather than implementation, when everything is asynchronous and paralleized you might need some extra techniques to verify your code works right.

pid_a =
  spawn(fn ->
    receive do
      :red -> IO.puts("got red")
      :blue -> IO.puts("got blue")
    end
  end)

:erlang.trace(pid_a, true, [:receive])

spawn(fn ->
  send(pid_a, :blue)
end)

assert_receive({:trace, captured_pid, :receive, captured_message})

assert captured_pid == pid_a
assert :blue == captured_message

In the above example we setup a trace on receive for the first process:

:erlang.trace(pid_a, true, [:receive])

Now, the process that called trace will receive a message whenever traced process receives a message. That message will look like this:

{
  :trace,
  pid_that_received_the_message,
  :receive, # the action being traced
  message_that_was_received
}

This in combination with assert_receive allows you to test that the test process receives the trace message.

Assert Linked Process Raised Error

Linked processes bubble up errors, but not in a way that you can catch with rescue:

test "catch child process error?" do
  spawn_link(fn -> 
    raise "3RA1N1AC"
  end)
rescue 
  e in RuntimeError ->
    IO.puts e
end

This test fails because the error wasn’t caught. The error bubbles up outside of normal execution so you can’t rely on procedural methods of catching the error.

But, because the error causes on exit on the parent process (the test process) you can trap the exit with Process.flag(:trap_exit, true). This flag changes exit behavior. Instead of exiting, the parent process will now receive an :EXIT message.

test "catch child process error?" do
  Process.flag(:trap_exit, true)

  child_pid = spawn_link(fn -> 
    raise "3RA1N1AC"
  end)

  assert_receive {
    :EXIT,
    ^child_pid,
    {%RuntimeError{message: "3RA1N1AC"}, _stack}
  }
end

The error struct is returned in the message tuple so you can pattern match on it and assert about.

This method is still subject to race conditions. The child process must throw the error before the assert_receive times out.

There is a different example in the Elixir docs for catch_exit.

Assert Test Process Did or Will Receive A Message

The ExUnit.Assertions module contains a function assert_receive which the docs state:

Asserts that a message matching pattern was or is going to be received within the timeout period, specified in milliseconds.

It should possibly in addition say “received by the test process”. Let’s see if we can send a message from a different process and assert that the test process receives it:

test_process = self()

spawn(fn ->
  :timer.sleep(99)
  send(test_process, :the_message)
end)

assert_receive(:the_message, 100)
# It Passes!!!

In the above code, :the_message is sent 1 millisecond before the timeout, and the assertion passes.

Now let’s reverse the assertion to refute_receive, and change the sleep to the same time as the timeout.

test_process = self()

spawn(fn ->
  :timer.sleep(100)
  send(test_process, :the_message)
end)

refute_receive(:the_message, 100)
# It Passes!!!

Yep, it passes.

Are All Values True in Postgres

If you have values like this:

chriserin=# select * from (values (true), (false), (true)) x(x);
 x
---
 t
 f
 t

You might want to see if all of them are true. You can do that with bool_and:

chriserin=# select bool_and(x.x) from (values (true), (false), (true)) x(x);
bool_and 
---
 f

And when they are all true:

chriserin=# select bool_and(x.x) from (values (true), (true), (true)) x(x);
bool_and 
---
 t

Hiding and Revealing Struct Info with `inspect`

There are two ways to hide information when printing structs in Elixir.

Hiding by implementing the inspect protol.

defmodule Thing do
  defstruct color: "blue", tentacles: 7
end

defimpl Inspect, for: Thing do
  def inspect(thing, opts) do
    "A #{thing.color} thing!"
  end
end

So now in iex I can’t tell how many tentacles the thing has:

> monster = %Thing{color: "green", tentacles: 17}
> IO.inspect(monster, label: "MONSTER")
MONSTER: A green thing!
A green thing!

Note that iex uses inspect to output data which can get confusing.

NEW IN ELIXIR 1.8: You can also hide data with @derive

defmodule Thing do
  @derive {Inspect, only: [:color]}
  defstruct color: "blue", tentacles: 7
end

And now you won’t see tentacles on inspection

> monster = %Thing{color: "green", tentacles: 17}
> IO.inspect(monster)
#Thing<color: "green", ...>

In both cases, you can reveal the hidden information with the structs: false option:

> monster = %Thing{color: "green", tentacles: 17}
> IO.inspect(monster)
#Thing<color: "green", ...>
> IO.inspect(monster, structs: false)
%{__struct__: Thing, color: "green", tentacles: 17}

Custom Validation in Ecto

Sometimes the standard validation options aren’t enough.

You can create a custom validator in Ecto using the validate_change function.

In this particular case, I want to validate that thing has an odd number of limbs. I can pass the changeset to validate_odd_number.

def changeset(thing, attrs) do
  thing
  |> cast(attrs, [
    :number_of_limbs
  ])
  |> validate_odd_number(
    :number_of_limbs
  )
end

And then define validate_odd_number like this:

def validate_odd_number(changeset, field) when is_atom(field) do
  validate_change(changeset, field, fn (current_field, value) ->
    if rem(value, 2) == 0 do
      [{f, "This field must be an odd number"}]
    else 
      []
    end
  end)
end

We pass a function to validate_change that takes the field atom and the value for that field. In that function we test for oddness and return a keyword list that contains the field name and an error message.

Remember that when f is :number_of_limbs:

    [{f, "hi"}] == [number_of_limbs: "hi"]
    # true, these data structures are equal

Ecto's `distinct` adds an order by clause

When using distinct on you may encounter this error:

select distinct on (color) id, color from fruits order by id;
-- ERROR:  SELECT DISTINCT ON expressions must match initial ORDER BY expressions

This is because distinct on needs to enounter the rows in a specific order so that it can make the determination about which row to take. The expressions must match the initial ORDER BY expressions!

So the above query works when it looks like this:

select distinct on (color) id, color from fruits order by color, id;

OK, now it works.

Ecto’s distinct function helps us avoid this common error by prepending an order by clause ahead of the order by clause you add explicitly.

This elixir statement:

Fruit |> distinct([f], f.color) |> order_by([f], f.id) |> Repo.all

Produces this sql (cleaned up for legibility):

select distinct on (color) id, color from fruits order by color, id;

Ecto added color to the order by!

Without any order by at all, distinct does not prepend the order by.

Read the docs!

Transactions can timeout in Elixir

In Ecto, transactions can timeout. So this type of code:

  Repo.transaction fn -> 
    # Many thousands of expensive queries
    # And inserts
  end

This type of code might fail after 15 seconds, which is the default timeout.

In Ecto you can specify what the timeout should be for each operation. All functions that make a request to the database have the same same shared options of which :timeout is one.

Repo.all(massive_query, timeout: 20_000)

The above query now times out after 20 seconds.

These shared options apply to a transaction as well. If you don’t care that a transaction is taking a long time you can set the timeout to :infinity.

  Repo.transaction(fn -> 
    # Many thousands of expensive queries
    # And inserts
  end, timeout: :infinity)

Now this operation will be allowed to finish, despite the time it takes.

Timing A Function In Elixir

Erlang provides the :timer module for all things timing. The oddly named function tc will let you know how long a function takes:

{uSecs, :ok} = :timer.tc(IO, :puts, ["Hello World"])

Note that uSecs is microseconds not milliseconds so divide by 1_000_000 to get seconds.

microseconds are helpful though because sometimes functions are just that quick.

:timer.tc(IO, :puts, ["Hello World"])
# {22, :ok}

You can also call :timer.tc with a function and args:

adding = fn (x, y) ->  x + y end

:timer.tc(adding, [1,3])
# {5, 4}

Or just a function:

:timer.tc(fn -> 
    # something really expensive
    :ok
    end)
# {1_302_342, :ok}

Ack ignores node_modules by default

When searching through your JavaScript project it doesn’t make sense to search through your node_modules. But if your are on a spelunking journey into the depths of your dependencies, you may want to search through all your node_modules!

ack ignores node_modules by default, and ack being ack you can ack through ack to check it out:

> cat `which ack` | ack node_modules
--ignore-directory=is:node_modules

This is different behaviour from ag and rg which also ignore node_modules but not explicitly. They both ignore node_modules by ignoring all entries in the .gitignore file.

rg claims to implement full support for the .gitignore file while also claiming other search tools do not. The open issues list for ag bears that out.

With each of these tools, explicitly stating the directory to search through overrides the ignore.

> ack autoprefix node_modules
> rg autoprefix node_modules
> ag autoprefix node_modules

`user-select:none` needs prefixes for each browser

The user-select css property governs if text is selectable for a given element. user-select: none means that it is not selectable.

What’s interesting about this property is that while each browser supports it, they each require their own prefix, except Chrome, which does not need a prefix.

In create react app, what starts out as:

user-select: none;

Gets expanded to:

-webkit-user-select: none;
-moz-user-select: none;
-ms-user-select: none;
user-select: none;

But the default browserList configuration for development in your package.json file is:

"development": [
    "last 1 chrome version",
    "last 1 firefox version",
    "last 1 safari version"
]

And so in development, it gets expanded to:

-webkit-user-select: none;
-moz-user-select: none;
user-select: none;

Sorry Micrsoft.

How does create react app know which prefixes to render? The caniuse-lite npm package has up-to-date support data for each property and user-select:none is defined here.

That data is compressed. Here’s how to uncompress it using the node cli to see what it represents:

const compressedData = require('caniuse-lite/data/features/user-select-none');
const caniuse = require('caniuse-lite');
const unpackFunction = caniuse.feature;
unpackFunction(compressedData);

This is accomplished in create react app by the npm package autoprefixer;

Custom React Hook Must Use `use`

You can build your own hooks by composing existing hooks.

Here, I create a custom hook useBoolean by wrapping useState:

const useBoolean = () => useState(true);

Which I can then use in my component:

function Value() {
  const [value, setValue] = useBoolean();

  return <div onClick={() => setValue(!value)}>Click me {String(value)}</div>;
}

The react documentation very politely asks that you start the name of your hook with use. This is isn’t strictly necessary, and it will still work if you call it:

const doBoolean = () => useState(true);

But that violates the Rules of Hooks.

You can include an eslint plugin that will prevent you from breaking the rules. This plugin is installed by default in create-react-app version 3.

Aggregate Arrays In Postgres

array_agg is a great aggregate function in Postgres but it gets weird when aggregating other arrays.

First let’s look at what array_agg does on rows with integer columns:

select array_agg(x) from (values (1), (2), (3), (4)) x (x);
-- {1,2,3,4}

It puts each value into an array. What if are values are arrays?

select array_agg(x)
from (values (Array[1, 2]), (Array[3, 4])) x (x);
-- {{1,2},{3,4}}

Put this doesn’t work when the arrays have different numbers of elements:

select array_agg(x)
from (values (Array[1, 2]), (Array[3, 4, 5])) x (x);
-- ERROR:  cannot accumulate arrays of different dimensionality

If you are trying to accumulate elements to process in your code, you can use jsonb_agg.

select jsonb_agg(x.x)
from (values (Array[1, 2]), (Array[3, 4, 5])) x (x);
-- [[1, 2], [3, 4, 5]]

The advantage of using Postgres arrays however is being able to unnest those arrays downstream:

select unnest(array_agg(x))
from (values (Array[1, 2]), (Array[3, 4])) x (x);
--      1
--      2
--      3
--      4

Count true values with postgres

I can try to count everything that is not null in postgres:

select count(x.x)
from (
  values (null), ('hi'), (null), ('there')
) x (x);
# 2

But if I try to count everything that is true in postgres, I don’t get what I want:

select count(x.x)
from (
  values (false), (true), (false), (true)
) x (x);
# 4

I can, however, take advantage of the postgres ability to cast boolean to int:

select true::int;
# 1
select false::int;
# 0

Using ::int I get:

select count(x.x::int)
from (
  values (false), (true), (false), (true)
) x (x);
# 4

Postgres is still counting everything that is not null, but what if use sum instead?

select sum(x.x::int)
from (
  values (false), (true), (false), (true)
) x (x);
# 2

Because everything is either a 0 or a 1, sum behaves like count.

Now you can do something like this:

select
  sum((status = 'Awesome')::int) as awesomes,  
  sum((status = 'Terrible')::int) as terribles
from statuses;

Print the current stacktrace in Elixir

Stacktrace, backtrace, callstack, in Elixir its stacktrace and it’s available via Process.info/2 using the :current_stacktrace item:

Process.info(self(), :current_stacktrace)

And to print it:

IO.inspect(Process.info(self(), :current_stacktrace), label: "STACKTRACE")

I’m also learning that Process.info/2 takes a pid and an item as arguments. When you call Process.info/1 with just the pid you only get a subset of the info available, not everything.

The items available via Process.info/1 are listed in the erlang documentation here.

The additional items available via Process.info/2 are listed in the erlang documentation here.

You may note that backtrace is also an item that is available via Process.info but it contains more information than you are might need to figure out where you are in the code.

Give a commit a new parent

Given I have these logical commits on two branches:

normalbranch: A - B - C - D

funkybranch: Z - Y - X
git co normalbranch
git rebase --onto X C

Then the logical commits of normalbranch will now be:

normalbranch: Z - Y - X - C - D

We’ve given commit C the new parent of X. C’s hash will change and D’s hash will change.

I use this when I build on top of a branch that I’ve already submitted for a PR. That branch will get merged into the new root branch, and then I’ll need the new parent on the root branch for the commits that I’m currently working on.

`mix phx.digest` creates the manifest

mix phx.digest creates production ready assets: hashed, zipped and compressed.

In production, when you call:

Endpoint.static_path('asset_name.css')

This will look for a file in the priv/static directory and return the path for that file. But what you want is the hashed version of that file, because modern browsers are greedy cachers and will you use the use to bust that cache.

The static_path function can return a path that represents the cached version, but it needs to read the manifest, which maps the file to it’s hashed, zipped and compressed versions.

mix phx.digest creates the manifest along with the hashed, zipped and compressed versions of all the files in the priv/static directory.

Update Map Syntax

There is a special syntax for updating a map in elixir.

thing = %{a: 1, b: 2, c: 3}
updated_thing = %{thing | b: 4}
# %{a: 1, b: 4, c: 3}

But be careful, it throws an error if you try to update a key that doesn’t exist!!

thing = %{a: 1, b: 2, c: 3}
# ** (KeyError) key :x not found in: %{a: 1, b: 2, c: 3}

Skip Pending Tests in ExUnit

In ExUnit you have the ability to tag a test with any atom:

@tag :awesome
test "my awesome test" do
end

@tag :terrible
test "my terrible test" do
end

And then you can exclude those tags at the command line so that your terrible test does not run:

mix test --exclude terrible

So you can make your own tag :pending and exclude those. If you don’t want to have to set the flag at the command line everytime, you can call ExUnit.configure. This is typically placed in the test/test_helper.exs file before ExUnit.start().

ExUnit.configure(exclude: :pending)
ExUnit.start()

Now your pending tests will not run.

That is how you can use custom tags to skip tests, but you can also just use the built in skip tag to skip tests.

@tag :skip
test "my terrible test" do
end

Timex `between?` is exclusive but can be inclusive

Timex.between? is a great function for determining if a time is in a specific time period, but it’s exclusive, so if you are testing for a time at the boundary the result will be negative.

iex> end_time = Timex.now()
iex> start_time = Timex.shift(end_time, days: -7)
iex> time = start_time
iex> Timex.between?(time, start_time, end_time)
false

Never fear, you can pass an inclusive option:

iex> Timex.between?(time, start_time, end_time, inclusive: true)
true

But you don’t have to be inclusive on both sides! Here, I pass :end so that I am only inclusive at the end of my time period.

iex> Timex.between?(time, start_time, end_time, inclusive: :end)
false

And of course I can be only inclusive at the beginning of the period if I prefer:

iex> Timex.between?(time, start_time, end_time, inclusive: :start)
true

Match across lines with Elixir Regex `dotall` (s)

Elixir Regex is PCRE compliant and Ruby Regex isn’t in at least one specific way. The m (multiline) flag behaves differently in Elixir and Ruby.

Without any modifiers, Regex does not match across new lines:

iex> Regex.scan( ~r/.*/, "a\nb")
[["a"], [""], ["b"], [""]]
iex> Regex.scan( ~r/.*/, "ab")
[["ab"], [""]]

With the m (multiline) modifier, Regex will match the beginning of each line when using the ^ char.

iex> Regex.scan( ~r/.*/m, "a\nb")
[["a"], [""], ["b"], [""]]
iex> Regex.scan( ~r/^.*/, "a\nb")
[["a"]]
iex> Regex.scan( ~r/^.*/m, "a\nb")
[["a"], ["b"]]

The s (dotall) modifier is the right way in Elixir to match across newlines.

iex> Regex.scan( ~r/.*/s, "a\nb")
[["a\nb"], [""]]
iex> Regex.scan( ~r/^.*/s, "a\nb")
[["a\nb"]]

In ruby, you can match across lines with the m (modifier) and the s is ignored.

irb> "a\nb".scan(/.*/)
=> ["a", "", "b", ""]
irb> "a\nb".scan(/.*/m)
=> ["a\nb", ""]
irb> "a\nb".scan(/.*/s)
=> ["a", "", "b", ""]

Read about the history of dotallhere

Get _just one_ value from Ecto query

With the data structure you use in select’s expression argument you can specify what type of data structure the query will return for a row. I’ve used [] and %{} and %Something{} but you can also specify that each row is just one value without using a data structure at all.

Combine that with Repo.one to just return one row and you can just get the specific value you are looking for without any destructuring.

age = 
  User
  |> where(id: 42)
  |> select([u], u.age)
  |> Repo.one()

Don't truncate when redirecting

A common problem in shell languages is truncating a file that you’re trying to transform by redirecting into it.

Let’s say I have a file full of “a”‘s

PROMPT> cat test.txt
aaaaaa

And I want to switch them all to “b”‘s

PROMPT> cat test.txt | tr 'a' 'b'
bbbbbb

But when I try to redirect into the same file I truncate file:

PROMPT> cat test.txt | tr 'a' 'b' > test.txt
PROMPT> cat test.txt
# nothing

This is because the shell truncates “test.txt” for redirection before cat reads the file.

An alternate approach is to use tee.

PROMPT> cat test.txt | tr 'a' 'b' | tee test.txt
bbbbbb
PROMPT> cat test.txt
bbbbbb

tee will write the output to both the file and to standard out, but will do so at the end of the pipe, after the cat command reads from the file.

Get pids for each beam application in Elixir

In Elixir, it’s hard to determine what pids belong to what applications just by using Process.info

iex> pid(0,45,0) |> Process.info
[
  current_function: {:application_master, :main_loop, 2},
  initial_call: {:proc_lib, :init_p, 5},
  status: :waiting,
  message_queue_len: 0,
  links: [#PID<0.46.0>, #PID<0.43.0>],
  dictionary: [
    "$ancestors": [#PID<0.44.0>],
    "$initial_call": {:application_master, :init, 4}
  ],
  trap_exit: true,
  error_handler: :error_handler,
  priority: :normal,
  group_leader: #PID<0.45.0>,
  total_heap_size: 376,
  heap_size: 376,
  stack_size: 7,
  reductions: 49,
  garbage_collection: [
    max_heap_size: %{error_logger: true, kill: true, size: 0},
    min_bin_vheap_size: 46422,
    min_heap_size: 233,
    fullsweep_after: 65535,
    minor_gcs: 0
  ],
  suspending: []
]

With the above output, I can’t tell what application this is!

I can use the :running key of Erlang’s :application.info to get a map of applications to pids.

iex> :application.info[:running]
[
  logger: #PID<0.93.0>,
  iex: #PID<0.85.0>,
  elixir: #PID<0.79.0>,
  compiler: :undefined,
  stdlib: :undefined,
  kernel: #PID<0.45.0>
]

Oh ok, pid(0,45,0) is the Kernel application.

Test that an email was sent with correct params

assert_email_delivered_with is a function that checks if an email was delivered using Bamboo.

Sending an email is a side-effect in functional programming parlance. When testing a function where the email is sent several calls deep, you generally don’t get the email as a return value of that function.

When using Bamboo’s TestAdaptor in test mode, Bamboo captures the delivered email so that you can assert about it.

Lottery.notify_winner()
assert_email_delivered_with(subject: "You Won!!")

You can also assert on: to, cc, text_body, html_body, from, and bcc.

Happy Testing!

Formatting the email address fields with Bamboo

When you create an email with Bamboo and deliver it:

 email = new_email(
      to: "chris@example.com",
      from: "vitamin.bot@example.com",
      subject: "Reminder",
      text_body: "Take your vitamins"
    )
 |> Mailer.deliver_now()

The to field will not be what you expect it to be:

email.to
# {nil, "chris@example.com"}

This is the normalized format for a to field. The first element in the tuple is the name of the recipient.

Bamboo allows you to format this field yourself with protocols.

defimpl Bamboo.Formatter, for: Person do
  def format_email_address(person, _opts) do
    {person.nickname, person.email}
  end
end

And now if I send an email with Person in the to field:

person = %Person{
    nickname: "shorty", 
  email: "chris@example.com"
}

email = new_email(
      to: person,
      from: "vitamin.bot@example.com",
      subject: "Reminder",
      text_body: "Take your vitamins"
    )
|> Mailer.deliver_now()

Then the to field gets formatted with our protocol.

email.to
# {"shorty", "chris@example.com"}

Read more about the Bamboo Formatter here.

Fuzzy translation key merging with Gettext

Gettext is a great tool for i18n in elixir. It provides a mix task for extracting translation keys from your code. The translation keys (or message ids) are natural language and look like this:

gettext("Hi, Welcome to Tilex!")

After running mix gettext.extract && mix gettext.merge, an already translated Italian locale file would look like:

msgid "Hi, Welcome to Tilex!"
msgstr "Italian version of Welcome!"

There’s a chance that the natural language key (which also serves as the default string) will change.

If it changes just a little bit then the Italian locale file will look like:

#, fuzzy
msgid "Hi, Welcome to Tilex!!"
msgstr "Italian version of Welcome!"

It gets marked as #, fuzzy, and the new msgid replaced the old msgid.

Gettext determines how big of a change will constitute a fuzzy match with String.jaro_distance.

iex> String.jaro_distance("something", "nothing")
0.8412698412698413
iex> String.jaro_distance("peanuts", "bandersnatch")
0.576984126984127

The higher the number the closer the match. fuzzy_threshold is the configuration that determines whether a msgid is fuzzy or not and the default for fuzzy_threshold is 0.8, set here.

Two arguments in a command with xargs and bash -c

You can use substitution -I{} to put the argument into the middle of the command.

> echo "a\nb\nc\nd" | xargs -I{} echo {}!
a!
b!
c!
d!

I can use -L2 to provide exactly 2 arguments to the command:

> echo "a\nb\nc\nd" | xargs -I{} -L2 echo {}!
a b!
c d!

But I want to use two arguments, the first in one place, the next in another place:

> echo "a\nb\nc\nd" | xargs -I{} -L2 echo {}x{}!
a bxa b!
c dxa b!

I wanted axb! but got a bxa b!. In order to achieve this you have to pass arguments to a bash command.

> echo "a\nb\nc\nd" | xargs -L2 bash -c 'echo $0x$1!'
axb!
cxd!

Just like calling

bash -c 'echo $0x$1!' a b

Where $0 represents the first argument and $1 represents the second argument.

Weighted Shuffle in Elixir

Shuffling a list is easy, but what if you want to perform a weighted shuffle? That is, what if some elements in a list had a greater chance of being selected, or higher in the list, than others?

Given these colors and these weights:

weighted_colors = [{1, "orange"}, {3, "green"}, {5, "red"}]

I can create a list that is mostly represents each item the number of times of it’s weight by utilizing List.duplicate:

weighted_colors
|> Enum.map(fn ({weight, color}) -> 
  List.duplicate(color, weight)
end)
|> Enum.flatten
# ["orange", "green", "green", "green", "red", "red", "red", "red", "red"]

Then shuffle it:

|> Enum.shuffle
# ["red", "red", "orange", "red", "green", "green", "red", "green", "red"]

And then use Enum.uniq attribute of preserving the order in which it encounters each unique element from left to right.

|> Enum.uniq
# ["red", "orange", "green"]

Run this 100 times over the same weighted list and the most outcome that you would experience most often is:

# ["red", "green", "orange"]