Today I Learned

A Hashrocket project

465 posts by chriserin @mcnormalmode

Don't truncate when redirecting

A common problem in shell languages is truncating a file that you’re trying to transform by redirecting into it.

Let’s say I have a file full of “a”‘s

PROMPT> cat test.txt
aaaaaa

And I want to switch them all to “b”‘s

PROMPT> cat test.txt | tr 'a' 'b'
bbbbbb

But when I try to redirect into the same file I truncate file:

PROMPT> cat test.txt | tr 'a' 'b' > test.txt
PROMPT> cat test.txt
# nothing

This is because the shell truncates “test.txt” for redirection before cat reads the file.

An alternate approach is to use tee.

PROMPT> cat test.txt | tr 'a' 'b' | tee test.txt
bbbbbb
PROMPT> cat test.txt
bbbbbb

tee will write the output to both the file and to standard out, but will do so at the end of the pipe, after the cat command reads from the file.

Get pids for each beam application in Elixir

In Elixir, it’s hard to determine what pids belong to what applications just by using Process.info

iex> pid(0,45,0) |> Process.info
[
  current_function: {:application_master, :main_loop, 2},
  initial_call: {:proc_lib, :init_p, 5},
  status: :waiting,
  message_queue_len: 0,
  links: [#PID<0.46.0>, #PID<0.43.0>],
  dictionary: [
    "$ancestors": [#PID<0.44.0>],
    "$initial_call": {:application_master, :init, 4}
  ],
  trap_exit: true,
  error_handler: :error_handler,
  priority: :normal,
  group_leader: #PID<0.45.0>,
  total_heap_size: 376,
  heap_size: 376,
  stack_size: 7,
  reductions: 49,
  garbage_collection: [
    max_heap_size: %{error_logger: true, kill: true, size: 0},
    min_bin_vheap_size: 46422,
    min_heap_size: 233,
    fullsweep_after: 65535,
    minor_gcs: 0
  ],
  suspending: []
]

With the above output, I can’t tell what application this is!

I can use the :running key of Erlang’s :application.info to get a map of applications to pids.

iex> :application.info[:running]
[
  logger: #PID<0.93.0>,
  iex: #PID<0.85.0>,
  elixir: #PID<0.79.0>,
  compiler: :undefined,
  stdlib: :undefined,
  kernel: #PID<0.45.0>
]

Oh ok, pid(0,45,0) is the Kernel application.

Test that an email was sent with correct params

assert_email_delivered_with is a function that checks if an email was delivered using Bamboo.

Sending an email is a side-effect in functional programming parlance. When testing a function where the email is sent several calls deep, you generally don’t get the email as a return value of that function.

When using Bamboo’s TestAdaptor in test mode, Bamboo captures the delivered email so that you can assert about it.

Lottery.notify_winner()
assert_email_delivered_with(subject: "You Won!!")

You can also assert on: to, cc, text_body, html_body, from, and bcc.

Happy Testing!

Formatting the email address fields with Bamboo

When you create an email with Bamboo and deliver it:

 email = new_email(
      to: "chris@example.com",
      from: "vitamin.bot@example.com",
      subject: "Reminder",
      text_body: "Take your vitamins"
    )
 |> Mailer.deliver_now()

The to field will not be what you expect it to be:

email.to
# {nil, "chris@example.com"}

This is the normalized format for a to field. The first element in the tuple is the name of the recipient.

Bamboo allows you to format this field yourself with protocols.

defimpl Bamboo.Formatter, for: Person do
  def format_email_address(person, _opts) do
    {person.nickname, person.email}
  end
end

And now if I send an email with Person in the to field:

person = %Person{
    nickname: "shorty", 
  email: "chris@example.com"
}

email = new_email(
      to: person,
      from: "vitamin.bot@example.com",
      subject: "Reminder",
      text_body: "Take your vitamins"
    )
|> Mailer.deliver_now()

Then the to field gets formatted with our protocol.

email.to
# {"shorty", "chris@example.com"}

Read more about the Bamboo Formatter here.

Fuzzy translation key merging with Gettext

Gettext is a great tool for i18n in elixir. It provides a mix task for extracting translation keys from your code. The translation keys (or message ids) are natural language and look like this:

gettext("Hi, Welcome to Tilex!")

After running mix gettext.extract && mix gettext.merge, an already translated Italian locale file would look like:

msgid "Hi, Welcome to Tilex!"
msgstr "Italian version of Welcome!"

There’s a chance that the natural language key (which also serves as the default string) will change.

If it changes just a little bit then the Italian locale file will look like:

#, fuzzy
msgid "Hi, Welcome to Tilex!!"
msgstr "Italian version of Welcome!"

It gets marked as #, fuzzy, and the new msgid replaced the old msgid.

Gettext determines how big of a change will constitute a fuzzy match with String.jaro_distance.

iex> String.jaro_distance("something", "nothing")
0.8412698412698413
iex> String.jaro_distance("peanuts", "bandersnatch")
0.576984126984127

The higher the number the closer the match. fuzzy_threshold is the configuration that determines whether a msgid is fuzzy or not and the default for fuzzy_threshold is 0.8, set here.

Two arguments in a command with xargs and bash -c

You can use substitution -I{} to put the argument into the middle of the command.

> echo "a\nb\nc\nd" | xargs -I{} echo {}!
a!
b!
c!
d!

I can use -L2 to provide exactly 2 arguments to the command:

> echo "a\nb\nc\nd" | xargs -I{} -L2 echo {}!
a b!
c d!

But I want to use two arguments, the first in one place, the next in another place:

> echo "a\nb\nc\nd" | xargs -I{} -L2 echo {}x{}!
a bxa b!
c dxa b!

I wanted axb! but got a bxa b!. In order to achieve this you have to pass arguments to a bash command.

> echo "a\nb\nc\nd" | xargs -L2 bash -c 'echo $0x$1!'
axb!
cxd!

Just like calling

bash -c 'echo $0x$1!' a b

Where $0 represents the first argument and $1 represents the second argument.

Weighted Shuffle in Elixir

Shuffling a list is easy, but what if you want to perform a weighted shuffle? That is, what if some elements in a list had a greater chance of being selected, or higher in the list, than others?

Given these colors and these weights:

weighted_colors = [{1, "orange"}, {3, "green"}, {5, "red"}]

I can create a list that is mostly represents each item the number of times of it’s weight by utilizing List.duplicate:

weighted_colors
|> Enum.map(fn ({weight, color}) -> 
  List.duplicate(color, weight)
end)
|> Enum.flatten
# ["orange", "green", "green", "green", "red", "red", "red", "red", "red"]

Then shuffle it:

|> Enum.shuffle
# ["red", "red", "orange", "red", "green", "green", "red", "green", "red"]

And then use Enum.uniq attribute of preserving the order in which it encounters each unique element from left to right.

|> Enum.uniq
# ["red", "orange", "green"]

Run this 100 times over the same weighted list and the most outcome that you would experience most often is:

# ["red", "green", "orange"]

Create a temp table from values

In postgres, if you are looking for a way to create a quick data set to experiment with you can create a temporary table using the values expression.

create temp table test as values ('a', 1), ('b', 2), ('c', 3);

In postgres, a temp table is a table that will go away at the end of the session in which it was created.

The types of the columns are inferred as you can see when examining the table in psql:

> \d test
               Table "pg_temp_3.test"
 Column  |  Type   | Collation | Nullable | Default
---------+---------+-----------+----------+---------
 column1 | text    |           |          |
 column2 | integer |           |          |

What happens if we try to mix types?

> create temp table test as values ('a', 1), (1, 'c');
ERROR:  invalid input syntax for integer: "a"

You can also be specific about types by casting the values of the first row.

> create temp table test as values ('a'::varchar(1), 1::decimal), ('b', 1);
> \d test
                    Table "pg_temp_3.test"
 Column  |       Type        | Collation | Nullable | Default
---------+-------------------+-----------+----------+---------
 column1 | character varying |           |          |
 column2 | numeric           |           |          |

Choose only one row for a given value w Distinct

If you have a table with many rows but you only want one row for a given value, you can use distinct on (<column>)

select distinct on (letter) * 
from (
values 
('y', 512), 
('y',128), 
('z', 512), 
('x',128), 
('z', 256)) 
as x (letter, number);

Which produces

letter | number
--------+--------
 x      |    128
 y      |    512
 z      |    512
(3 rows)

Distinct implicitly orders by the specified column ascending. Putting order by letter at the end of the select statement produces the exact same output (and execution plan).

Distinct chooses the first row for the given column after sorting, so changing the sort order for the second column will change the results.

Here are the results after adding a order by letter, number asc to the above select statement.

 letter | number
--------+--------
 x      |    128
 y      |    128
 z      |    256
(3 rows)

Values clause in a select statement

You’ve all seen the VALUES clause before. It is typically used to insert data.

insert into colors (name, brightness) 
values ('red', 10), ('black', 0), ('blue', 5);

You can also use the VALUES clause in a select statement:

select * from (values ('red', 10), ('black', 0), ('blue', 5)
);

This is great when experimenting with different parts of sql at the command line.

Additionally, values is a first class expression on its own:

> values (1,2), (2,3);

 column1 | column2
---------+---------
       1 |       2
       2 |       3
(2 rows)

Generating a postgres query with no rows

For testing purposes, it is nice to be able to simulate a query that does not result in any rows.

select * from generate_series(0, -1) x;
 x
---
(0 rows)

This is because (from the docs):

When step is positive, zero rows are returned if start is greater than stop.

Step (the third argument to generate series) defaults to 1, so anything where the from has to go backwards to get to the to will result in 0 rows.

select * from generate_series(4, 3) x;
 x
---
(0 rows)

And now you can see how sum() behaves without any rows:

select sum(x) from generate_series(1, 0) x;
sum
-----
   ø
(1 row)

Ohhh, it returns null. I hope you’re planning for that.

Parsing CSV at the command line with `csvcut`

Parsing csv at the command line is easy with the csvcut tool from csvkit.

csvkit is installable with pip

> pip install csvcut 

You can print only the columns you are interested in.

> echo "a,b,c,d" | csvcut -c 1,4
a,d
> echo "a,b,c,d" | csvcut -c 1,5
Column 5 is invalid. The last column is 'd' at index 4.

It also handles quoted csv columns:

echo 'a,"1,2,3,4",c,d' | csvcut -c 2,3

It handles new lines:

> echo 'a,b\nc,d' | csvcut -c 2
b
d

There are a virtual plethora of options check out csvcut --help.

csvkit has a number of csv processing tools, check them out here.

Get pip installed executables into the asdf path

Given I am using asdf when I install a python executable via pip, then I expect to be able to run that executable at the command line.

> asdf global python 3.7.2
> pip install some-executable-package
> some-executable
zsh: command not found: some-executable

This is because a shim has not yet been created for this executable. If you look into your shims dir for the executable:

> ls $(dirname $(which pip)) | grep some-executable

It doesn’t exist.

To place a shim into the shims dir of asdf you must reshim.

> asdf reshim python

And now you have a shim that points to the executable:

> ls $(dirname $(which pip)) | grep some-executable
some-executable
> some-executable
Hello World!

Find duplicate routes in Elixir Phoenix

If you have duplicated routes in your route file like this:

scope "/api", MyAppWeb.Api, as: :api do
  pipe_through [:this]
  
  resources "/users", UserController, except: [:new, :edit]
  scope 
end

scope "/api", MyAppWeb.Api, as: :api do
  pipe_through [:that]

  resources "/users", UserController, except: [:new, :edit]
end

Then you’ll get a warning like this:

warning: this clause cannot match because a previous clause at line 2 always matches
  lib/idea_web/router.ex:2

The warning doesn’t really let you know which routes are duplicated, but it’s really ease to find the duplicated routes by utilizing the uniq command.

mix phx.routes | sort | uniq -d

The -d flag for uniq is duplicates only. -d only considers similar lines in consecutive order as duplicates, so you need to run it through sort first.

The output looks like this:

api_user_path  DELETE  /api/users/:id                   MyAppWeb.Api.UserController :delete
api_user_path  GET     /api/users                       MyAppWeb.Api.UserController :index
api_user_path  GET     /api/users/:id                   MyAppWeb.Api.UserController :show
api_user_path  PATCH   /api/users/:id                   MyAppWeb.Api.UserController :update
api_user_path  POST    /api/users                       MyAppWeb.Api.UserController :create

And those are your duplicate routes!

Named bindings in Ecto (vs positional bindings)

Positional bindings in Ecto are meant to confuse when buildings large queries across several different functions.

query = Thing
|> join(:inner, [t], x in X, on: t.x_id = x.id)
|> join(:inner, [_, x], y in Y, on: x.y_id = y.id)

Using that query in another function might look like this:

query
|> where([_, x, _], x.type == "Articulated")

But what if the positions change? How am I supposed to know which positions are which when I’m adding to this query out of context?

Named bindings help tremendously here (note the :as option):

query = Thing
|> join(:inner, [t], x in X, as: :x, on: t.x_id = x.id)
|> join(:inner, [_, x], y in Y, as: :y, on: x.y_id = y.id)

Now I can refer to these things without knowing the position. If the position changes it’s all good the additive where statement does not have to change:

query
|> where([y: y, x: x], x.type == "Articulated", y.feel == "Good")

How to actually _load_ the resource with Guardian

Guardian, like all auth libraries in all languages, is tough to wrap my head around.

I know there is a plug in the pipeline called plug Guardian.Plug.LoadResource. I know there is a function called Guardian.Plug.current_resource(conn) that takes the conn and returns that returns the resource placed in the conn by the Guardian.Plug.LoadResource plug.

What I don’t know is how the LoadResource plug knows what resource to get.

In Guardian, you configure the pipeline with:

use Guardian.Plug.Pipeline, otp_app: :my_app,
                              module: GuardianImpl,
                              error_handler: ErrorHandler

The GuardianImpl is a module that uses the Guardian behaviour.

The Guardian behaviour has a callback resource_from_claims that might be implemented like this:

def resource_from_claims(claims) do
  {:ok, Repo.get(User, claims["sub"])}
end

So when you need to modify how you load the resource, you should look to see how the resource_from_claims callback is implemented.

Read more here.

Confirming operations with `xargs -p`

xargs is a great tool to take a lot of input and execute a lot of different commands based on that input. Sometimes though, if you are performing destructive or mutative actions with xargs you want to proceed more cautiosly.

> echo "banana apple orange" | tr ' ' '\n' | xargs -n1 echo "I like"

This outputs:

I like banana
I like apple
I like orange

But maybe I don’t like some of those things, please ask! Including the p flag with xargs forces a prompt.

> echo "banana apple orange" | tr ' ' '\n' | xargs -p -n1 echo "I like"
echo I like banana ?...n
echo I like apple ?...y
I like apple
echo I like orange ?...n

Yep, I only like apples.

Updating the ExUnit test context with setup

When using ExUnit the second argument to the test macro is context.

test "1 + 1 = value", context do
  assert 1 + 1 == 2
end

This context can provide setup values so that you can share setup across tests.

Use the setup macro to update the context. The keyword list you return from the setup function will be merged into the map of the context.

setup do
  [result: 2]  # this gets merged into the context
end

setup do
  # create some database records
  :ok # this does not get merged into the context
end

Also in the setup macro you can have access to the context, allowing you to potentially change the context based on the needs of the test.

setup context do
  %{result: r} = context
    [result: r + 1]
end

test "1 + 1 = value", %{result: value} do
  assert 1 + 1 == value
end

Read more about ExUnit setup here.

Get first image of animated gif

Image Magick’s convert tool has a no-option, very simple way to access the first frame of an animated gif.

convert 'animated.gif[0]' animated.first.gif

The square brackets after the file name above can contain any index for any frame of the image. 0 is the index of the first image.

To discover how many frames an animated gif has you can use:

identify animated.gif

Which will return a line for every frame in the animated gif. Those lines will look like this:

animated.gif[32] GIF 736x521 756x594+4+70 8-bit sRGB 256c 421707B 0.000u 0:00.000

Go to next ALE error

Has ALE overtaken your vim setup like it has mine? It constantly runs linters, compilers and formatters, just waiting for you to slip up so that it can put an X in the gutter.

Those X’s are really quite handy. They generally point me to the next place in the code that I need to make a change.

To get there quickly you can goto the next ALE error with:

:ALENext

This will stop at the last error in the file though. To have it wrap around use:

:ALENextWrap

I really enjoy vim-unimpaired’s handy bracket mappings, but I don’t use ]a that move between args (because I don’t use args very often).

To setup my own handy bracket mappings for ALE:

:nmap ]a :ALENextWrap<CR>
:nmap [a :ALEPreviousWrap<CR>
:nmap ]A :ALELast
:nmap [A :ALEFirst

Prevent npm high level security errors in CI

In npm 6.6, a feature was added to provide security audit information for the packages that are used in your application.

This is run with:

node audit

This exits with a non-zero exit code if any ‘low’, ‘medium’, ‘high’, or ‘critical’ errors were detected.

You can use that non-zero return code in your CI to fail a check, which should notify you of the security vulnerability which you can then resolve.

If you care about ‘high’ or ‘critical’ errors but don’t care about ‘low’ or ‘medium’ you can set the audit-level npm config value to ‘high’ in you npm configuration for your CI server.

Set the relative path of assets in a CRA app

When I build my CRA app I get a path for my assets (css, images) that begins with /static. If I deploy my app to https://something.com/myapp, then the app will try to access those asset paths at https://something.com/static/asset.css. That’s not where the asset lives. The asset lives at https://something.com/myapp/static/asset.css.

Create React App allows you to change the prefix for a the built assets with the homepage attribute in your package.json file.

You could set it to myapp:

"homepage": "/myapp"

And then the asset will have the path of /myapp/static/asset.css, but what if you want to change paths?

"homepage": "."

Setting homepage to . will make the asset always relative to index.html, allowing you to not be concerned with the path the application is deployed to.

This actually repurposes a property of the package json file that npm uses to set the homepage of an npm package, so you may find this property used in a different way in other package.json files.

See the npm docs here.

See the Create React App docs here

Override Create React App conf w/react-app-rewired

A common problem when using Create React App is changing the configured behaviour of webpack. Generally, if you want to change the webpack configuration provided by Create React App you need to eject, but eject at the very minimum adds a lot of files to your project that you may not want.

An alternative is to use react-app-rewired in combination with customize-cra.

react-app-rewired provides a file, config-overrides.js placed in your project root directory where you can override existing behaviour.

customize-cra provides a set of handy utility functions to help you override specific configurations.

For example, when using the ant design library, you can import both the component and the css for that component with one import line if you use babel-loader.

Here is an example of a config-overrides.js file that would provide that behaviour.

const { override, fixBabelImports } = require('customize-cra');

module.exports = override(
  fixBabelImports('import', {
    libraryName: 'antd',
    libraryDirectory: 'es',
    style: 'css',
  })
);

Run side effect when a prop changes w/Hooks

There is a React Hook for side effects useEffect. You can pass useEffect a function and that function will run after each render.

useEffect(() => console.log('rendered!'));

In many cases it’s inefficient and unnecessary to call the effect function after every render. useEffect has a second argument of an array of values. If passing in this second argument, the effect function will only run when the values change.

useEffect(() => console.log('value changed!'), [props.isOpen]);

Now, you will see “value changed!” both on the first render and everytime isOpen changes.

Reminder: React Hooks are for functional components not class components. Check out the hooks api here

Get a ref to a dom element with react hooks

React Hooks are now available in React 16.8. There are 10 different hooks, you can read about them here. When I needed a ref to a dom element yesterday I reached for useRef.

const containerRef = useRef(null);

This isn’t a ref to anything unless you pass the ref to a tag.

return (<div ref={containerRef}></div>);

Now the ref will be assigned a dom element that you can use. In this example I’m using the useEffect hook to execute a side effect after the render takes place. Use the current attribute to access the current dom node.

useEffect(() => {
    containerRef.current.style = 'background-color: green;'
})

Destructuring Record in Fn Argument

Elm has destructuring/pattern matching that feels typical for an ML. One pattern matching feature I like is record destructuring in a function argument.

myRecord = {a = 1}

myFunc {a} = a

myFunc myRecord
# 1

Here, we destructure the key a out of the record. What’s cool about this is the record does not need to match exactly, but can be any record with the property of a.

myRecord = {a = 1, b = 2}

myFunc {a} = a

myFunc myRecord
# 1

This enables us to be able to grow the record over time without changing the signature of the function. In Elm, not having to accomodate that change across the entire program is great.

Formatting Elm Code

Elm comes with it’s own formatter.

elm-format src/

It’s got options like --upgrade which will help you get from 0.18 code to 0.19 code, and its got --validate which you can use in continuous integration to ensure all PRs are properly formatted.

In vim, formatting is enabled by default when you use elm-vim.

Where is List.zip in Elm?

unzip is a function available as part of the list package.

List.unzip [(1, 2), (3, 4)]
-- ([1,3],[2,4])

It’s defined as:

Decompose a list of tuples into a tuple of lists.

But there is no corresponding zip function to compose a tuple of lists into a list of tuples. If you just want a list to be zipped with it’s index, then you can use List.indexedMap.

List.indexedMap (\x y -> (x, y)) ["a", "b", "c"]
-- [(0,"a"),(1,"b"),(2,"c")]

And you could substitute (\x y -> (x, y)) with Tuple.pair which does the same thing.

List.indexedMap Tuple.pair ["a", "b", "c"]
-- [(0,"a"),(1,"b"),(2,"c")]

And if you don’t care about indexes but instead have two lists, you can zip those two lists together with List.map2.

List.map2 Tuple.pair [1, 3, 5] ["a", "b", "c"]
-- [(1,"a"),(3,"b"),(5,"c")]

Happy Zipping!

Random is not pure in Elm

Elm requires that functions be pure, that is, the same arguments should produce the same outputs every time. Random necessarily injects some uncertainty into what the outputs that way and Elm has decided to handle random differently than in other languages.

First, install the random package:

elm install elm/random

The Random package allows you to create numbers in a couple of different ways, but the most idiomatic is to create a generator:

generator = (Random.int 1 10)

And then create a message that will let the Elm runtime know to produce a random number with the parameters defined by the generator.

type Msg = ConsumeRandomValue

msg = Random.generate ConsumeRandomValue generator

This message can then be placed into the (model, msg) tuple that is returned from the update function. The update function is then called to respond to the message, using the message type to wrap the random value that has been produced.

import Random

type Msg = ProduceRandomValue | ConsumeRandomValue Int

update msg model =
    case msg of
        ProduceRandomValue -> 
            (model, Random.generate ConsumeRandomValue (Random.int 1 10))
        ConsumeRandomValue randomValue ->
            ({model | rValue = randomValue}, Cmd.none)

Decorator Factory vs Decorator

I’ve discovered decorator factories and they are cool!

Decorating a function in python is easy. Watch as I yell when a function gets called.

def yell(func):
    def yeller(*args):
        print("YARGH")
        func(*args)
    return yeller

@yell
def hi(name):
    print(f'hi {name}')

hi("Bob")
# YARGH
# hi Bob

What if I always wanted to say hi to Bob, and wanted to configure that via a decorator. Could I do that?

@yell("Bob")
def hi(name):
    print(f'hi {name}')

# TypeError: 'str' object is not callable

Instead of just passing an argument to a decorator, I need to create a function that will return a decorator, a decorator factory.

def yell(name):
    def decorate(func):
        def yeller():
            print("YARGH")
            func(name) 
        return yeller
    return decorate

@yell("Bob")
def hi(name):
    print(f'hi {name}')

hi()
# YARGH
# hi Bob

So this time, I created a function that returned a decorator which in turn returns a function that wraps the decorated function and calls the decorated function with the argument passed in to the decorator factory. Very russion doll. Fun.

Examining the closure

Python is the first language I’ve encountered that avails you the meta information about the defining closure of a function.

After you get the reference of a function, there is a lot of meta information available via the __code__ attribute. __code__ has many attributes and one of them is co_freevars which is all the variables defined outside of the function, but available through the closure to the function. It returns a tuple. The order of the values in this tuple is important.

The values of those co_freevars are in another dunder (__) method that is called __closure__. It also returns a tuple. The order is the same order as the co_freevars tuple. The tuple holds cell objects with one attribute, cell_contents. cell_contents holds the current co_freevar value.

def get_fn():
    a = 1
    b = 2
    def get():
        print(a)
        print(b)
        return (a, b)
    return get

gfn = get_fn()
gfn.__code__.co_freevars
# ('a', 'b')
gfn.__closure__[0]
# <cell at 0x10849c4f8: int object at 0x1081907c0>
gfn.__closure__[0].cell_contents
# 1
gfn.__closure__[1].cell_contents
# 2

Python global closure scoping oddity

Beware closuring in global variables.

world = 'Endor'

def world_name():
    def world_knower():
      print(world)
    world = 'Hoth'
    return world_knower

knower = world_name()
knower()
# Hoth

In the above example, the reference the local world was closured in to world_knower, and the value was changed after the world_knower declaration.

What if we use the global keyword to let python know we want to use the global version of this variable?

world = 'Endor'

def world_name():
    global world
    def world_knower():
      print(world)
    world = 'Hoth'
    return world_knower

knower = world_name()
knower()
# Hoth

Yikes, the inner function still uses the outer functions local reference. I guess if we truly want to ignore the local reference, we need to declare that world is global in the inner function.

world = 'Endor'

def world_name():
    def world_knower():
      global world
      print(world)
    world = 'Hoth'
    return world_knower

knower = world_name()
knower()
# Endor

Lambdas can only be one line

Unfortunately, lambdas in Python are limited. They can be only one expression.

This works:

print_twice = lambda word: print(word * 2)
# prints: "wordword"

But you can’t assign the intermediate value to a variable.

print_twice = lambda word: result = word * 2; print result;
# Syntax error

White space is significant is this language, so there are no (){} or keywords that would help you out here.

lambdas are the only way to define anonymous functions in python, but you can still define a named function in any context.

def build_fn(a):
    def adder(b):
        return b + a
    return adder

add3 = build_fn(3)
add3(4)
# 7

Annotate Args with Anything

Python annotations don’t do anything. Annotations are just meta information about the arguments of your method. You can apply this metadata in any way that you see fit.

Let’s check out a function with no annotations:

def talk(a, b, c):
    pass

talk.__annotations__
# {}

Makes sense, an empty dictionary. Now lets say that a should be an str.

def talk(a: str, b, c):
    pass

talk.__annotations__
# {'a': <class 'str'>}

What’s unique about python is that its primitive types are objects/classes. This allows for annotations to be accomodating of any object/class.

def talk(a: str, b: 'hi', c):
    pass

talk.__annotations__
{'a': <class 'str'>, 'b': 'hi'}

Any type or instance? What about declaring a variable outside of the method and using that as an annotation.

birds = {'corvids': True}
def talk(a: str, b: 'hi', c: birds):
    pass

talk.__annotations__
# {'a': <class 'str'>, 'b': 'hi', 'c': {'corvids': True}}

Yeah you can do that!! Crzy!!

Check out how argument annotations can be used as a type system with mypy

Remotely control your desktop over SSH on macOS

Using SSH port forwarding and VNC you can connect to your remote desktop using the Screen Sharing application.

First connect to your machine over SSH and port forward 5900.

$ ssh user@machine.somehwere -L 5900:localhost:5900

Then in another terminal, on your local machine, open Screen Sharing by passing a open a VNC URL.

$ open 'vnc://localhost'

Screen Sharing should open and ask you for credentials for the remote machine. And then you do cool things on your remote desktop!

Read more about VNC here in this wikipedia article.

Hat Tip to Dorian Karter!

Capture and View screenshot on macOS remotely

First, wake up the desktop with caffeinate.

caffeinate -u -t 2 # assert that the user is active

Then switch to the application you want to have focus on the desktop with open

open -a Google\ Chrome

Then call MacOS’s screencapture command.

sudo screencapture /Users/dev/Desktop/FullScreen.png

Finally, if you are daring, using iTerm2, have downloaded imgcat from the iTerm imgcat site, have chmod +x that file, and have copied that file to /usr/local/bin, then view the captured image in your terminal with imgcat.

imgcat /Users/dev/Desktop/FullScreen.png

Call an object like a function with `__call__`

Using __call__ you can make an object behave like a function.

class HiyaPerson:
     def __init__(self, person_name):
         self.person_name = person_name
     def __call__(self):
         print("Hiya there, " + self.person_name)

hiya = HiyaPerson("Bob")
hiya()
# Hiya there, Bob

A side effect is that now a “function” can have state. Interesting!

class Counter:
    def __init__(self):
        self.count = 0
    def __call__(self):
        self.count += 1
   
count = Counter()
count()

import inspect 
dict(inspect.getmembers(count))['count']
# 1
count()
dict(inspect.getmembers(count))['count']
# 2

Ad block is hiding your selector

Why is .ad-inner set to display: none !important? It’s because I’m using uBlock Origin, a fantastic ad blocker. It makes the internet tolerable. But it completely breaks the site I’m working on because a previous dev on the project chose ad-inner as a class name.

It turns out, there are a lot of class names that you shouldn’t use. Class names, ids, even certain dimensions applied as inline styles. uBlock Origin sources its rules from about 8 different lists, and the list that contained a rule for .ad-inner is EasyList. EasyList is hosted on github and the particular file with selector rules is easylist_general_hide.txt

Here are all the class names in the list that begin with .ad-i

##.ad-icon
##.ad-identifier
##.ad-iframe
##.ad-imagehold
##.ad-img
##.ad-img300X250
##.ad-in-300x250
##.ad-in-content-300
##.ad-in-post
##.ad-in-results
##.ad-incontent-ad-plus-billboard-top
##.ad-incontent-ad-plus-bottom
##.ad-incontent-ad-plus-middle
##.ad-incontent-ad-plus-middle2
##.ad-incontent-ad-plus-middle3
##.ad-incontent-ad-plus-top
##.ad-incontent-wrap
##.ad-index
##.ad-index-main
##.ad-indicator-horiz
##.ad-inline
##.ad-inline-article
##.ad-inner
##.ad-inner-container
##.ad-innr
##.ad-inpage-video-top
##.ad-insert
##.ad-inserter
##.ad-inserter-widget
##.ad-integrated-display
##.ad-internal
##.ad-interruptor
##.ad-interstitial
##.ad-intromercial
##.ad-island
##.ad-item
##.ad-item-related

A good rule of thumb is to not start a class name with .ad. This survey says that 30% of users are using ad blockers.

range() v slice()

A range is not a slice and a slice is not a range. But they look the same.

slice(1, 10)
# slice(1, 10, None)
range(1, 10)
# range(1, 10)

They both take step as a third argument.

slice(1, 10, 3)
# slice(1, 10, 3)
range(1, 10, 3)
# range(1, 10, 3)

But one is iterable and the other is not.

list(slice(1, 10))
# TypeError: 'slice' object is not iterable
list(range(1, 10))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

One can be used as a list indices, one cannot.

[1, 2, 3, 4, 5][range(1, 2)]
# TypeError: list indices must be integers or slices, not range
>>> [1, 2, 3, 4, 5][slice(1, 2)]
# [2]

They both conform to the start, stop, step interface.

s = slice(1, 10)
s.start, s.stop, s.step
# (1, 10, None)
r = range(1, 10)
r.start, r.stop, r.step
# (1, 10, 1)

You can slice a range but you can’t range a slice.

range(1, 10)[slice(2, 8)]
# range(3, 9)
slice(1, 10)[range(2, 8)]
# TypeError: 'slice' object is not subscriptable

Counting is as easy as...

Python has a specialized dict data structure for a common task, counting. Counter can process and store the counts of items you pass it.

Initialized like this with chars, and then keys:

from collections import Counter

char_count = Counter('xyxxyxyy')
# Counter({'x': 4, 'y': 4})

keys_count = Counter({'apple': 2, 'banana': 5, 'pear': 3})
# Counter({'banana': 5, 'pear': 3, 'apple': 2})

Then updating the Counter is easy with update:

chars_count.update('abxyabxy')
# Counter({'x': 6, 'y': 6, 'a': 2, 'b': 2})

keys_count.update({'apple': 1, 'banana': 2, 'fruit': 3})
# Counter({'banana': 7, 'apple': 3, 'pear': 3, 'fruit': 3})

Then figure out which items are the most common:

chars_count.most_common(2)
# [('x', 6), ('y', 6)]
keys_count.most_common(2)
# [('banana', 7), ('apple', 3)]

Read more about some really useful things you can do with counter objects in the Python docs

Pattern matching with `Kernel.match`

Pattern matching is powerful, but when iterating over a list with an Enum function you must allow all variants of the list to be processed. So this fails:

sample = [{1, "A"}, {2, "B"}]
 > Enum.filter(sample, fn ({_, "B"}) -> true end)
** (FunctionClauseError) no function clause matching in :erl_eval."-inside-an-interpreted-fun-"/1

    The following arguments were given to :erl_eval."-inside-an-interpreted-fun-"/1:

        # 1
        {1, "A"}

    (stdlib) :erl_eval."-inside-an-interpreted-fun-"/1
    (stdlib) erl_eval.erl:826: :erl_eval.eval_fun/6
    (elixir) lib/enum.ex:2898: Enum.filter_list/2

Instead of using pattern matching here we can just use an anonymous function that takes all args and makes a comparison.

sample = [{1, "A"}, {2, "B"}]
Enum.filter(sample, fn ({_, letter}) -> letter == "B" end)
# [{2, "B"}]

But the cool way to do it is with Kernel.match?. Which according to the docs is:

A convenience macro that checks if the right side (an expression) matches the left side (a pattern).

What that looks like:

sample = [{1, "A"}, {2, "B"}]
Enum.filter(sample, &match?({_, "B"}, &1))
# [{2, "B"}]

H/T Taylor Mock

Keep your lists sorted on every insert

Python is a language that provides you the tools of efficiency. The bisect module has a series of functions that work with a bisecttion algorithm. An inserting function in this grouping is the insort function which will insert an element into a sequence in the right spot according to the sort order.

from bisect import insort
sorted_list = [0, 2, 4, 6, 8, 10]
insort(sorted_list, 5)
# [0, 2, 4, 5, 6, 8, 10]

The above example uses a sorted list, what if the list is not sorted?

list = [0, 4, 2, 6]
insort(list, 5)
# [0, 4, 2, 5, 6]

It will still insert in the first place that it can find that makes sense. But what if there are two places that make sense? The algo seems to give up and just tack it on at the end.

list = [0, 4, 2, 6, 2]
>>> insort(list, 5)
# [0, 4, 2, 6, 2, 5]

insort works left to right but you can also work right to left with insert_right. In some cases this will be more efficient.

`::before` pseudo element full height

I’m trying to put a before image to the left of some text. But I want the text to NOT wrap below the image. The image is only 40px tall but the text can be any height.

<div class="text">
  Many many words
</div>
.text {
  width: 200px;

  &:before {
    background-image: url('carrot.png');
  }
}

When I do this the words wrap under the carrot. To fix this I can use the absolute top 0 bottom 0 technique to give the before pseudo element full height.

.text {
  width: 200px;
  position: relative;
  padding-left: 32px;

  &:before {
    background-image: url('carrot.png');
    position: absolute;
    top: 0;
    bottom: 0;
    left: 0;
  }
}

Now none of the text wraps under the carrot.

Steppin' and Slicin' through lists

Python allows you to slice a list with a range, which is common. But what is less common is adding a step variable to the same syntax construct.

A range takes on the form of start:end in python, but it allows you to default to the start and end of the data with just :.

[0, 1, 2, 3, 4, 5, 6][:]
# [0, 1, 2, 3, 4, 5, 6]

Add some numbers in there and it looks like this:

[0, 1, 2, 3, 4, 5, 6][2:5]
# [2, 3, 4]

But bolted on syntax allows you to step as well.

[0, 1, 2, 3, 4, 5, 6][2:5:2]
# [2, 4]

The form is actually start:end:step. Leaving out the start and end looks weird but allows you to get all even indexed elements:

[0, 1, 2, 3, 4, 5, 6][::2]
# [0, 2, 4, 6]

Named arguments by default

This feature is fantastic and I haven’t seen it in any other language.

You don’t have to declare named arguments, all arguments are named with the argument names by default.

>>> def add(a, b, c):
...     return a + b + c
...
>>> add(1, 2, 3)
6
>>> add(c=1, b=2, a=3)
6

What happens if you mix in named args with positional args

>>> add(1, b=2, c=3)
6

That works, but if you change the order?

>>> add(a=1, b=2, 3)
  File "<stdin>", line 1
SyntaxError: non-keyword arg after keyword arg

That error is definitive which is great. What about using a named arg for an already declared positional argument? Another definitive error:

>>> add(1, a=2, c=3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: add() got multiple values for keyword argument 'a'

There are so many times in Ruby and in JavaScript(es6 w/destructuring) where I debate whether the arguments should be named or not. I don’t really have a good rhyme or reason to it other than just feel and percieved readability. To not have to think about it seems wonderful.

Creating named tuples in Python

Named tuples are an interesting, flexible and fun data structure in Python.

Sorta like an openstruct in Ruby, it’s a quick way to create a finite data structure that has named properties. The properties of a namedtuple are also accessible via index.

from collections import namedtuple
Thing = namedTuple('Thing', ['a', 'b'])
x = Thing(1, 2)
x.a
# 1
x[0]
# 1

You can create an instance of a named tuple with named arguments as well.

x = Thing(b=3, a=4)
# Thing(a=4, b=3)

The properties of a namedtuple can also be declared with a string rather than a list. That looks like this:

Row = namedtuple('Row', 'a b c d')

or comma separated:

Row = namedtuple('Row', 'a, b, c, d')

Imports in ES5

The ES6 import keyword is so pervasive in how we write JavaScript currently that when when I had to “import” the path package in ES5 I didn’t know how to do it!

It’s really easy though, and if you don’t like magic keywords maybe it’s a little more intuitive too.

const path = require('path');

And if something is the default export in it’s module, then you can use the default property.

const Something = require('something').default;

I ran into this in a file that was outside the build path, gatsby-config.js.

String Interpolation in python??

String interpolation is something that I’ve come to expect from a modern language. Does python have it?

Well, sorta. From this article python has a couple different ways of getting data into a string.

‘Old Style’

'Old Style is a cheap beer from %' % 'Chicago'

‘New Style’

'New Style seems a bit less {}'.format('obtuse')

‘f-Strings’ (from python 3.6)

adejective = 'better'
f'This is a bit {adjective}'

Template Strings

from string import Template
t = Template('This is maybe more like Ruby\'s $tech?')
t.substitute(tech='ERB')

Where is that python function defined?

If you want to look at the definition of a function in Python but you don’t know where it’s defined, you can access a special attribute of the function __globals__.

In the following case I have a function imported from nltk called bigram and I want to see what file it’s defined in:

>>> bigram.__globals__['__file__']
'.../python3.7/site-packages/nltk/util.py'

Ah, just as I suspected, it is defined in nltk/util.py

__file__ is an attribute of the globals dictionary.

The special attribute __globals__ is only available for user-defined functions. If you try to access it for a python defined function you’ll get an error:

>>> len.__globals__['__file__']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'builtin_function_or_method' object has no attribute '__globals__'

Python has tuples!

I love a good tuple and in JavaScript or Ruby I sometimes use arrays as tuples.

[thing, 1, "Description"]

In those languages however, this tuple isn’t finite. Wikipedia defines tuple thusly.

a tuple is a finite ordered list (sequence) of elements.

Python tuples look like this:

mytuple = (thing, 1, "Description")

And is it finite?

>>> mytuple.append("c")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'tuple' object has no attribute 'append'

No append attribute, ok. But can you add tuples together?

>>> a = ('a', 'b')
>>> a + ('c', 'd')
('a', 'b', 'c', 'd')
>>> a
('a', 'b')

You can add tuples together but it’s not mutative.

A syntax quirk is the one element tuple.

>>> type(('c'))
<class 'str'>
>>> type(('c',))
<class 'tuple'>

Include a comma after the only tuple element to ensure that the tuple is not tokenized as a string.