Today I Learned

A Hashrocket project

17 posts about #python

Python Help in REPL 🐍

Heading to the internet to get help with a Python function? Slow down! There’s help right in your terminal.

With the Python executable installed, it’s as easy as:

$ python
>>> help()
help> string.lower
Help on function lower in string:

string.lower = lower(s)
    lower(s) -> string

    Return a copy of the string s converted to lowercase.

help>

Decorator Factory vs Decorator

I’ve discovered decorator factories and they are cool!

Decorating a function in python is easy. Watch as I yell when a function gets called.

def yell(func):
    def yeller(*args):
        print("YARGH")
        func(*args)
    return yeller

@yell
def hi(name):
    print(f'hi {name}')

hi("Bob")
# YARGH
# hi Bob

What if I always wanted to say hi to Bob, and wanted to configure that via a decorator. Could I do that?

@yell("Bob")
def hi(name):
    print(f'hi {name}')

# TypeError: 'str' object is not callable

Instead of just passing an argument to a decorator, I need to create a function that will return a decorator, a decorator factory.

def yell(name):
    def decorate(func):
        def yeller():
            print("YARGH")
            func(name) 
        return yeller
    return decorate

@yell("Bob")
def hi(name):
    print(f'hi {name}')

hi()
# YARGH
# hi Bob

So this time, I created a function that returned a decorator which in turn returns a function that wraps the decorated function and calls the decorated function with the argument passed in to the decorator factory. Very russion doll. Fun.

Examining the closure

Python is the first language I’ve encountered that avails you the meta information about the defining closure of a function.

After you get the reference of a function, there is a lot of meta information available via the __code__ attribute. __code__ has many attributes and one of them is co_freevars which is all the variables defined outside of the function, but available through the closure to the function. It returns a tuple. The order of the values in this tuple is important.

The values of those co_freevars are in another dunder (__) method that is called __closure__. It also returns a tuple. The order is the same order as the co_freevars tuple. The tuple holds cell objects with one attribute, cell_contents. cell_contents holds the current co_freevar value.

def get_fn():
    a = 1
    b = 2
    def get():
        print(a)
        print(b)
        return (a, b)
    return get

gfn = get_fn()
gfn.__code__.co_freevars
# ('a', 'b')
gfn.__closure__[0]
# <cell at 0x10849c4f8: int object at 0x1081907c0>
gfn.__closure__[0].cell_contents
# 1
gfn.__closure__[1].cell_contents
# 2

Python global closure scoping oddity

Beware closuring in global variables.

world = 'Endor'

def world_name():
    def world_knower():
      print(world)
    world = 'Hoth'
    return world_knower

knower = world_name()
knower()
# Hoth

In the above example, the reference the local world was closured in to world_knower, and the value was changed after the world_knower declaration.

What if we use the global keyword to let python know we want to use the global version of this variable?

world = 'Endor'

def world_name():
    global world
    def world_knower():
      print(world)
    world = 'Hoth'
    return world_knower

knower = world_name()
knower()
# Hoth

Yikes, the inner function still uses the outer functions local reference. I guess if we truly want to ignore the local reference, we need to declare that world is global in the inner function.

world = 'Endor'

def world_name():
    def world_knower():
      global world
      print(world)
    world = 'Hoth'
    return world_knower

knower = world_name()
knower()
# Endor

Lambdas can only be one line

Unfortunately, lambdas in Python are limited. They can be only one expression.

This works:

print_twice = lambda word: print(word * 2)
# prints: "wordword"

But you can’t assign the intermediate value to a variable.

print_twice = lambda word: result = word * 2; print result;
# Syntax error

White space is significant is this language, so there are no () {} or keywords that would help you out here.

lambdas are the only way to define anonymous functions in python, but you can still define a named function in any context.

def build_fn(a):
    def adder(b):
        return b + a
    return adder

add3 = build_fn(3)
add3(4)
# 7

Annotate Args with Anything

Python annotations don’t do anything. Annotations are just meta information about the arguments of your method. You can apply this metadata in any way that you see fit.

Let’s check out a function with no annotations:

def talk(a, b, c):
    pass

talk.__annotations__
# {}

Makes sense, an empty dictionary. Now lets say that a should be an str.

def talk(a: str, b, c):
    pass

talk.__annotations__
# {'a': <class 'str'>}

What’s unique about python is that its primitive types are objects/classes. This allows for annotations to be accomodating of any object/class.

def talk(a: str, b: 'hi', c):
    pass

talk.__annotations__
{'a': <class 'str'>, 'b': 'hi'}

Any type or instance? What about declaring a variable outside of the method and using that as an annotation.

birds = {'corvids': True}
def talk(a: str, b: 'hi', c: birds):
    pass

talk.__annotations__
# {'a': <class 'str'>, 'b': 'hi', 'c': {'corvids': True}}

Yeah you can do that!! Crzy!!

Check out how argument annotations can be used as a type system with mypy

Call an object like a function with `__call__`

Using __call__ you can make an object behave like a function.

class HiyaPerson:
     def __init__(self, person_name):
         self.person_name = person_name
     def __call__(self):
         print("Hiya there, " + self.person_name)

hiya = HiyaPerson("Bob")
hiya()
# Hiya there, Bob

A side effect is that now a “function” can have state. Interesting!

class Counter:
    def __init__(self):
        self.count = 0
    def __call__(self):
        self.count += 1
   
count = Counter()
count()

import inspect 
dict(inspect.getmembers(count))['count']
# 1
count()
dict(inspect.getmembers(count))['count']
# 2

range() v slice()

A range is not a slice and a slice is not a range. But they look the same.

slice(1, 10)
# slice(1, 10, None)
range(1, 10)
# range(1, 10)

They both take step as a third argument.

slice(1, 10, 3)
# slice(1, 10, 3)
range(1, 10, 3)
# range(1, 10, 3)

But one is iterable and the other is not.

list(slice(1, 10))
# TypeError: 'slice' object is not iterable
list(range(1, 10))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

One can be used as a list indices, one cannot.

[1, 2, 3, 4, 5][range(1, 2)]
# TypeError: list indices must be integers or slices, not range
>>> [1, 2, 3, 4, 5][slice(1, 2)]
# [2]

They both conform to the start, stop, step interface.

s = slice(1, 10)
s.start, s.stop, s.step
# (1, 10, None)
r = range(1, 10)
r.start, r.stop, r.step
# (1, 10, 1)

You can slice a range but you can’t range a slice.

range(1, 10)[slice(2, 8)]
# range(3, 9)
slice(1, 10)[range(2, 8)]
# TypeError: 'slice' object is not subscriptable

Counting is as easy as...

Python has a specialized dict data structure for a common task, counting. Counter can process and store the counts of items you pass it.

Initialized like this with chars, and then keys:

from collections import Counter

char_count = Counter('xyxxyxyy')
# Counter({'x': 4, 'y': 4})

keys_count = Counter({'apple': 2, 'banana': 5, 'pear': 3})
# Counter({'banana': 5, 'pear': 3, 'apple': 2})

Then updating the Counter is easy with update:

chars_count.update('abxyabxy')
# Counter({'x': 6, 'y': 6, 'a': 2, 'b': 2})

keys_count.update({'apple': 1, 'banana': 2, 'fruit': 3})
# Counter({'banana': 7, 'apple': 3, 'pear': 3, 'fruit': 3})

Then figure out which items are the most common:

chars_count.most_common(2)
# [('x', 6), ('y', 6)]
keys_count.most_common(2)
# [('banana', 7), ('apple', 3)]

Read more about some really useful things you can do with counter objects in the Python docs

Keep your lists sorted on every insert

Python is a language that provides you the tools of efficiency. The bisect module has a series of functions that work with a bisecttion algorithm. An inserting function in this grouping is the insort function which will insert an element into a sequence in the right spot according to the sort order.

from bisect import insort
sorted_list = [0, 2, 4, 6, 8, 10]
insort(sorted_list, 5)
# [0, 2, 4, 5, 6, 8, 10]

The above example uses a sorted list, what if the list is not sorted?

list = [0, 4, 2, 6]
insort(list, 5)
# [0, 4, 2, 5, 6]

It will still insert in the first place that it can find that makes sense. But what if there are two places that make sense? The algo seems to give up and just tack it on at the end.

list = [0, 4, 2, 6, 2]
>>> insort(list, 5)
# [0, 4, 2, 6, 2, 5]

insort works left to right but you can also work right to left with insert_right. In some cases this will be more efficient.

Steppin' and Slicin' through lists

Python allows you to slice a list with a range, which is common. But what is less common is adding a step variable to the same syntax construct.

A range takes on the form of start:end in python, but it allows you to default to the start and end of the data with just :.

[0, 1, 2, 3, 4, 5, 6][:]
# [0, 1, 2, 3, 4, 5, 6]

Add some numbers in there and it looks like this:

[0, 1, 2, 3, 4, 5, 6][2:5]
# [2, 3, 4]

But bolted on syntax allows you to step as well.

[0, 1, 2, 3, 4, 5, 6][2:5:2]
# [2, 4]

The form is actually start:end:step. Leaving out the start and end looks weird but allows you to get all even indexed elements:

[0, 1, 2, 3, 4, 5, 6][::2]
# [0, 2, 4, 6]

Named arguments by default

This feature is fantastic and I haven’t seen it in any other language.

You don’t have to declare named arguments, all arguments are named with the argument names by default.

>>> def add(a, b, c):
...     return a + b + c
...
>>> add(1, 2, 3)
6
>>> add(c=1, b=2, a=3)
6

What happens if you mix in named args with positional args

>>> add(1, b=2, c=3)
6

That works, but if you change the order?

>>> add(a=1, b=2, 3)
  File "<stdin>", line 1
SyntaxError: non-keyword arg after keyword arg

That error is definitive which is great. What about using a named arg for an already declared positional argument? Another definitive error:

>>> add(1, a=2, c=3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: add() got multiple values for keyword argument 'a'

There are so many times in Ruby and in JavaScript(es6 w/destructuring) where I debate whether the arguments should be named or not. I don’t really have a good rhyme or reason to it other than just feel and percieved readability. To not have to think about it seems wonderful.

Creating named tuples in Python

Named tuples are an interesting, flexible and fun data structure in Python.

Sorta like an openstruct in Ruby, it’s a quick way to create a finite data structure that has named properties. The properties of a namedtuple are also accessible via index.

from collections import namedtuple
Thing = namedTuple('Thing', ['a', 'b'])
x = Thing(1, 2)
x.a
# 1
x[0]
# 1

You can create an instance of a named tuple with named arguments as well.

x = Thing(b=3, a=4)
# Thing(a=4, b=3)

The properties of a namedtuple can also be declared with a string rather than a list. That looks like this:

Row = namedtuple('Row', 'a b c d')

or comma separated:

Row = namedtuple('Row', 'a, b, c, d')

String Interpolation in python??

String interpolation is something that I’ve come to expect from a modern language. Does python have it?

Well, sorta. From this article python has a couple different ways of getting data into a string.

‘Old Style’

'Old Style is a cheap beer from %' % 'Chicago'

‘New Style’

'New Style seems a bit less {}'.format('obtuse')

‘f-Strings’ (from python 3.6)

adejective = 'better'
f'This is a bit {adjective}'

Template Strings

from string import Template
t = Template('This is maybe more like Ruby\'s $tech?')
t.substitute(tech='ERB')

Where is that python function defined?

If you want to look at the definition of a function in Python but you don’t know where it’s defined, you can access a special attribute of the function __globals__.

In the following case I have a function imported from nltk called bigram and I want to see what file it’s defined in:

>>> bigram.__globals__['__file__']
'.../python3.7/site-packages/nltk/util.py'

Ah, just as I suspected, it is defined in nltk/util.py

__file__ is an attribute of the globals dictionary.

The special attribute __globals__ is only available for user-defined functions. If you try to access it for a python defined function you’ll get an error:

>>> len.__globals__['__file__']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'builtin_function_or_method' object has no attribute '__globals__'

Python has tuples!

I love a good tuple and in JavaScript or Ruby I sometimes use arrays as tuples.

[thing, 1, "Description"]

In those languages however, this tuple isn’t finite. Wikipedia defines tuple thusly.

a tuple is a finite ordered list (sequence) of elements.

Python tuples look like this:

mytuple = (thing, 1, "Description")

And is it finite?

>>> mytuple.append("c")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'tuple' object has no attribute 'append'

No append attribute, ok. But can you add tuples together?

>>> a = ('a', 'b')
>>> a + ('c', 'd')
('a', 'b', 'c', 'd')
>>> a
('a', 'b')

You can add tuples together but it’s not mutative.

A syntax quirk is the one element tuple.

>>> type(('c'))
<class 'str'>
>>> type(('c',))
<class 'tuple'>

Include a comma after the only tuple element to ensure that the tuple is not tokenized as a string.

Get the ancestors of a python class

Python has inheritance, but if you encounter a Python object in a program, how can you tell what it’s superclasses are?

mro is an abbreviation for method resolution. Use the mro method on class.

>>> something = object() type(something).mro()
[<class 'object'>] 

This is just an object of type object, it’s only class is object. Let’s get a bit more complicated.

>>> class Fruit:
...     pass ...
>>> class Apple(Fruit):
...     pass ...
>>> Apple.mro()
[<class '__main__.Apple'>, <class '__main__.Fruit'>, <class 'object'>] 

OK, Apple inherits from Fruit which inherits from object. Makes sense!