For a while there was a slightly uncomfortable situation inside the python ecosystem. On the one hand, any newcomer in the python world is rather soon exposed to the “there should be one– and preferably only one –obvious way to do it” philosophy. On the other hand, a brutal violation of this philosophy can be seen, when one visits the python download page. Should I get 2.7 or 3 something? Tough choice…

For me, as for lots of python users, it wasn’t really a problem. I have always used python 2.7, kinda for one silly reason – the way how the print statement was a statement, and not a function (which, as you probably know, was changed in the 3 series). Any new features that went into the 3 series, weren’t enough to make me change my habit. As you probably suspect already, this was up until now.

You may wonder what is the killer feature that made me that enthusiastic about python 3? There are actually two of them. The first one is the ability to pinpoint memory leaks with the tracemalloc module. It is present in the 3 series for some time (with some painful setup is usable also in 2.7), but alone wasn’t enough to make me consider python 3. The second one was added in the latest 3.6 release – formatted string literals. I’ll cover both of them in this post.

The tracemalloc module

If you want to optimize your code for execution speed story was dead simple for a long time. You import cProfile, run your code under it and view results (e.g. using snakeviz). If you need finer granularity of results on the source code level (i.e. measurements for given line in your source file instead of function level stats) you go for the ‘line_prof’ package. Easy and efficient.

To learn where and how much memory was allocated, you could use the memory_profiler package. The printout was informative, but measurement came with a significant cost of slower code execution (in the snippet shown below I’ve measured the slowdown to be of the order of 20 times). The situation got better with the introduction of the tracemalloc module, where the slowdown is significantly lower (2x measured on the same code). As usual, we’ll see the usage with an example:

#! /usr/bin/env python
import tracemalloc
tracemalloc.start()

def main():
    test = {}
    for x in range(10000):
        test[x] = str(x)*100

    snapshot = tracemalloc.take_snapshot()
    for stat in snapshot.statistics('lineno')[:5]:
        print(stat)

if __name__ == "__main__":
    main()

Running this should yields the following output:

test_tm.py:8: size=4564 KiB, count=10001, average=467 B
test_tm.py:7: size=266 KiB, count=9743, average=28 B
test_tm.py:5: size=712 B, count=2, average=356 B

which immediately tells us, what is the line causing most of the memory allocations.

Tracemalloc is also available for python 2.7, but installation requires recompiling python from source with some patches applied. Not very complicated but time-consuming enough to prevent me from doing it apart from one single occasion when I was up against a wall. So – having tracemalloc as a standard module of python 3 gives you a great deal of functionality without any struggle.

Formatted string literals

The second feature I would like to advertise today is string interpolation also known as formatted string literals, added in the latest python release (3.6). The idea is quite simple but very convenient and powerful – now you can embed python expressions inside the string literals. Among others, this allows you to put local variables into string without calling the ‘format’ function or using ‘%’ notation. The following example illustrates this:

def main():
    val1 = 1
    val2 = 3
    val3 = Exception("Whoops!")
    todo = [f"Reference local variable: {val1}",
            f"Divison of two variables: {val1/val2}",
            f"Divison with format specifier: {val1/val2:.2f}",
            f"Function calls are OK: {list(map(lambda x: x*x, [1,2,3,4]))}",
            f"Exception caught: {val3} # standard, using str",
            f"Exception caught: {val3!r} # using repr"]

    for t in todo:
        print(t)

if __name__ == "__main__":
    main()

This should produce output as follows:

Reference local variable: 1
Divison of two variables: 0.3333333333333333
Divison with format specifier: 0.33
Function calls are OK: [1, 4, 9, 16]
Exception caught: Whoops! # standard, using str
Exception caught: Exception('Whoops!',) # using repr

The used notation is pretty neat and compact. All you have to do is start your string definition with ‘f’ and embed any number of valid python expressions inside curly braces. You can also use standard format definitions (as you would do using the ‘format’ function after a colon; see line 7 in the source code above). It is worth noting, that you can also control the way given variable is converted to a string. By default, the ‘str’ function is used. You can force python to use ‘repr’ (or ‘ascii’) functions by adding !r (or !a for ‘ascii’) after the expression, for example ‘{val3!r}’ (this is what we did in line 10 of the example above). One last thing worth noting is that you cannot use the ‘!’ and ‘:” characters inside your embedded expressions (given their special purpose). The only exception is the `!=` operator. Edit: you can actually use the ‘!’ and ‘:’ characters inside your expression as long as you nest them inside of parentheses (as in line 8 of the example above).

Overall, the string interpolation was a feature, that I was missing in python from the moment I was first introduced to the idea when doing some experimentation with scala programming language.

Wrap up

“Critical mass” is a term that may refer to several different phenomena. In physics it means the (smallest) amount of material allowing a sustained nuclear chain reaction. The situation with the number of new features added to the python 3 (and deliberately not added to the 2 series) reminds me of this term in its first, physics related, meaning. With this release, the amount of goodies that I would miss by sticking to the 2.7 release is simply too large. I guess it may be the same also for others, maybe to a point, that soon we start seeing a growing number of python 3 only packages uploaded to pypi. So it’s really the moment to say ‘thank you’ to 2.7, and move to the 3 series.

p.s. Python 3.6 is more than 6 months old now. I’ve just recently learned about the string interpolation feature by watching this great presentation from pycon2017:


Look ma, I made a browser game! MongoDB for Developers (python flavour) - course review

  1. > One last thing worth noting is that you cannot use the ‘!’ and ‘:” characters inside your embedded expressions (given their special purpose). The only exception is the `!=` operator.

    You did use the “:” character inside the embedded expression in your lambda example 🙂

Leave a Reply

Your email address will not be published.