Python's Star Operator

The star operator (*) can be used for more than just multiplication in Python. Using it appropriately can make your code cleaner and more idiomatic.

Where It's Used

Numeric Multiplication

For the sake of completeness, I’ll get multiplication out of the way. The simplest example is multiplying two numbers:

>>> 5 * 5
25

Repeating Elements

We can use the star operator to repeat characters in a string:

>>> 'a' * 3
'aaa'
>>> 'abc' * 2
'abcabc'

Or, repeating elements in lists or tuples:

>>> [1] * 4
[1, 1, 1, 1]
>>> [1, 2] * 2
[1, 2, 1, 2]
>>> (1,) * 3
(1, 1, 1)
>>> [(1,2)] * 3
[(1, 2), (1, 2)]

However, we should be very careful with (or even avoid) repeating mutable elements (like lists). To illustrate:

    >>> x = [[3, 4]] * 2
    >>> print(x)
    [[3, 4], [3, 4]]

So far so good.

    >>> x[1].pop()
    4
    >>> print(x)
    [[3], [3]]

What? When we repeat elements with the star operator, the different repeated elements refer to the same underlying object. This isn't a problem when the element is immutable. But as we saw above, it can lead to problems for mutable elements. A better way to repeat mutable elements is list comprehension:

    >>> x = [[3, 4] for _ in range(2)]
    >>> x[1].pop()
    4
    >>> print(x)
    [[3, 4], [3]]

Unpacking

If you understand containers and iterables in Python, unpacking will make intuitive sense. Otherwise, it might seem a little mysterious. So let's understand those first:

  • Container: Structures that contain primitive data types (like numbers and strings) and other containers. Lists, tuples and dictionaries are examples of containers in Python.
  • Iterable: The official Python glossary defines an iterable as "an object capable of returning its members one at a time." Any object whose elements you can iterate over using a for loop falls into this category. Thus, lists, tuples, dictionaries, strings and range, are all examples of iterables.

With those definitions out of the way, we can define unpacking simply as extracting elements from an iterable into its enclosing container. Based on this definition, try to guess the output of the following snippet:

>>> x = [*[3, 5], 7]
>>> print(x)

Here, the inner iterable is a list with 3 and 5, which is inside an outer list (container). Extracting the elements of the inner list into the outer list gives us:

>>> print(x)
[3, 5, 7]

There is nothing special about a list as an iterable. Some other examples:

>>> [1, 2, *range(4, 9), 10]
[1, 2, 4, 5, 6, 7, 8, 10]
>>> (1, *(2, *(3, *(4, 5, 6))))
(1, 2, 3, 4, 5, 6)

Note that an enclosing container must exist. For example, the following doesn't work:

>>> *[1, 2]
  File "<stdin>", line 1
SyntaxError: can't use starred expression here

Starred Expression and Extended Iterable Unpacking

Extended iterable unpacking sounds really complicated, but the syntax is quite straightforward. Suppose you wanted to write a function that took in an iterable as input, and returned all but the first element, as a list. Without using extended iterable unpacking (we'll get to that in a minute), you might write something like this:

def all_but_first(seq):
    it = iter(seq)
    next(it)
    return [*it]

Let's test this

>>> all_but_first(range(1, 5))
[2, 3, 4]

Perfect. Now let's use extended iterable unpacking.

def all_but_first(seq):
    first, *rest = seq
    return rest

Very clean! And if you test this, you'll see that this function is equivalent to the previous one.

Behind the Scenes

How does the same operator (*) perform so many different functions? To understand this, we need to dig deeper into Python. Everything in Python is an object. If you’re not familiar with the object-oriented programming paradigm, then you can think objects as self-contained entities that have properties (called attributes) and can perform actions (called methods), much like real-world objects. Objects are created using blueprints called classes. A class also has attributes and methods. But, just as the map is not the territory, the class is not the object—a class merely describes attributes and methods of its objects; the object actually has attributes and executes methods.

Multiplication and Repeating Elements

In Python, classes have special pre-defined “double underscore” methods. The most familiar one probably is the __init__ method that is used to initialize objects. They are also called dunder or magic methods. They are called magic methods becuase they are called behind the scenes and almost never called directly. For example, consider the following class:

class Person:
    def __init__(self, name):
        self.name = name

    def __call__(self):
        print(f'I am {self.name}.')

>>> oreo = Person('Oreo')
>>> kitkat = Person('Kit Kat')

Instantiating a Person object calls the __new__ method (for creating the object) and the __init__ method (for initializing the object) behind the scenes. And __call__ is a magic method which allows me to do the following:

>>> oreo()
I am Oreo.
>>> kitkat()
I am Kit Kat.

Which is the same as

>>> oreo.__call__()
I am Oreo.
>>> kitkat.__call__()
I am Kit Kat.

Cool! And as you might have guessed, the star operator also has an underlying magic method: __mul__. The following two are identical:

>>> 25 * 4
100
>>> (25).__mul__(4)
100

Thus, different objects display different behavior when the star operator is used on them because the underlying magic method __mul__ has different definitions in the corresponding class. For strings and lists:

>>> 'bana'.__mul__(3)
'banabanabana'
>>> [2].__mul__(4)
[2, 2, 2, 2]

Unpacking and Extended Iterable Unpacking

While __mul__ explains the magic behind multiplication and repeating elements, it does not explain unpacking or extended iterable unpacking. That makes sense since multiplication and repeating use * as a binary operator while the others use it as a unary operator: the underlying mechanics are probably different.

In order to explain these, I will use Python's dis module. It stands for "disassembler" and is used to get Python bytecode. The Python Glossary defines Python bytecode as "the internal representation of a Python program in the CPython interpreter". You'll see what I mean.

>>> import dis
>>> dis.dis('[1, *(2, 3)]')
  1           0 LOAD_CONST               0 (1)
              2 BUILD_LIST               1
              4 LOAD_CONST               1 ((2, 3))
              6 LIST_EXTEND              1
              8 RETURN_VALUE

This shows that the list [1] is first built and it is then extended with (2, 3). Kind of similar to:

>>> l = [1]
>>> l.extend((2, 3))
>>> l
[1, 2, 3]

This explains why we can do unpacking only inside containers—outside of containers, there wouldn't be anything to extend.

As for extended iterable unpacking, there's a special bytecode instruction called UNPACK_EX to do just that. To illustrate:

>>> dis.dis('a, *b = [1, 2, 3]')
  1           0 BUILD_LIST               0
              2 LOAD_CONST               0 ((1, 2, 3))
              4 LIST_EXTEND              1
              6 UNPACK_EX                1
              8 STORE_NAME               0 (a)
             10 STORE_NAME               1 (b)
             12 LOAD_CONST               1 (None)
             14 RETURN_VALUE

Conclusion

That’s it, folks! I really went down a rabbit hole. I hope you enjoyed reading that!