/ Functions

Python Advanced Functions

Functions make your code more compact, more readable, more portable, and easier to debug. In this tutorial, you will learn and explore advanced Python features for functions, and put these concepts into practice by analyzing a data set. While we'll revisit the basics a bit, you should already have experience with functions to get the most out of this content.


Basics of functions

Let's start by defining a simple function that takes a numerical argument and returns its square. Click Run in the editor below. The output will appear on the right-hand side.


Quick terminology reminder: an argument is a value that we provide to the function. Python will then assign that argument to a parameter, which is a variable inside the function. Our code above indicates the distinction, which will be useful for us in this tutorial.

Python gives us a lot of flexibility in how we can work with functions: we can assign them to variables, store them in lists, provide them as arguments to other functions, and store them as class attributes, among other operations. This is an important concept for getting the most out of functional programming in Python.

To illustrate this idea, we assign our square() function to a variable named func. This new variable will simply point to the original function. We can then use func() to call square(). Click Run to see this concept in action.


As a more challenging illustration, let's look at an example that shows how it can be useful to pass a function as an argument to another function. We code the following steps below.

  • Declare a function (let's call it apply()) that has two parameters, data and func.
  • The first parameter (data) will be a list of numbers. The second (func) will be a generic function.
  • Evaluate func() for each element of data.
  • Return the results as a list (we use a list comprehension to implement this step below).
  • Call apply()> using a list of numbers and square as arguments.
  • Assign the result to a variable named squares.

Go ahead and click the Run button.


The apply() function code works with any generic function that takes a single numerical input.


Our example is equivalent to using the Python map built-in function as follows.

numbers = [1, 4, 9, 16, 25]
list(map(sqrt, numbers))
[1, 2, 3, 4, 5]

Positional versus keyword arguments


Python functions accept different types of arguments, and it is useful to understand how we can leverage this. The key distinction is between keyword and positional arguments.

A keyword argument directly refers to the parameter that we want to assigned it to (as name inside the function). A positional argument is an argument that is not a keyword argument.

In the code below, we pass 10 as a positional argument and 2 as a keyword argument to a function called divide().


We can use any of these syntax options to obtain the same result.

divide(10, 2)
divide(10, denominator=2)
divide(numerator=10, denominator=2)
divide(denominator=2, numerator=10)

The key syntax rule to remember is that positional arguments must precede keyword arguments, unless they are optional (more on this soon).

# Incorrect syntax
divide(numerator=10, 2)
SyntaxError: positional argument follows keyword argument

Keyword arguments improve clarity and prevent errors by making our intentions more explicit. Compare the two options below:

divide(10, 2)
divide(numerator=10, denominator=2)

Keyword arguments also allow us to skip default arguments when calling the function, as we will now see.


Default arguments

To specify a default value for a parameter, we modify the function head by assigning the desired default argument to that parameter.


If the function call includes neither a positional nor a keyword argument, Python will use the default value instead.

When writing the function head, parameters that have default arguments must follow parameters that do not have default arguments.

# Incorrect syntax
def multiply(a=1, b):
    return  a*b
SyntaxError: non-default argument follows default argument

Below, we modify the hello() function to accept a customized greeting in addition to a name. We set default arguments for both. In this case, we must use a keyword argument if we want to provide an argument for the second parameter (name) only.


Required versus optional arguments


We can write functions that accept optional positional or keyword arguments by using the *args and **kwargs syntax. The following example illustrates this process.


The code creates a function that requires two positional arguments, x and y, and accepts any number of optional positional and keyword arguments. We can see that Python:

  • Assigns the optional positional arguments to a tuple named args.
  • Assigns the optional keyword arguments to a dictionary named kwargs.

We can then access the optional arguments by working with the args and kwargs variables as we normally would with any other tuples and dictionaries.

In the next example, we modify the code from above to retrieve and print a variable named z, if it appears as an optional keyword argument.


Let's look at another example that further develops the concept of optional arguments. Below, we create a function called add(), which sums as many optional positional arguments as we supply to the function.


As a related concept, we use tuples and dictionaries as an alternative way to provide multiple positional and keyword arguments to function. Let's look at an example.

def add(a, b):
    return a+b

x = (1,2)
c = add(*x) # equivalent to add(1,2)

z = {'b': 3, 'a': 2}
d = add(**z) # equivalent to add(b=3, a=2)

In the above snippet, the * before the tuple breaks it down into individual elements which then become positional arguments for the function. In the same way, the ** before the dictionary provides each individual item as a keyword argument, using the corresponding dictionary keys as the keywords. We refer to this as tuple or dictionary unpacking.

Sometimes, we may worry that users can make mistakes by confusing the order of positional arguments. You can prevent this problem by making keyword arguments compulsory for certain parameters. The basic syntax is as follows.

# The function will only accept a keyword argument for z, because it follows *
def required_keyword(x, y, *, z): 
    print(x, y, z)

# Correct
required_keyword(1, 2, z=3)

# Incorrect because we try to set z with a positional argument
required_keyword(1,2,3)

More generally, keyword arguments will be required for parameters that appear after *args in the function definition.


Anonymous functions


The lambda statement in Python is a shortcut for defining a function without the def syntax. This will create an anonymous function, as we do not assign a name to the function in the lambda syntax.

We typically use the lambda syntax as a convenient way to pass a simple single-expression function as an argument to a callable.

For example, suppose that we want to use our apply() function from earlier to square the values of a list of numbers. However, we do not wish to create a separate square function just for this purpose. The code below uses the lambda syntax to achieve this goal.

A common use of lambda expressions is to sort data structures such as dictionaries according to alternative keys. For example, suppose that we want to retrieve the items of a dictionary sorted by their values.

example_dict = {'a': 4,  'c' : 2, 'd': 1, 'b' : 3}

# The .items() method returns a tuple with the key/value pairs of the dictionary
sorted_by_values = sorted(example_dict.items(), key = lambda x: x[1])
print(sorted_by_values)
[('d', 1), ('c', 2), ('b', 3), ('a', 4)]

A lambda function behaves like any other function, as the following example illustrates.

You should use lambda functions sparingly, and make sure that they do not make your code harder to read. Do:

numbers = [1, 2, 3, 4]
squares = [x**2 for x in numbers]

Instead of:

numbers = [1, 2, 3, 4]
list(map(lambda x: x**2, numbers))

Decorators


Python decorators give us the flexibility to add generic capability to functions and other callables, without changing the functions themselves. A decorator can be powerful productivity tool, since it is reusable across multiple functions and projects.

To make this more concrete, suppose that we want to modify some of our existing functions so that they will automatically print an output, instead of running silently.

With this goal in mind, let's first make it clear that a decorator is a particular type of function that returns a modified version of another function.

The code below implements the following steps:

  • Define a function called print_output() (this will be our decorator), which takes a function as an argument and assigns it to a parameter called func.
  • Within the decorator, define another function called wrapper.
  • The wrapper calls func(), prints its output value, and returns the same output as the original function.
  • The print_output() decorator returns wrapper(), which is therefore a modified version of func().
def print_output(func):
    
    def wrapper(*args, **kwargs):
        print('Calling', func.__name__)
        result = func(*args, **kwargs)
        print('Output:', result, '\n')
        return result
    
    return wrapper

We used some advanced concepts here:

  • We defined a function inside a function.
  • We returned a function as the output of a function.

Let's apply our decorator to an example. One way to achieve this, based on familiar syntax, is as follows.


This version of the code assigns the modified version of the square() function to a new variable, square_with_output(). However, Python provides an abbreviated syntax such that we can apply a decorator directly to the function syntax. The code below illustrates the full process, using our print_output() decorator as an example.


We can go further and apply multiple decorators to a single function. In order to do this, we stack the decorators above the function definition. Python will then apply the decorators from bottom to top, as the next example illustrates.

def print_output(func):
    def wrapper(*args, **kwargs):
        result = func(*args, **kwargs)
        print('Output:', result)
        return result
    
    return wrapper

def print_input(func):    
    def wrapper(*args, **kwargs):
        result = func(*args, **kwargs)
        print('Arguments:', args, ', ', kwargs)
        return result
    
    return wrapper

def print_name(func):    
    def wrapper(*args, **kwargs):
        print('Calling', func.__name__)
        result = func(*args, **kwargs)
        return result
 
    return wrapper
    
    

@print_output
@print_input
@print_name
def add(a, b):
    return a+b

c = add(a=5, b=2)
Calling add
Arguments: () ,  {'a': 5, 'b': 2}
Output: 7

Because @print_name was the first decorator above the function, Python prints the function name first, followed by the input (the next decorator going up in the stack, @print_input), and finally the output (the top decorator).

As a final detail, the best practice for working with decorators is to add the functools.wraps decorator from the Python standard library to the wrapper. This will ensure that decorated function inherits the metadata from the original function.


Remove the @functools.wraps line from above and run the code again to check that without it, the decorated function loses the __name__ attribute of the original function.

Decorators are one of the most challenging concepts in Python, and you will soon master them with practice.


Annotations

Python allows us to add annotation information regarding the arguments and the output of a function. The annotations are arbitrary and in principle available for information only: Python does not use them in any way when we call the function.

The code below illustrates the syntax.


We can see that Python stores the annotations as a dictionary in the __annotations\__ attribute of the function.

The annotation syntax is compatible with default arguments. In our example, the default argument for z is one. The annotation (initiated with :) must precede the default argument.


Project: Summarizing a data set


In this project, you will code a function that reports a customized numerical summary of a data series. To illustrate this process, we will use data from the Kaggle House Prices knowledge competition. Our data file (which you can download from here) contains prices (in thousands of dollars) for houses sold in the city of Ames, Iowa. The house prices are the target variable in the competition.

Here is what the data look like.


Within a data science project, we may want to use a tool such as the function that you will develop as part of Exploratory Data Analysis (EDA), which is the process of describing and understanding a data set.

EDA

Farcaster at English Wikipedia via Wikimedia Commons

Step 1: Creating the function

Create a function called summary() according to the following specifications:

  1. The function should have one required argument, the data series (as a list of numerical values).

  2. The function should compute the number of observations (use the len built-in function) and the average (use the function import from below) of the series. Return the results as a dictionary.

  3. The function should have an optional parameter called extended with False as a default argument. If the user sets extended to True, the function should additionally compute the standard deviation (use the NumPy function as below), and the minimum and maximum value of the series (use the Python built-ins for the last two).

  4. The function should accept additional functions as optional keyword arguments. These functions should compute additional summary statistics as desired (see suggestions below).

# Average
from numpy import mean

# Standard deviation
from numpy import std

# Suggestion for an optional keyword argument
from numpy import median

Step 2: Using the function

  1. Load the data and assign it to a list called prices.

  2. Compute the summary statistics using the required and optional keyword arguments. Set extended to True.

  3. As a challenge, include lambda expressions among the optional keyword arguments (see hint below).

# For trying a lambda expression:
from numpy import percentile

# First quartile (25% of observations have value equal to or lower than the result)
percentile(price, 25) 

# Third quartile (75% of observations have value equal to or lower than the result)
percentile(price, 75) 

Step 3: Print an output

  1. Use the print_dict function defined below to print the summary statistics based on the output of summary.

  2. As a challenge, build on this code to create a decorator that prints the output of summary.

def print_dict(data):
    max_key_len = max([len(key) for key in data.keys()]) 
    for key, value in data.items():
        print(key, ' '*(max_key_len+1 - len(key)), round(value, 2))

Your turn: