/ Functions

Python Functions​

When writing a program, you could just keep rewriting the same code, but that would be quite tedious, and result in an unnecessarily large program. Moreover, functions are self-contained routines that perform a specific task that you can incorporate into your program. After declaring the function you are free to use the function anytime in an effort to save time and resources.

A function takes in values, transforms those values, and returns outputs, like so:

220px-Function_machine2
Image: http://www.ComputerScienceWiki.org

In this lesson, we will:

  • Cover the syntax and structure of a function.
  • Demonstrate how to use a function.
  • Define arguments and discuss the different kinds of arguments.
  • Develop a function that takes in a set of data, extracts the salaries from the data, and returns the average salary for White House Staff.

The anatomy of a function

When setting up a function in Python, you create a function by defining it. Furthermore, to define a function in Python, you need the following:

  • def followed by a function name.
    • PEP8 is the standard style guide for Python and suggests function names should be lowercase, with words separated by underscores as necessary to improve readability.
  • A pair of parentheses that contains parameters (more on parameters in a bit).
  • A colon (:) at the end.

Below is an example of the basic structure of a function. Don't worry if it doesn't make a lot of sense yet, we'll be walking through each piece.

def function_name(parameters): 
    a = 2 + 2
    b = 2 * a
    c = 3 * a
    return None

A function is one type of code block in Python. A code block is a group of statements that perform a given task. A block begins with a header, followed by a group of indented statments (more on statements below). This tells the developer and everyone else reading the code that a group of statements belong to the given header.

Now let's look at each piece of the function.

def function_name(parameters):

As mentioned, this is the function name and where you include the parentheses, which contains your functions parameters. Parameters are placeholders for the actual variables we are using in the function.

    a = 2 + 2
    b = 2 * a
    c = 3 * a

These are what we call statements. Statements are steps the function performs to transform the parameters into a result returned by the function. You can include if/else statements, calculations, or return statements. Speaking of return statements...

return None

return statements end the execution of the function and "return" the result or the value of the expression following the statement. In this case, we're telling the function that we want the function to have a data type of type None. None is a special data type that signals that the function doesn't hold any value.

Every function needs at least one return statement. In addition to returning None, you can return one variable or multiple variables. When a function has multiple arguments, the function returns a tuple. For the remainder of the tutorial, we'll be seeing functions that return a single variable.

You may wonder what happens if you don't return None or any parameters. The following script is an example of a function that does not tell the function what to return.

def addition(a,b):
    c = a + b
    return

print(addition(5,8))

In the above script, you'd probably expect the script to print 13. You would be right ... except, we didn't tell the function to return anything. If a return statement has no parameters, it is the same as returning None. The function will also return None if you don't include a return statement at all.

Writing your own Function

Let's practice writing our own function, func_name, that has no parameters and returns None. Later on in this tutorial, we'll create a more complex function.

In the code editor below:

  • Start off building your function with def.
  • Follow def with your function name. In this case, it's func_name.
  • Include a pair of empty parenthesis for arguments.
  • Include your colon (:) after the closing parenthesis.
  • Press enter to begin writing the function.
  • Indent to include statements to tell the function what to do. By default, the editor will indent for you. Since we're just having this function return none, you won't need to write anything.
  • At the end of the function, include return None.

If you look under variables on the right hand side, you should see your function name. If you click the arrow, you should see function (<class function>). If you do, excellent job! You just created your first function all by yourself.

However, the function looks a tad boring. Now that we have a handle on the structure of a function, let's use a concrete example to build a more complex function.

Loading in and previewing the data

46895_whitehousesouthfacadef-1
Source: http://www.thedp.com

We'll be using salary data for the 2010 White House staff, and will build a function that calculates the average salary. You can find the data by clicking here. Once there, use your mouse to right-click on the Raw button and click Save Link As. Make sure the format is comma-separated values.

p2VsaYmrVC

Let's load in and print out a sample of our data set using the following code.

Before we load in the data, we should briefly discuss modules. Modules are simply source files that detail functions, classes, and other useful tools. Since we're working with a CSV file, we'll be working with the CSV module, which explicitly exists to loop through and split the data to get data from individual columns.

In the below code we:

  • Import the CSV module.
  • Open the Congress_White_house.csv file.
    • With the file open, create a csv.reader() object.
    • Separate the list on the , delimiter.
  • Assign the list containing the csv.reader() object to a variable named wh.
  • Print the first six items of the list.
import csv 
#import csv module

with open("Congress_White_House.csv", 'r') as f:  
# Reads the file into a variable called f
    
    wh = list(csv.reader(f, delimiter=",")) 
    # csv reader object spliting our data f on the ',' delimiter
    # list made up of the csv.reader object

print(wh[:6]) 
#prints the first six items of the list

Running the previous code, we get:

[['Employee Name', 'Employee Status', 'Salary', 'Pay Basis', 'Position Title'], ['Abrams, Adam W.', 'Employee', '66300.00', 'Per Annum', 'WESTERN REGIONAL COMMUNICATIONS DIRECTOR'], ['Adams, Ian H.', 'Employee', '45000.00', 'Per Annum', 'EXECUTIVE ASSISTANT TO THE DIRECTOR OF SCHEDULING AND ADVANCE'], ['Agnew, David P.', 'Employee', '9 3840.00', 'Per Annum', 'DEPUTY DIRECTOR OF INTERGOVERNMENTAL AFFAIRS'], ['Albino, James ', 'Employee', '91800.00', 'Per Annum', 'SENIOR PROGRAM MANAGER']]

Let's format this data into a table so it's easier to read. The first list within the list contains the column names:

'Employee Name', 'Employee Status', 'Salary', 'Pay Basis', 'Position Title'

Based on the above, our table looks like this:

Employee Name Employee Status Salary Pay Basis Position Title
Abrams, Adam W. Employee $66300.00 Per Annum WESTERN REGIONAL COMMUNICATIONS DIRECTOR
Adams, Ian H. Employee $45000.00 Per Annum EXECUTIVE ASSISTANT TO THE DIRECTOR OF SCHEDUL...
Agnew, David P. Employee $93840.00 Per Annum DEPUTY DIRECTOR OF INTERGOVERNMENTAL AFFAIRS
Albino, James Employee $91800.00 Per Annum SENIOR PROGRAM MANAGER
Aldy, Jr., Joseph E. Employee $130500.00 Per Annum SPECIAL ASSISTANT TO THE PRESIDENT FOR ENERGY ...

Some things to keep in mind before start building our function that computes the average salary:

  • The salaries are in the third column of the data set.
  • The first row of the data set is the header.
  • You can use len() to find the number of items in a list.
  • You can use sum() to compute the sum of a list.

Using the above pieces of information, we can build our function.

In the code editor below:

  • Use def to start building your function.
  • Following PEP8 standards, name our function avg_sal.
  • Insert a pair of parentheses with df being your parameter followed by a colon(:) after your closing parenthesis.
  • In the body of the function, create an empty list using list(), and set it to the variable salary.
  • Use a for loop to iterate through the salary column to populate the empty list.
    • Use the .append() method to add each salary to the empty list.
  • Outside of the loop, have the function return the average salary. That is, return sum(salary) / len(salary).

Ah! That's much better! Now we can use the function to compute the average salary of any data frame where the salary exists in the third column. Below, we'll discuss how to call or use functions.

Using functions

When using, or calling, the function, we list the function's name followed by a pair of parentheses and all its parameters within the parentheses.

With that in mind, to call the function to compute the mean salary for White House staff, we neet to call the print function to print what is inside the parenthsis. The code to call the print function is:

print(avg_sal(wh))

At this point, you may be thinking, "What's wh doing within the parentheses? Shouldn't the variable be df?". Remember, df is our parameter e.g., placeholder variable. As for wh, wh is what we call an argument.

What are arguments?

As we've said, parameters are defined as placeholders for the actual variables used in the function. We use parameters since functions are generalized blocks of code that can be reused and adapted for different situations as needed. Arguments are the actual values of variables when using the function. In other words, arguments are the actual values assigned to the parameters when the function is called.

In this section, we'll cover three types of arguments, and look at some examples:

  • Required Arguments
  • Keyword Arguments
  • Default Arguments

Required Arguments

Required arguments are necessary for the function to execute. The number of arguments supplied when using the function should match the number of parameters as defined in the function's definition. Put concretely, if you define your function with two parameters, your Python Interpreter will expect two arguments when executing your function.

For example, if you pass through one argument when executing our function,PowerC, below:

def PowerC(a, b):
    return None
    
## Call our function
print(PowerC(2))

We would receive the following error:

File "/Users/randallhall/blog.py", line 24, in <module>
    print(PowerC(3))
TypeError: PowerC() missing 1 required positional argument: 'b'

You might also see the following:

File "/Users/randallhall/blog.py", line 24, in <modulre>
    print(PowerC(3))
TypeError: PowerC() takes exactly 2 arguements (1 given)

We receive an error because our function expects two arguments since we defined the function with two parameters, and we only supplied one argument.

Keyword Arguments

When you use keyword arguments in a function call, you can identify the arguments by the keywords defined in the function definition. By default, your Python Interpreter recognizes your arguments in positional order as defined in the function definition. Keyword arguments allow the flexibility of placing the arguments out of order. Moreover, these keywords are mapped to the function's parameters so the function can identify the corresponding values.

def PowerC(a, b):
    
## Now call the function
print(PowerC(b = 3, a = 6))

Default Argument

When defining the function, you can also set the default value of the parameter and you can omit the argument when using the function. This type of argument is known as a default argument. If the parameter is defined in the function but the argument is not found when using the function, the function is still executed since the parameter has a pre-defined value.

def PowerC(a, b = 3):
    return None
    
## Now call the function
# Only provides one argument because b is declared to be three by default
print(PowerC(5))

Let's tie this all together with a quick project. We can use our avg_sal function to compute the average salary of the 2010 White House staff using just one line of code.

Project: average white house salary

Considering functions are a generalized code block, we still have to make sure everything we supply as an argument is defined in our program. Moreover, our task of computing the average salary would not be complete without a variable that holds our data. With that in mind, let's build our function and see the average salary in the White House.

Here are the steps you're going to want to take to load in the data set:

  • Import the csv library.
  • Open the Congress_White_house.csv file.
  • With the file open, create a csv.reader() object.
    • Separate the list on the , delimiter.
    • Assign the list containing the csv.reader() object to a variable named wh.

Now here are the steps you are going to want to take to create our function:

  • Use def to start building your function.
  • Following PEP8 standards, name our function avg_sal.
  • Insert a pair of parentheses with df being your parameter followed by a colon.
  • In the body of the function, create an empty list using list(), and set it to the variable salary.
  • Use a for loop to iterate through the salary column to populate the empty list.
    • Use the .append() method to add each salary to the empty list.
  • Outside of the loop, have the function return the average salary. That is, return sum(salary) / len(salary).

Below is what your code should look like but you should try this on your own before consulting the solution.

import csv

with open("Congress_White_House.csv", 'r') as f:
    wh = list(csv.reader(f, delimiter=","))

def avg_sal(df):
    salary = list()
    for item in df[1:]:
        salary.append(item[2])
    return sum(salary) / len(salary)

## Function call
print(avg_sal(wh))

Running the above code should return an average salary of ~$82721.34.

Next steps

As you can see, functions are an integral part of programming and can drasitcally reduce program sizes. What we've covered here is just the tip of the iceberg. Our Python Programming Beginner course at Dataquest provides a much more in-depth scope of functions.

With Dataquest, you get hands-on experience through projects learning more about functions as well as other programming concepts. In addition, you can see how functions fits into the field of data science. Our interactive courses will equip you with the skills you need to become proficient in data science, analysis, and engineering.