7 Intro to Functions and Conditionals

import pandas as pd
pd.options.display.max_rows = 7

7.1 Intro

So far in this course you have mostly used functions written by others. In this lesson, you will learn how to write your own functions in Python.

7.2 Learning Objectives

By the end of this lesson, you will be able to:

Create and use your own functions in Python.
Design function arguments and set default values.
Use conditional logic like if, elif, and else within functions.

7.3 Packages

Run the following code to install and load the packages needed for this lesson:

# Import packages
import pandas as pd
import numpy as np
import vega_datasets as vd

7.4 Basics of a Function

Let’s start by creating a very simple function. Consider the following function that converts pounds (a unit of weight) to kilograms (another unit of weight):

def pounds_to_kg(pounds):
    return pounds * 0.4536

If you execute this code, you will create a function named pounds_to_kg, which can be used directly in a script or in the console:

print(pounds_to_kg(150))

68.04

Let’s break down the structure of this first function step by step.

First, a function is created using the def keyword, followed by a pair of parentheses and a colon.

def function_name():
    # Function body

Inside the parentheses, we indicate the arguments of the function. Our function only takes one argument, which we have decided to name pounds. This is the value that we want to convert from pounds to kilograms.

def pounds_to_kg(pounds):
    # Function body

Of course, we could have named this argument anything we wanted. E.g. p or weight.

The next element, after the colon, is the body of the function. This is where we write the code that we want to execute when the function is called.

def pounds_to_kg(pounds):
    return pounds * 0.4536

We use the return statement to specify what value the function should output.

You could also assign the result to a variable and then return that variable:

def pounds_to_kg(pounds):
    kg = pounds * 0.4536
    return kg

This is a bit more wordy, but it makes the function clearer.

We can now use our function like this with a named argument:

pounds_to_kg(pounds=150)

68.04

Or without a named argument:

pounds_to_kg(150)

68.04

To use this in a DataFrame, you can create a new column:

pounds_df = pd.DataFrame({'pounds': [150, 200, 250]})
pounds_df['kg'] = pounds_to_kg(pounds_df['pounds'])
pounds_df

	pounds	kg
0	150	68.04
1	200	90.72
2	250	113.40

And that’s it! You have just created and usedyour first function in Python.

Practice

7.4.1 Age in Months Function

Create a simple function called years_to_months that transforms age in years to age in months.

Use it on the riots_df DataFrame imported below to create a new column called age_months:

riots_df = vd.data.la_riots()
riots_df

	first_name	last_name	age	gender	race	death_date	address	neighborhood	type	longitude	latitude
0	Cesar A.	Aguilar	18.0	Male	Latino	1992-04-30	2009 W. 6th St.	Westlake	Officer-involved shooting	-118.273976	34.059281
1	George	Alvarez	42.0	Male	Latino	1992-05-01	Main & College streets	Chinatown	Not riot-related	-118.234098	34.062690
2	Wilson	Alvarez	40.0	Male	Latino	1992-05-23	3100 Rosecrans Ave.	Hawthorne	Homicide	-118.326816	33.901662
...	...	...	...	...	...	...	...	...	...	...	...
60	Elbert O.	Wilkins	33.0	Male	Black	1992-04-30	Western Avenue & 92nd Street	Gramercy Park	Homicide	-118.310004	33.952767
61	John H.	Willers	37.0	Male	White	1992-04-29	10621 Sepulveda Blvd.	Mission Hills	Homicide	-118.467770	34.263184
62	Willie Bernard	Williams	29.0	Male	Black	1992-04-29	Gage & Western avenues	Chesterfield Square	Death	-118.308952	33.982363

63 rows × 11 columns

7.5 Functions with Multiple Arguments

Most functions take multiple arguments rather than just one. Let’s look at an example of a function that takes three arguments:

def calc_calories(carb_grams, protein_grams, fat_grams):
    result = (carb_grams * 4) + (protein_grams * 4) + (fat_grams * 9)
    return result

calc_calories(carb_grams=50, protein_grams=25, fat_grams=10)

The calc_calories function computes the total calories based on the grams of carbohydrates, protein, and fat. Carbohydrates and proteins are estimated to be 4 calories per gram, while fat is estimated to be 9 calories per gram.

If you attempt to use the function without supplying all the arguments, it will yield an error.

calc_calories(carb_grams=50, protein_grams=25)

TypeError: calc_calories() missing 1 required positional argument: 'fat_grams'

You can define default values for your function’s arguments. If an argument is called without a value assigned to it, then this argument assumes its default value. Let’s make all arguments optional by giving them all default values:

def calc_calories(carb_grams=0, protein_grams=0, fat_grams=0):
    result = (carb_grams * 4) + (protein_grams * 4) + (fat_grams * 9)
    return result

Now, we can call the function with only some arguments without getting an error:

calc_calories(carb_grams=50, protein_grams=25)

Let’s use this on a sample dataset:

food_df = pd.DataFrame({
    'food': ['Apple', 'Avocado'],
    'carb_grams': [25, 10],
    'protein_grams': [0, 1],
    'fat_grams': [0, 14]
})
food_df['calories'] = calc_calories(food_df['carb_grams'], food_df['protein_grams'], food_df['fat_grams'])
food_df

	food	carb_grams	protein_grams	fat_grams	calories
0	Apple	25	0	0	100
1	Avocado	10	1	14	170

Practice

7.5.1 BMI Function

Create a function named calc_bmi that calculates the Body Mass Index (BMI) for one or more individuals, then apply the function by running the code chunk further below. The formula for BMI is weight (kg) divided by height (m) squared.

# Your code here

bmi_df = pd.DataFrame({
    'Weight': [70, 80, 100],  # in kg
    'Height': [1.7, 1.8, 1.2]  # in meters
})
bmi_df['BMI'] = calc_bmi(bmi_df['Weight'], bmi_df['Height'])
bmi_df

7.6 Intro to Conditionals: `if`, `elif`, and `else`

Conditional statements allow you to execute code only when certain conditions are met. The basic syntax in Python is:

if condition:
    # Code to execute if condition is True
elif another_condition:
    # Code to execute if the previous condition was False and this condition is True
else:
    # Code to execute if all previous conditions were False

Let’s look at an example of using conditionals within a function. Suppose we want to write a function that classifies a number as positive, negative, or zero.

def class_num(num):
    if num > 0:
        return "Positive"
    elif num < 0:
        return "Negative"
    else:
        return "Zero"

print(class_num(10))    # Output: Positive
print(class_num(-5))    # Output: Negative
print(class_num(0))     # Output: Zero

Positive
Negative
Zero

If you try to use this function the way we have done above for, for example the BMI function, you will get an error:

num_df = pd.DataFrame({'num': [10, -5, 0]})
num_df

	num
0	10
1	-5
2	0

num_df['category'] = class_num(num_df['num'])

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

The reason for this is that if statements are not built to work with series (they are not inherently vectorized); but rather work with single values. To get around this, we can use the np.vectorize function to create a vectorized version of the function:

class_num_vec = np.vectorize(class_num)
num_df['category'] = class_num_vec(num_df['num'])
num_df

	num	category
0	10	Positive
1	-5	Negative
2	0	Zero

To get more practice with conditionals, let’s write a function that categorizes grades into simple categories:

If the grade is 85 or above, the category is ‘Excellent’.
If the grade is between 60 and 84, the category is ‘Pass’.
If the grade is below 60, the category is ‘Fail’.
If the grade is negative or invalid, return ‘Invalid grade’.

def categorize_grade(grade):
    if grade >= 85 and grade <= 100:
        return 'Excellent'
    elif grade >= 60 and grade < 85:
        return 'Pass'
    elif grade >= 0 and grade < 60:
        return 'Fail'
    else:
        return 'Invalid grade'

categorize_grade(95)  # Output: Excellent

'Excellent'

We can apply this function to a column in a DataFrame but first we need to vectorize it:

categorize_grade = np.vectorize(categorize_grade)

grades_df = pd.DataFrame({'grade': [95, 82, 76, 65, 58, -5]})
grades_df['grade_cat'] = categorize_grade(grades_df['grade'])
grades_df

	grade	grade_cat
0	95	Excellent
1	82	Pass
2	76	Pass
3	65	Pass
4	58	Fail
5	-5	Invalid grade

Practice

7.6.1 Age Categorization Function

Now, try writing a function that categorizes age into different life stages as described earlier. You should use the following criteria:

If the age is under 18, the category is ‘Minor’.
If the age is greater than or equal to 18 and less than 65, the category is ‘Adult’.
If the age is greater than or equal to 65, the category is ‘Senior’.
If the age is negative or invalid, return ‘Invalid age’.

Use it on the riots_df DataFrame printed below to create a new column called Age_Category.

# Your code here

riots_df = vd.data.la_riots()
riots_df

	first_name	last_name	age	gender	race	death_date	address	neighborhood	type	longitude	latitude
0	Cesar A.	Aguilar	18.0	Male	Latino	1992-04-30	2009 W. 6th St.	Westlake	Officer-involved shooting	-118.273976	34.059281
1	George	Alvarez	42.0	Male	Latino	1992-05-01	Main & College streets	Chinatown	Not riot-related	-118.234098	34.062690
2	Wilson	Alvarez	40.0	Male	Latino	1992-05-23	3100 Rosecrans Ave.	Hawthorne	Homicide	-118.326816	33.901662
...	...	...	...	...	...	...	...	...	...	...	...
60	Elbert O.	Wilkins	33.0	Male	Black	1992-04-30	Western Avenue & 92nd Street	Gramercy Park	Homicide	-118.310004	33.952767
61	John H.	Willers	37.0	Male	White	1992-04-29	10621 Sepulveda Blvd.	Mission Hills	Homicide	-118.467770	34.263184
62	Willie Bernard	Williams	29.0	Male	Black	1992-04-29	Gage & Western avenues	Chesterfield Square	Death	-118.308952	33.982363

63 rows × 11 columns

Side Note

7.6.2 Apply vs Vectorize

Another way to use functions with if statements on a dataframe is to use the apply method. Here is how you can do the grade categorization function with apply:

grades_df['grade_cat'] = grades_df['grade'].apply(categorize_grade)
grades_df

	grade	grade_cat
0	95	Excellent
1	82	Pass
2	76	Pass
3	65	Pass
4	58	Fail
5	-5	Invalid grade

The vectorize method is easier to use with multiple arguments, but you will encounter the apply method further down the road.

7.7 Conclusion

In this lesson, we’ve introduced the basics of writing functions in Python and how to use conditional statements within those functions. Functions are essential building blocks in programming that allow you to encapsulate code for reuse and better organization. Conditional statements enable your functions to make decisions based on input values or other conditions.

7.1 Intro

7.2 Learning Objectives

7.3 Packages

7.4 Basics of a Function

7.4.1 Age in Months Function

7.5 Functions with Multiple Arguments

7.5.1 BMI Function

7.6 Intro to Conditionals: if, elif, and else

7.6.1 Age Categorization Function

7.6.2 Apply vs Vectorize

7.7 Conclusion

7.6 Intro to Conditionals: `if`, `elif`, and `else`