/ loops

Python Loops

Loops, simply put, are sets of instructions that repeat, and are one of the most basic structures of computer programming. Typically, loops check if a condition is met and perform an action, such as printing a statement.

Basic-Loops-example-final-1

In this tutorial, we'll use loops to analyze a data set of the Top Spotify Tracks of 2017. If you've ever listened to Spotify or another music streaming app, you're already familiar with a couple different kinds of loops:

  • Loops that execute a statement while a specific condition is true. You choose the "play" option. The music plays until you select "stop". Once "stop" is selected, the music loop is terminated.

  • Loops that execute a statement for a specific list of elements. You listen to a playlist with a finite set of songs. After each song has been played once, the music loop is terminated.

  • Loops that execute a statement for a certain number of times. You decide not to pay for a subscription. The app allows you to listen to five songs and then an ad plays.

Spotify

Let's use Python loops to gather more information about Spotify's most popular songs of 2017. Here's a preview of the first couple columns we'll be working with:

id - Spotify URI of the song.
name - Name of the song.
artists - Artist(s) of the song.
danceability - A measure of how suitable the track is for dancing. A value of 1.0 is the most danceable.

You can find the full data set here. Below are the first five rows:

id,name,artists,danceability,energy,key,loudness,mode,speechines, acousticness,instrumentalness,liveness,valence,tempo,duration_ms,time_signature
7qiZfU4dY1lWllzX7mPBI,Shape of You,Ed Sheeran,0.825,0.652,1,-3.183,0,0.0802,0.581,0,0.0931,0.931,95.977,233713,4
5CtI0qwDJkDQGwXD1H1cL,Despacito - Remix,Luis Fonsi,0.694,0.815,2,-4.328,1,0.12,0.229,0,0.0924,0.813,88.931,228827,4
4aWmUDTfIPGksMNLV2rQP,Despacito (Featuring Daddy Yankee),Luis Fonsi,0.66,0.786,2,-4.757,1,0.17,0.209,0,0.112,0.846,177.833,228200,4
6RUKPb4LETWmmr3iAEQkt,Something Just Like This,The Chainsmokers,0.617,0.635,11,-6.769,0,0.0317,0.0498,1.44E-05,0.164,0.446,103.019,247160,4

The data is in a csv (comma separated vales) format — each record is separated by a comma and each row begins on a new line. You'll notice that the first row in the data set contains the column headers (id, name, artists, danceability, etc). Each row following contains the data for one specific song. For example, the second row contains the data for Ed Sheeran's "Shape of You."

We can read in the data set with the csv package, which allows us to split the records in the file on the comma separators, and store it in the songs variable:

import csv
#import the csv library
file = open("C:\Users\Julie\Documents\Top Spotify Tracks of 2017.csv","r")
#Open the Top Spotify Tracks of 2017 csv file. Replace the path above with the path from your local drive.
songs = csv.reader(file, delimiter = ",")
#Create a new csv.reader object. Pass in the argument 'delimiter = ","' to split the records on the commas separating them.
songs = list(songs)
#Call the list type to get all the rows in the file.

Before we dive into this data, let’s look at a function commonly used in loops.

The range function

The range function is used to generate a sequence of numbers. We can control the sequence created with the parameters below:

range(start, stop[,step])

  • start: The sequence begins with this number.
  • stop: The sequence ends with the number up to, but not including, this number.
  • step: The numbers in the sequence are incremented by this number.

The stop parameter is the only required input. If not specified, the starting number will default to 0 and each subsequent number will increase by 1.

As an example:

range(5)

Generates:

[0,1,2,3,4]

However, if we set the starting number to 1 and the step to 2:

range(1,5,2)

The range function will generate:

[1,3]

We can also use the range function to generate a sequence of decreasing numbers by inputting a negative step parameter:

range(5,1,-2) 

Below is the output:

[5,3]

Let's practice using the range function!

Instructions:

  • Generate a sequence of consecutive numbers from 1 to 10 (including 10). Store the results in one_ten. Print one_ten.
  • Generate a sequence of only even numbers from 1 to 10 (not including 10).Store the results in evens. Print evens.

For loops

The range function is commonly used in for loops: loops that execute a statement for a specific number of times.

All for loops must follow the format below:

for item in list:
    execute this statement

Don’t forget to indent the second line. If you forget, you’ll get an error when you run your code!

Below is an example of a for loop:

for x in range(1,4):
#iterates through each number in range(1,4)
    print (x) 
    #prints each number

Here is the output:

1
2
3

Range-example-updated

The first time the loop is executed, x is set to 1, the first number returned by range(1,4), and the indented statement is executed:

  print (1)

The second time the loop is executed, x is set to 2, the second number returned by range(1,4), and the indented statement is executed:

  print (2)

The third time the loop is executed, x is set to 3, the last number returned by range(1,4), and the indented statement is executed one final time:

  print (3)

Remember, the sequence returned by the range function will include all numbers up to, but not including, the stop parameter.

We can also use for loops to iterate over a specific list of elements. Let's look at an example involving our data set. As a reminder, our columns are:

id - Spotify URI of the song.
name - Name of the song.
artists - Artist(s) of the song.
danceability - A measure of how suitable the track is for dancing. A value of 1.0 is the most danceable.

And we saved our data in the songs variable. First, let's print the first five rows of songs to get a feel for our dataset:

print(songs[:5])

Below is the output:

[['id', 'name', 'artists', 'danceability', 'energy', 'key', 'loudness', 'mode', 'speechiness', 'acousticness', 'instrumentalness', 'liveness', 'valence', 'tempo', 'duration_ms', 'time_signature'], ['7qiZfU4dY1lWllzX7mPBI', 'Shape of You', 'Ed Sheeran', '0.825', '0.652', '1', '-3.183', '0', '0.0802', '0.581', '0', '0.0931', '0.931', '95.977', '233713', '4'], ['5CtI0qwDJkDQGwXD1H1cL', 'Despacito - Remix', 'Luis Fonsi', '0.694', '0.815', '2', '-4.328', '1', '0.12', '0.229', '0', '0.0924', '0.813', '88.931', '228827', '4'], ['4aWmUDTfIPGksMNLV2rQP', 'Despacito (Featuring Daddy Yankee)', 'Luis Fonsi', '0.66', '0.786', '2', '-4.757', '1', '0.17', '0.209', '0', '0.112', '0.846', '177.833', '228200', '4'], ['6RUKPb4LETWmmr3iAEQkt', 'Something Just Like This', 'The Chainsmokers', '0.617', '0.635', '11', '-6.769', '0', '0.0317', '0.0498', '1.44E-05', '0.164', '0.446', '103.019', '247160', '4']]

Notice that songs is a list of lists. Each row from the csv file, containing the data for one song, is contained in a list. Then, each list containing the data for one song is combined into the songs list. Remember, brackets are used to signify the beginning and end of a list: [this,is,a,list]. Because songs is a list of lists, it begins and ends with two brackets: [[list 1]...[list 10]].

We can use the following for loop to print the song name:

for song in songs:
#iterates through each of the lists in songs
    print (song[1])
    #prints the second element in each list

Here are the first five values printed:

name
Shape of You
Despacito - Remix
Despacito (Featuring Daddy Yankee)
Something Just Like This

For-loops-first-example

The first time the loop is executed, it works with the first list:

['id', 'name', 'artists', 'danceability', 'energy', 'key', 'loudness', 'mode', 'speechiness', 'acousticness', 'instrumentalness', 'liveness', 'valence', 'tempo', 'duration_ms', 'time_signature']

In the list above, the second element, song[1], is 'name'. As a result, 'name' is the first value printed.

The for loop then moves onto the second list:

['7qiZfU4dY1lWllzX7mPBI', 'Shape of You', 'Ed Sheeran', '0.825', '0.652', '1', '-3.183', '0', '0.0802', '0.581', '0', '0.0931', '0.931', '95.977', '233713', '4']

In the list above, the second element, song[1], is 'Shape of You'. As a result, 'Shape of You' is the second value printed.

For loops continue to repeat until a certain condition is met. In this case, the loop ends when it finishes iterating through all of the lists in songs.

What if we wanted to print out the song name, artist, and danceability score, instead of just the song name? We can do that with the range function:

for song in songs:
    for x in range(1,4):
        print (song[x])

For-Loops-Second-Example

We know from practicing above that for song in songs signifies that the indented statements will execute for each list in songs. We also know that for x in range(1,4) will return the following values for x:

1
2
3

If we substitute the x values into the print statement, we'll get these three print statements:

    print song[1] 
#prints the second element of the song list - name
    print song[2] 
#prints the third element of the song list - artist
    print song[3] 
#prints the fourth element of the song list - danceability

As a result, the output of this for loop consists of elements 1,2, and 3 of each list or the name, artists, and danceability of each song. Below are the first couple lines of the output:

name
artists
danceability
Shape of You
Ed Sheeran
0.825

Let's practice writing for loops next.

Instructions:
The data for the first five songs in our data set is stored in first_songs. In the editor below:

  • Use a for loop to iterate through the songs in first_songs.
  • Print the artist for each song.

While loops

Another type of loop, the while loop, executes a piece of code while a condition is true.

All while loops must follow this format:

While statement is True:
     execute this statement

Be careful that you don’t write a while statement that always evaluates to True. This will result in an infinite loop, a loop that never ends!

Here's a basic example:

counter = 0
while counter < 5:
    print (“Hello”)
    counter = counter + 1

The code above would print the following:

Hello
Hello
Hello
Hello
Hello

Let's break down the while statement above.

While-loop-diagram

In the first iteration of the loop, counter = 0. Because 0 < 5 is True, the indented statement is executed:

    print (“Hello”)
    counter = counter + 1

Because we ended the first loop by adding 1 to counter, in the second iteration, counter = 1. Because 1 < 5 is True, the indented statement is again executed:

    print (“Hello”)
    counter = counter + 1

The loop continues until the while statement evaluates to False. In our example, this will happen when counter = 5, because 5 < 5 is not True.

Now let's look at an example using our Spotify data set. As a reminder, songs is our dataset reformatted into a list of lists and each list contains the data for one song. We can use the following while loop to print the first five lists:

counter = 0
while counter < 5:
    print (songs[counter])
    counter += 1 
        #counter += 1 is another way to write counter = counter + 1

Here is the output:

['id', 'name', 'artists', 'danceability', 'energy', 'key', 'loudness', 'mode',   'speechiness', 'acousticness', 'instrumentalness', 'liveness', 'valence', 'tempo', 'duration_ms', 'time_signature']
['7qiZfU4dY1lWllzX7mPBI', 'Shape of You', 'Ed Sheeran', '0.825', '0.652', '1', '-3.183', '0', '0.0802', '0.581', '0', '0.0931', '0.931', '95.977', '233713', '4']
['5CtI0qwDJkDQGwXD1H1cL', 'Despacito - Remix', 'Luis Fonsi', '0.694', '0.815', '2', '-4.328', '1', '0.12', '0.229', '0', '0.0924', '0.813', '88.931', '228827', '4']
['4aWmUDTfIPGksMNLV2rQP', 'Despacito (Featuring Daddy Yankee)', 'Luis Fonsi', '0.66', '0.786', '2', '-4.757', '1', '0.17', '0.209', '0', '0.112', '0.846', '177.833', '228200', '4']
['6RUKPb4LETWmmr3iAEQkt', 'Something Just Like This', 'The Chainsmokers', '0.617', '0.635', '11', '-6.769', '0', '0.0317', '0.0498', '1.44E-05', '0.164', '0.446', '103.019', '247160', '4']

While-Loops

In the first iteration of the loop, counter = 0. Because 0 < 5 is True, the indented statement is executed:

    print songs[0]
     #prints the first list in songs 
    counter += 1
     #add 1 to counter

You'll notice in the output above that the first list printed is just the column headers. This is true because the list of column headers is the first list in songs.

Because we ended the first loop by adding 1 to counter, in the second iteration, counter = 1. Because 1 < 5 is True, the indented statement is again executed:

    print songs[1]
     #prints the second list in songs
    counter += 1
     #add 1 to counter

The second list in songs contains the data for Ed Sheeran's "Shape of You", so this list is printed next.

The loop continues until the while statement evaluates to False. In our example, this will happen when counter = 5, because 5 < 5 is not True.

Let's practice writing while loops next.

Instructions:
We've already imported the csv file and stored the data in songs. In the editor below, use a while loop to print out the data for the first 10 songs in songs.

Break, continue, and pass statements

The following statements can be used to change the execution of loops:

  • Break: Used to terminate a loop
  • Continue: Used to skip over a statement, but complete the rest of the loop
  • Pass: Used to bypass a specific statement

Let's a look at a couple examples, starting with the break statement:

for x in range(10):
      if x == 6:
         break
      print (x)

Below is the output:

0
1
2
3
4
5

The numbers 6-9 are never printed, because when x == 6, the break statement is executed.

Now, let's replace the break statement with the continue statement:

for x in range(10):
      if x == 6:
           continue
      print (x)

The following numbers print:

0
1
2
3
4
5
7
8
9

In this case, all numbers between 0 and 9 print EXCEPT for 6, because the continue statement is used to skip 6.

Finally, let's use the pass statement.

for x in range(10):
      if x == 6:
            pass
      print (x)

The following numbers print:

0
1
2
3
4
5
6
7
8
9

You'll notice in this case, just the numbers 0-9 print, because the pass statement is used to bypass the condition x == 6.

Now that we know how to use the break, continue, and pass statements, we can use it to analyze our data set. Once again, assume songs is a list of lists of our Spotify data, with each list containing the data for one song. The first list contains the column names. Here are the first three columns:

id - Spotify URI of the song.
name - Name of the song.
artists - Artist(s) of the song.

Below are the artists in the first five lists in the dataset:

artists
Ed Sheeran
Luis Fonsi
Luis Fonsi
The Chainsmokers

Let's say we want to print all of the artists before "Luis Fonsi". We could use the following for loop and break statement:

for song in songs:
    if song[2] == "Luis Fonsi":
        break
    print (song[2])

The following values only are printed, because once the value equals to "Luis Fonsi", the loop is terminated:

artists
Ed Sheeran

If we wanted to print all values except for "Luis Fonsi", we could use the continue statement instead:

for song in songs:
    if song[2] == "Luis Fonsi":
        continue
    print (song[2])

Below are the first five values printed. Notice that when the value equals to "Luis Fonsi", it's skipped:

artists
Ed Sheeran
The Chainsmokers
DJ Khaled
Kendrick Lamar

But wait! We changed our minds and now want to see all the artists. We can change the continue statement to a pass statement to stop the if statement from executing:

 for song in songs:
    if song[2] == "Luis Fonsi":
        pass
    print (song[2])

Now, we see the output is the same as our original list!

artists
Ed Sheeran
Luis Fonsi
Luis Fonsi
The Chainsmokers

Let's practice using the break, continue, and pass statement next.

Instructions:

We've already imported the csv file and stored the data as a list of lists in songs. In the editor below:

  • Use a for loop to iterate through songs.
    • Append each artist's name to artists_list EXCEPT:
      • If the artist's name is artists.
      • If the artist's name is already in artists_list.
  • Print artists_list.

Project: explore the speechiness column

In this project, you'll use loops to analyze the "speechiness" of the top Spotify tracks of 2017. "Speechiness" is defined as a measure of the presence of spoken words in the song; the more speech-like the song, the closer the value in this column is to 1.0. According to the column descriptions, the values can be broken down as such:

1.0 - .66: Likely made entirely of spoken words
.66 - .33: Contains both music and speech, such as rap music
.33 - 0.0: Likely represents just music

Let's use the categories above to evaluate the "speechiness" of 2017's popular music.

We've already imported the data and saved it in the songs list. Speechiness is the 9th element in songs. In order to perform any numerical comparisons, this value has to be converted to a float type. We've already done that conversion for you and saved it in speechiness_float variable.

  • Set three variables equal to 0: low_speechiness, medium_speechiness, high_speechiness
  • Use a for loop to iterate through each value in the speechiness column.
    • If song[8] equals to 'speechiness', skip over it.
    • If speechiness_float is greater than .66, add 1 to high_speechiness.
    • If speechiness_float is between .66 and .33, add 1 to medium_speechiness.
    • If speechiness_float is less than .33, add 1 to low_speechiness.
  • Print low_speechiness, medium_speechiness, high_speechiness.

You should see that all but six of the songs have a speechiness value less than .33. Was rap music less popular than other music genres in 2017? Or, maybe Spotify listeners just prefer other kinds of music to rap music.

That's it for the guided steps! But you can continue to explore the dataset on your own. Here are some additional steps:

  1. Explore the duration_ms column. What is the average length of the songs? What is the longest duration? What is the shortest duration?
  2. Explore the "instrumentalness" column. How many songs have a value below .5? How many songs have a value above .5?
  3. Which artists have more than one song in the dataset?