Programming for Data Science¶

List¶

Dr. Bhargavi R

SCOPE, VIT Chennai

List¶

  • As a data science student, you often need to handle large amounts of data.
  • We'll see (later) how Python helps us to handle huge amounts of data in an easier way by providing the list data type.
  • Before diving into huge amounts of data, let's start with small examples.

What is a list?¶

  • A list is an ordered sequence of values.
  • Usually group of logically related items are put together in a list.
  • Nested lists are helpful in organizing the data in a hierachical structure.
  • A list is mutable.
  • Items in a list need not be of the same data type.

Create a list - syntax¶

list_name = [item_1, item_2, item_3,...., item_n]

Examples¶

In [1]:
# let's now create a list which can hold the details of a student
student_details = ["19MIA0000", "Asish", 100, 99, 99, 99.33]
print(student_details)
['19MIA0000', 'Asish', 100, 99, 99, 99.33]
In [2]:
# Create another list which contains the first 5 even numbers less than 12
even_numbers = [element for element in range(2,12,2)]
print(even_numbers)
[2, 4, 6, 8, 10]
In [3]:
# Create an empty list
empty_list = []
empty_list.append(1)
empty_list.append(10)
empty_list.append(2*20)
print(empty_list)
[1, 10, 40]
In [4]:
# list() method to create a list
l = list()
l.append(1)
l.append(2)
print(l)
[1, 2]
In [5]:
dir(l)
Out[5]:
['__add__',
 '__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']
In [6]:
l[::]
Out[6]:
[1, 2]
In [7]:
# Now we will create a list of lists
# create 5 students marks scored in 4 subjects

student1_scores    =   [45, 48, 50, 46]   
student2_scores    =   [44, 42, 39, 44]   
student3_scores    =   [25, 30, 28, 32]   
student4_scores    =   [40, 45, 44, 45]
student5_scores    =   [30, 35, 32, 34]
In [8]:
# Now let's create a list which contain the list of 5 students marks

student_scores = [student1_scores, student2_scores, student3_scores, 
                  student4_scores, student5_scores]
print(student_scores) 
[[45, 48, 50, 46], [44, 42, 39, 44], [25, 30, 28, 32], [40, 45, 44, 45], [30, 35, 32, 34]]
In [9]:
matrix = [[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]
print(matrix)
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

Indexing¶

Accessing the Individual Values¶

  • A List element can be accessed with it's index.
  • Python indexing starts with $0$.
  • $n^{th}$ element of a list can be accessed as follows:
    list_name[n-1]
    
In [10]:
# let's now access the 2nd element in student_details 
# and assign it to a variable named 'name'
name = student_details[1]
print("Name of the student is ....", name)
Name of the student is .... Asish
In [11]:
# Last element can be accessed as follows
last_element = even_numbers[-1]
print(last_element)
10

Slicing¶

In [12]:
# Access the last 4 elements of student details 
# and assign the values to another list named 'marks'
marks = student_details[2:5]
print(marks)                    ## Is this the one we want?? Something went wrong...
[100, 99, 99]
In [13]:
marks = student_details[2:6]
print(marks)                    ## Yes. That is correct
[100, 99, 99, 99.33]
In [14]:
print(marks[ : 3])              # Access the first 3 elements fromm marks[]
print(marks[2 : ])                # Access the elements from the 3rd position to last position
[100, 99, 99]
[99, 99.33]
In [15]:
# Access last 3 elements in even_numbers
last_three_elements = even_numbers[-4 : -1]
print(last_three_elements)
[4, 6, 8]
In [16]:
#Try this
last_two_elements = even_numbers[-1 : -3: -1]
print(last_two_elements)
[10, 8]

Examples (cont..)¶

In [17]:
# Accessing the elements of a list of lists
# Access the 2nd row of the list student_scores
student2_scores_copy = student_scores[1]
print(student2_scores_copy)
[44, 42, 39, 44]
In [18]:
#Access the first 3 rows of student_scores and store the result in slice1
slice1 = student_scores[1:4]
print(slice1)
[[44, 42, 39, 44], [25, 30, 28, 32], [40, 45, 44, 45]]
In [19]:
#Access the third and fifth rows of student_scores and store the result in slice2
slice2 = [student_scores[2], student_scores[4]]
print(slice2)
[[25, 30, 28, 32], [30, 35, 32, 34]]
In [20]:
# Print the 3rd row 2nd element of matrix
print(matrix[2][1])
8
In [21]:
#Access the 3rd column from matrix
for i in range( 0 , 3):
    print(matrix[i][2])
3
6
9

What else I can do with a list?¶

In [22]:
help(list)
Help on class list in module builtins:

class list(object)
 |  list(iterable=(), /)
 |  
 |  Built-in mutable sequence.
 |  
 |  If no argument is given, the constructor creates a new empty list.
 |  The argument must be an iterable if specified.
 |  
 |  Methods defined here:
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __delitem__(self, key, /)
 |      Delete self[key].
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(...)
 |      x.__getitem__(y) <==> x[y]
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __iadd__(self, value, /)
 |      Implement self+=value.
 |  
 |  __imul__(self, value, /)
 |      Implement self*=value.
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __le__(self, value, /)
 |      Return self<=value.
 |  
 |  __len__(self, /)
 |      Return len(self).
 |  
 |  __lt__(self, value, /)
 |      Return self<value.
 |  
 |  __mul__(self, value, /)
 |      Return self*value.
 |  
 |  __ne__(self, value, /)
 |      Return self!=value.
 |  
 |  __repr__(self, /)
 |      Return repr(self).
 |  
 |  __reversed__(self, /)
 |      Return a reverse iterator over the list.
 |  
 |  __rmul__(self, value, /)
 |      Return value*self.
 |  
 |  __setitem__(self, key, value, /)
 |      Set self[key] to value.
 |  
 |  __sizeof__(self, /)
 |      Return the size of the list in memory, in bytes.
 |  
 |  append(self, object, /)
 |      Append object to the end of the list.
 |  
 |  clear(self, /)
 |      Remove all items from list.
 |  
 |  copy(self, /)
 |      Return a shallow copy of the list.
 |  
 |  count(self, value, /)
 |      Return number of occurrences of value.
 |  
 |  extend(self, iterable, /)
 |      Extend list by appending elements from the iterable.
 |  
 |  index(self, value, start=0, stop=9223372036854775807, /)
 |      Return first index of value.
 |      
 |      Raises ValueError if the value is not present.
 |  
 |  insert(self, index, object, /)
 |      Insert object before index.
 |  
 |  pop(self, index=-1, /)
 |      Remove and return item at index (default last).
 |      
 |      Raises IndexError if list is empty or index is out of range.
 |  
 |  remove(self, value, /)
 |      Remove first occurrence of value.
 |      
 |      Raises ValueError if the value is not present.
 |  
 |  reverse(self, /)
 |      Reverse *IN PLACE*.
 |  
 |  sort(self, /, *, key=None, reverse=False)
 |      Sort the list in ascending order and return None.
 |      
 |      The sort is in-place (i.e. the list itself is modified) and stable (i.e. the
 |      order of two equal elements is maintained).
 |      
 |      If a key function is given, apply it once to each list item and sort them,
 |      ascending or descending, according to their function values.
 |      
 |      The reverse flag can be set to sort in descending order.
 |  
 |  ----------------------------------------------------------------------
 |  Class methods defined here:
 |  
 |  __class_getitem__(...) from builtins.type
 |      See PEP 585
 |  
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  __hash__ = None

In [23]:
type(student_scores)
Out[23]:
list

Methods on list¶

We can access and use the methods defind on list as follows¶

method_name(list_name, otherArgument(s))
    list_name.method_name(argument(s))
In [24]:
even_numbers
Out[24]:
[2, 4, 6, 8, 10]
In [25]:
# Append 12 to even_numbers
even_numbers.append(12)
print(even_numbers)
[2, 4, 6, 8, 10, 12]
In [26]:
# Make a copy of even_numbers
even_numbers_copy = even_numbers.copy()
print("even_numbers = ", even_numbers)
print("even_numbers_copy = ", even_numbers_copy)
even_numbers =  [2, 4, 6, 8, 10, 12]
even_numbers_copy =  [2, 4, 6, 8, 10, 12]
In [27]:
# You found that by mistake you have entered the name in the student_details wrongly.
# Let's now correct it.
# Since list is mutable it is possible to correct it

student_details[1] = 'Harish'
print(student_details)
['19MIA0000', 'Harish', 100, 99, 99, 99.33]
In [28]:
a = [[1,2],
     [3,4]]
b = [[1,2],
     [3,4]]
print(a)
print( a + b)
[[1, 2], [3, 4]]
[[1, 2], [3, 4], [1, 2], [3, 4]]

Let's now see something interesting that needs to be paid attention when you code.

In [29]:
# Let's crete a list and make two copies of the same with different approaches
# Create a list
my_first_list = [1] * 5
print(my_first_list)
[1, 1, 1, 1, 1]
In [30]:
#Approach 1: Make a copy of the list using assignment operator
my_second_list = my_first_list

#Approach 2: Make a copy of the list using copy method
my_third_list = my_first_list.copy()
In [31]:
# Check the lists once
print(f"my_first_list = {my_first_list}")
print(f"my_second_list = {my_second_list}")
print(f"my_third_list = {my_third_list}")
my_first_list = [1, 1, 1, 1, 1]
my_second_list = [1, 1, 1, 1, 1]
my_third_list = [1, 1, 1, 1, 1]
In [32]:
# let's update an element
my_second_list[0] = 2

print("my_first_list =",  my_first_list)        ## OOPS what went wrong
print("my_second_list =",  my_second_list)
my_first_list = [2, 1, 1, 1, 1]
my_second_list = [2, 1, 1, 1, 1]
In [33]:
my_third_list[-1] = 5

print("my_first_list =",  my_first_list)
print("my_second_list =",  my_second_list)
print("my_third_list =",  my_third_list)
my_first_list = [2, 1, 1, 1, 1]
my_second_list = [2, 1, 1, 1, 1]
my_third_list = [1, 1, 1, 1, 5]

Why?¶

In [34]:
# Let's now check the address of these variables
print("Address of {} is {}".format('my_first_list', id(my_first_list)))
print("Address of {} is {}".format('my_second_list', id(my_second_list)))
print("Address of {} is {}".format('my_third_list', id(my_third_list)))
Address of my_first_list is 4359933120
Address of my_second_list is 4359933120
Address of my_third_list is 4360030656
In [35]:
x = [0] * 5
print(x)
x[0] = 1
print(x)
[0, 0, 0, 0, 0]
[1, 0, 0, 0, 0]
In [36]:
x = [[0] * 5] * 5
print(x)
x[2][2] = 1
print(x)
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
[[0, 0, 1, 0, 0], [0, 0, 1, 0, 0], [0, 0, 1, 0, 0], [0, 0, 1, 0, 0], [0, 0, 1, 0, 0]]
In [37]:
x = [[0] * 5 for i in range(0,5)]
print(x)
x[2][2] = 2
print(x)
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 2, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]