Numpy Part I
Get Started With numpy
numpy is one of the important libraries when it comes to data science and machine learning.
what is numpy?
numpy is a python library used for working with arrays . Numpy provides an array object that is much faster than traditional Python lists.
why use numpy arrays over lists?
Now you would think that in python If lists are present then why do we need arrays? numpy arrays are stored in contiguous memory location unlike lists. this provides an fast and efficient way of working with data
how to install numpy?
first install pip
if you don't have pip installed which is a package manager for python , download it from here. after that navigate to the folder where you downloaded the file and open a command prompt there and type the following in command prompt
type the following command
sudo apt install python3-pip
now install numpy
type the following command in your command prompt
pip install numpy
get started with numpy
import numpy arr = numpy.array([1,2,3,4,5,6]) #here arr is a numpy array object print(arr)
these are the most common and basic arrays
import numpy as np arr = np.array([1,2,3,4,5]) #here arr is a 1-D array containing elements 1,2,3,4,5 print(arr)
array containing 1-D arrays as its elements is known as a 2-D array. 2-D arrays have rows and columns and are usually used to represent matrices.
import numpy as np arr = np.array([[1,2,3],[4,5,6]]) #here arr is a 2-D array print(arr)
numpy provides a ndim property to know the dimensions of a particular array
import numpy as np a = np.array(50) b = np.array([1,2,3,4]) c = np.array([1,2,3,4],[5,6,7,8]) print(a.ndim) #prints 0 because it has 0 dimensions print(b.ndim) #prints 1 because it has 1 dimension print(c.ndim) #prints 2 because it has 2 dimensions
Accessing Array elements
you can access an array element by referencing its index number
indexes start with 0 , meaning that first element has index 0 , second element has index 1 and so on.
accessing 1-D array elements
import numpy as np arr = np.array([1,2,3,4] print(arr) #here it prints 3 as the element at index 2 is 3
accessing 2-D array elements
import numpy as np arr = np.array([[1,2,3],[4,5,6]])
in the above example to get the element with the value of 5 we would use the following indexing :
print(arr[1,1]) #prints 5
the first index is the row number and the second index is the column number
we can also use slicing method to access the element
import numpy as np arr = np.array([1,2,3,4,5,6]) print(arr[0:4]) #prints 1,2,3,4
same can be used with 2-D arrays
import numpy as np arr = np.array([[1,2,3],[4,5,6]]) print(arr[1,0:2]) #prints 4,5
numpy arrays have a property named shape that returns a tuple with first index representing the number of dimensions and second index representing number of elements each dimension has
first index = number of dimensions
second index = number of elements each dimension has
import numpy as np arr = np.array([1,2,3,4],[5,6,7,8]) print(arr.shape) #prints (2,4)
here (2,4) means it has 2 dimensions (rows) and each dimension having 4 elements
reshaping means to change the shape of the array
numpy provides a reshape property for the same
Reshape 1-D array to 2-D array
import numpy as np arr = np.array([1,2,3,4,5,6,7,8,9,10]) new_arr = arr.reshape(2,5) print(new_arr) #prints
here , it converts a 1-D array with 10 elements into a 2-D array with 2 rows and 5 elements in each row
Important Note - while reshaping it's mandatory that original array and modified array has same number of elements
We can reshape a 1D array with 8 elements into 2D array with 4 elements in 2 rows but we cannot reshape it into a 2D array of 3 elements 3 rows as that would require 3x3 = 9 elements.
Joining means putting contents of two or more arrays into a single array
for this , numpy provides a concatenate method
import numpy as np arr1 = np.array([1,2,3]) arr2 = np.array([4,5,6]) final_arr = np.concatenate(arr1,arr2) print(final_arr) #prints1,2,3,4,5,6
using Numpy stack to join
Stacking is same as concatenation, the only difference is that stacking is done along a new axis.
We can concatenate two 1-D arrays along the second axis which would result in putting them one over the other
import numpy as np arr1 = np.array([1,2,3]) arr2 = np.array([4,5,6]) final_arr = np.stack(arr1,arr2,axis=1) print(final_arr) #prints
if no value is passed to axis , it is considered as 0 i.e axis = 0
Splitting is the reverse operation of Joining
numpy provides array_split() for splitting arrays
array_split(array,number of splits)
import numpy as np arr = np.array([1, 2, 3, 4, 5, 6]) newarr = np.array_split(arr, 3) print(newarr) #prints
Same can be used with 2-D arrays
using where method
import numpy as np arr = np.array([1,2,3,2,4,5,6,2,9]) x = np.where(arr == 2) print(x) #prints (array([1, 3, 7], )
in the above example it searches for elements having value 2 and returns the indexes of the same
searchsorted() performs a binary search in the array, and returns the index where the specified value would be inserted to maintain the search order.
import numpy as np arr = np.array([6, 7, 8, 9]) x = np.searchsorted(arr, 7) print(x) #prints 1
Note - The searchsorted() method is assumed to be used on sorted array
search from the right side
By default the left most index is returned, but we can give side='right' to return the right most index instead.
import numpy as np arr = np.array([6, 7, 8, 9]) x = np.searchsorted(arr, 7, side='right') print(x) #prints 2
Sorting means putting elements in an ordered sequence.
import numpy as np arr = np.array([3, 2, 0, 1]) print(np.sort(arr)) #prints
if sort() is used for 2-D arrays then both the arrays will be sorted
Getting some elements out of an existing array and creating a new array out of them is called filtering.
In numPy, you filter an array using a boolean index list.
import numpy as np arr = np.array([41, 42, 43, 44]) x = [True, False, True, False] newarr = arr[x] print(newarr) #prints [41,43]
If the value at an index is True that element is contained in the filtered array, if the value at that index is False that element is excluded from the filtered array.
array copy vs array view
The main difference between a copy and a view of an array is that the copy is a new array, and the view is just a view of the original array.
import numpy as np arr = np.array([1, 2, 3, 4, 5]) x = arr.copy() arr = 42 print(arr) #prints print(x) #prints
import numpy as np arr = np.array([1, 2, 3, 4, 5]) x = arr.view() arr = 42 print(arr) #prints print(x) #prints
The copy owns the data and any changes made to the copy will not affect original array, and any changes made to the original array will not affect the copy.
The view does not own the data and any changes made to the view will affect the original array, and any changes made to the original array will affect the view.
Check if Array Owns it's Data
copies owns the data, and views does not own the data, but how can we check this?
Every numpy array has the attribute base that returns None if the array owns the data.
Otherwise, the base attribute refers to the original object.
import numpy as np arr = np.array([1, 2, 3, 4, 5]) x = arr.copy() y = arr.view() print(x.base) #prints None print(y.base) #prints [1,2,3,4,5]
hope you find this tutorial interesting and this was just the part 1
many more interesting concepts are yet to come with part 2
if you are stuck with any problem or have any questions feel free to ping me on Twitter