numpy-3-min

āš” NumPy, which stands for Numerical Python, is an opensource library that allows users to store large amounts of data using less memory and perform extensive operations (mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation, etc) easily using homogenous, one-dimensional, and multidimensional arrays.

The basic data structure of NumPy is a ndarray, similar to a list.

šŸ’” An array in NumPy is a data structure organized like a grid of rows and columns, containing values of the same data type that can be indexed and manipulated efficiently as per the requirement of the problem.

Difference between NumPy and Python standard List

The three most important differences between NumPy arrays and standard Python sequences are:

NumPy ArrayPython Sequences (list, tuple, range)
Creation SizeFixed sizePython list can grow dynamically
DatatypeElements are of same datatypeElements can be of multiple datatypes
SpeedFast as its partially written in CSlower compared to NumPy

Why use Numpy: Computation time

A python list can very well perform all the operations that NumPy arrays perform; it is simply a fact that NumPy arrays are faster āš” and convenient when it comes to large complex computations.

Let's add two matrix of 9 million elements each to see the computation time.

import time
import numpy as np

# python standard list
list_A = [i for i in range(1,9000000)]
list_B = [j**2 for j in range(1,9000000)]

t0 = time.time()
sum_list = list(map(lambda x, y: x+y, list_A, list_B))
t1 = time.time()
list_time = t1 - t0
print ("Time taken by Python standard list is ",list_time)

# numpy array
array_A = np.arange(1,9000000)
array_B = np.arange(1,9000000)

t0 = time.time()
sum_numpy =  array_A + array_B
t1 = time.time()
numpy_time = t1 - t0
print ("Time taken by NumPy array is ",numpy_time)

print("The ratio of time taken is {}".format(list_time//numpy_time))
Time taken by Python standard list is  0.6801159381866455
Time taken by NumPy array is  0.04106783866882324
The ratio of time taken is 16.0

You can notice that NumPy is a lot faster than the list. Below is a table to show the difference between the python standard list and NumPy computation speed on different operations.

Size of each matrixType of operationTime taken by listTime taken by numpyRatio (List Time / Numpy Time)
9 millionAddition (+)0.56s0.017s32.0
9 millionSubtraction (-)0.61s0.016s36.0
9 millionMultiplication (*)0.69s0.016s42.0
9 millionDivision (/)0.51s0.022s23.0

From the above table, we can conclude that NumPy is a lot faster than the python standard list. In the real world when the data is in billions and the operation are more complex, this ratio will be even bigger.

Installing NumPy

To start working with NumPy, you need to install it and you can't go wrong if you follow instructions from numpy official website.

[Optional]: Follow this guide to install python, if you don't have it already installed. It's not required but it's ideal to install python packages inside a virtual environment to avoid version-related conflicts in the future.

Basics of Numpy

As a prerequisite, you will need to know beginner-level python. See this Python tutorial for refreshing your concepts.

numpy-1-min

In the above image array is an object of ndarray class of the NumPy library.

Whenever you work with a dataset, the first step is to get an idea about the dataset array. Four important attributes of NumPy array to get information about the dataset are:

  • .ndim: returns number(int) of dimensions (axis) of the array.
  • .shape: returns a tuple of n rows and m column (n,m).
  • .size: returns a number(int) of total elements in the array.
  • .dtype: returns an object of numpy.dtype that describes the type of elements in the array.

Below is a code snippet of the attributes described above.

array = np.array([[1,2,3],[4,5,6]]) # Creating NumPy array from list

print("Dimension: ",array.ndim, type(array.ndim))
print("Shape: ",array.shape, type(array.shape))
print("Size: ",array.size, type(array.size))
print("Datatype: ",array.dtype, type(array.dtype))
print("Itemsize: ",array.itemsize, type(array.itemsize))
print("Data: ",array.data, type(array.data))
Dimension:  2 <class 'int'>
Shape:  (2, 3) <class 'tuple'>
Size:  6 <class 'int'>
Datatype:  int64 <class 'numpy.dtype[int64]'>
Itemsize:  8 <class 'int'>
Data:  <memory at 0x7f2d807312b0> <class 'memoryview'>

Array Creation

numpy-2-min

A NumPy array is created by passing an array-like data structure such as python's list or a tuple.

Let's create a 0-D, 1-D, 2-D, and a 3-D array from a list.

  • 0-D array: np.array(11)
  • 1-D array: np.array([1, 2, 3, 4, 5])
  • 2-D array: np.array([[1, 2, 3], [4, 5, 6]])
  • 3-D array: np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
array_0D = np.array(11)
array_1D = np.array([1, 2, 3, 4, 5])
array_2D = np.array([[1, 2, 3], [4, 5, 6]])
array_3D = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(array_0D)
print(array_1D)
print(array_2D)
print(array_3D)
11
[1 2 3 4 5]
[[1 2 3]
 [4 5 6]]
[[[1 2 3]
  [4 5 6]]

 [[1 2 3]

Like the python standard list, here are 7 ways to create a NumPy array.

array_list = np.array([1,2,3], dtype=int) # From List
array_tuple = np.array((1.1,2.2,3.3)) # From Tuple
array_zeroes = np.zeros((2,3)) # Array of zeroes: 2 rows and 3 columns
array_ones = np.ones((2,3)) # Array of ones: 2 rows and 3 columns
array_empty = np.empty((2,4)) # Array of zeroes: 2 rows and 3 columns
array_arange = np.arange(2,10,2) # Similar to python range()
array_linspace = np.linspace(2,4,9) # Array of 9 numbers between 2 and 4

Just like dtype=int parameter, you can make use of others parameters like copy, order, subok, ndim, like. You can explore other NumPy arrays parameters.

Let's practice some methods to create arrays

šŸ’” Tip: Use help to see syntax when required

help(np.zeros)
array([[ 0.],
           [ 0.]])

    >>> s = (2,2)
    >>> np.zeros(s)
    array([[ 0.,  0.],
           [ 0.,  0.]])

    >>> np.zeros((2,), dtype=[('x', 'i4'), ('y', 'i4')]) # custom dtype
    array([(0, 0), (0, 0)],
          dtype=[('x', '<i4'), ('y', '<i4')])

Create a 1D array of ones.

arr = np.ones(9)
print(arr)
print(arr.dtype)
[1. 1. 1. 1. 1. 1. 1. 1. 1.]
float64

Notice that, by default, NumPy creates a data type float64. Let's provide dtype explicitly.

arr = np.ones(9, dtype=int)
print(arr)
print(arr.dtype)
[1 1 1 1 1 1 1 1 1]
int64

Create a 4x3 array of zeroes.

arr = np.ones((4,3), dtype=int)
print(arr)
[[1 1 1]
 [1 1 1]
 [1 1 1]
 [1 1 1]]

Create an array of integers between 3 to 7.

arr = np.arange(4,7)
print(arr)
[4 5 6]

Create an array of integers from 5 to 20 with a step of 2

arr = np.arange(5,21,2)
print(arr)
[ 5  7  9 11 13 15 17 19]

Create an array of random integers of size 10.

arr = np.random.randint(5,size=10)
print(arr)
[3 2 2 0 4 0 1 3 2 0]

Create an array of random integers between 6 and 9 of size 10.

arr = np.random.randint(7,9,size=10)
print(arr)
[8 8 7 7 8 8 8 7 7 7]

Create a 2x3 2D array of random numbers.

arr = np.random.random([2,3])
print(arr)
[[0.9664729  0.33623868 0.52633769]
 [0.80454667 0.68146984 0.08063325]]

Create an array of size 10 between 1.5 and 2.

arr = np.linspace(1.5,2,10)
print(arr)
[1.5        1.55555556 1.61111111 1.66666667 1.72222222 1.77777778
 1.83333333 1.88888889 1.94444444 2.        ]

That's all for the basic ways of creating arrays. You can also explore these other 4 ways to create arrays as well:

  • .full(): Create a constant array of any number ā€˜nā€™
  • .tile(): Create a new array by repeating an existing array for a particular number of times
  • .eye(): Create an identity matrix of any dimension
  • .random.randint(): Create a random array of integers within a particular range

Basic Operations

NumPy can perform a variety of operations, the very basics include, addition, subtraction, and multiplication. Below are a few basic operations that can be done in NumPy without using loops.

Create a NumPy array to store the marks of 5 students.

marks = [1, 2, 3, 4, 5]
marks_np = np.array(marks)
print(marks_np)
[1 2 3 4 5]

Add marks of 5 subjects of two different students.

marks_A = [10,20,10,20,14]
marks_B = [23,12,43,12,43]

marks_np_A = np.array(marks_A)
marks_np_B = np.array(marks_B)

total = marks_np_A + marks_np_B # Add using + operator
print(total)
[33 32 53 32 57]

Convert weight of 5 students from kg to gram

weight = [45, 55, 53, 63, 60] # In KG
weight_np = np.array(weight)

weight_in_gram = weight_np * 1000 # 1kg = 1000gm
print(weight_in_gram)
[45000 55000 53000 63000 60000]

Calculate the BMI of 5 students. To calculate BMI we need

  • Two arrays of height and weight
  • Apply the formulae weight_in_kg / (height_in_m ** 2)
heights_in_inch = [71,72,73,74,75]
weights_in_lbs = [195, 180, 250, 230, 200]

First, let's convert height from inch to meter and weight lbs to kg

height_in_m = np.array(heights_in_inch) * 0.0254
weight_in_kg = np.array(weights_in_lbs) * 0.453592

Now, we have converted the array into the right units, let's calculate BMI

BMI = weight_in_kg / (height_in_m ** 2)
print("BMI",BMI)
BMI [27.19667848 24.41211827 32.98315848 29.52992539 24.99800911]

Here is a list of 5 common basic functions in NumPy ndarray:

  • .sum: returns sum of elements over a given axis
  • .min: return minimum number along a given axis.
  • .max: return maximum number along a given axis.
  • .cumsum: return cumulative sum of elements along a given axis.
  • .mean: return average of elements along a given axis.

NumPy also provides universal functions like sin, cos, and exp, these are also called ufunc.

Indexing, Slicing, and Iterating

bmi_first_element = BMI[0] #First Element
bmi_last_element = BMI[1] # second element
bmi_first_five_elements = BMI[0:5] # elements 1-5
bmi_last_five_elements = BMI[-1:] # elements 1-5 from the last

Filter BMI array where BMI > 23

# Conditional Filter

BMI_filtered = BMI[BMI > 23]
print(BMI_filtered)
[27.19667848 24.41211827 32.98315848 29.52992539 24.99800911]

Now you know the basics to work with a NumPy array and you should be able to create arrays and perform operations on them.

You can also checkout these tutorials on: