NumPy: Difference between reshape(), flatten(), ravel()

NumPy or Numeric Python is a powerful library for scientific calculations. It works with ndarray (array object in NumPy) that could be single or multi- dimensional.

To perform different calculations sometimes we may need to reduce dimension of a multidimension NumPy array. In NumPy there are many methods available to reshare or flatten a multidimension NumPy array. But you should know the differences and when to use them. In this post I’ll discuss three of those methods: reshape(), flatten(), ravel()

reshape()

Using reshape() we can change shape of a NumPy array. You need to make sure that during reshaping you are not changing the size of the array. In below example test is a 4×3 array, if you run test.size, it will show 12. So, in reshape I have given (6,2), that also makes size of reshaped array 12. If you give something like (6,3) or (4,2) in reshape, it will result in ValueError.

In reshape() you can use order=’F’ for Fortran-like index ordering. If you do not mention order, then order=’C‘ is the default.

reshape() returns a view of original ndarray when possible, otherwise it returns a copy of it. So, if you do changes in reshaped array, it may change the value in original array as well.

import numpy as np
#Creating a 2-D array
test = np.random.random((4,3))
test.shape
test_reshaped.size
test_reshaped = test.reshape((6,2))
test_reshaped
test.reshape((6,2), order='F')
#converting into a 1-D Array
test_reshaped.reshape(-1)

flatten()

Using flatten() you can get a 1-D array from a multidimensional array. So you may say flatten() and reshape(-1) are almost same. Why almost? I will discuss it in next section. flatten() always returns a copy of the original ndarray. So, any changes done on flatted array do not affect the original array.

ravel()

It also returns a flatten 1-D array. Unlike flatten() , it returns a view of the original array whenever possible. Here also you can use order= ‘F’ or ‘C’

Comparing reshape(), flatten(), ravel()

Now, we have created three 1-D NumPy array from test.

test_reshaped1 = test.reshape(-1)
test_flatten = test.flatten()
test_ravel = test.ravel()
print(id(test), id(test_reshaped1), id(test_flatten), id(test_ravel))

We can see all objects have their unique ids.

ndarray has an attribute called base. Using it, we can check base of an array that owns its memory. It can be observed that the base is None for test (as it is the original array) and test_flatten (as it copied from the original array). But for both test_reshaped1 and test_ravel base is test because both are just views of test.

test_reshaped1.base is test
test_flatten.base is test
test_ravel.base is test

You can also use np.may_share_memory() to check if two arrays are sharing same memory or not.

print (np.may_share_memory(test, test_ravel))
print (np.may_share_memory(test, test_flatten))
print (np.may_share_memory(test, test_reshaped1))
print (np.may_share_memory(test_ravel, test_reshaped1))

If two arrays are sharing memory, then updating any of those will change another one too. Now, updating 6^th element of test_flatten. You can see that it does not affect any other arrays.

If you change any element in test_ravel, test_flatten or test, the change will be reflected in other as well.

Conclusion:

flatten() always returns a copy. reshape() tries to return a view, but it’s always not possible to reshape the array without copying it, in that case it returns a copy. It’s same for ravel() as well. So the best practice is to always check using base attribute or np.may_share_memory() if returned object is a view or a copy.

Use flatten() when you want to create 1-D array where performing changes will not affect your original array. Use reshape() when you need to create different shapes of an array and changing any of those will affect other arrays. ravel() can be used same as reshape(), but it can create only 1-D array.