IKH

Chapter-19

I/O operations with Numpy

  • We can save/write ndarray objects to a binary file for future purpose.
  • Later point of time, when ever these objects are required, we can read from that
    binary file.
  • save() ==> to save/write ndarry object to a file
  • load() ==>to read ndarray object from a file

Syntax

  • save(file, arr, allow_pickle=True, fix_imports=True) ==> Save an array to a
  • binary file in NumPy .npy format.
  • load(file, mmap_mode=None, allow_pickle=False, fix_imports=True,
    encoding=’ASCII’) ==> Load arrays or pickled objects from .npy, .npz or pickled
    files.

Example

Python
In [445]: 
import numpy as np 
help(np.save) 

Output

PowerShell
Help on function save in module numpy: 
 
save(file, arr, allow_pickle=True, fix_imports=True) 
    Save an array to a binary file in NumPy ``.npy`` format.   

Example

Python
In [446]: 
import numpy as np 
help(np.load) 

Output

PowerShell
Help on function load in module numpy: 
 
load(file, mmap_mode=None, allow_pickle=False, fix_imports=True, encoding='AS
 CII') 
    Load arrays or pickled objects from ``.npy``, ``.npz`` or pickled files. 

save() and load() => single ndarray

Example

Python
In [447]: 
# Saving ndarray object to a file and read back:(save_read_obj.py) 
import numpy as np 
a = np.array([[10,20,30],[40,50,60]]) #2-D array with shape:(2,3) 
 
#save/serialize ndarray object to a file 
np.save('out.npy',a) 
 
#load/deserialize ndarray object from a file 
out_array = np.load('out.npy') 
print(out_array)

Output

PowerShell
[[10 20 30] 
 [40 50 60]] 

Note

  • The data will be stored in binary form
  • File extension should be .npy, otherwise save() function itself will add that
    extension.
  • By using save() function we can write only one obejct to the file. If we want to write
    multiple objects to a file then we should go for savez() function.

savez() and load() => multiple ndarrays

Example

Python
In [448]: 
import numpy as np 
help(np.savez) 

Output

PowerShell
Help on function savez in module numpy: 
 
savez(file, *args, **kwds) 
    Save several arrays into a single file in uncompressed ``.npz`` format. 
     
    If arguments are passed in with no keywords, the corresponding variable 
    names, in the ``.npz`` file, are 'arr_0', 'arr_1', etc. If keyword 
    arguments are given, the corresponding variable names, in the ``.npz`` 
    file will match the keyword names.

Example

Python
In [449]: 
# Saving mulitple ndarray objects to the binary file: 
import numpy as np 
a = np.array([[10,20,30],[40,50,60]]) #2-D array with shape:(2,3) 
b = np.array([[70,80],[90,100]]) #2-D array with shape:(2,2) 
 
#save/serialize ndarrays object to a file 
np.savez('out.npz',a,b) 
 
#reading ndarray objects from a file 
npzfileobj = np.load('out.npz') #returns NpzFile object 
print(f"Type of the npzfileobj : {type(npzfileobj)}") 
print(npzfileobj.files) 
print(npzfileobj['arr_0']) 
print(npzfileobj['arr_1'])

Output

PowerShell
Type of the npzfileobj : <class 'numpy.lib.npyio.NpzFile'> 
['arr_0', 'arr_1'] 
[[10 20 30] 
 [40 50 60]] 
[[ 70  80] 
 [ 90 100]] 

Example

Python
In [450]: 
# reading the file objets using for loop 
import numpy as np 
a = np.array([[10,20,30],[40,50,60]]) #2-D array with shape:(2,3) 
b = np.array([[70,80],[90,100]]) #2-D array with shape:(2,2) 
 
#save/serialize ndarrays object to a file 
np.savez('out.npz',a,b) 
 
#reading ndarray objects from a file 
npzfileobj = np.load('out.npz') #returns NpzFile object ==> <class 
'numpy.lib.npyio.NpzFile'> 
print(type(npzfileobj)) 
print(npzfileobj.files) 
for i in npzfileobj: 
    print(f"Name of the file : {i}") 
    print("Contents in the file :")
        print(npzfileobj[i]) 
    print("*"*80)

Output

PowerShell
<class 'numpy.lib.npyio.NpzFile'> 
['arr_0', 'arr_1'] 
Name of the file : arr_0 
Contents in the file : 
[[10 20 30] 
 [40 50 60]] 
***************************************************************************** 
Name of the file : arr_1 
Contents in the file : 
[[ 70  80] 
 [ 90 100]] 
***************************************************************************** 

Note

  • np.save() ==>Save an array to a binary file in .npy format
  • np.savez( ==> Save several arrays into a single file in .npz format but in
    uncompressed form.
  • np.savez_compressed() ==>Save several arrays into a single file in .npz format but
    in compressed form.
  • np.load() ==>To load/read arrays from .npy or .npz files.

savez_compressed() and load() => ndarray compressed form

Example

Python
In [451]: 
import numpy as np 
help(np.savez_compressed)

output

PowerShell
Help on function savez_compressed in module numpy: 
 
savez_compressed(file, *args, **kwds) 
    Save several arrays into a single file in compressed ``.npz`` format.

Example

PowerShell
In [452]: 
# compressed form 
 
import numpy as np 
a = np.array([[10,20,30],[40,50,60]]) #2-D array with shape:(2,3)
b = np.array([[70,80],[90,100]]) #2-D array with shape:(2,2) 
 
#save/serialize ndarrays object to a file 
np.savez_compressed('out_compressed.npz',a,b) 
 
#reading ndarray objects from a file 
npzfileobj = np.load('out_compressed.npz') #returns NpzFile object 
#print(type(npzfileobj)) 
print(npzfileobj.files) 
print(npzfileobj['arr_0']) 
print(npzfileobj['arr_1']) 

Output

PowerShell
['arr_0', 'arr_1'] 
[[10 20 30] 
 [40 50 60]] 
[[ 70  80] 
 [ 90 100]] 

Example

Python
In [453]: 
# Analysys 
%ls out.npz  out_compressed.npz 

Output

PowerShell
 Volume in drive D is BigData 
 Volume Serial Number is 8E56-9F3B 
 
 Directory of D:\Youtube_Videos\DurgaSoft\DataScience\JupyterNotebooks_Numpy\
 Chapter_wise_Notes 
 
 
 Directory of D:\Youtube_Videos\DurgaSoft\DataScience\JupyterNotebooks_Numpy\
 Chapter_wise_Notes 
 
15-08-2021  01:55               546 out.npz 
15-08-2021  01:56               419 out_compressed.npz 
               2 File(s)            965 bytes 
               0 Dir(s)  85,327,192,064 bytes free 

We can save object in compressed form, then what is the need of uncompressed
form?

  • compressed form ==>memory will be saved, but performance down.
  • uncompressed form ==>memory won’t be saved, but performance wise good.

Note

  • if we are using save() function the file extension: npy
  • if we are using savez() or savez_compressed() functions the file extension: npz

savetxt() and loadtxt() => ndarray object to text file

  • To save ndarray object to a text file we will use savetxt() function
  • To read ndarray object from a text file we will use loadtxt() function

Example

Python
In [454]: 
import numpy as np 
help(np.savetxt)

Output

PowerShell
Help on function savetxt in module numpy: 
 
savetxt(fname, X, fmt='%.18e', delimiter=' ', newline='\n', header='', footer
 ='', comments='# ', encoding=None) 
    Save an array to a text file.

Example

Python
In [455]: 
import numpy as np 
help(np.loadtxt)

Output

PowerShell
Help on function loadtxt in module numpy: 
 
loadtxt(fname, dtype=<class 'float'>, comments='#', delimiter=None, converter
 s=None, skiprows=0, usecols=None, unpack=False, ndmin=0, encoding='bytes', ma
 x_rows=None, *, like=None) 
    Load data from a text file. 
     
    Each row in the text file must have the same number of values.

Example

Python
In [456]: 
import numpy as np 
a = np.array([[10,20,30],[40,50,60]]) #2-D array with shape:(2,3) 
 
#save/serialize ndarrays object to a file 
np.savetxt('out.txt',a,fmt='%.1f') 
#reading ndarray objects from a file and default dtype is float 
out_array1 = np.loadtxt('out.txt') 
print("Output array in default format : float") 
print(out_array1) 
 
#reading ndarray objects from a file and default dtype is int 
print("Output array in int format") 
out_array2 = np.loadtxt('out.txt',dtype=int) 
print(out_array2)

Output

PowerShell
Output array in default format : float 
[[10. 20. 30.] 
 [40. 50. 60.]] 
Output array in int format 
[[10 20 30] 
 [40 50 60]] 

Example

Python
In [457]: 
## save ndarray object(str) into a text file 
import numpy as np 
a1 = np.array([['Sunny',1000],['Bunny',2000],['Chinny',3000],['Pinny',4000]]) 
 
#save this ndarray to a text file 
np.savetxt('out.txt',a1) # by default fmt='%.18e'. It will leads to error 

Output

PowerShell
--------------------------------------------------------------------------- 
TypeError                                 Traceback (most recent call last) 
F:\Users\Gopi\anaconda3\lib\site-packages\numpy\lib\npyio.py in savetxt(fname
 , X, fmt, delimiter, newline, header, footer, comments, encoding) 
   1432                 try: -> 1433                     v = format % tuple(row) + newline 
   1434                 except TypeError as e: 
 
TypeError: must be real number, not numpy.str_ 
 
The above exception was the direct cause of the following exception: 
 
TypeError                                 Traceback (most recent call last) 
<ipython-input-457-bf45be6c6703> in <module> 
      4  
      5 #save this ndarray to a text file 
----> 6 np.savetxt('out.txt',a1) # by default fmt='%.18e'. It will leads to e
 rror 
 
<__array_function__ internals> in savetxt(*args, **kwargs) 
 
F:\Users\Gopi\anaconda3\lib\site-packages\numpy\lib\npyio.py in savetxt(fname
 , X, fmt, delimiter, newline, header, footer, comments, encoding) 
   1433                     v = format % tuple(row) + newline 
   1434                 except TypeError as e: -> 1435                     raise TypeError("Mismatch between array dtype ('%
 s') and " 
   1436                                     "format specifier ('%s')" 
   1437                                     % (str(X.dtype), format)) from e 
 
TypeError: Mismatch between array dtype ('<U11') and format specifier ('%.18e 
%.18e') 

savetxt() conclusions

  • By using savetxt() we can store ndarray of type 1-D and 2-D only
  • If we use 3-D array to store into a file then it will give error

Example

Python
In [458]: 
import numpy as np 
a = np.arange(24).reshape(2,3,4) 
np.savetxt('output.txt',a)

Output

PowerShell
--------------------------------------------------------------------------- 
ValueError                                Traceback (most recent call last) 
<ipython-input-458-24e4ed2480fe> in <module> 
      1 import numpy as np 
      2 a = np.arange(24).reshape(2,3,4) ----> 3 np.savetxt('output.txt',a) 
 
<__array_function__ internals> in savetxt(*args, **kwargs) 
 
F:\Users\Gopi\anaconda3\lib\site-packages\numpy\lib\npyio.py in savetxt(fname
 , X, fmt, delimiter, newline, header, footer, comments, encoding) 
   1378         # Handle 1-dimensional arrays 
   1379         if X.ndim == 0 or X.ndim > 2: -> 1380             raise ValueError( 
   1381                 "Expected 1D or 2D array, got %dD array instead" % X.
 ndim)
  1382         elif X.ndim == 1: 
 
ValueError: Expected 1D or 2D array, got 3D array instead 

Example

Python
In [459]: 
# to store str ndarray into a file we must specify the fmt parameter as str 
import numpy as np 
a1 = np.array([['Sunny',1000],['Bunny',2000],['Chinny',3000],['Pinny',4000]]) 
 
#save this ndarray to a text file 
np.savetxt('out.txt',a1,fmt='%s %s') 
 
#reading ndarray from the text file 
a2 = np.loadtxt('out.txt',dtype='str') 
print(f"Type of a2(fetching the data from text file) : {type(a2)}") 
print(f"The a2 value after fetching the data from text file : \n {a2}")

Output

PowerShell
Type of a2(fetching the data from text file) : <class 'numpy.ndarray'> 
The a2 value after fetching the data from text file :  
 [['Sunny' '1000'] 
 ['Bunny' '2000'] 
 ['Chinny' '3000'] 
 ['Pinny' '4000']] 

Creating ndarray object by reading a file

Example

Python
In [460]: 
# out.txt: 
# ------- 
# Sunny 1000 
# Bunny 2000 
# Chinny 3000 
# Pinny 4000 
# Zinny 5000 
# Vinny 6000 
# Minny 7000 
# Tinny 8000 
# creating ndarray from text file data: 
import numpy as np 
#reading ndarray from the text file 
a2 = np.loadtxt('out.txt',dtype='str') 
print(f"Type of a2 : {type(a2)}") 
print(a2) 

Output

PowerShell
Type of a2 : <class 'numpy.ndarray'> 
[['Sunny' '1000'] 
 ['Bunny' '2000'] 
 ['Chinny' '3000'] 
 ['Pinny' '4000']] 

Writing ndarray objects to the csv file

  • csv – Comma separated Values
Python
In [461]: 
import numpy as np 
 
a1 = np.array([[10,20,30],[40,50,60]]) 
 
#save/serialize to a csv file 
np.savetxt('out.csv',a1,delimiter=',') 
 
#reading ndarray object from a csv file 
a2 = np.loadtxt('out.csv',delimiter=',') 
print(a2) 

Output

PowerShell
[[10. 20. 30.] 
 [40. 50. 60.]]

Summary:

  • Save one ndarray object to the binary file(save() and load())
  • Save multiple ndarray objects to the binary file in uncompressed form(savez()
    and load())
  • Save multiple ndarray objects to the binary file in compressed
    form(savez_compressed() and load())
  • Save ndarry object to the text file (savetxt() and loadtxt())
  • Save ndarry object to the csv file (savetxt() and loadtxt() with delimiter=’,’)

Report an error