I/O operations with Numpy
- We can save/write ndarray objects to a binary file for future purpose.
- Later point of time, when ever these objects are required, we can read from that
binary file. - save() ==> to save/write ndarry object to a file
- load() ==>to read ndarray object from a file
Syntax
- save(file, arr, allow_pickle=True, fix_imports=True) ==> Save an array to a
- binary file in NumPy .npy format.
- load(file, mmap_mode=None, allow_pickle=False, fix_imports=True,
encoding=’ASCII’) ==> Load arrays or pickled objects from .npy, .npz or pickled
files.
Example
Python
In [445]:
import numpy as np
help(np.save)
Output
PowerShell
Help on function save in module numpy:
save(file, arr, allow_pickle=True, fix_imports=True)
Save an array to a binary file in NumPy ``.npy`` format.
Example
Python
In [446]:
import numpy as np
help(np.load)
Output
PowerShell
Help on function load in module numpy:
load(file, mmap_mode=None, allow_pickle=False, fix_imports=True, encoding='AS
CII')
Load arrays or pickled objects from ``.npy``, ``.npz`` or pickled files.
save() and load() => single ndarray
Example
Python
In [447]:
# Saving ndarray object to a file and read back:(save_read_obj.py)
import numpy as np
a = np.array([[10,20,30],[40,50,60]]) #2-D array with shape:(2,3)
#save/serialize ndarray object to a file
np.save('out.npy',a)
#load/deserialize ndarray object from a file
out_array = np.load('out.npy')
print(out_array)
Output
PowerShell
[[10 20 30]
[40 50 60]]
Note
- The data will be stored in binary form
- File extension should be .npy, otherwise save() function itself will add that
extension. - By using save() function we can write only one obejct to the file. If we want to write
multiple objects to a file then we should go for savez() function.
savez() and load() => multiple ndarrays
Example
Python
In [448]:
import numpy as np
help(np.savez)
Output
PowerShell
Help on function savez in module numpy:
savez(file, *args, **kwds)
Save several arrays into a single file in uncompressed ``.npz`` format.
If arguments are passed in with no keywords, the corresponding variable
names, in the ``.npz`` file, are 'arr_0', 'arr_1', etc. If keyword
arguments are given, the corresponding variable names, in the ``.npz``
file will match the keyword names.
Example
Python
In [449]:
# Saving mulitple ndarray objects to the binary file:
import numpy as np
a = np.array([[10,20,30],[40,50,60]]) #2-D array with shape:(2,3)
b = np.array([[70,80],[90,100]]) #2-D array with shape:(2,2)
#save/serialize ndarrays object to a file
np.savez('out.npz',a,b)
#reading ndarray objects from a file
npzfileobj = np.load('out.npz') #returns NpzFile object
print(f"Type of the npzfileobj : {type(npzfileobj)}")
print(npzfileobj.files)
print(npzfileobj['arr_0'])
print(npzfileobj['arr_1'])
Output
PowerShell
Type of the npzfileobj : <class 'numpy.lib.npyio.NpzFile'>
['arr_0', 'arr_1']
[[10 20 30]
[40 50 60]]
[[ 70 80]
[ 90 100]]
Example
Python
In [450]:
# reading the file objets using for loop
import numpy as np
a = np.array([[10,20,30],[40,50,60]]) #2-D array with shape:(2,3)
b = np.array([[70,80],[90,100]]) #2-D array with shape:(2,2)
#save/serialize ndarrays object to a file
np.savez('out.npz',a,b)
#reading ndarray objects from a file
npzfileobj = np.load('out.npz') #returns NpzFile object ==> <class
'numpy.lib.npyio.NpzFile'>
print(type(npzfileobj))
print(npzfileobj.files)
for i in npzfileobj:
print(f"Name of the file : {i}")
print("Contents in the file :")
print(npzfileobj[i])
print("*"*80)
Output
PowerShell
<class 'numpy.lib.npyio.NpzFile'>
['arr_0', 'arr_1']
Name of the file : arr_0
Contents in the file :
[[10 20 30]
[40 50 60]]
*****************************************************************************
Name of the file : arr_1
Contents in the file :
[[ 70 80]
[ 90 100]]
*****************************************************************************
Note
- np.save() ==>Save an array to a binary file in .npy format
- np.savez( ==> Save several arrays into a single file in .npz format but in
uncompressed form. - np.savez_compressed() ==>Save several arrays into a single file in .npz format but
in compressed form. - np.load() ==>To load/read arrays from .npy or .npz files.
savez_compressed() and load() => ndarray compressed form
Example
Python
In [451]:
import numpy as np
help(np.savez_compressed)
output
PowerShell
Help on function savez_compressed in module numpy:
savez_compressed(file, *args, **kwds)
Save several arrays into a single file in compressed ``.npz`` format.
Example
PowerShell
In [452]:
# compressed form
import numpy as np
a = np.array([[10,20,30],[40,50,60]]) #2-D array with shape:(2,3)
b = np.array([[70,80],[90,100]]) #2-D array with shape:(2,2)
#save/serialize ndarrays object to a file
np.savez_compressed('out_compressed.npz',a,b)
#reading ndarray objects from a file
npzfileobj = np.load('out_compressed.npz') #returns NpzFile object
#print(type(npzfileobj))
print(npzfileobj.files)
print(npzfileobj['arr_0'])
print(npzfileobj['arr_1'])
Output
PowerShell
['arr_0', 'arr_1']
[[10 20 30]
[40 50 60]]
[[ 70 80]
[ 90 100]]
Example
Python
In [453]:
# Analysys
%ls out.npz out_compressed.npz
Output
PowerShell
Volume in drive D is BigData
Volume Serial Number is 8E56-9F3B
Directory of D:\Youtube_Videos\DurgaSoft\DataScience\JupyterNotebooks_Numpy\
Chapter_wise_Notes
Directory of D:\Youtube_Videos\DurgaSoft\DataScience\JupyterNotebooks_Numpy\
Chapter_wise_Notes
15-08-2021 01:55 546 out.npz
15-08-2021 01:56 419 out_compressed.npz
2 File(s) 965 bytes
0 Dir(s) 85,327,192,064 bytes free
We can save object in compressed form, then what is the need of uncompressed
form?
- compressed form ==>memory will be saved, but performance down.
- uncompressed form ==>memory won’t be saved, but performance wise good.
Note
- if we are using save() function the file extension: npy
- if we are using savez() or savez_compressed() functions the file extension: npz
savetxt() and loadtxt() => ndarray object to text file
- To save ndarray object to a text file we will use savetxt() function
- To read ndarray object from a text file we will use loadtxt() function
Example
Python
In [454]:
import numpy as np
help(np.savetxt)
Output
PowerShell
Help on function savetxt in module numpy:
savetxt(fname, X, fmt='%.18e', delimiter=' ', newline='\n', header='', footer
='', comments='# ', encoding=None)
Save an array to a text file.
Example
Python
In [455]:
import numpy as np
help(np.loadtxt)
Output
PowerShell
Help on function loadtxt in module numpy:
loadtxt(fname, dtype=<class 'float'>, comments='#', delimiter=None, converter
s=None, skiprows=0, usecols=None, unpack=False, ndmin=0, encoding='bytes', ma
x_rows=None, *, like=None)
Load data from a text file.
Each row in the text file must have the same number of values.
Example
Python
In [456]:
import numpy as np
a = np.array([[10,20,30],[40,50,60]]) #2-D array with shape:(2,3)
#save/serialize ndarrays object to a file
np.savetxt('out.txt',a,fmt='%.1f')
#reading ndarray objects from a file and default dtype is float
out_array1 = np.loadtxt('out.txt')
print("Output array in default format : float")
print(out_array1)
#reading ndarray objects from a file and default dtype is int
print("Output array in int format")
out_array2 = np.loadtxt('out.txt',dtype=int)
print(out_array2)
Output
PowerShell
Output array in default format : float
[[10. 20. 30.]
[40. 50. 60.]]
Output array in int format
[[10 20 30]
[40 50 60]]
Example
Python
In [457]:
## save ndarray object(str) into a text file
import numpy as np
a1 = np.array([['Sunny',1000],['Bunny',2000],['Chinny',3000],['Pinny',4000]])
#save this ndarray to a text file
np.savetxt('out.txt',a1) # by default fmt='%.18e'. It will leads to error
Output
PowerShell
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
F:\Users\Gopi\anaconda3\lib\site-packages\numpy\lib\npyio.py in savetxt(fname
, X, fmt, delimiter, newline, header, footer, comments, encoding)
1432 try: -> 1433 v = format % tuple(row) + newline
1434 except TypeError as e:
TypeError: must be real number, not numpy.str_
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
<ipython-input-457-bf45be6c6703> in <module>
4
5 #save this ndarray to a text file
----> 6 np.savetxt('out.txt',a1) # by default fmt='%.18e'. It will leads to e
rror
<__array_function__ internals> in savetxt(*args, **kwargs)
F:\Users\Gopi\anaconda3\lib\site-packages\numpy\lib\npyio.py in savetxt(fname
, X, fmt, delimiter, newline, header, footer, comments, encoding)
1433 v = format % tuple(row) + newline
1434 except TypeError as e: -> 1435 raise TypeError("Mismatch between array dtype ('%
s') and "
1436 "format specifier ('%s')"
1437 % (str(X.dtype), format)) from e
TypeError: Mismatch between array dtype ('<U11') and format specifier ('%.18e
%.18e')
savetxt() conclusions
- By using savetxt() we can store ndarray of type 1-D and 2-D only
- If we use 3-D array to store into a file then it will give error
Example
Python
In [458]:
import numpy as np
a = np.arange(24).reshape(2,3,4)
np.savetxt('output.txt',a)
Output
PowerShell
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-458-24e4ed2480fe> in <module>
1 import numpy as np
2 a = np.arange(24).reshape(2,3,4) ----> 3 np.savetxt('output.txt',a)
<__array_function__ internals> in savetxt(*args, **kwargs)
F:\Users\Gopi\anaconda3\lib\site-packages\numpy\lib\npyio.py in savetxt(fname
, X, fmt, delimiter, newline, header, footer, comments, encoding)
1378 # Handle 1-dimensional arrays
1379 if X.ndim == 0 or X.ndim > 2: -> 1380 raise ValueError(
1381 "Expected 1D or 2D array, got %dD array instead" % X.
ndim)
1382 elif X.ndim == 1:
ValueError: Expected 1D or 2D array, got 3D array instead
Example
Python
In [459]:
# to store str ndarray into a file we must specify the fmt parameter as str
import numpy as np
a1 = np.array([['Sunny',1000],['Bunny',2000],['Chinny',3000],['Pinny',4000]])
#save this ndarray to a text file
np.savetxt('out.txt',a1,fmt='%s %s')
#reading ndarray from the text file
a2 = np.loadtxt('out.txt',dtype='str')
print(f"Type of a2(fetching the data from text file) : {type(a2)}")
print(f"The a2 value after fetching the data from text file : \n {a2}")
Output
PowerShell
Type of a2(fetching the data from text file) : <class 'numpy.ndarray'>
The a2 value after fetching the data from text file :
[['Sunny' '1000']
['Bunny' '2000']
['Chinny' '3000']
['Pinny' '4000']]
Creating ndarray object by reading a file
Example
Python
In [460]:
# out.txt:
# -------
# Sunny 1000
# Bunny 2000
# Chinny 3000
# Pinny 4000
# Zinny 5000
# Vinny 6000
# Minny 7000
# Tinny 8000
# creating ndarray from text file data:
import numpy as np
#reading ndarray from the text file
a2 = np.loadtxt('out.txt',dtype='str')
print(f"Type of a2 : {type(a2)}")
print(a2)
Output
PowerShell
Type of a2 : <class 'numpy.ndarray'>
[['Sunny' '1000']
['Bunny' '2000']
['Chinny' '3000']
['Pinny' '4000']]
Writing ndarray objects to the csv file
- csv – Comma separated Values
Python
In [461]:
import numpy as np
a1 = np.array([[10,20,30],[40,50,60]])
#save/serialize to a csv file
np.savetxt('out.csv',a1,delimiter=',')
#reading ndarray object from a csv file
a2 = np.loadtxt('out.csv',delimiter=',')
print(a2)
Output
PowerShell
[[10. 20. 30.]
[40. 50. 60.]]
Summary:
- Save one ndarray object to the binary file(save() and load())
- Save multiple ndarray objects to the binary file in uncompressed form(savez()
and load()) - Save multiple ndarray objects to the binary file in compressed
form(savez_compressed() and load()) - Save ndarry object to the text file (savetxt() and loadtxt())
- Save ndarry object to the csv file (savetxt() and loadtxt() with delimiter=’,’)
Report an error