COO¶
-
class
sparse.
COO
(coords, data=None, shape=None, has_duplicates=True, sorted=False, cache=False)[source]¶ A sparse multidimensional array.
This is stored in COO format. It depends on NumPy and Scipy.sparse for computation, but supports arrays of arbitrary dimension.
Parameters: - coords (numpy.ndarray (COO.ndim, COO.nnz)) – An array holding the index locations of every value Should have shape (number of dimensions, number of non-zeros)
- data (numpy.ndarray (COO.nnz,)) – An array of Values
- shape (tuple[int] (COO.ndim,)) – The shape of the array.
- has_duplicates (bool, optional) – A value indicating whether the supplied value for
coords
has duplicates. Note that setting this to False whencoords
does have duplicates may result in undefined behaviour. SeeCOO.sum_duplicates
- sorted (bool, optional) – A value indicating whether the values in coords are sorted. Note
that setting this to False when
coords
isn’t sorted may result in undefined behaviour. SeeCOO.sort_indices
. - cache (bool, optional) – Whether to enable cacheing for various operations. See
COO.enable_caching
-
coords
¶ numpy.ndarray (ndim, nnz) – An array holding the coordinates of every nonzero element.
-
data
¶ numpy.ndarray (nnz,) – An array holding the values corresponding to
COO.coords
.
-
shape
¶ tuple[int] (ndim,) – The dimensions of this array.
See also
DOK
- A mostly write-only sparse array.
Examples
You can create
COO
objects from Numpy arrays.>>> x = np.eye(4, dtype=np.uint8) >>> x[2, 3] = 5 >>> s = COO.from_numpy(x) >>> s <COO: shape=(4, 4), dtype=uint8, nnz=5, sorted=True, duplicates=False> >>> s.data array([1, 1, 1, 5, 1], dtype=uint8) >>> s.coords array([[0, 1, 2, 2, 3], [0, 1, 2, 3, 3]], dtype=uint8)
COO
objects support basic arithmetic and binary operations.>>> x2 = np.eye(4, dtype=np.uint8) >>> x2[3, 2] = 5 >>> s2 = COO.from_numpy(x2) >>> (s + s2).todense() array([[2, 0, 0, 0], [0, 2, 0, 0], [0, 0, 2, 5], [0, 0, 5, 2]], dtype=uint8) >>> (s * s2).todense() array([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]], dtype=uint8)
Binary operations support broadcasting.
>>> x3 = np.zeros((4, 1), dtype=np.uint8) >>> x3[2, 0] = 1 >>> s3 = COO.from_numpy(x3) >>> (s * s3).todense() array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 1, 5], [0, 0, 0, 0]], dtype=uint8)
COO
objects also support dot products and reductions.>>> s.dot(s.T).sum(axis=0).todense() array([ 1, 1, 31, 6], dtype=uint64)
You can use Numpy
ufunc
operations onCOO
arrays as well.>>> np.sum(s, axis=1).todense() array([1, 1, 6, 1], dtype=uint64) >>> np.round(np.sqrt(s, dtype=np.float64), decimals=1).todense() array([[ 1. , 0. , 0. , 0. ], [ 0. , 1. , 0. , 0. ], [ 0. , 0. , 1. , 2.2], [ 0. , 0. , 0. , 1. ]])
Operations that will result in a dense array will raise a
ValueError
, such as the following.>>> np.exp(s) Traceback (most recent call last): ... ValueError: Performing this operation would produce a dense result: <ufunc 'exp'>
You can also create
COO
arrays from coordinates and data.>>> coords = [[0, 0, 0, 1, 1], ... [0, 1, 2, 0, 3], ... [0, 3, 2, 0, 1]] >>> data = [1, 2, 3, 4, 5] >>> s4 = COO(coords, data, shape=(3, 4, 5)) >>> s4 <COO: shape=(3, 4, 5), dtype=int64, nnz=5, sorted=False, duplicates=True>
Following scipy.sparse conventions you can also pass these as a tuple with rows and columns
>>> rows = [0, 1, 2, 3, 4] >>> cols = [0, 0, 0, 1, 1] >>> data = [10, 20, 30, 40, 50] >>> z = COO((data, (rows, cols))) >>> z.todense() array([[10, 0], [20, 0], [30, 0], [ 0, 40], [ 0, 50]])
You can also pass a dictionary or iterable of index/value pairs. Repeated indices imply summation:
>>> d = {(0, 0, 0): 1, (1, 2, 3): 2, (1, 1, 0): 3} >>> COO(d) <COO: shape=(2, 3, 4), dtype=int64, nnz=3, sorted=False, duplicates=False> >>> L = [((0, 0), 1), ... ((1, 1), 2), ... ((0, 0), 3)] >>> COO(L).todense() array([[4, 0], [0, 2]])
You can convert
DOK
arrays toCOO
arrays.>>> from sparse import DOK >>> s5 = DOK((5, 5), dtype=np.int64) >>> s5[1:3, 1:3] = [[4, 5], [6, 7]] >>> s5 <DOK: shape=(5, 5), dtype=int64, nnz=4> >>> s6 = COO(s5) >>> s6 <COO: shape=(5, 5), dtype=int64, nnz=4, sorted=False, duplicates=False> >>> s6.todense() array([[0, 0, 0, 0, 0], [0, 4, 5, 0, 0], [0, 6, 7, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]])
Attributes
COO.T
Returns a new array which has the order of the axes reversed. COO.dtype
The datatype of this array. COO.nbytes
The number of bytes taken up by this object. COO.ndim
The number of dimensions of this array. COO.nnz
The number of nonzero elements in this array. COO.size
The number of all elements (including zeros) in this array. COO.density
The ratio of nonzero to all elements in this array. COO.from_numpy
(x)Convert the given numpy.ndarray
to aCOO
object.COO.from_scipy_sparse
(x)Construct a COO
array from ascipy.sparse.spmatrix
COO.astype
(dtype[, out])Copy of the array, cast to a specified type. COO.round
([decimals, out])Evenly round to the given number of decimals. COO.reduce
(method[, axis, keepdims])Performs a reduction operation on this array. COO.sum
([axis, keepdims, dtype, out])Performs a sum operation along the given axes. COO.max
([axis, keepdims, out])Maximize along the given axes. COO.min
([axis, keepdims, out])Minimize along the given axes. COO.prod
([axis, keepdims, dtype, out])Performs a product operation along the given axes. COO.nanreduce
(method[, identity, axis, keepdims])Performs an NaN
skipping reduction on this array.COO.todense
()Convert this COO
array to a densenumpy.ndarray
.COO.maybe_densify
([max_size, min_density])Converts this COO
array to anumpy.ndarray
if not too costly.COO.to_scipy_sparse
()Converts this COO
object into ascipy.sparse.coo_matrix
.COO.tocsc
()Converts this array to a scipy.sparse.csc_matrix
.COO.tocsr
()Converts this array to a scipy.sparse.csr_matrix
.COO.dot
(other)Performs the equivalent of x.dot(y)
forCOO
.COO.reshape
(shape)Returns a new COO
array that is a reshaped version of this array.COO.transpose
([axes])Returns a new array which has the order of the axes switched. Utility functions
COO.broadcast_to
(shape)Performs the equivalent of numpy.broadcast_to
forCOO
.COO.enable_caching
()Enable caching of reshape, transpose, and tocsr/csc operations COO.linear_loc
([signed])The nonzero coordinates of a flattened version of this array. COO.sort_indices
()Sorts the COO.coords
attribute.COO.sum_duplicates
()Sums data corresponding to duplicates in COO.coords
.