COO¶

class sparse.COO(coords, data=None, shape=None, has_duplicates=True, sorted=False, cache=False)[source]¶

A sparse multidimensional array.

This is stored in COO format. It depends on NumPy and Scipy.sparse for computation, but supports arrays of arbitrary dimension.

Parameters:

coords (numpy.ndarray (COO.ndim, COO.nnz)) – An array holding the index locations of every value Should have shape (number of dimensions, number of non-zeros)
data (numpy.ndarray (COO.nnz,)) – An array of Values
shape (tuple[int] (COO.ndim,)) – The shape of the array.
has_duplicates (bool, optional) – A value indicating whether the supplied value for coords has duplicates. Note that setting this to False when coords does have duplicates may result in undefined behaviour. See COO.sum_duplicates
sorted (bool, optional) – A value indicating whether the values in coords are sorted. Note that setting this to False when coords isn’t sorted may result in undefined behaviour. See COO.sort_indices.
cache (bool, optional) – Whether to enable cacheing for various operations. See COO.enable_caching

coords¶: numpy.ndarray (ndim, nnz) – An array holding the coordinates of every nonzero element.

data¶: numpy.ndarray (nnz,) – An array holding the values corresponding to COO.coords.

shape¶: tuple[int] (ndim,) – The dimensions of this array.

See also

DOK: A mostly write-only sparse array.

Examples

You can create COO objects from Numpy arrays.

>>> x = np.eye(4, dtype=np.uint8)
>>> x[2, 3] = 5
>>> s = COO.from_numpy(x)
>>> s
<COO: shape=(4, 4), dtype=uint8, nnz=5, sorted=True, duplicates=False>
>>> s.data  
array([1, 1, 1, 5, 1], dtype=uint8)
>>> s.coords  
array([[0, 1, 2, 2, 3],
       [0, 1, 2, 3, 3]], dtype=uint8)

COO objects support basic arithmetic and binary operations.

>>> x2 = np.eye(4, dtype=np.uint8)
>>> x2[3, 2] = 5
>>> s2 = COO.from_numpy(x2)
>>> (s + s2).todense()  
array([[2, 0, 0, 0],
       [0, 2, 0, 0],
       [0, 0, 2, 5],
       [0, 0, 5, 2]], dtype=uint8)
>>> (s * s2).todense()  
array([[1, 0, 0, 0],
       [0, 1, 0, 0],
       [0, 0, 1, 0],
       [0, 0, 0, 1]], dtype=uint8)

Binary operations support broadcasting.

>>> x3 = np.zeros((4, 1), dtype=np.uint8)
>>> x3[2, 0] = 1
>>> s3 = COO.from_numpy(x3)
>>> (s * s3).todense()  
array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 1, 5],
       [0, 0, 0, 0]], dtype=uint8)

COO objects also support dot products and reductions.

>>> s.dot(s.T).sum(axis=0).todense()   
array([ 1,  1, 31,  6], dtype=uint64)

You can use Numpy ufunc operations on COO arrays as well.

>>> np.sum(s, axis=1).todense()  
array([1, 1, 6, 1], dtype=uint64)
>>> np.round(np.sqrt(s, dtype=np.float64), decimals=1).todense()   
array([[ 1. ,  0. ,  0. ,  0. ],
       [ 0. ,  1. ,  0. ,  0. ],
       [ 0. ,  0. ,  1. ,  2.2],
       [ 0. ,  0. ,  0. ,  1. ]])

Operations that will result in a dense array will raise a ValueError, such as the following.

>>> np.exp(s)
Traceback (most recent call last):
    ...
ValueError: Performing this operation would produce a dense result: <ufunc 'exp'>

You can also create COO arrays from coordinates and data.

>>> coords = [[0, 0, 0, 1, 1],
...           [0, 1, 2, 0, 3],
...           [0, 3, 2, 0, 1]]
>>> data = [1, 2, 3, 4, 5]
>>> s4 = COO(coords, data, shape=(3, 4, 5))
>>> s4
<COO: shape=(3, 4, 5), dtype=int64, nnz=5, sorted=False, duplicates=True>

Following scipy.sparse conventions you can also pass these as a tuple with rows and columns

>>> rows = [0, 1, 2, 3, 4]
>>> cols = [0, 0, 0, 1, 1]
>>> data = [10, 20, 30, 40, 50]
>>> z = COO((data, (rows, cols)))
>>> z.todense()  
array([[10,  0],
       [20,  0],
       [30,  0],
       [ 0, 40],
       [ 0, 50]])

You can also pass a dictionary or iterable of index/value pairs. Repeated indices imply summation:

>>> d = {(0, 0, 0): 1, (1, 2, 3): 2, (1, 1, 0): 3}
>>> COO(d)
<COO: shape=(2, 3, 4), dtype=int64, nnz=3, sorted=False, duplicates=False>
>>> L = [((0, 0), 1),
...      ((1, 1), 2),
...      ((0, 0), 3)]
>>> COO(L).todense()  
array([[4, 0],
       [0, 2]])

You can convert DOK arrays to COO arrays.

>>> from sparse import DOK
>>> s5 = DOK((5, 5), dtype=np.int64)
>>> s5[1:3, 1:3] = [[4, 5], [6, 7]]
>>> s5
<DOK: shape=(5, 5), dtype=int64, nnz=4>
>>> s6 = COO(s5)
>>> s6
<COO: shape=(5, 5), dtype=int64, nnz=4, sorted=False, duplicates=False>
>>> s6.todense()  
array([[0, 0, 0, 0, 0],
       [0, 4, 5, 0, 0],
       [0, 6, 7, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])

Note

COO objects also support operators and indexing

Attributes

`COO.T`	Returns a new array which has the order of the axes reversed.
`COO.dtype`	The datatype of this array.
`COO.nbytes`	The number of bytes taken up by this object.
`COO.ndim`	The number of dimensions of this array.
`COO.nnz`	The number of nonzero elements in this array.
`COO.size`	The number of all elements (including zeros) in this array.
`COO.density`	The ratio of nonzero to all elements in this array.

Constructing COO objects

`COO.from_numpy`(x)	Convert the given `numpy.ndarray` to a `COO` object.
`COO.from_scipy_sparse`(x)	Construct a `COO` array from a `scipy.sparse.spmatrix`

Element-wise operations

`COO.astype`(dtype[, out])	Copy of the array, cast to a specified type.
`COO.round`([decimals, out])	Evenly round to the given number of decimals.

Reductions

`COO.reduce`(method[, axis, keepdims])	Performs a reduction operation on this array.
`COO.sum`([axis, keepdims, dtype, out])	Performs a sum operation along the given axes.
`COO.max`([axis, keepdims, out])	Maximize along the given axes.
`COO.min`([axis, keepdims, out])	Minimize along the given axes.
`COO.prod`([axis, keepdims, dtype, out])	Performs a product operation along the given axes.
`COO.nanreduce`(method[, identity, axis, keepdims])	Performs an `NaN` skipping reduction on this array.

Converting to other formats

`COO.todense`()	Convert this `COO` array to a dense `numpy.ndarray`.
`COO.maybe_densify`([max_size, min_density])	Converts this `COO` array to a `numpy.ndarray` if not too costly.
`COO.to_scipy_sparse`()	Converts this `COO` object into a `scipy.sparse.coo_matrix`.
`COO.tocsc`()	Converts this array to a `scipy.sparse.csc_matrix`.
`COO.tocsr`()	Converts this array to a `scipy.sparse.csr_matrix`.

Other operations

`COO.dot`(other)	Performs the equivalent of `x.dot(y)` for `COO`.
`COO.reshape`(shape)	Returns a new `COO` array that is a reshaped version of this array.
`COO.transpose`([axes])	Returns a new array which has the order of the axes switched.

Utility functions

`COO.broadcast_to`(shape)	Performs the equivalent of `numpy.broadcast_to` for `COO`.
`COO.enable_caching`()	Enable caching of reshape, transpose, and tocsr/csc operations
`COO.linear_loc`([signed])	The nonzero coordinates of a flattened version of this array.
`COO.sort_indices`()	Sorts the `COO.coords` attribute.
`COO.sum_duplicates`()	Sums data corresponding to duplicates in `COO.coords`.