COO

class sparse.COO(coords, data=None, shape=None, has_duplicates=True, sorted=False, prune=False, cache=False, fill_value=None, idx_dtype=None)[source]

A sparse multidimensional array.

This is stored in COO format. It depends on NumPy and Scipy.sparse for computation, but supports arrays of arbitrary dimension.

Parameters:
  • coords (numpy.ndarray (COO.ndim, COO.nnz)) – An array holding the index locations of every value Should have shape (number of dimensions, number of non-zeros).

  • data (numpy.ndarray (COO.nnz,)) – An array of Values. A scalar can also be supplied if the data is the same across all coordinates. If not given, defers to as_coo.

  • shape (tuple[int] (COO.ndim,)) – The shape of the array.

  • has_duplicates (bool, optional) – A value indicating whether the supplied value for coords has duplicates. Note that setting this to False when coords does have duplicates may result in undefined behaviour. See COO.sum_duplicates

  • sorted (bool, optional) – A value indicating whether the values in coords are sorted. Note that setting this to True when coords isn’t sorted may result in undefined behaviour. See COO.sort_indices.

  • prune (bool, optional) – A flag indicating whether or not we should prune any fill-values present in data.

  • cache (bool, optional) – Whether to enable cacheing for various operations. See COO.enable_caching

  • fill_value (scalar, optional) – The fill value for this array.

coords

An array holding the coordinates of every nonzero element.

Type:

numpy.ndarray (ndim, nnz)

data

An array holding the values corresponding to COO.coords.

Type:

numpy.ndarray (nnz,)

shape

The dimensions of this array.

Type:

tuple[int] (ndim,)

See also

DOK

A mostly write-only sparse array.

as_coo

Convert any given format to COO.

Examples

You can create COO objects from Numpy arrays.

>>> x = np.eye(4, dtype=np.uint8)
>>> x[2, 3] = 5
>>> s = COO.from_numpy(x)
>>> s
<COO: shape=(4, 4), dtype=uint8, nnz=5, fill_value=0>
>>> s.data  
array([1, 1, 1, 5, 1], dtype=uint8)
>>> s.coords  
array([[0, 1, 2, 2, 3],
       [0, 1, 2, 3, 3]])

COO objects support basic arithmetic and binary operations.

>>> x2 = np.eye(4, dtype=np.uint8)
>>> x2[3, 2] = 5
>>> s2 = COO.from_numpy(x2)
>>> (s + s2).todense()  
array([[2, 0, 0, 0],
       [0, 2, 0, 0],
       [0, 0, 2, 5],
       [0, 0, 5, 2]], dtype=uint8)
>>> (s * s2).todense()  
array([[1, 0, 0, 0],
       [0, 1, 0, 0],
       [0, 0, 1, 0],
       [0, 0, 0, 1]], dtype=uint8)

Binary operations support broadcasting.

>>> x3 = np.zeros((4, 1), dtype=np.uint8)
>>> x3[2, 0] = 1
>>> s3 = COO.from_numpy(x3)
>>> (s * s3).todense()  
array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 1, 5],
       [0, 0, 0, 0]], dtype=uint8)

COO objects also support dot products and reductions.

>>> s.dot(s.T).sum(axis=0).todense()  
array([ 1,  1, 31,  6], dtype=uint64)

You can use Numpy ufunc operations on COO arrays as well.

>>> np.sum(s, axis=1).todense()  
array([1, 1, 6, 1], dtype=uint64)
>>> np.round(np.sqrt(s, dtype=np.float64), decimals=1).todense()  
array([[ 1. ,  0. ,  0. ,  0. ],
       [ 0. ,  1. ,  0. ,  0. ],
       [ 0. ,  0. ,  1. ,  2.2],
       [ 0. ,  0. ,  0. ,  1. ]])

Operations that will result in a dense array will usually result in a different fill value, such as the following.

>>> np.exp(s)
<COO: shape=(4, 4), dtype=float16, nnz=5, fill_value=1.0>

You can also create COO arrays from coordinates and data.

>>> coords = [[0, 0, 0, 1, 1], [0, 1, 2, 0, 3], [0, 3, 2, 0, 1]]
>>> data = [1, 2, 3, 4, 5]
>>> s4 = COO(coords, data, shape=(3, 4, 5))
>>> s4
<COO: shape=(3, 4, 5), dtype=int64, nnz=5, fill_value=0>

If the data is same across all coordinates, you can also specify a scalar.

>>> coords = [[0, 0, 0, 1, 1], [0, 1, 2, 0, 3], [0, 3, 2, 0, 1]]
>>> data = 1
>>> s5 = COO(coords, data, shape=(3, 4, 5))
>>> s5
<COO: shape=(3, 4, 5), dtype=int64, nnz=5, fill_value=0>

Following scipy.sparse conventions you can also pass these as a tuple with rows and columns

>>> rows = [0, 1, 2, 3, 4]
>>> cols = [0, 0, 0, 1, 1]
>>> data = [10, 20, 30, 40, 50]
>>> z = COO((data, (rows, cols)))
>>> z.todense()  
array([[10,  0],
       [20,  0],
       [30,  0],
       [ 0, 40],
       [ 0, 50]])

You can also pass a dictionary or iterable of index/value pairs. Repeated indices imply summation:

>>> d = {(0, 0, 0): 1, (1, 2, 3): 2, (1, 1, 0): 3}
>>> COO(d)
<COO: shape=(2, 3, 4), dtype=int64, nnz=3, fill_value=0>
>>> L = [((0, 0), 1), ((1, 1), 2), ((0, 0), 3)]
>>> COO(L).todense()  
array([[4, 0],
       [0, 2]])

You can convert DOK arrays to COO arrays.

>>> from sparse import DOK
>>> s6 = DOK((5, 5), dtype=np.int64)
>>> s6[1:3, 1:3] = [[4, 5], [6, 7]]
>>> s6
<DOK: shape=(5, 5), dtype=int64, nnz=4, fill_value=0>
>>> s7 = s6.asformat("coo")
>>> s7
<COO: shape=(5, 5), dtype=int64, nnz=4, fill_value=0>
>>> s7.todense()  
array([[0, 0, 0, 0, 0],
       [0, 4, 5, 0, 0],
       [0, 6, 7, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])

Note

COO objects also support operators and indexing

Attributes

COO.T

Returns a new array which has the order of the axes reversed.

COO.dtype

The datatype of this array.

COO.nbytes

The number of bytes taken up by this object.

COO.ndim

The number of dimensions of this array.

COO.nnz

The number of nonzero elements in this array.

COO.size

The number of all elements (including zeros) in this array.

COO.density

The ratio of nonzero to all elements in this array.

COO.imag

The imaginary part of the array.

COO.real

The real part of the array.

Constructing COO objects

COO.from_iter(x[, shape, fill_value, dtype])

Converts an iterable in certain formats to a COO array.

COO.from_numpy(x[, fill_value, idx_dtype])

Convert the given numpy.ndarray to a COO object.

COO.from_scipy_sparse(x)

Construct a COO array from a scipy.sparse.spmatrix

Element-wise operations

COO.astype(dtype[, casting, copy])

Copy of the array, cast to a specified type.

COO.conj()

Return the complex conjugate, element-wise.

COO.clip([min, max, out])

Clip (limit) the values in the array.

COO.round([decimals, out])

Evenly round to the given number of decimals.

Reductions

COO.reduce(method[, axis, keepdims])

Performs a reduction operation on this array.

COO.sum([axis, keepdims, dtype, out])

Performs a sum operation along the given axes.

COO.prod([axis, keepdims, dtype, out])

Performs a product operation along the given axes.

COO.min([axis, keepdims, out])

Minimize along the given axes.

COO.max([axis, keepdims, out])

Maximize along the given axes.

COO.any([axis, keepdims, out])

See if any values along array are True.

COO.all([axis, keepdims, out])

See if all values in an array are True.

COO.mean([axis, keepdims, dtype, out])

Compute the mean along the given axes.

COO.std([axis, dtype, out, ddof, keepdims])

Compute the standard deviation along the given axes.

COO.var([axis, dtype, out, ddof, keepdims])

Compute the variance along the given axes.

Converting to other formats

COO.asformat(format, **kwargs)

Convert this sparse array to a given format.

COO.todense()

Convert this COO array to a dense numpy.ndarray.

COO.maybe_densify([max_size, min_density])

Converts this COO array to a numpy.ndarray if not too costly.

COO.to_scipy_sparse()

Converts this COO object into a scipy.sparse.coo_matrix.

COO.tocsc()

Converts this array to a scipy.sparse.csc_matrix.

COO.tocsr()

Converts this array to a scipy.sparse.csr_matrix.

Other operations

COO.copy([deep])

Return a copy of the array.

COO.dot(other)

Performs the equivalent of x.dot(y) for COO.

COO.flatten([order])

Returns a new COO array that is a flattened version of this array.

COO.reshape(shape[, order])

Returns a new COO array that is a reshaped version of this array.

COO.resize(*args[, refcheck, coords_dtype])

This method changes the shape and size of an array in-place.

COO.transpose([axes])

Returns a new array which has the order of the axes switched.

COO.swapaxes(axis1, axis2)

Returns array that has axes axis1 and axis2 swapped.

COO.nonzero()

Get the indices where this array is nonzero.

Utility functions

COO.broadcast_to(shape)

Performs the equivalent of numpy.broadcast_to for COO.

COO.enable_caching()

Enable caching of reshape, transpose, and tocsr/csc operations

COO.linear_loc()

The nonzero coordinates of a flattened version of this array.