Construct Sparse Arrays¶
From coordinates and data¶
You can construct COO
arrays from coordinates and value data.
The coords
parameter contains the indices where the data is nonzero,
and the data
parameter contains the data corresponding to those indices.
For example, the following code will generate a \(5 \times 5\) diagonal
matrix:
>>> import sparse
>>> coords = [[0, 1, 2, 3, 4],
... [0, 1, 2, 3, 4]]
>>> data = [10, 20, 30, 40, 50]
>>> s = sparse.COO(coords, data, shape=(5, 5))
>>> s.todense()
array([[10, 0, 0, 0, 0],
[ 0, 20, 0, 0, 0],
[ 0, 0, 30, 0, 0],
[ 0, 0, 0, 40, 0],
[ 0, 0, 0, 0, 50]])
In general coords
should be a (ndim, nnz)
shaped
array. Each row of coords
contains one dimension of the
desired sparse array, and each column contains the index
corresponding to that nonzero element. data
contains
the nonzero elements of the array corresponding to the indices
in coords
. Its shape should be (nnz,)
.
If data
is the same across all the coordinates, it can be passed
in as a scalar. For example, the following produces the \(4 \times 4\)
identity matrix:
>>> import sparse
>>> coords = [[0, 1, 2, 3],
... [0, 1, 2, 3]]
>>> data = 1
>>> s = sparse.COO(coords, data, shape=(4, 4))
You can, and should, pass in numpy.ndarray
objects for
coords
and data
.
In this case, the shape of the resulting array was determined from
the maximum index in each dimension. If the array extends beyond
the maximum index in coords
, you should supply a shape
explicitly. For example, if we did the following without the
shape
keyword argument, it would result in a
\(4 \times 5\) matrix, but maybe we wanted one that was actually
\(5 \times 5\).
coords = [[0, 3, 2, 1], [4, 1, 2, 0]]
data = [1, 4, 2, 1]
s = COO(coords, data, shape=(5, 5))
COO
arrays support arbitrary fill values. Fill values are the “default”
value, or value to not store. This can be given a value other than zero. For
example, the following builds a (bad) representation of a \(2 \times 2\)
identity matrix. Note that not all operations are supported for operations
with nonzero fill values.
coords = [[0, 1], [1, 0]]
data = [0, 0]
s = COO(coords, data, fill_value=1)
From Scipy sparse matrices¶
To construct COO
array from spmatrix
objects, you can use the COO.from_scipy_sparse
method. As an
example, if x
is a scipy.sparse.spmatrix
, you can
do the following to get an equivalent COO
array:
s = COO.from_scipy_sparse(x)
From Numpy arrays¶
To construct COO
arrays from numpy.ndarray
objects, you can use the COO.from_numpy
method. As an
example, if x
is a numpy.ndarray
, you can
do the following to get an equivalent COO
array:
s = COO.from_numpy(x)
Generating random COO
objects¶
The sparse.random
method can be used to create random
COO
arrays. For example, the following will generate
a \(10 \times 10\) matrix with \(10\) nonzero entries,
each in the interval \([0, 1)\).
s = sparse.random((10, 10), density=0.1)
Building COO
Arrays from DOK
Arrays¶
It’s possible to build COO
arrays from DOK
arrays, if it is not
easy to construct the coords
and data
in a simple way. DOK
arrays provide a simple builder interface to build COO
arrays, but at
this time, they can do little else.
You can get started by defining the shape (and optionally, datatype) of the
DOK
array. If you do not specify a dtype, it is inferred from the value
dictionary or is set to dtype('float64')
if that is not present.
s = DOK((6, 5, 2))
s2 = DOK((2, 3, 4), dtype=np.uint8)
After this, you can build the array by assigning arrays or scalars to elements or slices of the original array. Broadcasting rules are followed.
s[1:3, 3:1:-1] = [[6, 5]]
At the end, you can convert the DOK
array to a COO
array, and
perform arithmetic or other operations on it.
s3 = COO(s)
In addition, it is possible to access single elements of the DOK
array
using normal Numpy indexing.
s[1, 2, 1] # 5
s[5, 1, 1] # 0
Converting COO
objects to other Formats¶
COO
arrays can be converted to Numpy arrays,
or to some spmatrix
subclasses via the following
methods:
COO.todense
: Converts to anumpy.ndarray
unconditionally.COO.maybe_densify
: Converts to anumpy.ndarray
based oncertain constraints.
COO.to_scipy_sparse
: Converts to ascipy.sparse.coo_matrix
ifthe array is two dimensional.
COO.tocsr
: Converts to ascipy.sparse.csr_matrix
ifthe array is two dimensional.
COO.tocsc
: Converts to ascipy.sparse.csc_matrix
ifthe array is two dimensional.