Using with Dask¶

Import¶

In [1]:

Copied!

import sparse

import dask.array as da

import numpy as np
import sparse

import dask.array as da

import numpy as np

Create Arrays¶

Here, we create two random sparse arrays and move them to Dask.

In [2]:

Copied!





rng = np.random.default_rng(42)
M, N = 10_000, 10_000
DENSITY = 0.0001
a = sparse.random((M, N), density=DENSITY)
b = sparse.random((M, N), density=DENSITY)

a_dask = da.from_array(a, chunks=1000)
b_dask = da.from_array(b, chunks=1000)
rng = np.random.default_rng(42)
M, N = 10_000, 10_000
DENSITY = 0.0001
a = sparse.random((M, N), density=DENSITY)
b = sparse.random((M, N), density=DENSITY)

a_dask = da.from_array(a, chunks=1000)
b_dask = da.from_array(b, chunks=1000)

As we can see in the "data type" section, each chunk of the Dask array is still sparse.

In [3]:

Copied!

a_dask  # noqa: B018
a_dask  # noqa: B018

Out[3]:

	Array	Chunk
Shape	(10000, 10000)	(1000, 1000)
Dask graph	100 chunks in 2 graph layers
Data type	float64 sparse.numba_backend._coo.core.COO

Compute and check results¶

As we can see, what we get out of Dask matches what we get out of sparse.

In [4]:

Copied!

assert sparse.all(a + b == (a_dask + b_dask).compute())
assert sparse.all(a + b == (a_dask + b_dask).compute())