In many scientific applications, arrays come up that are mostly empty or filled with zeros. These arrays are aptly named sparse arrays. However, it is a matter of choice as to how these are stored. One may store the full array, i.e., with all the zeros included. This incurs a significant cost in terms of memory and performance when working with these arrays.
An alternative way is to store them in a standalone data structure that keeps track
of only the nonzero entries. Often, this improves performance and memory consumption
but most operations on sparse arrays have to be re-written.
sparse tries to
provide one such data structure. It isn’t the only library that does this. Notably,
scipy.sparse achieves this, along with
So why use
sparse? Well, the other libraries mentioned are mostly limited to
two-dimensional arrays. In addition, inter-compatibility with
sparse strives to achieve inter-compatibility with
numpy.ndarray, and provide mostly the same API. It defers to
when it is convenient to do so, and writes custom implementations of operations where
this isn’t possible. It also supports general N-dimensional arrays.
Where to from here?¶
If you’re new to this library, you can visit the user manual page. If you’re already familiar with this library, or you want to dive straight in, you can jump to the API reference. You can also see the contents in the sidebar.