GraphBLAS – Graph algorithms in the language of linear algebra

szarnyasg · on May 23, 2020

I have created a tutorial slideshow on the theory of GraphBLAS with lots of examples and step-by-step illustrations: http://mit.bme.hu/~szarnyas/grb/graphblas-introduction.pdf

A shorter version of this tutorial was recorded at FOSDEM earlier this year: https://fosdem.org/2020/schedule/event/graphblas/

More papers/tutorials/libraries are listed at https://github.com/GraphBLAS/GraphBLAS-Pointers.

ngcc_hk · on May 24, 2020

Thanks and great

FabHK · on May 23, 2020

This interview with Tim Davis has a bit of background information.

https://www.acm.org/articles/people-of-acm/2019/tim-davis

> I picked Gaussian elimination. "That will be quick," I thought. Not! I got started on matrices in 1985 and I'm not done with Gaussian elimination yet.

> GraphBLAS is a community effort, including industry, academics, and government labs, that is working to design a library that can implement graph algorithms based on sparse linear algebra over semirings. There are lots of great graph libraries out there that don't exploit the linear algebraic abstraction, but most of what they do can be viewed as matrix operations on adjacency matrices. GraphBLAS makes the connection to linear algebra explicit.

> GraphBLAS gives the graph algorithm developer the power and abstraction of linear algebra, with all its advantages (associative and distributive laws, AB=(B'A')', and so on). One level of breadth-first search, for example, is a single sparse matrix-times-sparse vector multiply, with no loss of asymptotic efficiency. Google's entire PageRank computation can be done with a handful of iterations around a single call to GraphBLAS, using what I call the "PageRank semiring."

chrisaycock · on May 23, 2020

>> Google's entire PageRank computation can be done with a handful of iterations around a single call to GraphBLAS, using what I call the "PageRank semiring."

The author has an example of PageRank in C:

https://github.com/DrTimothyAldenDavis/GraphBLAS/blob/stable...

I assume his "single call" statement refers to the MATLAB version:

https://github.com/DrTimothyAldenDavis/GraphBLAS/blob/stable...

    for i = 1:20
        r = ((c*r) * C) + a * sum (r) ;
    end

The MATLAB interface uses overloaded operators and methods on a GraphBLAS matrix object.

DocSparse · on May 23, 2020

I was referring to the PageRank_semiring in https://github.com/DrTimothyAldenDavis/GraphBLAS/blob/5e569f... . I wasn't counting the test for the termination condition -- in the statement in my interview, I assumed the # of iterations could just be given. I would guess that's the case for the Google PageRank computation, since the graph doesn't change so much each month when they compute it, they probably know how many iterations they need.

michelpp · on May 23, 2020

Here are two different PageRank approaches with pygraphblas:

https://github.com/michelp/pygraphblas/blob/master/pygraphbl...

ww_wpg · on May 25, 2020

> AB=(B'A')'

Can someone clarify the notation for me? What is B' and A'? I've never seen that before.

soedirgo · on May 25, 2020

It's the transpose operation [1]. See property 3.

[1] https://en.wikipedia.org/wiki/Transpose

chrisaycock · on May 23, 2020

RedisGraph uses GraphBLAS:

https://github.com/RedisGraph/RedisGraph/

Their paper claims that computing with adjacency (sparse) matrices is more performant than alternatives:

https://arxiv.org/pdf/1905.01294.pdf

They compile queries written in Cypher (Neo4j's language) to operations in linear algebra. An example query is:

    MATCH (n:actor{name:"Nicolas Cage"})-[:act]->(m:movie)<-[:act]-(a:actor) RETURN a.name, m.title

waffle_ss · on May 24, 2020

Note that the Cypher support is incomplete, and missing features like multiple labels.

https://oss.redislabs.com/redisgraph/cypher_support/

nimish · on May 23, 2020

Is there a treatment of graph algorithms in linear algebra from the basics anywhere?

michelpp · on May 23, 2020

pygraphblas author here, here is a good notebook introduction to the very basics of GraphBLAS with Python. The concepts carry naturally to the C API as well:

https://github.com/michelp/pygraphblas/blob/master/pygraphbl...

I also made a quick introduction video for a paper submission you can see here:

https://www.youtube.com/watch?v=JUbXW_f03W0

boothby · on May 23, 2020

I'm a graph theorist, and I spend a lot of time on various optimization problems (matchings, packings, coverings, Steiner trees, etc) doing VLSI. AFAICT, it seems like GraphBLAS is geared for information processing about data stored in a graph representation, and doesn't approach the problems I need. Am I wrong?

Right now, the graphs I'm interested in are reasonably small (tens of thousands of edges), and my oldschool approaches are good enough, but if my employer's products continue their grow trajectory, we'll need to leverage GPUs and TPUs if at all possible. Should I be learning about GraphBLAS?

bubblethink · on May 23, 2020

GraphBLAS is meant to solve graph problems (i.e., generally the sort of ones you are interested in, but more emphasis on parallel ones). Perhaps graph analytics may be closer to what it does right now rather than graph theory. At the very basic level, it exploits the matrix graph duality and casts graph problems as matrix problems. For eg., BFS and SSSP are sparse matrix vector multiplications over a suitably defined semiring. The benefit of this is that you can use high performance matrix libraries. The downside is that not everything can be done this way. At least not easily. You may find the book by Kepner and Gilbert useful. Also look at cugraph and gunrock for GPU based graph frameworks.

boothby · on May 23, 2020

Time to hit the books, I guess. Thanks!

michelpp · on May 24, 2020

It's so early in this work that you might be the first person asking those specific questions. I'm no expert but I take it by VLSI you mean doing circuit routing and simulation and optimizing that topology. My gut tells me those would be good fits for the GraphBLAS.

If the base types don't suit your needs, you can always make your own. Tim Davis has talked about how he's made new types that are small nxn matrices as elements of a larger matrix. You can have complex, quaternion, or complex bags of stuff as matrix elements as long as you define the operators you need to work on those types.

Here's an example of using User Defined Types for a shortest path algorithm that also stores backlinks along the shortest path so the shortest path tree is materialized. in Python:

https://github.com/michelp/pygraphblas/blob/master/pygraphbl...

boothby · on May 24, 2020

Shortest paths is definitely a big timesink... I'll definitely give that a shot. Thanks for the link :)

As for what I'm doing, I work at D-Wave and a lot of my work is on what can be thought of as "compilers" for our quantum computers. There's tons of similarity between my problem area and compilers for FPGAs, for example, so I call it VLSI because that's the right ballpark.

andersource · on May 23, 2020

Definitely not a comprehensive treatment, but I wrote about some specific use cases from the basics here:

https://andersource.dev/2019/08/25/fun-with-matrix-exponenti...

FabHK · on May 23, 2020

There's a few papers listed on Davis's GraphBLAS [1] site, this one might serve as an intro:

http://faculty.cse.tamu.edu/davis/GraphBLAS_files/toms_graph...

[1] http://faculty.cse.tamu.edu/davis/GraphBLAS.html

bubblethink · on May 23, 2020

There is a book by the same title.

JadeNB · on May 24, 2020

Graph algorithms in the language of linear algebra: https://epubs.siam.org/doi/book/10.1137/1.9780898719918 .

codecoinbot · on June 2, 2020

Shameless self promotion, I started a rust wrapper as my first project for GraphBLAS

https://github.com/fabianmurariu/rustgraphblas

and a JNI wrapper

https://github.com/fabianmurariu/graphblas-java-native

They are not 100% complete, the JNI one is not published yet and only works on linux

amelius · on May 24, 2020

There is also a Python wrapper:

https://github.com/michelp/pygraphblas

mark_l_watson · on May 23, 2020

Sounds like a good idea with an added benefit that there are BLAS interfaces for many programming languages.

ablmf · on May 24, 2020

Can you guys give some real world examples of what graphs algorithms people are actually using in applications?

a_bonobo · on May 24, 2020

A lot of bioinformatics (my field) uses graph algorithms. DNA sequencing machines don't give you entire genomes, they give you pieces. Re-assembling the genome from those pieces is a prime graph algorithm target.

For example, Canu, one of the more popular current assemblers for PacBio and MinION data, uses a best overlap graph https://genome.cshlp.org/content/27/5/722.long

fgdh · on May 24, 2020

If you don’t mind the marketing aspect, there are some pretty good overviews in graphconnect (neo4j) conferences. https://m.youtube.com/watch?v=dW6JsFccdkM

kayza · on May 24, 2020

There is a talk by two Facebook engineers [0] where they talk about using graph partitioning to better distribute caching of data.

[0]: https://m.youtube.com/watch?v=QHkhyY9atkE

ajflores1604 · on May 24, 2020

For those like me who prefer watching talks, Tim Davis presented for Redis a year or two ago.

https://youtu.be/xnez6tloNSQ

chrisseaton · on May 23, 2020

Is there a short example somewhere of what it looks like in practice?

FabHK · on May 23, 2020

Tim Davis, the developer of SuiteSparse [1], a library for calculating with sparse matrices, also developed a reference implementation of GraphBLAS. On his blog are a few short examples of how it looks when you call it from MATLAB:

http://aldenmath.com/performance-of-the-matlab-interface-to-...

http://aldenmath.com/a-matlab-interface-for-suitesparsegraph...

[1] http://faculty.cse.tamu.edu/davis/suitesparse.html

ctchocula · on May 23, 2020

There's an example here showing you don't need to write a shortest path algorithm in terms of nodes and edges: https://github.com/gunrock/graphblast#usage

bubblethink · on May 23, 2020

graphblas is nice in theory, but actual high performance code tends to be quite messy and not always easily encapsulated by graphblas. For example, if you want to store predecessors in SSSP, the semiring becomes quite involved, and not easily transferable to GPUs etc. And there are other other optimizations (direction optimizing BFS, delta stepping SSSP etc.) that are not really a part of the algorithmic specification which also break the encapsulation.

DocSparse · on May 23, 2020

Give us time; we're still in the beginning stages of writing algorithms that use it. If you dig through LAGraph, you'll find lots of messy prototypes, as we experiment on various methods. Those are drafts, not polished results. Our most recent methods are polished, extremely simple, yet very fast:

https://github.com/GraphBLAS/LAGraph/blob/master/Source/Algo...

https://github.com/GraphBLAS/LAGraph/blob/master/Source/Algo... (that one has 6 algorithms inside)

https://github.com/GraphBLAS/LAGraph/blob/master/Source/Algo...

We're working on the GPU version, and all the built-in semirings very work nicely on the GPU. For user-defined semirings, we will need the user operators defined as strings containing C code. Then they can all be done on the GPU too.

ctchocula · on May 23, 2020

People have shown that direction optimizing BFS fits very neatly inside graphblas actually [1]. Perhaps in a few years people will have figured out how to make delta stepping and storing predecessors fit too even if it's not clear at the moment.

For delta stepping, it seems all you would need is a priority queue that works in batches as opposed to individual elements. Then to make it performant and match delta stepping code written from scratch, you might need something that can fuse multiple graphblas operations together so that you don't have too many extra memory ops from the priority queue operations.

[1] https://dl.acm.org/doi/pdf/10.1145/3225058.3225122

johlo · on May 23, 2020

There are some examples in in the SuiteSparse implementation, https://github.com/DrTimothyAldenDavis/SuiteSparse/tree/mast...

hobofan · on May 23, 2020

RedisGraph is a graph database that's built on top of GraphBLAS: https://github.com/RedisGraph/RedisGraph

greendream17 · on May 24, 2020

Is their a similar connection from algebra to hyper-graph (semantic graphs)?

michelpp · on May 24, 2020

Hypergraphs can be represented with Incidence Matrices

https://en.wikipedia.org/wiki/Incidence_matrix

the same algebra applies and GraphBLAS works well with them!

greendream17 · on May 24, 2020

Thanks, I get it, instead of using only 0 and 1, use the index to indicate which property, and the index is also a node in the graph.

noobermin · on May 24, 2020

The reference is written in C apparently...