Graph¶
This example demonstrates that ovl can express graph data structures and proces them. In this example we implemented a method for triangle counting in an undirected graph. Assume the graph is given by
where \(V\) is a set of vertex ids and \(E\) is a set with vertex tuples indicating the from-to vertex edge.
Example¶
To illustrate the definitions we specify a simple graph with five vertices and seven edges.
This graph has three triangles (0, 1, 3), (0, 2, 3), and (2, 3, 4).
Source Code¶
-
triangles
(startEdge, fromVertex, toVertex, target_language='cpp')[source]¶ Count the triangles on the GPU.
The array toVertex is a flattened list of lists structure, where startEdge encodes the start indices of the lists.
Parameters: - startEdge (list.) – Indices into toVertex where edges start.
- fromVertex (list.) – The from-vertex of each edge.
- toVertex (list.) – The to-vertex of each edge.
Returns: Triangle count of graph.
Examples: >>> from opveclib.examples.test_graph import load_graph_from_text_file, write_example_graph_to_text_file, triangles >>> tmpName = "/tmp/v7e20.txt" >>> write_example_graph_to_text_file(tmpName) >>> startEdge, fromVertex, toVertex = load_graph_from_text_file(tmpName) >>> triangles(startEdge, fromVertex, toVertex) 3
Triangle Counting¶
There are edge-based and vertex-based algorithms for triangle counting. For the GPU edge based algorithms are better suited because they contain more parallelism. The idea of the edge-based algorithm for triangle counting is:
Set a triangle counter to zero. Take each edge (u, v) in the graph
- Compute the intersection of the neighbors of u with the neighbors of v.
- Add the cardinality of the intersection set to the triangle counter.
Now, for an undirected graph this triangle counter contains three times the count of the triangles.
Performance Optimization¶
- To avoid the tripple count of triangles, we enforce an edge order, e.g. (u, v) only exists in the set E if u < v.
- For the described algorithm it is efficient to have a flattened list of lists representation.
- To effciently compute the intersection between two sub-sets we store all neighboring lists sorted.
Testing¶
As test case we download the graph https://snap.stanford.edu/data/ca-HepTh.html which has 28339 triangles. This graph unzipped and stored in your temporary directory. We load the edge list representation and transform it into an adjacency list representation using a single flattened list. This flatten list representation is the input to our triangle counting methods. We count triangles using three implementations of the same algorithm.
- Implementation in OVL and compilation for the GPU.
- Implementation in OVL and compilation for the CPU.
- Implementation in numpy and execution on the CPU.