MTGL provides several adapters called wrapper adapters that wrap around a graph and provide a different view of the original graph. Wrapper adapters present almost the same interface MTGL graph API. The exceptions are in initializing the adapters. Wrapper adapter inherit their direction from the original graph. Therefore, if the original adapter is a bidirectional graph and supports in_degree(), then the subgraph will support in_degree() as well. The types, such as size_type and vertex_descriptor, for wrapper adapters are different from the graphs the original graphs. Be sure to use the original graph types for the original graph and the wrapper graph types for the wrapper graph.

We currently provide four types of wrapper adapters: transpose_adapter, duplicate_adapter, subgraph_adapter, and filter_adapter.

There are two main approaches to a wrapper adapter. The first is to create a whole copy of the original graph with the necessary modifications. The second is to permute the graph on the fly. The second approach is often the one taken as it provides a new view of the graph with minimal extra storage and the minimal cost of permuting the graph on the fly. Unfortunately, the cost of permuting the graph on the fly can be much higher on the XMT. The code necessary for the permutation can make loops that normally scale have scaling problems and can make inhibit loop collapses necessary for good performance. As a result, three of the four wrapper adapters provided with MTGL take the first approach and keep a modified copy of the original graph.

The transpose_adapter, duplicate_adapter, and subgraph_adapter keep a modified copy of the original graph, and they keep a link back to the original graph. Because these wrapper adapters create a new graph object to hold the modified graph, their initialization time is the as if the user had created a new graph. They trade an upfront higher cost of creating a graph copy for better performance when running algorithms on the graph. One thing to note is that these wrapper adapters only know about the state of the graph they modify when the wrapper adapter object was created. If the underlying graph changes after the initialization of the wrapper adapter, the adapter will not know about those changes.

These three wrapper adapters have a couple of common interface features. The type of the wrapper adapter has a template parameter of the graph they wrap around. The constructor of the wrapper adapter takes the graph it wraps around as a parameter. For example, here we declare a transpose_adapter that wraps around a compressed_sparse_row_graph.

compressed_sparse_row_graph<undirectedS> g; // Initialize g. transpose_adapter<compressed_sparse_row_graph<undirectedS> > tg(g);

These three wrapper adapters can be wrapped around other wrapper adapters. For instance, if you want to make a subgraph of the transpose adapter we declared above, you would do so as:

subgraph_adapter<transpose_adapter<compressed_sparse_row_graph<undirectedS> > > stg(tg);

# Transpose Adapter

The transpose_adapter (mtgl/transpose_adapter.hpp) creates a transpose of an existing graph. It is initialized at construct time. The transpose adapter preserves the vertex and edge ids of the original graph.

# Duplicate Adapter

The duplicate_adapter (mtgl/duplicate_adapter.hpp) takes an existing graph and creates a new one that has two copies of every edge in the original graph. For undirected graphs each edge is just duplicated. For directed graphs each edge is duplicated, but the duplicate's direction is the reverse of the original edge's direction. The duplicate adapter is initialized at construct time.

The duplicate adapter contains all the vertices and edge from the original graph, and it contains additional edges. For the vertices and the edges from the original graph, the duplicate adapter preserves the vertex and edge ids of the original graph. The new edges are assigned an id of *num_edges(g) + i*, where g is the original graph and i is the id of the edge from g that the new edge is the duplicate of. For example, consider a graph with 10 edges. The duplicate edge of the original edge with id 4 will have an id of 14. This allows the user to easily translate between the edges in the original graph and the edges in the duplicate graph.

# Subgraph Adapter

The subgraph_adapter (mtgl/subgraph_adapter.hpp) creates a subgraph of an existing graph. Because the subgraph adapter gives new ids to vertices and edges in the subgraph, it keeps links from the original graph vertices and edges to the subgraph vertices and edges and vice versa so that the user can translate between them.

When a subgraph_adapter is declared, it includes no vertices or edges. A user must add vertices and edges to the graph with the init_vertices() and init_edges() functions. These functions create a vertex-induced subgraph and an edge-induced subgraph, respectively. There are two forms of each function. The first form of init_vertices() takes a list of vertices and a size. The second form takes a boolean mask indicating which vertices from the original graph are in the subgraph. Both forms create a vertex-induced subgraph. The init_edges() function also has two forms that are similar.

For the following initialization functions, which are global functions, the subgraph sg wraps around the graph g.

*void*

init_vertices(vertex_descriptor* V, size_type n, const subgraph_adapter& sg)- Creates a vertex-induced subgraph of g using the n vertices in V. The vertices in V must belong to g.

*void*

init_vertices(bool* vmask, const subgraph_adapter& sg)- Creates a vertex-induced subgraph of g using the vertices in vmask marked as true. The array vmask represents the vertices in g and must be of size of the number of vertices in g.

*void*

init_edges(edge_descriptor* E, size_type m, const subgraph_adapter& sg)- Creates an edge-induced subgraph of g using the m edges in E. The edges in E must belong to g.

*void*

init_edges(bool* emask, const subgraph_adapter& sg)- Creates an edge-induced subgraph of g using the edges in emask marked as true. The array emask represents the edges in g and must be of size of the number of edges in g.

The functions that translate between the subgraph vertices and edges and the original graph vertices and edges are member functions. Again, we assume that subgraph sg wraps around the graph g.

*vertex_descriptor*

local_to_global(vertex_descriptor v, const subgraph_adapter& sg)- Returns the vertex in g that corresponds to v, which is in sg.

*vertex_descriptor*

global_to_local(vertex_descriptor v, const subgraph_adapter& sg)- Returns the vertex in sg that corresponds to v, which is in g.

*edge_descriptor*

local_to_global(edge_descriptor e, const subgraph_adapter& sg)- Returns the edge in g that corresponds to e, which is in sg.

*edge_descriptor*

global_to_local(edge_descriptor e, const subgraph_adapter& sg)- Returns the edge in sg that corresponds to e, which is in g.

# Filter Adapter

The filter_adapter provides a very similar functionality as the subgraph_adapter. It provides a subgraph of the original graph, but the subgraph is based on vertex and edge filters. It is also the only adapter that doesn't make a copy of the original graph; it performs the filtering on the fly. When a filter_adapter tests a vertex, it only tests the vertex filter against the vertex. When a filter_adapter tests an edge, however, it tests the edge filter against the edge and the vertex filter against both of the endpoints of the edge.

Let's do an example showing how to use the filter adapter and the cost involved. The example will be to count the indegree of all the vertices in the graph and then report their sum. The code for this example is in tutorial/filter_example.cpp. The graph used for the example is

The first step to defining a filter_adapter is to define the vertex and edge filters. A vertex filter is a function object that defines the () operator to take a vertex_descriptor and return a boolean indicating if the vertex is part of the filtered graph. The vertex filter used in our example filters out all vertices whose ids are less than 1 or greater than 4. Thus, the filtered graph is the vertices 1 - 4 and the edges connecting them. This is the box from the diagram.

template <typename Graph> class vertex_filter { public: vertex_filter(const Graph& _g) : g(_g) {} bool operator()(typename graph_traits<Graph>::vertex_descriptor v) const { return get(get(_vertex_id_map, g), v) > 0 && get(get(_vertex_id_map, g), v) < 5; } private: const Graph& g; };

Likewise, an edge filter is a function object that defines the () operator to take an edge_descriptor and return a boolean indicating if the edge is part of the filtered graph. An edge filter is defined similarly to a vertex filter. For our example, we don't want to filter edges, so we use the predefined function object in MTGL called keep_all whose () operator always returns true. It can be used as a filter type when no filtering is desired.

The code for our example defines the graph as a directed compressed_sparse_row_graph and a filtered graph using the vertex filter above. The filter_adapter takes three template parameters: the original graph type, the edge filter type, and the vertex filter type. The filter_adapter also takes three parameters to its constructor: the original graph object, an instantiated object of the edge filter, and an instantiated object of the vertex filter.

typedef compressed_sparse_row_graph<directedS> Graph; typedef graph_traits<Graph>::size_type size_type; typedef filter_adapter<Graph, keep_all, vertex_filter<Graph> > FilterGraph; const size_type numVerts = 6; const size_type numEdges = 8; size_type sources[numEdges] = { 0, 0, 1, 1, 2, 3, 3, 4 }; size_type targets[numEdges] = { 1, 2, 2, 3, 4, 4, 5, 5 }; // Initialize the graph. Graph g; init(numVerts, numEdges, sources, targets, g); FilterGraph fg_v(g, keep_all(), vertex_filter<Graph>(g));

Now, we'll discuss how to actually perform the filtering. In a serial context, you would have the iterators for the filter_adapter never point to a vertex or edge that has been filtered out. The iterator would, in this case, keep some state information: a pointer to the vertex or edge it was pointing to. If incrementing the iterator moved it to an invalid vertex or edge, it would just keep incrementing until a valid vertex or edge was reached.

Remember, however, that iterators in MTGL can't save state because of the multithreaded context. Thus, we can't do the nice automatic filtering that can be done in a serial context. Instead, the iterators iterate over all the vertices and edges, even the ones that don't pass the filter. We introduce the helper functions is_valid(), null_vertex(), and null_edge(). The is_valid() function tests whether or not a certain vertex or edge pointed to by an iterator would be valid. The null_vertex() and null_edge() functions return a special value for a vertex_descriptor or edge_descriptor that indicates an invalid vertex or edge. When an iterator would dereference to a vertex or edge that didn't pass the filter, the null_vertex() or null_edge() is returned. These function definitions are required by the MTGL graph API. Graphs that don't do filtering should define is_valid() to always return true.

The example has three functions: count_indegree(), count_indegree_filter(), and count_indegree_filter_graph(). The function count_indegree() counts the indegree without using any filtering. The functions count_indegree_filter() and count_indegree_filter_graph() are identical and count the indegree while filtering the graph. All three functions take two parameters: the graph and the array indeg. The array indeg is indexed by vertex id, and it will hold the indegree of each vertex after the function is run. The functions count_indegree() and count_indegree_filter() are run on the original graph, while the function count_indegree_filter_graph() is run on the filtered graph.

Let's look first at the loop in count_indegree(). We loop over all the vertices and all their adjacencies, incrementing the counter for each adjacency.

#pragma mta assert noalias *indeg for (size_type i = 0; i < order; ++i) { vertex_descriptor u = verts[i]; adjacency_iterator adjs = adjacent_vertices(u, g); size_type end = out_degree(u, g); for (size_type j = 0; j < end; ++j) { vertex_descriptor v = adjs[j]; size_type vid = get(vid_map, v); mt_incr(indeg[vid], 1); } }

The canal output for the loop shows that the nested loops were parallelized and merged.

| #pragma mta assert noalias *indeg | for (size_type i = 0; i < order; ++i) | { | vertex_descriptor u = verts[i]; ++ function T1 mtgl::detail::csr_vertex_iterator<T1, T2>::operator [] inlined 46 P:m | adjacency_iterator adjs = adjacent_vertices(u, g); ++ function mtgl::compressed_sparse_row_graph<T1>::adjacency_iterator mtgl::adjacent_vertices<T1> inlined 31 P | size_type end = out_degree(u, g); ++ function mtgl::compressed_sparse_row_graph<T1>::size_type mtgl::out_degree<T1> inlined | | for (size_type j = 0; j < end; ++j) | { 46 PP:m | vertex_descriptor v = adjs[j]; ++ function T1 mtgl::detail::csr_adjacency_iterator<T1, T2>::operator [] inlined 46 PP:m | size_type vid = get(vid_map, v); ++ function T2 mtgl::get<T1, T2, T3> inlined | 46 PP:m | mt_incr(indeg[vid], 1); ++ function T1 mtgl::mt_incr<T1, T2> inlined | } | }

Here are the loop annotations for the appropriate loops.

Loop 20 in void count_indegree<T1>(T1 &, mtgl::graph_traits<T1>::size_type *) [with T1=mtgl::compressed_sparse_row_graph<mtgl::directedS>] in region 3 Stage 1 of recurrence Compiler generated Loop summary: 1 loads, 0 stores, 0 floating point operations 2 instructions, needs 45 streams for full utilization pipelined Loop 26 in void count_indegree<T1>(T1 &, mtgl::graph_traits<T1>::size_type *) [with T1=mtgl::compressed_sparse_row_graph<mtgl::directedS>] in region 3 Stage 1 recurrence communication Loop not pipelined: not structurally ok Compiler generated Loop summary: 11 instructions, 0 floating point operations 1 loads, 1 stores, 0 reloads, 0 spills, 3 branches, 0 calls Loop 31 in void count_indegree<T1>(T1 &, mtgl::graph_traits<T1>::size_type *) [with T1=mtgl::compressed_sparse_row_graph<mtgl::directedS>] at line 49 in region 3 Stage 2 of recurrence Loop summary: 1 loads, 2 stores, 0 floating point operations 3 instructions, needs 60 streams for full utilization pipelined Loop 46 in void count_indegree<T1>(T1 &, mtgl::graph_traits<T1>::size_type *) [with T1=mtgl::compressed_sparse_row_graph<mtgl::directedS>] at line 48 in loop 41 Parallel section of loop from level 2 Loop summary: 1 loads, 1 stores, 0 floating point operations 2 instructions, needs 45 streams for full utilization pipelined

Now let's look at the loop in count_indegree_filter(). The only difference between this and the previous version is the two if statements calling the is_valid() function. The is_valid() function takes three parameters: an iterator, the current entry of the iterator, and the graph. It returns a boolean indicating if the vertex or edge specified by the iterator and entry pair passes the filter. The test for a valid vertex is performed before grabbing the vertex because this lets the test compile away to nothing for unfiltered graphs as their is_valid() always returns true.

for (size_type i = 0; i < order; ++i) { if (is_valid(verts, i, g)) { vertex_descriptor u = verts[i]; adjacency_iterator adjs = adjacent_vertices(u, g); size_type end = out_degree(u, g); for (size_type j = 0; j < end; ++j) { if (is_valid(adjs, j, g)) { vertex_descriptor v = adjs[j]; size_type vid = get(vid_map, v); mt_incr(indeg[vid], 1); } } } }

Remember that this function is called on the original graph. Looking at the canal annotations, we indeed see that the cost is the same as using the code without testing for filtering.

Loop 21 in void count_indegree_filter<T1>(T1 &, mtgl::graph_traits<T1>::size_type *) [with T1=mtgl::compressed_sparse_row_graph<mtgl::directedS>] in region 4 Stage 1 of recurrence Compiler generated Loop summary: 1 loads, 0 stores, 0 floating point operations 2 instructions, needs 45 streams for full utilization pipelined Loop 27 in void count_indegree_filter<T1>(T1 &, mtgl::graph_traits<T1>::size_type *) [with T1=mtgl::compressed_sparse_row_graph<mtgl::directedS>] in region 4 Stage 1 recurrence communication Loop not pipelined: not structurally ok Compiler generated Loop summary: 11 instructions, 0 floating point operations 1 loads, 1 stores, 0 reloads, 0 spills, 3 branches, 0 calls Loop 32 in void count_indegree_filter<T1>(T1 &, mtgl::graph_traits<T1>::size_type *) [with T1=mtgl::compressed_sparse_row_graph<mtgl::directedS>] at line 81 in region 4 Stage 2 of recurrence Loop summary: 1 loads, 2 stores, 0 floating point operations 3 instructions, needs 60 streams for full utilization pipelined Loop 47 in void count_indegree_filter<T1>(T1 &, mtgl::graph_traits<T1>::size_type *) [with T1=mtgl::compressed_sparse_row_graph<mtgl::directedS>] at line 80 in loop 42 Parallel section of loop from level 2 Loop summary: 1 loads, 1 stores, 0 floating point operations 2 instructions, needs 45 streams for full utilization pipelined

The loop in count_indegree_filter_graph() is the same as the loop in count_indegree_filter(), but it is called on an unfiltered graph. We used two loops to show the difference in the canal annotations between passing an unfiltered graph and a filtered graph to the code. Here is the canal output. Note that the inner loop doesn't parallelize, and we don't get a loop collapse. Even using the ultimate hammer of adding assert parallel statements for both the inner and outer loops will not force the inner loop to parallelize and achieve a loop collapse. This is because of the if statement in the outer loop. It prevents the compiler from knowing how many times the inner loop will actually run.

| #pragma mta assert noalias *indeg | for (size_type i = 0; i < order; ++i) | { 16 P | if (is_valid(verts, i, g)) ++ function bool mtgl::is_valid<T1, T2, T3> inlined | { 16 P | vertex_descriptor u = verts[i]; ++ function mtgl::graph_traits<T3>::vertex_descriptor mtgl::detail::filter_vertex_iterator<T1, T2, T3>::operator [] inlined 16 P | adjacency_iterator adjs = adjacent_vertices(u, g); ++ function mtgl::filter_adapter<T1, T2, T3>::adjacency_iterator mtgl::adjacent_vertices<T1, T2, T3> inlined 16 P | size_type end = out_degree(u, g); ++ function mtgl::filter_adapter<T1, T2, T3>::size_type mtgl::out_degree<T1, T2, T3> inlined | | for (size_type j = 0; j < end; ++j) | { 22 P- | if (is_valid(adjs, j, g)) ++ function bool mtgl::is_valid<T1, T2, T3> inlined | { 22 P- | vertex_descriptor v = adjs[j]; ++ function mtgl::graph_traits<T3>::vertex_descriptor mtgl::detail::filter_vertex_iterator<T1, T2, T3>::operator [] inlined 22 P- | size_type vid = get(vid_map, v); ++ function T2 mtgl::get<T1, T2, T3> inlined | 22 P- | mt_incr(indeg[vid], 1); ++ function T1 mtgl::mt_incr<T1, T2> inlined | } | } | } | }

To compile and run this example on the XMT, issue the following commands from the tutorial directory

c++ -I.. -o filter_example filter_example.cpp mtarun -m <num processors> filter_example

The output should be as below which we can verify is correct by looking at the graph. The sum of the indegree of all the vertices in a directed graph is the number of edges in the graph. Taking away vertices 0 and 5 and the edges that touch them leaves four edges in the filtered graph.

indegree sum full graph: 8 indegree sum full graph filter code: 8 indegree sum filter graph filter code: 4

So, we see that the cost of filtering is the loss of a loop collapse. For single loops, using the filter_adapter shouldn't cause much performance degradation; however, the loss of the loop collapse for nested loops could make them have horrible performance depending on the graph structure.