Preliminary Write-up

We implemented a JavaScript framework for calculating the layout for large-scale graphs in the web platform. We use the GLSL on WebGL to do general purpose computation for the graph layout algorithm.

This is a significant feature for data scientists, artists, and journalists, who would love to visualize and analysis large-scale graph efficiently in an interactive manner. They want to access it across all platforms by just opening a web page, instead of installing a software like Gephi.

Major Technical Challenges

Preliminary Result

We have achieved significant speedup, compared to an implementation without WebGL on Fruchterman Reingold graph layout algorithm. To our knowledge, we are the first one who implements graph layout algorithms with WebGL.

We test the program multiple times on a graph with 100, 500, 1000 edges and get the average runtime. The error is within 3%. We could achieve about 32x speedup for the layout of a graph with 500 edges in 10000 iterations on New MacBook. Furthermore, we get 75x speedup for a graph with 1000 edges in 10000 iterations.

Plan for Friday

FAQ

What is the baseline that speedup is being compared against?

The baseline is an implementation of Fruchterman Reingold layout algorithm provided by Sigma.js (a popular JavaScript library dedicated to graph visualization). It does not use GPU to calculate the layout. Our version uses the same algorithm and configurations to make a fair comparison.

Force-directed layout algorithms need to be interactive to be useful. What are the milis-er-frame times for your graphs?

For a graph with 2092 nodes and 5000 edges, it takes 1830 ms to finish 1000 iterations, on a Macbook Air. That’s to say, calculating the layout of each frame takes only 1.83 ms. (In the current setting, we only render after finishing all iterations. We will invoke rendering more frequently to have a more interactive manner.)

We’ll want to know a very brief description of the workload characteristics of the algorithm.

In one iteration, each node will compute a repulsive force by accessing the positions of all other nodes, and an attractive force by accessing the positions of connected nodes, and then add gravity and speed to get a new position. Approximately, it has about 10 float points operations every 16 bytes memory access. I would consider the workload as memory intensive.

Hopefully you can demo on some real-world graphs?

The graphs we used in the benchmark are sampled from a collaboration network of authors on arXiv: