Welcome to Parallel Graph Analytics (PGX)


What is PGX?

PGX is a fast, parallel, in-memory graph analytic framework. Using PGX, the users can load up their graphs into main-memory, run various graph algorithms on them very efficiently, explore their results, and export them back into the file system.


What can I do with PGX?

  • Loading graphs into memory: PGX is an in-memory graph analytic framework that needs to load the graph instance into main-memory before running analytic algorithms on the graph. PGX supports a few popular graph file formats for convenient data loading.

  • Running built-in graph algorithms: PGX provides built-in implementations of many popular graph algorithms. The user can easily apply these algorithms on their graph data sets by simply invoking the appropriate methods.

  • Running custom graph algorithms: PGX is also able to execute custom (i.e. user-provided) graph algorithms. Users can write up their own graph algorithms with the Green-Marl DSL and feed it to PGX. The provided Green-Marl program is transformed to be executed by PGX using a parallelizing compiler.

  • Mutating Graphs: Complicated graph analyses often consist of multiple steps, where some of the steps require graph mutating operations. For example, one may want to create an undirected version of the graph, to renumber the nodes in the graph, or remove repeated edges between nodes. PGX provides fast, parallel built-in implementations of such operations.

  • Browsing and exporting results: Once the analysis is finished, the users can browse the results of their analysis and export them into the file system.


What are the key benefits of PGX?

  • Fast, parallel, in-memory execution: PGX is a fast, parallel, in-memory graph analytic framework. PGX adopts light-weight in-memory data structures which allow fast execution of graph algorithms. Moreover, PGX exploits multiple CPUs of modern computer systems by running parallelized graph algorithms. Note that not only the built-in algorithms are parallelized, but also custom graph algorithms are automatically parallelized with the help of a DSL compiler.

  • Rich built-in algorithms: PGX provides built-in implementations of many popular graph algorithms including computing various centrality measures, finding shortest paths, finding/evaluating clusters and components, and predicting future edges, etc. (Note: The OTN public release contains only a small subset of these algorithms. See Readme and contact us if you want to remove this limitation.)

  • Easy implementation and efficient execution of custom algorithms: PGX adopts the Green-Marl DSL for the sake of both ease of implementation of custom algorithms and their efficient execution. The users can program their own graph algorithms intuitively by using the high-level graph-specific data type and operators in Green-Marl. PGX can execute the given Green-Marl program efficiently by parallelizing the given Green-Marl program and mapping it into the PGX-internal API.

  • Interactive Shell: PGX provides a shell application with which the user can exercise the PGX features in an interactive manner. That is, the user can simply start the shell and type commands from the shell command line, instead of creating a whole Java application for his/her analysis.


How can I use PGX? What does the PGX API look like?

PGX can be used in two ways.

  1. Embedded in a Java application: Since PGX is implemented as a set of Java classes, the users can embed PGX into their Java application as a library. The user, however, needs to take care of starting up PGX appropriately, before he/she invokes PGX methods.

  2. Interactively from the shell: The user can also make use of PGX, as if it is a separate application, by using the PGX shell. Once the user starts up the PGX shell, he/she can load graphs, invoke algorithms, and browse/export results in a very simple manner using the shell.

Check our tutorials for both embedded and shell cases.

For the embedded use cases, PGX provides two different levels of APIs.

  1. Core API: The Core API is a single layer of API methods that expose all of the features of PGX. The Core API allows detailed, fine-grained control of the PGX engine.

  2. Analyst API: The Analyst API is a convenience API layer that provides an easy access to the built-in algorithms. It hides the details of the Core API.


What is the license of PGX?

This version of PGX is released under the OTN license. Please see the Readme document for more details about the license.


How can I install PGX in my system?

Please see the Readme document which includes the installation guide.