Your search did not match any results.
Graph analysis lets you reveal latent information that is encoded, not as fields in your data, but as direct and indirect relationships - metadata - between elements of your data - information that is not obvious to the naked eye, but can have tremendous value once uncovered.
PGX is a toolkit for graph analysis - both running algorithms such as PageRank against graphs, and performing SQL-like pattern-matching against graphs, using the results of algorithmic analysis. Algorithms are parallelized for extreme performance. The PGX toolkit includes both a single-node in-memory engine, and a distributed engine for extremely large graphs. Graphs can be loaded from a variety of sources including flat files, SQL and NoSQL databases and Apache Spark and Hadoop; incremental updates are supported.
The tools included as part of the PGX distribution include:
The typical usage pattern in PGX is to
In addition, there are features for filtering graphs, extracting subgraphs and much more, and graphs can be saved for later use.
In our latest PGX version, we have added awesome features like Apache Spark support, the ability to export compiled Green-Marl programs as Java JAR files and more. Check out our what's new page for the latest features.
Loading graphs from a variety of sources such as relational databases, NoSQL databases, Apache Spark / Hadoop, and flat files
Applying graph pattern matching: PGX includes an SQL-like query language for pattern-matching subgraphs based on their connections, properties or both. Matched subgraphs can have further analytics run against them.
Running parallel, high-performance graph algorithms: PGX provides built-in implementations of many popular graph algorithms. The user can easily apply these algorithms on their graph data sets by simply invoking the appropriate methods.
Browsing and exporting results: Once the analysis is finished, the users can browse the results of their analysis and export them into the file system.
Fast, parallel, in-memory execution: PGX is a fast, parallel, in-memory graph analytic framework. PGX adopts light-weight in-memory data structures which allow fast execution of graph algorithms. Moreover, PGX exploits multiple CPUs of modern computer systems by running parallelized graph algorithms. Note that not only the built-in algorithms are parallelized, but also custom graph algorithms are automatically parallelized with the help of a DSL compiler.
Rich built-in algorithms: PGX provides built-in implementations of many popular graph algorithms including computing various centrality measures, finding shortest paths, finding/evaluating clusters and components, and predicting future edges, etc. (Note: The OTN public release contains only a small subset of these algorithms. See the documentation and contact us if you want to remove this limitation.)
Easy implementation and efficient execution of custom algorithms: PGX adopts the Green-Marl DSL for the sake of both ease of implementation of custom algorithms and their efficient execution. The users can program their own graph algorithms intuitively by using the high-level graph-specific data type and operators in Green-Marl. PGX can execute the given Green-Marl program efficiently by parallelizing the given Green-Marl program and mapping it into the PGX-internal API.
Interactive Shell: PGX provides a shell application with which the user can exercise the PGX features in an interactive manner. That is, the user can simply start the shell and type commands from the shell command line, instead of creating a whole Java application for his/her analysis.
Deploy as a webservice: PGX ships with a web application which can be deployed in a container like Weblogic, Jetty or Tomcat. This allows you to use your interactive shell and other APIs on a remote instance. You can deploy PGX on a server-class machine and have multiple clients share access to the resources of that machine.
Hadoop support: You can use PGX to analyze graphs on a Hadoop cluster. You can run PGX as a Yarn application and connect to it from the interactive shell or other APIs. PGX also supports loading and storing graphs from HDFS.
PGX can be used in several ways:
In a Java (or Scala or Groovy or other JVM language) application: The entire runtime, PGX or the PGX client (talking to a remote PGX server) can be used as a library embedded in a Java application.
Interactively from the shell: The user can also make use of PGX, as if it is a separate application, by using the PGX shell. Once the user starts up the PGX shell, he/she can load graphs, invoke algorithms, and browse/export results in a very simple manner using the shell.
Remote usage: For both use cases above, you can either use PGX locally or remotely. In the remote case you need to start PGX on a webserver and provide the client with a hostname and port to connect to. If you use PGX locally, it will simply spin up a local PGX instance on which you can work without any HTTP overhead.
This version of PGX is released under the OTN license. Please see the documentation for more details about the OTN release and its limitations.
Please see the installation documentation, which explains how to install PGX.
We have set up an OTN Community for the purpose of gathering feedback and to provide a space for users of PGX to communicate. We encourage you to discuss issues and use cases related to PGX there. We especially welcome suggestions for improvement and interesting features you would like us to add in future releases. This community is also a great place to discuss novel graph algorithms with fellow PGX users and developers.
Sungpack Hong, Hassan Chafi, Eric Sedlar, and Kunle Olukotun. 2012. "Green-Marl: a DSL for easy and efficient graph analysis." SIGPLAN Not. 47, 4 (March 2012), 349-362.
Sungpack Hong, Semih Salihoglu, Jennifer Widom, and Kunle Olukotun. "Simplifying Scalable Graph Processing with a Domain-Specific Language." Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization. ACM, 2014.
Sungpack Hong, Nicole C. Rodia, and Kunle Olukotun. "On fast parallel detection of strongly connected components (SCC) in small-world graphs." Proceedings of SC13: International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 2013.
Adam Welc, Raghavan Raman, Zhe Wu, Sungpack Hong, Hassan Chafi, and Jay Banerjee. "Graph analysis: do we have to reinvent the wheel?" In First International Workshop on Graph Data Management Experiences and Systems, ACM, 2013.
Sungpack Hong, Jan Van Der Lugt, Adam Welc, Raghavan Raman, and Hassan Chafi. "Early experiences in using a domain-specific language for large-scale graph analysis." In First International Workshop on Graph Data Management Experiences and Systems, ACM, 2013.
Sungpack Hong, Tayo Oguntebi, and Kunle Olukotun. "Efficient parallel graph exploration on multi-core CPU and GPU." Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on. IEEE, 2011.
Sungpack Hong, Sang Kyun Kim, Tayo Oguntebi, and Kunle Olukotun. "Accelerating CUDA graph algorithms at maximum warp." ACM SIGPLAN Notices 46, no. 8 (2011): 267-276.