/* ---- Google Analytics Code Below */

Saturday, December 04, 2021

The Future is Big Graphs

Like the thought.    But how do we define big here? Complete?

The Future Is Big Graphs: A Community View on Graph Processing Systems

By Sherif Sakr, Angela Bonifati, Hannes Voigt, Alexandru Iosup, Khaled Ammar, Renzo Angles, Walid Aref, Marcelo Arenas, Maciej Besta, Peter A. Boncz, Khuzaima Daudjee, Emanuele Della Valle, Stefania Dumbrava, Olaf Hartig, Bernhard Haslhofer, Tim Hegeman, Jan Hidders, Katja Hose, Adriana Iamnitchi, Vasiliki Kalavri, Hugo Kapp, Wim Martens, M. Tamer Özsu, Eric Peukert, Stefan Plantikow, Mohamed Ragab, Matei R. Ripeanu, Semih Salihoglu, Christian Schulz, Petra Selmer, Juan F. Sequeda, Joshua Shinavier

Communications of the ACM, September 2021, Vol. 64 No. 9, Pages 62-71    10.1145/3434642

Graphs are, by nature, 'unifying abstractions' that can leverage interconnectedness to represent, explore, predict, and explain real- and digital-world phenomena. Although real users and consumers of graph instances and graph workloads understand these abstractions, future problems will require new abstractions and systems. What needs to happen in the next decade for big graph processing to continue to succeed?

We are witnessing an unprecedented growth of interconnected data, which underscores the vital role of graph processing in our society. Instead of a single, exemplary ("killer") application, we see big graph processing systems underpinning many emerging but already complex and diverse data management ecosystems, in many areas of societal interest.a

To name only a few recent, remarkable examples, the importance of this field for practitioners is evidenced by the large number (more than 60,000) of people registeredb to download the Neo4j book Graph Algorithmsc in just over one-and-a-half years, and by the enormous interest in the use of graph processing in the artificial intelligence (AI) and machine learning (ML) fields.d Furthermore, the timely Graphs 4 COVID-19 initiativee is evidence of the importance of big graph analytics in alleviating the pandemic.

Academics, start-ups, and even big tech companies such as Google, Facebook, and Microsoft have introduced various systems for managing and processing the growing presence of big graphs. Google's PageRank (late 1990s) showcased the power of Web-scale graph processing and motivated the development of the MapReduce programming model, which was originally used to simplify the construction of the data structures used to handle searches, but has since been used extensively outside of Google to implement algorithms for large-scale graph processing.

Motivated by scalability, the 2010 Google Pregel "think-like-a-vertex" model enabled distributed PageRank computation, while Facebook, Apache Giraph, and ecosystem extensions support more elaborate computational models (such as task-based and not always distributed) and data models (such as diverse, possibly streamed, possibly wide-area data sources) useful for social network data. At the same time, an increasing number of use cases revealed RDBMS performance problems in managing highly connected data, motivating various startups and innovative products, such as Neo4j, Sparksee, and the current Amazon Neptune. Microsoft Trinity and later Azure SQL DB provided an early distributed database-oriented approach to big graph management.   ... ' 

No comments: