/* ---- Google Analytics Code Below */

Thursday, February 21, 2019

Knowledge Graphs, Governance, Learning and Much More

Have now been involved in a number of efforts in this area, worth understanding since Google has made an impressive run at this.  The article tells a historical journey I have traveled as well, we might finally be getting to real enterprise value.  And regulations like GDPR are forcing us to take notice of the need to really understand our data. The article is long, but has good points to make.

The Semantic Zoo - Smart Data Hubs, Knowledge Graphs and Data Catalogs   By Kurt Cagle Contributor in Forbes

COGNITIVE WORLDContributor Group
Sometimes, you can enter into a technology too early. The groundwork for semantics was laid down in the late 1990s and early 2000s, with Tim Berners-Lee’s stellar Semantic Web article, debuting in Scientific American in 2004, seen by many as the movement’s birth. Yet many early participants in the field of semantics discovered a harsh reality: computer systems were too slow to handle the intense indexing requirements the technology needed, the original specifications and APIs failed to handle important edge cases, and, perhaps most importantly, the number of real world use cases where semantics made sense were simply not at a large enough scope; they could easily be met by existing approaches and technology.

Semantics faded around 2008, echoing the pattern of the Artificial Intelligence Winter of the 1970s. JSON was all the rage, then mobile apps, big data came on the scene even as Javascript underwent a radical transformation, and all of a sudden everyone wanted to be a data scientist (until they discovered the fact that data science was mostly math). Meanwhile, from the dim recesses of the troughs of despair, semantics was readying itself for its own metamorphosis. Several semantic standards, including the SPARQL query language along with a new update language began seeing implementations by 2015.  Servers became faster and cheaper, and a rise of graphics processor units (GPUs) fueled by the gaming and entertainment industry provided tools for a new class of graph databases.

Meanwhile, the Big Data initiatives that had marked the early part of the 2010s was facing some real problems. The original promise of Hadoop as a map / reduce framework had ended up creating large numbers of data lakes that aggregated content but that sat under-utilized. Data scientists struggled to deal with dirty data that was really no cleaner for having been put in data lakes. JSON databases had grown in popularity, but they were proving hard to query in a consistent fashion, and all too many Hadoop projects ended up becoming large, slow, but cheap data graveyards for regulatory data (the kind of data that must be retained for five years). ..... "

No comments: