/* ---- Google Analytics Code Below */

Sunday, August 16, 2020

Optimization Assisting the User of Large Databases

Using tools like machine learning to adapt how we do complex operations, like analyzing increasingly large databases for complex uses.  Here work at MIT in this space.  I can see this being used further, to analyze and leverage the context in which the data will be used. 

MIT Is Developing a Tool for Machine Learning-Powered Data Retrieval   Oliver Peckham in DataNami

With the global deluge of data, the opportunities are endless – but so are the challenges. Within five years, the world’s data is estimated to reach 175 zettabytes: enough to fill over 23,000 one-terabyte hard drives for every single person alive. In the context of such a data-driven world, managing and sorting through that data is a task that gets harder by the day, with database and query managers struggling to keep up. Now, researchers from MIT are developing a tool to intelligently assist users of large databases.

“It’s like building a database system for every application from scratch, which is not economically feasible with traditional system designs,” explained MIT Professor Tim Kraska in an interview with MIT’s Adam Conner-Simons. Kraska and his colleagues – from the institute’s Computer Science and Artificial Intelligence Laboratory (CSAIL) – are debuting a design for what they call “instance-optimized systems”: database systems that are able to optimize and reorganize themselves in response to the data types and workloads at hand. 

MIT’s instance-optimized system will be the child of two parents: the “Tsunami” and “Bao” tools. Using machine learning, Tsunami (a successor to “Flood”) interprets user queries to reorganize the layouts of databases. Bao, meanwhile, uses machine learning to intelligently pick the appropriate plan for completing a given query. On their own, Tsunami improved query speed up to tenfold, while Bao-created query plans ran up to 50% faster. When combined: the instance-optimized system.

“Query optimizers have been around for years, but they often make mistakes, and usually they don’t learn from them. That’s where we feel that our system can make key breakthroughs, as it can quickly learn for the given data and workload what query plans to use and which ones to avoid,” Kraska said. “Our hope is that a system like this will enable much faster query times, and that people will be able to answer questions they hadn’t been able to answer before.”   ... "

No comments: