Key ability to have available to access private, personal or corporate data
The solution, as the name suggests, provides enterprises with a layer of data privacy and security throughout the entire lifecycle of their LLMs, beginning with data collection and continuing through model training and deployment.
It comes as enterprises across sectors continue to race to embed LLMs, like the GPT series of models, into their workflows to simplify processes and boost productivity.
Why a privacy vault for GPT models?
LLMs are all the rage today, helping with things like text generation, image generation and summarization. However, most of the models that are out there have been trained on publicly available data. This makes them suitable for broader public use, but not so much for the enterprise side of things.
Register Now
To make LLMs work in specific enterprise settings, companies need to train them on their internal knowledge. A few have already done it or are in the process of doing it, but the task is not easy, as you have to ensure that the internal, business-critical data used for training the model is protected at all stages of the process.
This is exactly where Skyflow’s GPT privacy vault comes in.
Delivered via API, the solution establishes a secure environment, allowing users to define their sensitive data dictionary and have that information protected at all stages of the model lifecycle: data collection, preparation, model training, interaction and deployment. Once fully integrated, the vault uses the dictionary and automatically redacts or tokenizes the chosen information as it flows through GPT — without lessening the value of the output in any way.
“Skyflow’s proprietary polymorphic encryption technique enables the model to seamlessly handle protected data as if it were plaintext,” Anshu Sharma, Skyflow cofounder and CEO, told VentureBeat. “It will protect all sensitive data flowing into GPT models and only reveal sensitive information to authorized parties once it has been processed by the model and returned.”
For example, Sharma explained, plaintext sensitive data elements like email addresses and social security numbers are swapped with Skyflow-managed tokens before inputs are provided to GPTs. This information is protected by multiple layers of encryption and fine-grained access control throughout model training, and ultimately de-tokenized after the GPT model returns its output. As a result, authorized end users get a seamless output experience, with plaintext-sensitive data bypassing the GPT model.
“This works because GPT LLMs already break down inputs to analyze patterns and relationships between them and then make predictions about what comes next in the sequence. So, tokenizing or redacting sensitive data with Skyflow before inputs are provided to the LLM doesn’t impact the quality of GPT LLM output — the patterns and relationships remain the same as before plaintext sensitive data is tokenized by Skyflow,” Sharma added.
No comments:
Post a Comment