The Eponymous Pickle: AI Threatened by Too Little Data

Monday, July 11, 2022

AI Threatened by Too Little Data

Some good points made.

Is AI threatened by too little data? by Mary Shacklett in Artificial Intelligence in TechRepublic

Knowing when to limit your data dramatically affects the quality of your AI. How do you know that your AI data is enough?

Whether it’s due to a lack of funding, lack of know-how or censorship, some governments and entities are shrinking the amount of data that they incorporate into their AI. Does this compromise the integrity of AI results?

The case for shrinking the data

Intentional data shrinking is occurring as a matter of policy and expediency.

Roya Ensafi, assistant professor of computer science and engineering at the University of Michigan, discovered that censorship was increasing in 103 countries. ...

NVIDIA moves to front of the AI pack

AI investments soared in 2021, but big problems remain

Forrester: AI may be used to enhance enterprise creativity

4 automation-focused IT careers on the rise (in TechRepublic Premim)

Most censorship actions “were driven by organizations or internet service providers filtering content,” Ensafi reported. “While the United States saw a smaller uptick in blocking activity, the groundwork for such blocking has been put in place.”

In other industry sectors, analytics providers and companies work hard to shrink the amount of data they admit into their processing and data repositories. They only want data that they deem relevant to the problem they are trying to solve.

In 2018, the U.S. Census Bureau moved to reduce the amount of data it was collecting on citizens — even if it meant more inaccurate data — in order to protect citizen privacy.

All of these use cases have clear cut business objectives, but what is the net impact of their data exclusions on the quality of the AI that operates on it?

How AI “misses” when data is missing

Sanjiv Narayan, professor of medicine at Stanford University School of Medicine, explains how missing data can impact healthcare.

“Think of height in the U.S.,” said Narayan. “If you collected them and put them all onto a chart, you’d find overlapping groups or clusters of taller and shorter people, broadly indicating adults and children and those in between. However, who was surveyed to get the heights? Was this done during the weekdays or on weekends, when different groups of people are working? If heights were measured at medical offices, people without health insurance may be left out. If done in the suburbs, you’ll get a different group of people compared to those in the countryside or those in cities. How large was the sample?”

The Amazon hiring algorithm that attracted controversy in 2019 illustrates this well.

Amazon’s AI-propelled recruiting engine was trained on historical data about successful job candidates from a time when most candidates were male. Observing this pattern, the AI taught itself that male candidates were preferable to females. Consequently, the company missed out on many qualified female applicants. ... '

The Eponymous Pickle

About Me

RSS

Blog Archive

Monday, July 11, 2022

AI Threatened by Too Little Data

No comments: