Solving deep learning problems requires considerable and varied data. I was recently involved in such a problem which aimed to do facial demographics, learning and then do real-time classification. It was quickly determined that this data was hard to get in sufficient volume. Though it was also determined that services like Facebook had lots of it. Technically, scraping it online is easy. But this, if used, would be without direct consent. The company involved rejected that approach, and looked for other places to get data.
IBM’s photo-scraping scandal shows what a weird bubble AI researchers live in
On Tuesday, NBC published a story with a gripping headline: “Facial recognition’s ‘dirty little secret’: Millions of online photos scraped without consent.” I linked to it in our last Algorithm issue, but it’s worth a revisit today.
The story highlights a recent data set released by IBM with 1 million pictures of faces, intended to help develop fairer face recognition algorithms. (I wrote about the news at the time too.) It turns out, NBC found, that those faces were scraped directly from the online photo-hosting site Flickr, without the permission of the subjects or photographers. .... "
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment