Narrow AI needs data to scout for patterns to then regurgitate out for the next person that comes along and asks it a question.
AI developers have gathered as many free, legal or otherwise, texts as they could, they've also siphoned off tons of data from public forums such as reddit and Twitter for God knows for how long until they started to close the doors.
This formed a good base for historical knowledge until ~2022 give or take.
Now let's fast forward a bit. Public forums and social networks that house information have closed their doors to prevent losing traffic. Take reddit as an example. It's a known meme for those seeking tech answers they should just append "reddit" to their search and they'll probably find better and more accurate data than elsewhere.
Then you have places like source forge or hack reactor and even Twitter. All of these places rely on foot traffic but a good chunk of the foot traffic is just googling.
Now Bing and Google have AI and users stay on Google or Bing while they ask their question and the AI spits out curated search results.
But what happens when it's time to gather new info past 2023 and beyond? Can't gather texts because libraries don't want their books just ripped. Can't gather user posts because users aren't visiting social networks to ask questions anymore and the api of these social networks has been limited and gated on top of that.
AI degrades because it's stuck regurgitating old content over and over and the accuracy of its answers will go down for questions about newer content because their sources have shrunk and have been limited to trash.
Users will start going back to the old forums and social networks for answers and AIs are now the enemy of the free internet. Now there's security around protecting your data from data scrapers.
Or there's a boom of micro AIs from these smaller services. Imagine a shitty reddit AI answering your questions.
Which way do you think it'll go?
Personally, this is yet another nail in the old internet we grew up on. This is another step in the wrong direction with controlled, censored, and curated information.
[link] [comments]