Wikipedia Says AI Is Causing a Dangerous Decline in Human Visitors

Tony Bark@pawb.social · 7 days ago

Wikipedia Says AI Is Causing a Dangerous Decline in Human Visitors

SpaceCowboy@lemmy.ca · 6 days ago

If this AI stuff weren’t a bubble and the companies dumping billions into it were capable of any long term planning they’d call up wikipedia and say “how much do you need? we’ll write you a cheque”

They’re trying to figure out nefarious ways of getting data from people and wikipedia literally has people doing work to try to create high quality data for a relatively small amount of money that’s very valuable to these AI companies.

But nah, they’ll just shove AI into everything blow the equivalent of Wikipedia’s annual budget in a week on just electricity to shove unwanted AI slop into people’s faces.

Suffa@lemmy.wtf · 6 days ago

Because they already ate through every piece of content on wikipedia years and years ago. They’re at the stage where they’ve trawled nearly the entire internet and are running out of content to find.

fishy@lemmy.today · 6 days ago

So now the AI trawls other AI slop, so it’s essentially getting inbred. So they literally need you to subscribe to their AI slop so they can get new data directly from you because we’re still nowhere near AGI.

nova_ad_vitum@lemmy.ca · 6 days ago

But nah, they’ll just shove AI into everything blow the equivalent of Wikipedia’s annual budget in a week on just electricity to shove unwanted AI slop into people’s faces.

You’re off my several order of magnitude unfortunately. Tech giants are spending the equivalent of the entire fucking Apollo program on various AI investments every year at this point.

coffee_nutcase207@lemmy.world · 6 days ago

That is too bad. Wikipedia is important.

utopiah@lemmy.world · 7 days ago

(pasting a Mastodon post I wrote few days ago on StackOverflow but IMHO applies to Wikipedia too)

"AI, as in the current LLM hype, is not just pointless but rather harmful epistemologically speaking.

It’s a big word so let me unpack the idea with 1 example :

StackOverflow, or SO for shot.

So SO is cratering in popularity. Maybe it’s related to LLM craze, maybe not but in practice, less and less people is using SO.

SO is basically a software developer social network that goes like this :

hey I have this problem, I tried this and it didn’t work, what can I do?
well (sometimes condescendingly) it works like this so that worked for me and here is why

then people discuss via comments, answers, vote, etc until, hopefully the most appropriate (which does not mean “correct”) answer rises to the top.

The next person with the same, or similar enough, problem gets to try right away what might work.

SO is very efficient in that sense but sometimes the tone itself can be negative, even toxic.

Sometimes the person asking did not bother search much, sometimes they clearly have no grasp of the problem, so replies can be terse, if not worst.

Yet the content itself is often correct in the sense that it does solve the problem.

So SO in a way is the pinnacle of “technically right” yet being an ass about it.

Meanwhile what if you could get roughly the same mapping between a problem and its solution but in a nice, even sycophantic, matter?

Of course the switch will happen.

That’s nice, right?.. right?!

It is. For a bit.

It’s actually REALLY nice.

Until the “thing” you “discuss” with maybe KPI is keeping you engaged (as its owner get paid per interaction) regardless of how usable (let’s not even say true or correct) its answer is.

That’s a deep problem because that thing does not learn.

It has no learning capability. It’s not just “a bit slow” or “dumb” but rather it does not learn, at all.

It gets updated with a new dataset, fine tuned, etc… but there is no action that leads to invalidation of a hypothesis generated a novel one that then … setup a safe environment to test within (that’s basically what learning is).

So… you sit there until the LLM gets updated but… with that? Now that less and less people bother updating your source (namely SO) how is your “thing” going to lean, sorry to get updated, without new contributions?

Now if we step back not at the individual level but at the collective level we can see how short-termist the whole endeavor is.

Yes, it might help some, even a lot, of people to “vile code” sorry I mean “vibe code”, their way out of a problem, but if :

they, the individual
it, the model
we, society, do not contribute back to the dataset to upgrade from…

well I guess we are going faster right now, for some, but overall we will inexorably slow down.

So yes epistemologically we are slowing down, if not worst.

Anyway, I’m back on SO, trying to actually understand a problem. Trying to actually learn from my “bad” situation and rather than randomly try the statistically most likely solution, genuinely understand WHY I got there in the first place.

I’ll share my answer back on SO hoping to help other.

Don’t just “use” a tool, think, genuinely, it’s not just fun, it’s also liberating.

Literally.

Don’t give away your autonomy for a quick fix, you’ll get stuck."

originally on https://mastodon.pirateparty.be/@utopiah/115315866570543792

ThirdConsul@lemmy.ml · edit-2 6 days ago

I honestly think that LLM will result in no progress made ever in computer science.

Most past inventions and improvements were made because of necessity of how sucky computers are and how unpleasant it is to work with them (we call it “abstraction layers”). And it was mostly done on company’s dime.

Now companies will prefer to produce slop (even more) because it will hope to automate slop production.

I3lackshirts94@lemmy.world · 6 days ago

As an expert in my engineering field I would agree. LLMs has been a great tool for my job in being better at technical writing or getting over the hump of coding something every now and then. That’s where I see the future for ChatGPT/AI LLMs; providing a tool that can help people broaden their skills.

There is no future for the expertise in fields and the depth of understanding that would be required to make progress in any field unless specifically trained and guided. I do not trust it with anything that is highly advanced or technical as I feel I start to teach it.

amzd@lemmy.world · 6 days ago

Most importantly, the pipeline from finding a question on SO that you also have, to answering that question after doing some more research is now completely derailed because if you ask an AI a question and it doesn’t have a good answer you have no way to contribute your eventual solution to the problem.

NotMyOldRedditName@lemmy.world · edit-2 6 days ago

Maybe SO should run everyone’s answers through a LLM and revoke any points a person gets for a condescending answer even if accepted.

Give a warning and suggestions to better meet community guidelines.

It can be very toxic there.

Edit: I love the downvotes here. OP - AI is going to destroy the sources of truth and knowledge, in part because people stopped going to those sources because people were toxic at the sources. People: But I’ll downvote suggestions that could maybe reduce toxicity, while having no actual impact on the answers given.

Chaotic Entropy@feddit.uk · edit-2 6 days ago

AI will inevitably kill all the sources of actual information. Then all we’re going to be left with is the fuzzy learned version of information plus a heap of hallucinations.

What a time to be alive.

Gary Ghost@lemmy.world · 7 days ago

AI just cuts pastes from the websites like Wikipedia. The problem is when it gets information that’s old or from a sketchy source. Hopefully people will still know how to check sources, should probably be taught in schools. Who’s the author, how olds the article, is it a reputable website, is there a bias. I know I’m missing some pieces

Encrypt-Keeper@lemmy.world · 6 days ago

You replied to OP while somehow missing the entire point of what he said lol

Gary Ghost@lemmy.world · 6 days ago

Do be fair I didn’t read the article

mriormro@lemmy.zip · 6 days ago

That’s not ‘being fair’ that’s just you admitting you’d rather hear your own blathering voice than do any real work.

Gary Ghost@lemmy.world · 5 days ago

To be fair calling that work is stretching it

Chaotic Entropy@feddit.uk · 7 days ago

Much of the time, AI paraphrases, because it is generating plausible sentences not quoting factual material. Rarely do I see direct quotes that don’t involve some form of editorialising or restating of information, but perhaps I’m just not asking those sorts of questions much.

veni_vedi_veni@lemmy.world · edit-2 7 days ago

Man, we hardly did that shit 20 years ago. Ain’t no way the kids doing that now.

At best they’ll probably prompt AI into validating if the text is legit

RedWheelbarrow@lemmy.world · 7 days ago

I guess I’m a bit old school, I still love Wikipedia.

AceOnTrack@lemmy.blahaj.zone · 7 days ago

I use Wikipedia when I want to know stuff. I use chatGPT when I need quick information about something that’s not necessarily super critical.

It’s also much better at looking up stuff than Google. Which is amazing, because it’s pretty bad. Google has become absolute garbage.

kadu@scribe.disroot.org · 7 days ago

It’s also much better at looking up stuff than Google.

Or maybe it’s just as bad but extremely confident, so you accept the wrong results. ChatGPT is just looking at Reddit and Google search results through an additional layer of language processing, it can’t possibly be better than either. Every day AI bros tell us “no seriously now they fixed search!” and I do the exact same benchmark of 10 easy questions that you can first an answer to within the first five results of a traditional search, and they fail on 6 out of 10.

AceOnTrack@lemmy.blahaj.zone · 7 days ago

To get a decent result on Google, you have to wade through 2 pages of ads, 4 pages of sponsored content, and maybe the first good result is on page 10.

ChatGPT does a good job at filtering most of the bullshit.

I know enough to not just accept any shit from the internet at face value.

kadu@scribe.disroot.org · 7 days ago

To get a decent result on Google, you have to wade through 2 pages of ads, 4 pages of sponsored content, and maybe the first good result is on page 10.

Block ads and use a different search engine?

ChatGPT does a good job at filtering most of the bullshit.

You repeated that twice, but it’s demonstrably false. It does not. It feeds you completely wrong information randomly.

I know enough to not just accept any shit from the internet at face value.

If you’re going to fact check ChatGPT anyway, you’re wasting more time than just doing the research yourself with good tools. But this is a false equivalency, because by doing the research yourself you start to learn good sources and exercise information synthesis, by using ChatGPT and fact checking it you’re helping Sam Altman get richer.

AceOnTrack@lemmy.blahaj.zone · edit-2 7 days ago

Why the fuck are you defending google so hard lmao.

Google will absolutely put bad information front and center too.

And by using Google you make Google richer. In fact you get served far more ads using Google products than chatGPT.

What’s your fucking point lmao.

kadu@scribe.disroot.org · edit-2 7 days ago

Why the fuck are you defending google so hard lmao.

Ah yes, when I said “use a different search engine” as a solution to Google having issues I’m certainly defending Google! What an endorsement right? “Use a completely different service” is free publicity for Google!

AceOnTrack@lemmy.blahaj.zone · 7 days ago

Other search engines are even worse than Google lmao. Brave consistently provide literally the worst results. Duck duck go same.

Are you actually serious.

tb_@lemmy.world · 7 days ago

I think you missed a part of their comment:

Block ads and use a different search engine?

Both Ecosia and DuckDuckGo have served me pretty well. Kagi also seems somewhat interesting.
Ecosia is working with Qwant on their own index, the first version of which has already gone online I believe. So they’re no longer exclusively relying on Bing/Google for their back-end.

AceOnTrack@lemmy.blahaj.zone · edit-2 6 days ago

I have yet to use an alternate search engine for any length of time (and i’ve tried a few) and think “ah yes, this was the kind of results I expected from my search”, they’re systematically worse than google, which is an incredible achievement, considering how absolute garbage google is nowadays.

Brave, which i’m using now, is atrocious with that. The amount of irrelevant bullshit it throws at you before getting to the stuff you are actually looking for is actually incredible.

DMCMNFIBFFF@lemmy.world · 6 days ago

Yep, that an occasionally Wiktionary, Wikidata, and even Rationalwiki.

RedWheelbarrow@lemmy.world · 7 days ago

You’re right bro but I feel comfortable searching the old fashioned way!

Cybersteel@lemmy.world · 7 days ago

Same but with Encyclopedia Brittanica

kazerniel@lemmy.world · 7 days ago

“With fewer visits to Wikipedia, fewer volunteers may grow and enrich the content, and fewer individual donors may support this work.”

I understand the donors aspect, but I don’t think anyone who is satisfied with AI slop would bother to improve wiki articles anyway.

drspawndisaster@sh.itjust.works · 7 days ago

The idea that there’s a certain type of person that’s immune to a social tide is not very sound, in my opinion. If more people use genAI, they may teach people who could have been editors in later years to use genAI instead.

kazerniel@lemmy.world · 7 days ago

That’s a good point, scary to think that there are people growing up now for whom LLMs are the default way of accessing knowledge.

Hackworth@piefed.ca · 7 days ago

Eh, people said the exact same thing about Wikipedia in the early 2000’s. A group of randos on the internet is going to “crowd source” truth? Absurd! And the answer to that was always, “You can check the source to make sure it says what they say it says.” If you’re still checking Wikipedia sources, then you’re going to check the sources AI provides as well. All that changes about the process is how you get the list of primary sources. I don’t mind AI as a method of finding sources.

The greater issue is that people rarely check primary sources. And even when they do, the general level of education needed to read and understand those sources is a somewhat high bar. And the even greater issue is that AI-generated half-truths are currently mucking up primary sources. Add to that intentional falsehoods from governments and corporations, and it already seems significantly more difficult to get to the real data on anything post-2020.

llama@lemmy.zip · 7 days ago

But Wikipedia actually is crowd sourced data verification. Every AI prompt response is made up on the fly and there’s no way to audit what other people are seeing for accuracy.

Hackworth@piefed.ca · 7 days ago

Hey! An excuse to quote my namesake.

Hackworth got all the news that was appropriate to his situation in life, plus a few optional services: the latest from his favorite cartoonists and columnists around the world; the clippings on various peculiar crackpot subjects forwarded to him by his father […] A gentleman of higher rank and more far-reaching responsibilities would probably get different information written in a different way, and the top stratum of New Chuasan actually got the Times on paper, printed out by a big antique press […] Now nanotechnology had made nearly anything possible, and so the cultural role in deciding what should be done with it had become far more important than imagining what could be done with it. One of the insights of the Victorian Revivial was that it was not necessarily a good thing for everyone to read a completely different newspaper in the morning; so the higher one rose in society, the more similar one’s Times became to one’s peers’. - The Diamond Age by Neal Stephenson (1995)

That is to say, I agree that everyone getting different answers is an issue, and it’s been a growing problem for decades. AI’s turbo-charged it, for sure. If I want, I can just have it yes-man me all day long.

r0ertel@lemmy.world · 6 days ago

This will be unpopular, but hear me out. Maybe the decline in visitors is only a decline in the folks who are simply looking for a specific word or name and the forgot. Like, that one guy who believed in the survival of the fittest. Um. Let me try to remember. I think he had an epic beard. Ah! Darwin! I just needed a reminder, I didn’t want to read the entire article on him because I did that years ago.

Look at your own behaviors on lemmy. How often do you click/tap through to the complete article? What if it’s just a headline? What if it’s the whole article pasted into the body of the post? Click bait headlines are almost universally hated, but it’s a desperate attempt to drive traffic to the site. Sometimes all you need is the article synopsis. Soccer team A beats team B in overtime. Great, that’s all I need to know…unless I have a fantasy team.

Kissaki@feddit.org · edit-2 5 days ago

If you don’t check their name - Darwin - on Wikipedia, where do you check it? A random AI? When you’re on Facebook, their AI? When you’re on Reddit, their AI? How trustworthy are they? What does that mean for general user behavior in the short and long term?

When you’re satisfied with a soccer match score from a headline, fair enough. Which headline do you refer to, though? Who provides it? Who ensures it is correct?

Wikipedia is an established and good source for many things.

The point is that people get their information elsewhere now. Where it may be incomplete, wrong, or maliciously misrepresenting or lying. Where discovering more related information is even further away. Instead of the next paragraph or a scroll or index nav list jump away, no hyperlink, no information.

Personally, I regularly explore and verify sources.

I doubt most of those visits to Wikipedia were as shallow as finding just one name or term. Maybe one piece of information. Which may already go deeper than shallow term finding, and cross references and notes may spark interests or relevant concerns.

r0ertel@lemmy.world · 5 days ago

You have a lot of good points and I may have missed the intent of the article, but a knee jerk reaction of “lower traffic = AI is bad” is not helpful either. My point is that I frequently find myself hitting a page just to check a reference, quote or remember something. AI search results can be useful here. It’s no different than how DuckDugkGo has a sidebar if the results are from StackOverflow. It’s nice to get quick answers. I would like to see a fair solution to the content creators being able to stay in business.

Petter1@discuss.tchncs.de · 5 days ago

I think that you did not understand OC correctly…

What OC is talking about, is that the person searching for the lost word is verification enough. Meaning as soon as the word is seen, the remember is triggered to where the searching person knows the information already.

i_stole_ur_taco@lemmy.ca · 5 days ago

Half my visits to Wikipedia are because I need to copy and paste a Unicode character and that’s always the highest search result with a page I can easily copy and paste the exact character from.

Scrollone@feddit.it · 5 days ago

Em dash? Wikipedia.

Nice-looking quotes? Wikipedia.

Accented uppercase letters? Wikipedia.

(Yeah, I know. The last one can only be understood by Italian speakers; or speakers of other languages with stupid keyboard layouts)

Treczoks@lemmy.world · 7 days ago

Not me. I value Wikipedia content over AI slop.

Mrkawfee@feddit.uk · edit-2 7 days ago

I asked a chatbot scenarios for AI wiping out humanity and the most believable one is where it makes humans so dependent and infantilized on it that we just eventually stop reproducing and die out.

Geodad@lemmy.world · 7 days ago

So we get the Wall-e future…

DMCMNFIBFFF@lemmy.world · 6 days ago

wp:I, Mudd

Mudd explains that he broke out of prison, stole a spaceship, crashed on this planet, and was taken in by the androids. He says they are accommodating, but refuse to let him go unless he provides them with other humans to serve and study. Mudd informs Kirk that he and his crew are to serve this purpose and can expect to spend the rest of their lives there.

godrik@lemmy.world · 7 days ago

Tbh, I’d say that’s not a bad scenario all in all, and much more preferably than scenarios with world war, epidemics, starvation etc.

Tollana1234567@lemmy.today · 7 days ago

because people are just reading AI summarized explanation of your searches, many of them are derived from blogs and they cant be verified from an official source.

BeMoreCareful@lemmy.world · 7 days ago

Or the ai search just rips off Wikipedia.

katy ✨@piefed.blahaj.zone · 7 days ago

all websites should block ai and bot traffic on principle.

maniacalmanicmania@aussie.zone · 7 days ago

The problem is many no longer identify as bots and come from hundreds if not thousands of IPs.

WhiskyTangoFoxtrot@lemmy.world · 7 days ago

Voight-Kampff them.

kent_eh@lemmy.ca · edit-2 6 days ago

all websites should block ai and bot traffic on principle.

Increasing numbers do.

But there is no proof that the LLM trawling bots are willing to respect those blocks.

DMCMNFIBFFF@lemmy.world · 6 days ago

FWIW:

Wikipedia:Bot policy#Bot requirements

https://en.wikipedia.org/wiki/Wikipedia:Bot_policy#Bot_requirements

RationalWiki:Bots

https://rationalwiki.org/wiki/RationalWiki:Bots

llama@lemmy.zip · 7 days ago

Yet I still have to go to the page for the episode lists of my favorite TV shows because every time I ask AI which ones to watch it starts making up episodes that either don’t exist or it gives me the wrong number.

Kissaki@feddit.org · 5 days ago

Sounds like it wants you to ask about it and then wants to write fan fiction for you.

Scrollone@feddit.it · 5 days ago

Let’s all repeat: LLMs don’t know any facts. They’re just a thesaurus on steroids.

anticurrent@sh.itjust.works · edit-2 7 days ago

I am kinda a big hater on AI and what danger it represents to the future of humanity

But. as a hobby programmer, I was surprised at how good these llms can answer very technical questions and provide conceptual insight and suggestions about how to glue different pieces of software together and which are the limitations of each one. I know that if AI knows about this stuff it must have been produced by a human. but considering the shitty state of the internet where copycat website are competing to outrank each other with garbage blocks of text that never answer what you are looking for. the honest blog post is instead burried at the 99 page in google search. I can’t see how old school search will win over.

Add to that I have found forums and platforms like stack overflow to be not always very helpful, I have many unanswered questions on stackoverflow piled-up over many years ago. things that llms can answer in details in just seconds without ever being annoyed at me or passing passive aggressive comments.

godrik@lemmy.world · 7 days ago

Hobby programmer her as well. I know you I’ve spent a lot of time searching for solutions or hints for, especially when it’s about edge cases. So using AI as an alt. to a search engine have saved me sooo much time!

Another thing with the approach. I read somewhere that it require about 10 times as much energy to ask an AI instead of doing a web search and spending a little time looking through the result. So it’s something I try to think of to motivate myself with, to do as many usual web searches as possible, saving AI queries for when it matters more.

TangledHyphae@lemmy.world · 6 days ago

I would say it’s more like 1000 times more energy. Trillions of matrix math computations for a handful of tokens at max speed and CPU/GPU usage, compared to a 10 millisecond database query (or in wiki’s case, probably mostly just easy direct edge node cache with no processing involved.)

godrik@lemmy.world · edit-2 6 days ago

Alright, yea found fair enough, even better motivation to prioritize search engines!

kent_eh@lemmy.ca · edit-2 6 days ago

I know that if AI knows about this stuff it must have been produced by a human.

For now. Maybe.

It won’t be long before these LLMs will start ingesting the output from other LLMs, biases, confidently wrong answers, hallucinations and all.

GeneralEmergency@lemmy.world · 5 days ago

Surly it can’t be because of the decline in quality because of deposit admins defending their own personal fiefdoms.

MystValkyrie@lemmy.blahaj.zone · edit-2 7 days ago

I’m part of the problem. I now use Le Chat instead of search engines because AI destroyed search engines, thanks to all the content mills that make slop. I wish search engines just worked, and it’s a classic example of capitalism creating problems to justify new technology.

And I wonder if it’s just AI. I know some people moved to backing up pre-2025 versions of Wikipedia via Kiwix out of fear that the site gets censored. I know now that I’ve done that, it’s a no-brainer to just do my Wikipedia research without using bandwidth.

tb_@lemmy.world · 7 days ago

Search engines will still give Wikipedia results at the top for relevant searches. Heck, you can search Wikipedia itself directly!

Both Ecosia and DuckDuckGo support some form of “bangs”, if I tack !w onto my search it’ll immediate go through to Wikipedia.
DuckDuckGo has even introduced an AI image filter, which is not perfect but still pretty good.

MystValkyrie@lemmy.blahaj.zone · 7 days ago

Bangs are helpful, but my problem is that I previously used search engines to find informative articles and product suggestions beyond the scope of Wikipedia, and so much of that is AI slop now. And if it’s not that, Reddit shows up disproportionately in search results and Google is dominated by promoted posts.

Search engines used to be really good at connecting people to reliable resources, even if you didn’t have a specific website in mind, if you were good with keywords/boolean and had a discerning eye for reliable content, but now the slop-to-valuable-content ratio is too disproportionate. So you either need to have pre-memorized a list of good websites, rely on Chatbots, or take significantly longer wading through the muck.

tb_@lemmy.world · 6 days ago

This blacklist is a pretty neat way to block a good amount of those AI slop results.