Wikimedia Foundation's plans to introduce AI-generated article summaries to Wikipedia

ɯᴉuoʇuɐ · edit-2 10 months ago

Wikimedia Foundation's plans to introduce AI-generated article summaries to Wikipedia

@pinball_wizard@lemmy.zip · 10 months ago

The big issue I see here isn’t the proposed solution, it’s the public image of doing something the tech bro billionaires are pushing hard right now.

It looks a bit like choosing the other side of the class war from their contributors.

Wikipedia, in particular, may not be able to afford that negatvie image, right now.

I could welcome this kind of tool later, but their timing sucks.

@tfm@europe.pub · 10 months ago

Thanks, I hate it.

@Cheradenine@sh.itjust.works · 10 months ago

Wikipedia articles already have lead in summaries.

Fuck right off with this

A future experiment will study ways of editing and adjusting this content.

@chunes@lemmy.world · 10 months ago

There is also already https://simple.wikipedia.org/wiki/Main_Page

@MDCCCLV@lemmy.ca · 10 months ago

A lot of them for the small articles and stubs are written very technically and don’t provide an explanation for complex subjects if you aren’t already familiar with it. Then you have to read 4 subjects down just to figure out the jargon for what they’re saying

@Cheradenine@sh.itjust.works · 10 months ago

I’d agree with that, both are problematic.

A lot of stubs should be deleted until they are expanded, they’re often more confusing than knowing nothing at all. I don’t think an LLM summary will help here though.

Reading a few articles deep is not only a pain in the ass, but is going to dissuade those who won’t do it. There’s also the issue that when you do wade in it might link to something that is poorly cited and confusing. Again, I think an LLM is going to make things worse here.

FaceDeer · 10 months ago

A lot of stubs should be deleted until they are expanded

How does one expand a deleted article?

Wikipedia is not intended to be presenting a finished product, it’s an eternal work in progress. A stub is the start of an article. If you delete an article whenever it gets started that seems counterproductive.

@BrianTheeBiscuiteer@lemmy.world · 10 months ago

Maybe it’s a result of Wikipedia trying to be more of an “online encyclopedia” vs a digital information hub or learning resource. I don’t think it’s a problem on its own but I do think there should be a simplified version of every article.

@catloaf@lemm.ee · 10 months ago

Math articles are the worst. They always jump right into calculus and stuff. I usually have to hope there’s a simple English article for those!

@AbouBenAdhem@lemmy.world · 10 months ago

This is one thing I can see an actual use case for (as an external tool, not as part of WP itself): Create a summary, not of the article itself, but of the prerequisite background knowledge. And customized to the reader’s existing knowledge—like, what do I need to know to understand this article assuming I already know X but not Y or Z.

@Takapapatapaka@lemmy.world · 10 months ago

I agree, having experienced this especially on mathematics pages. But on the other hand, from my experience, the whole article is very technical in those cases : I’m not sure making a summary would help, and im not sure you can provide a summary both correct and easily understandable in those cases.

@jsomae@lemmy.ml · 10 months ago

ok, just so long as the articles themselves aren’t AI generated.

@deathbird@mander.xyz · 10 months ago

This is not the medicine for curing what ails Wikipedia, but when all anyone is selling is a hammer…

[R3D4CT3D] · 10 months ago

“Most readers in the US can comfortably read at a grade 5 level,[CN]”

so where is the citation? did they just pull a number from their butt? hmm…

srsly, this is some bs.

@cygnus@lemmy.ca · edit-2 10 months ago

It’s actually true. 56% of Americans are “partially illiterate”, which explains a lot about the state of affairs in that country.

In 2023, 28% of adults scored at or below Level 1, 29% at Level 2, and 44% at Level 3 or above. Anything below Level 3 is considered “partially illiterate”

https://en.wikipedia.org/wiki/Literacy_in_the_United_States

Dr. Moose · 10 months ago

I’m genuonely confused how is that even possible in a developed country such as US. Do people not read at all? As in an article or gossip magazine - all of those would get you there.

Is it just country side folk drinking beer and watching fox news? It can’t be 50% of all people. How.

@Ledericas@lemm.ee · 10 months ago

basically the 2nd sentence is a product defunding education in red states, and under funding everywhere else. another issue is “participation grades for basically almost failing and failing classes”.

sillyplasm · edit-2 10 months ago

frankly, I’m not quite surprised ._.
edit: upon reading the article, I now wonder if it’s possible for your literacy to go down. I used to be such a bookworm in grade school, but now I have to reread stuff over and over in order to comprehend what’s going on.

@Carnelian@lemmy.world · 10 months ago

You might just be chronically tired or worn down from the stresses of life. It’s pretty common.

Another thing is as we get older a lot of people will choose more “challenging” adult books and then just be totally bored lol. I read young adult and kids books sometimes (how can I give a book to a child if I haven’t read it myself?) and it’s always surprising to me how they can be ripped through in no time at all.

But in general I think you’re probably right that literacy can decrease with disuse. It seems like most things about the mind and body trend that way

ladfrombrad 🇬🇧 · 10 months ago

But in general I think you’re probably right that literacy can decrease with disuse

Maths is a really good example of this.

At one point I really enjoyed doing long division in my head but as time goes on (and you don’t exercise that sponge…), it becomes lazy.

@applemao@lemmy.world · 10 months ago

The mind is a muscle. Don’t ignore it. Especially now, if you use your mind you’ll be light-years ahead of ai addicts.

Dr. Moose · edit-2 10 months ago

AI threads on lemmy are always such a disappointment.

Its ironic that people put so little thought into understanding this and complain about “ai slop”. The slop was in your heads all along.

To think that more accessibility for a project that is all about sharing information with people to whom information is least accessible is a bad thing is just an incredible lack of awareness.

Its literally the opposite of everything people might hate AI for:

RAG is very good and accurate these days that doesn’t invent stuff. Especially for short content like wiki articles. I work with RAG almost every day and never seen it hallucinate with big models.
it’s open and not run a “big scary tech”
it’s free for all and would sace millions of editor hours and allow more accuracy in the articles themselves.

And to top it all you know this is a lost fight even if you’re right so instead of contributing to steering this societal ship these people cover their ears and “bla bla bla we don’t want it”. It’s so disappointingly irresponsible.

@FourWaveforms@lemm.ee · 10 months ago

I don’t trust even the best modern commercial models to do this right, but with human oversight it could be valuable.

You’re right about it being a lost fight, in some ways at least. There are lawsuits in flight that could undermine it. How far that will go remains to be seen. Pissing and moaning about it won’t accelerate the progress of those lawsuits, and is mainly an empty recreational activity.

ɯᴉuoʇuɐ · edit-2 10 months ago

RAG is very good and accurate these days that doesn’t invent stuff.

In the OP I linked a comment showing how the summary presented in the showcase video is not actually very accurate and it definitely does invent some elements that are not present in the article that is being summarised.

And in general the “accessibility” that primarily seems to work by expressing things in imprecise, unscientific or emotionally charged terms could well be more harmful than less immediately accessible but accurate and unambiguous content. You appeal to Wikipedia being “a project that is all about sharing information with people to whom information is least accessible”, but I don’t think this ever was that much of a goal - otherwise the editors would have always worked harder on keeping the articles easily accessible and comprehensible to laymen (in fact I’d say traditional encyclopedias are typically superior to Wikipedia in this regard).

and would save millions of editor hours and allow more accuracy and complexity in the articles themselves.

Sorry but you’re making things up here, not even the developers of the summaries are promising such massive consequences. The summaries weren’t meant to replace any of the usual editing work, they weren’t meant to replace the normal introductory paragraphs or anything else. How would they save these supposed “millions of editor hours” then? In fact, they themselves would have to be managed by the editors as well, so all I see is a bit of additional work.

@phantomwise@lemmy.ml · 10 months ago

I don’t think the idea itself is awful, but everyone is so fed up with AI bullshit that any attempt to integrate even an iota of it will be received very poorly, so I’m not sure it’s worth it.

qevlarr · 10 months ago

The point is they should be fighting AI, not open the door even an inch to AI on their site. Like so many other endeavors, it only works because the contributors are human. Not corpos, not AI, not marketing. AI kills Wikipedia if they let that slip. Look at StackOverflow, look at Reddit, look at Google search, look at many corporate social media. Dead internet theory is all around us.

Wikipedia is trusted because it’s all human. No other reason

@flamingo_pinyata@sopuli.xyz · 10 months ago

Finally, a good use case for AI

ɯᴉuoʇuɐ · 10 months ago

Looks like the vast majority of people disagree D: I do agree that WP should consider ways to make certain articles more approachable to laymen, but this doesn’t seem to be the right approach.

Encrypt-Keeper · 10 months ago

I am pretty rabidly anti-AI in most cases, but the use case for AI that I don’t think is a big negative is the distillation of information for simplification purposes. I am still somewhat against this in the sense that at the end of the day their summarization AI could hallucinate, and since they’ve admitted this is a solution to a problem of scale, then it’s not sensible to assume humans will be able to babysit it.

However… there is some inherent value to the idea that people will end up using AI to summarize Wikipedia using models of dubious quality with an unknown quantity of intentionally pre-trained bias, and therefore there is some inherent value to training your own model to present the information on your site in a way that is the “most free” of slop and bias.

Admiral Patrick · 10 months ago

Doesn’t it already have simplified versions of most articles at simple.wikipedia.org ?

ɯᴉuoʇuɐ · edit-2 10 months ago

This is already addressed in the first quote in my post. And no, I’m sure that not even close to most articles have a simple.wikipedia equivalent, or that it actually is adequately simple (e.g. one topic I was interested in recently that Wikipedia didn’t really help me with: “The Bernoulli numbers are a sequence of signed rational numbers that can be defined with exponential generating functions. These numbers appear in the series expansion of some trigonometric functions.” - that’s one whole “simplified” article, and I have no idea what it’s saying and it has no additional info or examples).

FaceDeer · 10 months ago

The vast majority of people in this particular bubble disagree.

I’ve found that AI is one of those topics that’s extremely polarizing, communities drive out dissenters and so end up with little awareness of what the general attitude in the rest of the world is.

ɯᴉuoʇuɐ · 10 months ago

The problem is that the bubble here are the editors who actually create the site and keep it running, and their “opposition” is the bubble of WMF staff.

FaceDeer · 10 months ago

The problem is that the bubble here are the editors who actually create the site and keep it running

No it isn’t, it’s the technology@lemmy.world Fediverse community.

ɯᴉuoʇuɐ · 10 months ago

Have you read my OP or did you just use an AI-generated summary? I copy-pasted several comments from Wikipedia editors and linked a page with dozens, if not a hundred other comments by them, and they’re overwhelmingly negative.

FaceDeer · 10 months ago

I’m not talking about them at all. I’m talking about the technology@lemmy.world Fediverse community. It’s an anti-AI bubble. Just look at the vote ratios on the comments here. The guy you responded to initially said “Finally, a good use case for AI” and he got close to four downvotes per upvote. That’s what I’m talking about.

The target of these AI summaries are not Wikipedia editors, it’s Wikipedia readers. I see no reason to expect that target group to be particularly anti-AI. If Wikipedia editors don’t like it there’ll likely be an option to disable it.

ɯᴉuoʇuɐ · 10 months ago

I’m not talking about them at all.

But it’s quite obvious that they were what I was talking about, and you were responding to me. Instead of responding to my actual comment, you deceptively shifted the topic in order to trivialise the situation.

The target of these AI summaries are not Wikipedia editors

Except that the editors will very likely have to work to manage those summaries (rate, correct or delete them), so they definitely will be affected by them. And in general it’s completely unacceptable to suggest that the people who have created 99% of the content on Wikipedia should have less of a say on how the website functions than a handful of bureaucrats who ran a survey.

If Wikipedia editors don’t like it there’ll likely be an option to disable it.

Disabling would necessarily mean disabling it wiki-wide, not just for individual editors, in which case the opinions of the editors’ “bubble” will be quite relevant.

@pinball_wizard@lemmy.zip · edit-2 10 months ago

How much do you want to bet on the overlap being small?

A bigger question is how much does Wikiemedia Foundation want to bet that their top donors and contributors aren’t in this thread…

Edit: Moving my unrelated ramblings to a separate comment.

miguel · 10 months ago

You mean the bubble of people who don’t want a factually incorrect, environmentally damaging shortcut to provide a summary that’s largely already being done by someone? You’re right.

FaceDeer · 10 months ago

What an unbiased view. Got any citations?

@tyler@programming.dev · 10 months ago

The survey results? Did you read the post?

FaceDeer · 10 months ago

Miguel’s claims are:

The summaries are factually inaccurate
Generating the summaries are environmentally damaging.
Summarization is “largely already being done by someone”

There’s an anecdote in a talk page about one summary being inaccurate. A talk page anecdote is not a usable citation.

Survey results aren’t measuring environmental impact.

An the whole point of AI is to take the load off of someone having to do things manually. Assuming they actually are - even in this thread there are plenty of complaints about articles on Wikipedia that lack basic summaries and jump straight into detailed technical content.

@Sandbar_Trekker@lemmy.today · edit-2 10 months ago

“environmentally damaging”
I see a lot of users on here saying this when talking about any use case for AI without actually doing any sort of comparison.

In some cases, AI absolutely uses more energy than an alternative, but you really need to break it down and it’s not a simple thing to apply to every case.

For instance: using an AI visual detection model hooked up to a camera to detect when rain droplets are hitting the windshield of a car. A completely wasteful example. In comparison you could just use a small laser that pulses every now and then and measures the diffraction to tell when water is on the windshield. The laser uses far less electricity and has been working just fine as they are currently used today.

Compare that to enabling DLSS in a video game where NVIDIA uses multiple AI models to improve performance. As long as you cap the framerates, the additional frame generation, upscaling, etc. will actually conserve electricity as your hardware is no longer working as hard to process and render the graphics (especially if you’re playing on a 4k monitor).

Looking at Wikipedia’s use case, how long would it take for users to go through and create a summary or a “simple.wikipedia” page for every article? How much electricity would that use? Compare that to running everything through an LLM once and quickly generating a summary (which is a use case where LLMs actually excel at). It’s honestly not that simple either because we would also have to consider how often these summaries are being regenerated. Is it every time someone makes a minor edit to a page? Is it every few days/weeks after multiple edits have been made? Etc.

Then you also have to consider, even if a particular use case uses more electricity, does it actually save time? And is the time saved worth the extra cost in electricity? And how was that electricity generated anyway? Was it generated using solar, coal, gas, wind, nuclear, hydro, or geothermal means?

Edit: typo

warm · 10 months ago

If they add AI they better not ask me for any money ever again.

@SCmSTR@lemmy.blahaj.zone · 10 months ago

Holy shit kbin is still around??

warm · 10 months ago

Kbin.earth is on mbin, I think kbin is dead.

@SCmSTR@lemmy.blahaj.zone · 10 months ago

I am so sad. I really liked what kbin was trying to do.

warm · 10 months ago

Mbin is a fork and continuation of /kbin, but community-focused.

Kbin was destined to fail without opening up to community collaboration. I greatly preferred it over lemmy. So I will stick with Mbin now and Kbin.earth has been a small but nice Mbin instance.

@6nk06@sh.itjust.works · 10 months ago

Or moderators. Why would they need those people when the AI can fix everything for free and even improve articles?

@Monument@lemmy.sdf.org · 10 months ago

Right! I can’t wait to hear about all the new historical events!

I wonder if anyone witnessed the burning of the Library of Alexandria and felt a similar sense of despair for the future of knowledge.

@arrow74@lemm.ee · 10 months ago

You can download a copy of Wikipedia in full today before they turn it to shit.

Unlike the people in Alexandria, you can spend less that $20 and 20 minutes to download the whole thing and preserve it yourself

@Monument@lemmy.sdf.org · 10 months ago

You are a light in the darkness.

@cotlovan@lemm.ee · 10 months ago

Who exactly asked for this? Wikipedia isn’t publicly traded, they aren’t a for profit company, why are they trying to shove Ai into people’s faces?

For those few who wanted it, there are dozens of bots who can summarize the (already kinda small) Wikipedia articles

@Deflated0ne@lemmy.world · 10 months ago

asudox · 10 months ago

Time to switch to something else? Nutomic developed Ibis wiki for example: https://ibis.wiki/

FaceDeer · 10 months ago

You realize this is just a proposal at this stage? Their proposed next step is an experiment:

If we introduce a pre-generated summary feature as an opt-in feature on a the mobile site of a production wiki, we will be able to measure a clickthrough rate greater than 4%, ensure no negative effects to session length, pageviews, or internal referrals, and use this data to decide how and if we will further scale the summary feature.

Note, an opt-in clickthrough that they intend to monitor for further information on how to implement features like this and whether they should monitor them at all. As befits Wikipedia, they’re planning to base these decisions on evidence.

If “they’re gathering evidence and making proposals” is the threshold for you to jump ship to some other encyclopedia, I guess you do you. It’s not going to be much of an exodus though since nobody who actually uses Wikipedia has seen anything change.

asudox · 10 months ago

Mb. I still don’t see anything good coming out of implementing anything to do with AI though.

@vrighter@discuss.tchncs.de · 10 months ago

the summary (not ecessarily ai generated) I read elsewhere is what got me to wikipedia in the first place.

@sandflavoured@lemm.ee · 10 months ago

My immediate thought is that the purpose of an encyclopaedia is to have a more-or-less comprehensive overview of some topic of interest. The reader should be able to look through the page index to find the section they care about and read that section.

Its purpose is not to rapidly teach anyone anything in full.

It seems like a poor fit as an application for LLMs

katy ✨ · 10 months ago

fucking disgusting. no place should have ai but especially not an encyclopedia.