Lots of people on Lemmy really dislike AI’s current implementations and use cases.
I’m trying to understand what people would want to be happening right now.
Destroy gen AI? Implement laws? Hoping all companies use it for altruistic purposes to help all of mankind?
Thanks for the discourse. Please keep it civil, but happy to be your punching bag.
- I’m not against AI itself—it’s the hype and misinformation that frustrate me. LLMs aren’t true AI - or not AGI as the meaning of AI has drifted - but they’ve been branded that way to fuel tech and stock market bubbles. While LLMs can be useful, they’re still early-stage software, causing harm through misinformation and widespread copyright issues. They’re being misapplied to tasks like search, leading to poor results and damaging the reputation of AI. - Real AI lies in advanced neural networks, which are still a long way off. I wish tech companies would stop misleading the public, but the bubble will burst eventually—though not before doing considerable harm. 
- If we’re talking realm of pure fantasy: destroy it. - I want you to understand this is not AI sentiment as a whole, I understand why the idea is appealing, how it could be useful, and in some ways may seem inevitable. - But a lot of sci-fi doesn’t really address the run up to AI, in fact a lot of it just kind of assumes there’ll be an awakening one day. What we have right now is an unholy, squawking abomination that has been marketed to nefarious ends and never should have been trusted as far as it has. Think real hard about how corporations are pushing the development and not academia. - Put it out of its misery.  - How do you “destroy it”? I mean, you can download an open source model to your computer right now in like five minutes. It’s not Skynet, you can’t just physically blow it up. - OP asked what people wanted to happen, and even later “destroy gen AI” as an option. I get it is not realistically feasible, but it’s certainly within the realm of options provided for the discussion. No need to police their pie in the sky dream. I’m sure they realize it’s not realistic. 
 
 
- I would likely have different thoughts on it if I (and others) was able to consent my data into training it, or consent to even have it rather than it just showing up in an unwanted update. 
- If we’re going pie in the sky I would want to see any models built on work they didn’t obtain permission for to be shut down. - Failing that, any models built on stolen work should be released to the public for free. - This is the best solution. Also, any use of AI should have to be stated and watermarked. If they used someone’s art, that artist has to be listed as a contributor and you have to get permission. Just like they do for every film, they have to give credit. This includes music, voice and visual art. I don’t care if they learned it from 10,000 people, list them. 
- If we’re going pie in the sky I would want to see any models built on work they didn’t obtain permission for to be shut down. - I’m going to ask the tough question: Why? - Search engines work because they can download and store everyone’s copyrighted works without permission. If you take away that ability, we’d all lose the ability to search the Internet. - Copyright law lets you download whatever TF you want. It isn’t until you distribute said copyrighted material that you violate copyright law. - Before generative AI, Google screwed around internally with all those copyrighted works in dozens of different ways. They never asked permission from any of those copyright holders. - Why is that OK but doing the same with generative AI is not? I mean, really think about it! I’m not being ridiculous here, this is a serious distinction. - If OpenAI did all the same downloading of copyrighted content as Google and screwed around with it internally to train AI then never released a service to the public would that be different? - If I’m an artist that makes paintings and someone pays me to copy someone else’s copyrighted work. That’s on me to make sure I don’t do that. It’s not really the problem of the person that hired me to do it unless they distribute the work. - However, if I use a copier to copy a book then start selling or giving away those copies that’s my problem: I would’ve violated copyright law. However, is it Xerox’s problem? Did they do anything wrong by making a device that can copy books? - If you believe that it’s not Xerox’s problem then you’re on the side of the AI companies. Because those companies that make LLMs available to the public aren’t actually distributing copyrighted works. They are, however, providing a tool that can do that (sort of). Just like a copier. - If you paid someone to study a million books and write a novel in the style of some other author you have not violated and law. The same is true if you hire an artist to copy another artist’s style. So why is it illegal if an AI does it? Why is it wrong? - My argument is that there’s absolutely nothing illegal about it. They’re clearly not distributing copyrighted works. Not intentionally, anyway. That’s on the user. If someone constructs a prompt with the intention of copying something as closely as possible… To me, that is no different than walking up to a copier with a book. You’re using a general-purpose tool specifically to do something that’s potentially illegal. - So the real question is this: Do we treat generative AI like a copier or do we treat it like an artist? - If you’re just angry that AI is taking people’s jobs say that! Don’t beat around the bush with nonsense arguments about using works without permission… Because that’s how search engines (and many other things) work. When it comes to using copyrighted works, not everything requires consent. - If you paid someone to study a million books and write a novel in the style of some other author you have not violated any law. The same is true if you hire an artist to copy another artist’s style. So why is it illegal if an AI does it? Why is it wrong? - I think this is intentionally missing the point. - LLMs don’t actually think, or produce original ideas. If the human artist produces a work that too closely resembles a copyrighted work, then they will be subject to those laws. LLMs are not capable of producing new works, by definition they are 100% derivative. But their methods in doing so intentionally obfuscate attribution and allow anyone to flood a space with works that require actual humans to identify the copyright violations. 
- Like the other comments say, LLMs (the thing you’re calling AI) don’t think. They aren’t intelligent. If I steal other people’s work and copy pieces of it and distribute it as if I made it, that’s wrong. That’s all LLMs are doing. They aren’t “being inspired” or anything like that. That requires thought. They are copying data and creating outputs based on weights that tell it how and where to put copied material. - I think the largest issue is people hearing the term “AI” and taking it at face value. There’s no intelligence, only an algorithm. It’s a convoluted algorithm that is hard to tell what going on just by looking at it, but it is an algorithm. There are no thoughts, only weights that are trained on data to generate predictable outputs based on given inputs. If I write an algorithm that steals art and reorganizes into unique pieces, that’s still stealing their art. - For a current example, the stuff going on with Marathon is pretty universally agreed upon to be bad and wrong. However, you’re arguing if it was an LLM that copied the artist’s work into their product it would be fine. That doesn’t seem reasonable, does it? - My argument is that the LLM is just a tool. It’s up to the person that used that tool to check for copyright infringement. Not the maker of the tool. - Big company LLMs were trained on hundreds of millions of books. They’re using an algorithm that’s built on that training. To say that their output is somehow a derivative of hundreds of millions of works is true! However, how do you decide the amount you have to pay each author for that output? Because they don’t have to pay for the input; only the distribution matters. - My argument is that is far too diluted to matter. Far too many books were used to train it. - If you train an AI with Stephen King’s works and nothing else then yeah: Maybe you have a copyright argument to make when you distribute the output of that LLM. But even then, probably not because it’s not going to be that identical. It’ll just be similar. You can’t copyright a style. - Having said that, with the right prompt it would be easy to use that Stephen King LLM to violate his copyright. The point I’m making is that until someone actually does use such a prompt no copyright violation has occurred. Even then, until it is distributed publicly it really isn’t anything of consequence. - My argument is that the LLM is just a tool. It’s up to the person that used that tool to check for copyright infringement. Not the maker of the tool. - Build an inkjet printer exclusively out of stolen parts from HP, Brother, and Epson and marketed as being so good that experts can’t differentiate what they print from legal currency (except sometimes it adds cartoonish moustaches). Start selling it in retail stores alongside them. They would battery be announced, much less stocked on the shelves before C&D letters and/or arrest warrants arrived. 
- I run local models. The other day I was writing some code and needed to implement simplex noise, and LLMs are great for writing all the boilerplate stuff. I asked it to do it, and it did alright although I had to modify it to make it actually work because it hallucinated some stuff. I decided to look it up online, and it was practically an exact copy of this, down to identical comments and everything. - It is not too diluted to matter. You just don’t have the knowledge to recognize what it copies. 
 
 
- Search engines work because they can download and store everyone’s copyrighted works without permission. If you take away that ability, we’d all lose the ability to search the Internet. - No they don’t. They index the content of the page and score its relevance and reliability, and still provide the end user with the actual original information 
- However, if I use a copier to copy a book then start selling or giving away those copies that’s my problem: I would’ve violated copyright law. However, is it Xerox’s problem? Did they do anything wrong by making a device that can copy books? - This is false equivalence - LLMs do not wholesale reproduce an original work in it’s original form, they make it easy to mass produce a slightly altered form without any way to identify the original attribution. 
 
- Definitely need copyright laws. What if everything has to be watermarked in some way and it’s illegal to use AI generated content for commercial use unless permitted by creators? - The problem with trying to police the output is there isn’t a surefire way to detect the fact it’s generated. That’s why I prefer targeting the companies who created the problematic models. - But let’s say the model is released for free but people use it for commercial purposes. It seems the only solution is to mandate that all content a model is trained on and accesses has provided express permission or is original content. Nobody can release a model to the public which generates content based on “illegal” material. 
 
 
- Genuine curiosity. Not an attack. Did you download music illegally back in the day? Or torrent things? Do you feel the same about those copyrighted materials? 
 
- Part of what makes me so annoyed is that there’s no realistic scenario I can think of that would feel like a good outcome. - Emphasis on realistic, before anyone describes some insane turn of events. - I think somehow incentivizing companies to use solar power to power their data centers would be a step in the right direction - That would be a win. I think they’re currently angling for more nuclear energy. Because of course. 
 
- Some jobs are automated and prices go down. That’s realistic enough. To be fair there’s good and bad likely in that scenario. So tack on some level of UBI. Still realistic? That’d be pretty good. 
 
- Wishful thinking? Models trained on illegal data get confiscated, the companies dissolved, the ceos and board members made liable for the damages. - Then a reframing of these bs devices from ai to what they actually do: brew up statistical probability amalgamations of their training data, and then use them accordingly. They arent worthless or useless, they are just being shoved into functions they cannot perform in the name of cost cutting. 
- AI models produced from copyrighted training data should need a license from the copyright holder to train using their data. This means most of the wild west land grab that is going on will not be legal. In general I’m not a huge fan of the current state of copyright at all, but that would put it on an even business footing with everything else. - I’ve got no idea how to fix the screeds of slop that is polluting search of all kinds now. These sorts of problems ( along the lines of email spam ) seem to be absurdly hard to fix outside of walled gardens. - See, I’m troubled by that one because it sounds good on paper, but in practice that means that Google and Meta, who can certainly build licenses into their EULAs trivially, would become the only government-sanctioned entities who can train AI. Established corpos were actively lobbying for similar measures early on. - And of course good luck getting China to give a crap, which in that scenario would be a better outcome, maybe. - Like you, I think copyright is broken past all functionality at this point. I would very much welcome an entire reconceptualization of it to support not just specific AI regulation but regulation of big data, fair use and user generated content. We need a completely different framework at this point. - See, I’m troubled by that one because it sounds good on paper, but in practice that means that Google and Meta, who can certainly build licenses into their EULAs trivially, would become the only government-sanctioned entities who can train AI. Established corpos were actively lobbying for similar measures early on. - As long as people are paying other people, these things will equalize eventually. Ultimately, it would be much more likely that the cost of AI production would become so severe that it would no longer be viable as a business (which, frankly, is fine. There will eventually be enough public domain content that AI will be at the quality it is today with public materials alone.) - You seem to have a lot more trust in the invisible hand of the market and the inability of corporations to change copyright regulations to their liking than I do. - I have seen no evidence that “as long as people are paying other people” the money goes anywhere but towards billionaires. And… well, the absolute dismantling of public domain has been a running gag for ages. - And again, the corpos would not need to pay anybody anyway. Google already has a perfectly legal license to train AI on all of Youtube, Meta on all of Instagram and Facebook. You are telling me it’ll all even out in 100 years when the Internet goes into the public domain. That doesn’t sound like it’ll work the way you’re saying it’ll work. 
- There will eventually be enough public domain content that AI will be at the quality it is today with public materials alone. - So, AI will always be ~95 years behind the times? - Except the AIs produced by Disney et al, of course. And those produced by Chinese companies with the CCP stamp of approval. They’ll be up to date. 
 
 
- They use DRM for music, use it for AI but switch it the person owns their own voice, art and data. 
 
- I’m not anti AI, but I wish the people who are would describe what they are upset about a bit more eloquently, and decipherable. The environmental impact I completely agree with. Making every google search run a half cooked beta LLM isn’t the best use of the worlds resources. But every time someone gets on their soapbox in the comments it’s like they don’t even know the first thing about the math behind it. Like just figure out what you’re mad about before you start an argument. It comes across as childish to me - It feels like we’re being delivered the sort of stuff we’d consider flim-flam if a human did it, but lapping it up bevause the machine did it. - “Sure, boss, let me write this code (wrong) or outline this article (in a way that loses key meaning)!” If you hired a human who acted like that, we’d have them on an improvement plan in days and sacked in weeks. - So you dislike that the people selling LLMs are hyping up their product? They know they’re all dumb and hallucinate, their business model is enough people thinking it’s useful that someone pays them to host it. If the hype dies Sam Altman is back in a closet office at Microsoft, so he hypes it up. - I actually don’t use any LLMs, I haven’t found any smart ones. Text to image and image to image models are incredible though, and I understand how they work a lot more. - I expect the hype people to do hype, but I’m frustrated that the consumers are also being hypemen. So much of this stuff, especially at the corporate level, is FOMO rather than actually delivered value. - If it was any other expensive and likely vendor-lockin-inducing adventure, it would be behind years of careful study and down-to-the-dime estimates of cost and yield. But the same people who historically took 5 years to decide to replace an IBM Wheelwriter with a PC and a laser printer are rushing to throw AI at every problem up to and including the men’s toilet on the third floor being clogged. 
 
 
- But every time someone gets on their soapbox in the comments it’s like they don’t even know the first thing about the math behind it. Like just figure out what you’re mad about before you start an argument. - The math around it is unimportant, frankly. The issue with AI isn’t about GANN networks alone, it’s about the licensing of the materials used to train a GANN and whether or not companies that used materials to train a GANN had proper ownership rights. Again, like the post I made, there’s an easy argument to make that OpenAI and others never licensed the material they used to train the AI, making the whole model poisoned by copyright theft. - There’s plenty of uses of GANNs that are not problematic. Bespoke solution for predicting the outcomes of certain equations or data science uses that involve rough predictions on publically sourced statistics (or privately owned.) The problem is that these are not the same uses that we call “AI” today – and we’re actually sleeping on much better uses of neural networks by focusing on a pie in the sky AGI nonsense being pushed by companies that are simply pushing highly malicious, copyright infringing products to make a quick buck on the stock market. 
 
- I want people to figure out how to think for themselves and create for themselves without leaning on a glorified Markov chain. That’s what I want. - I totally understand your point of view. AI seems like the nail in the coffin for digital dominance over humans. It will debilitate people by today’s standards. - Can we compare gen AI tools to any other tools that currently eliminate some level of labor for us to do? e.g. drag and drop programs tools - Where do we draw the line? Can people then think and create in different ways using different tools? - Some GPT’s are already integrating historical conversations. We’re past Markov chain. 
- Maybe if the actual costs—especially including environmental costs from its energy use—were included in each query, we’d start thinking for ourselves again. It’s not worth it for most things it’s used for at the moment 
- People haven’t ”thought for themselves” since the printing press was invented. You gotta be more specific than that. - Ah, yes, the 14th century. That renowned period of independent critical thought and mainstream creativity. All downhill from there, I tell you. - Independent thought? All relevant thought is highly dependent of other people and their thoughts. - That’s exactly why I bring this up. Having systems that teach people to think in a similar way enable us to build complex stuff and have a modern society. - That’s why it’s really weird to hear this ”people should think for themselves” criticism of AI. It’s a similar justification to antivaxxers saying you ”should do your own research”. - Surely there are better reasons to oppose AI? - I agree on the sentiment, it was just a weird turn of phrase. - Social media has done a lot to temper my techno-optimism about free distribution of information, but I’m still not ready to flag the printing press as the decay of free-thinking. - Things are weirder than they seem on the surface. - A math professor collegue of mine calls extremely restrictive use of language ”rigor”, for example. - The point isn’t that it’s restrictive, the point is that words have precise technical meanings that are the same across authors, speakers, and time. It’s rigorous because of that precision and consistency, not just because it’s restrictive. It’s necessary to be rigorous with use of language in scientific fields where clear communication is difficult but important to get right due to the complexity of the ideas at play. - Yeah sure buddy. - Have you tried to shoehorn real life stuff into mathematical notation? It is restrictive. You have pre-defined strict boxes that don’t have blurry lines. Free form thoughts are a lot more flexible than that. - Consistency is restrictive. I don’t know why you take issue with that. 
 
 
 
- The usage of “independent thought” has never been “independent of all outside influence”, it has simply meant going through the process of reasoning–thinking through a chain of logic–instead of accepting and regurgitating the conclusions of others without any of one’s own reasoning. It’s a similar lay meaning as being an independent adult. We all rely on others in some way, but an independent adult can usually accomplish activities of daily living through their own actions. - Yeah but that’s not what we are expecting people to do. - In our extremely complicated world, most thinking relies on trusting sources. You can’t independently study and derive most things. - Otherwise everybody should do their own research about vaccines. But the reasonable thing is to trust a lot of other, more knowledgeable people. - My comment doesn’t suggest people have to run their own research study or develop their own treatise on every topic. It suggests people have make a conscious choice, preferably with reasonable judgment, about which sources to trust and to develop a lay understanding of the argument or conclusion they’re repeating. Otherwise you end up with people on the left and right reflexively saying “communism bad” or “capitalism bad” because their social media environment repeats it a lot, but they’d be hard pressed to give even a loosly representative definition of either. - This has very little to do with the criticism given by the first commenter. And you can use AI and do this, they are not in any way exclusive. 
 
 
 
 
 
- Speak for yourself. 
 
- So your argument against AI is that it’s making us dumb? Just like people have claimed about every technology since the invention of writing? The essence of the human experience is change, we invent new tools and then those tools change how we interact with the world, that’s how it’s always been, but there have always been people saying the internet is making us dumb, or the TV, or books, or whatever. - Get back to me after you have a few dozen conversations with people who openly say “Well I asked ChatGPT and it said…” without providing any actual input of their own. - Oh, you mean like people have been saying about books for 500+ years? - Not remotely the same thing. Books almost always have context on what they are, like having an author listed, and hopefully citations if it’s about real things. You can figure out more about it. LLMs create confident sounding outputs that are just predictions of what an output should look like based on the input. It didn’t reason and doesn’t tell you how it generated its response. - The problem is LLMs are sold to people as Artifical Intelligence, so it sounds like it’s smart. In actuality, it doesn’t think at all. It just generates confident sounding results. It’s literally companies selling con(fidence) men as a product, and people fully trust these con men. - Yeah, nobody has ever written a book that’s full of bullshit, bad arguments, and obvious lies before, right? - Obviously anyone who uses any technology needs to be aware of the limitations and pitfalls, but to imagine that this is some entirely new kind of uniquely-harmful thing is to fail to understand the history of technology and society’s responses to it. - Yeah, nobody has ever written a book that’s full of bullshit, bad arguments, and obvious lies before, right? - Lies are still better than ChatGPT. ChatGPT isn’t even capable of lying. It doesn’t know anything. It outputs statistically probable text. - How exactly? Bad information is bad information, regardless of the source. 
 
- You can look up the author and figure out if they’re a reliable source of information. Most authors either write bullshit or don’t, at least on a particular subject. LLMs are unreliable. Sometimes they return bullshit and sometimes they don’t. You never know, but it’ll sound just as confident either way. Also, people are lead to believe they’re actually thinking about their response, and they aren’t. They aren’t considering if it’s real or not, only if it is a statistically probable output. - You should check your sources when you’re googling or using chatGPT too (most models I’ve seen now cite sources you can check when they’re reporting factual stuff), that’s not unique to those those things. Yeah LLMs might be more likely to give bad info, but people are unreliable too, they’re biased and flawed and often have an agenda, and they are frequently, confidently wrong. Guess who writes books? Mostly people. So until we’re ready to apply that standard to all sources of information it seems unreasonable to arbitrarily hold LLMs to some higher standard just because they’re new. 
 
 
 
 
 
 
- AI people always want to ignore the environmental damage as well… - Like all that electricity and water are just super abundant things humans have plenty of. - Everytime some idiot asks AI instead of googling it themselves the planet gets a little more fucked - This is my #1 issue with it. My work is super pushing AI. The other day I was trying to show a colleague how to do something in teams and as I’m trying to explain to them (and they’re ignoring where I’m telling them to click) they were like “you know, this would be a great use of AI to figure it out!”. - I said no and asked them to give me their fucking mouse. - People are really out there fucking with extremely powerful wasteful AI for something as stupid as that. 
- Are you not aware that Google also runs on giant data centers that eat enormous amounts of power too? - Multiple things can be bad at the same time, they don’t all need to be listed every time any one bad thing is mentioned. - I wasn’t listing other bad things, this is not a whataboutism, this was a specific criticism of telling people not to use one thing because it uses a ton of power/water when the thing they’re telling people to use instead also uses a ton of power/water. - Yeah, you’re right. I think I misread your/their comment initially or something. Sorry about that. - And ai is in search engines now too, so even if asking chatfuckinggpt uses more water than google searching something used to, google now has its own additional fresh water resource depletor to insert unwanted ai into whatever you look up. - We’re fucked. - Fair enough. - Yeah, the intergration of AI with chat will just make it eat even more power, of course. 
 
 
 
- This is like saying a giant truck is the same as a civic for a 2 hr commute … - Per: https://www.rwdigital.ca/blog/how-much-energy-do-google-search-and-chatgpt-use/ - Google search currently uses 1.05GWh/day. ChatGPT currently uses 621.4MWh/day - The per-entry cost for google is about 10% of what it is for GPT but it gets used quite a lot more. So for one user ‘just use google’ is fine, but since are making proscriptions for all of society here we should consider that there are ~300 million cars in the US, even if they were all honda civics they would still burn a shitload of gas and create a shitload of fossil fuel emissions. All I’m saying if the goal is to reduce emissions we should look at the big picture, which will let you understand that taking the bus will do you a lot better than trading in your F-150 for a Civic. - Google search currently uses 1.05GWh/day. ChatGPT currently uses 621.4MWh/day - … - And oranges are orange - It doesn’t matter what the totals are when people are talking about one or the other for a single use. - Less people commute to work on private jets than buses, are you gonna say jets are fine and buses are the issue? - Because that’s where your logic ends up 
 
 
 
 
- I agree with this sentiment but I don’t see it actually convincing anyone of the dangers of AI. It reminds me a lot of how teachers said that calculators won’t always be available and we need to learn how to do mental math. That didn’t convince anyone then 
 
- They have to pay for every copyrighted material used in the entire models whenever the AI is queried. - They are only allowed to use data that people opt into providing. - I would make a case for creation of datasets by a international institution like the UNESCO. The used data would be representative for world culture, and creation of the datasets would have to be sponsored by whoever wants to create models out of it, so that licencing fees can be paid to creators. If you wanted to make your mark on global culture, you would have an incentive to offer training data to UNESCO. - I know, that would be idealistic and fair to everyone. No way this would fly in our age. 
- There’s no way that’s even feasible. Instead, AI models trained on pubically available data should be considered part of the public domain. So, any images that anyone can go and look at without a barrier in the way, would be fair game, but the model would be owned by the public. - Its only not feasible because it would kill AIs. - Large models have to steal everything from everyone to be baseline viable - No, it’s not feasible because the models are already out there. The data has already been ingested and at this point it can’t be undone. - And you can’t exactly steal something that is infinitely reproducible and doesn’t destroy the original. I have a hard time condemning model creators of training their models on images of Mickey Mouse while I have a Plex server with the latest episodes of Andor on it. Once something is put on display in public the creator of it should just accept that they have given up their total control of it. - Ah yes the “its better to beg forgiveness than to ask permission” argument. 
 
 
- There’s no way that’s even feasible. - It’s totally feasible, just very expensive. - Either copyright doesn’t exist in its corny form or AI companies don’t. 
- no way that’s even possible - Oh no… Anyway 
- Public Domain does not mean being able to see something without a barrier in the way. The vast majority of text and media you can consume for free on the Internet is not in the Public Domain. - Instead, “Public Domain” means that 1) the creator has explicitly released it into the Public Domain, or 2) the work’s copyright has expired, which in turn then means that anyone is from that point on entitled to use that work for any purpose. - All the major AI models scarfed up works without concern for copyrights, licenses, permissions, etc. For great profit. In some cases, like at least Meta, they knowingly used known collections of pirated works to do so. - I am aware and I don’t expect that everything on the internet is public domain… I think the models built off of works displayed to the public should be automatically part of the public domain. - The models are not creating copies of the works they are trained on any more than I am creating a copy of a sculpture I see in a park when I study it. You can’t open the model up and pull out images of everything that it was trained on. The models aren’t ‘stealing’ the works that they use for training data, and you are correct that the works were used without concern for copyright (because the works aren’t being copied through training), licenses (because a provision such as ‘you can’t use this work to influence your ability to create something with any similar elements’ isn’t really an enforceable provision in a license), or permission (because when you put something out for the public to view it’s hard to argue that people need permission to view it). - Using illegal sources is illegal, and I’m sure if it can be proven in court then Meta will gladly accept a few hundred thousand dollar fine… before they appeal it. - Putting massive restrictions on AI model creation is only going to make it so that the most wealthy and powerful corporations will have AI models. The best we can do is to fight to keep AI models in the public domain by default. The salt has already been spilled and wishing that it hadn’t isn’t going to change things. 
 
 
- This definitely relates to moral concerns. Are there other examples like this of a company that is allowed to profit off of other people’s content without paying or citing them? - Hollywood, from the very start. Its why it is across the US from New York, to get outside of the legal reach of Broadway show companies they stole from. - Stack overflow - Reddit. - Google. 
 
- What about models folks run at home? - I think if you’re not making money off the model and its content, then you’re good. 
- Careful, that might require a nuanced discussion that reveals the inherent evil of capitalism and neoliberalism. Better off just ensuring that wealthy corporations can monopolize the technology and abuse artists by paying them next-to-nothing for their stolen work rather than nothing at all. 
 
 
- Training data needs to be 100% traceable and licensed appropriately. - Energy usage involved in training and running the model needs to be 100% traceable and some minimum % of renewable (if not 100%). - Any model whose training includes data in the public domain should itself become public domain. - And while we’re at it we should look into deliberately taking more time at lower clock speeds to try to reduce or eliminate the water usage gone to cooling these facilities. 
- I want OpenAI to collapse. - Many people with positive sentiments towards AI also want that. 
 
- I want real, legally-binding regulation, that’s completely agnostic about the size of the company. OpenAI, for example, needs to be regulated with the same intensity as a much smaller company. And OpenAI should have no say in how they are regulated. - I want transparent and regular reporting on energy consumption by any AI company, including where they get their energy and how much they pay for it. - Before any model is released to the public, I want clear evidence that the LLM will tell me if it doesn’t know something, and will never hallucinate or make something up. - Every step of any deductive process needs to be citable and traceable. - Clear reporting should include not just the incremental environmental cost of each query, but also a statement of the invested cost in the underlying training. 
- … I want clear evidence that the LLM … will never hallucinate or make something up. - Nothing else you listed matters: That one reduces to “Ban all Generative AI”. Actually worse than that, it’s “Ban all machine learning models”. - Let’s say I open a medical textbook a few different times to find the answer to something concrete, and each time the same reference material leads me to a different answer but every answer it provides is wrong but confidently passes it off as right. Then yes, that medical textbook should be banned. - Quality control is incredibly important, especially when people will use these systems to make potentially life-changing decisions for them. - especially when people will use these systems to make potentially life-changing decisions for them. - That specifically is the problem. I don’t have a solution, but treating and advertising these things like they think and know stuff is a mistake that of course the companies behind them are encouraging. 
 
- If “they have to use good data and actually fact check what they say to people” kills “all machine leaning models” then it’s a death they deserve. - The fact is that you can do the above, it’s just much, much harder (you have to work with data from trusted sources), much slower (you have to actually validate that data), and way less profitable (your AI will be able to reply to way less questions) then pretending to be the “answer to everything machine.” - The way generative AI works means no matter how good the data it’s still gonna bullshit and lie, it won’t “know” if it knows something or not. It’s a chaotic process, no ML algorithm has ever produced 100% correct results. - That’s how they work now, trained with bad data and designed to always answer with some kind of positive response. - They absolutely can be trained on actual data, trained to give less confident answers, and have an error checking process run on their output after they formulate an answer. - There’s no such thing as perfect data. Especially if there’s even the slightest bit of subjectivity involved. - Even less existent is complete data. - Perfect? Who said anything about perfect data? I said actually fact checked data. You keep movimg the bar on what possible as an excuse to not even try. - They could indeed build models that worked on actual data from expert sources, and then have their agents check those sources for more correct info when they create an answer. They don’t want to, for all the same reasons I’ve already stated. - It’s possible, it does not “doom” LLM, it just massively increases its accuracy and actual utility at the cost of money, effort and killing the VC hype cycle. 
 
 
 
 
 
- Before any model is released to the public, I want clear evidence that the LLM will tell me if it doesn’t know something, and will never hallucinate or make something up. - Their creators can’t even keep them from deliberately lying. - Exactly. 
 
- This is awesome! The citing and tracing is already improving. I feel like no hallucinations is gonna be a while. - How does it all get enforced? FTC? How does this become reality? 
 
- I think two main things need to happen: increased transparency from AI companies, and limits on use of training data. - In regards to transparency, a lot of current AI companies hide information about how their models are designed, produced, weighted and use. This causes, in my opinion, many of the worst effects of current AI. Lack of transparency around training methods mean we don’t know how much power AI training uses. Lack of transparency in training data makes it easier for the companies to hide their piracy. Lack of transparency in weighting and use means that many of the big AI companies can abuse their position to push agendas, such as Elon Musk’s manipulation of Grok, and the CCP’s use of DeepSeek. Hell, if issues like these were more visible, its entirely possible AI companies wouldn’t have as much investment, and thus power as they do now. - In terms of limits on training data, I think a lot of the backlash to it is over-exaggerated. AI basically takes sources and averages them. While there is little creativity, the work is derivative and bland, not a direct copy. That said, if the works used for training were pirated, as many were, there obviously needs to be action taken. Similarly, there needs to be some way for artists to protect or sell their work. From my understanding, they technically have the legal means to do so, but as it stands, enforcement is effectively impossible and non-existant. 















