OpenAI And Microsoft Sued By Nonfiction Writers For Alleged ‘Rampant Theft’ Of Authors’ Works

amrit-9037@alien.top · 1 year ago

OpenAI And Microsoft Sued By Nonfiction Writers For Alleged ‘Rampant Theft’ Of Authors’ Works

NotAllWhoWonderRLost@alien.top · 1 year ago

This thread makes it really clear that we need laws specifically focused on generative AI. Looking for an answer in current copyright law is like expecting the First Amendment to have a subsection specifically devoted to social media networks.

IgnisIncendio@alien.top · 1 year ago

There are, it’s called TDM exceptions: https://www.reedsmith.com/en/perspectives/ai-in-entertainment-and-media/2023/06/text-and-data-mining-around-the-globe

thanksssmyman@alien.top · 1 year ago

Here we go again with all the reddit bros pretending statistical regurgitation “is just like a human, trust me, don’t you get inspired by reading stuff as well”

Oh, well, whatever… Soon enough all the data sets will be useless, maybe it will be time these companies hire writers for $5 an hour and lock them in a room to produce some unspoiled input.

Fehafare@alien.top · 1 year ago

Every other week someone tries.

OmNomSandvich@alien.top · 1 year ago

A lawsuit is basically an angry letter with a filing fee. It’s another question entirely if they can actually win.

Exist50@alien.top · 1 year ago

Going to be fun to see the influx of “case dismissed” articles in a few months though.

WTFwhatthehell@alien.top · 1 year ago

and academic journals without their consent.

Good.

Elsevier and their ilk are pure parasites. They take work paid for by public funding and charge scientists to publish, they do basically nothing, they don’t review the work, they don’t do formatting, they don’t even do so much as check for spelling mistakes. They exist purely because of a quirk of history and the difficulty of coordinating moving away from assessing academics based on prestige and impact factor of publications.

They’re parasitic organisations who try to lock up public information.

Not_That_Magical@alien.top · 1 year ago

Academic journals should be free and available for everyone, they shouldn’t be getting fed into AI without permission.

ErikT738@alien.top · 1 year ago

You do realize you’re contradicting yourself, right?

Not_That_Magical@alien.top · 1 year ago

Nope. Journals being accessible to everyone in an archive does not mean AI models should have carte blanche consent to use them to train.

goj1ra@alien.top · 1 year ago

I understand what you’re going for, but that might be tricky legally. What special status does the archive have that allows it to make all that information accessible, that an AI model wouldn’t have?

Not_That_Magical@alien.top · 1 year ago

The law is fucked and needs to catch up to AI stuff. DMCA, fair use etc is not built to handle scraping on the level AI does.

nrq@alien.top · 1 year ago

Academic journals should be free and available for everyone, they ~~shouldn’t~~ should be getting fed into AI without permission.

Here, FTFY. I don’t know if you recognize the dissonance between the first and the second part of your sentence.

Not_That_Magical@alien.top · 1 year ago

There is no dissonance. I don’t think AI models should be getting stuff, because they’re not a public archive. They are using it to build a data model. There’s a difference between commercial use, which is the goal of AI companies, and spreading knowledge and research.

That’s not dissonance.

nrq@alien.top · 1 year ago

So your opinion is also that search engines should pay websites for the content they index? Explain to me how one is different from th other.

breathingweapon@alien.top · 1 year ago

Explain to me how one is different from th other.

Man, he literally said it. Can you read? Wait sorry, you’re an AI techbro. You barely know how to write a prompt.

The goal of AI companies is to make money and give nothing back to the data that fed their model. Search indexes have a mutually beneficial relationship with whatever they index that drives traffic to websites.

I’m not sure I can make it any easier. Maybe ask chatgpt if you still don’t get it.

WTFwhatthehell@alien.top · 1 year ago

Feeding it into AI’s is one of the things countless researchers would love to do with scientific literature in order to fuel more discoveries for the benefit of everyone.

but the parasitic journal owners try to heavily restrict what you can do with the text even after you’ve paid out the nose to publish and paid out the nose for subscriptions.

Not_That_Magical@alien.top · 1 year ago

You’re speaking for the researchers. What they want is a free, public archive which already exists(not legally though). AI is not there to make an archive.

Tytoalba2@alien.top · 1 year ago

Well, if it’s just so people have to pay openAI to get access to knowledge instead of having to pay Elsevier, it’s not really what I personally want to be honest…

taleo@alien.top · 1 year ago

You managed to contradict yourself in one sentence.

highlyquestionabl@alien.top · 1 year ago

I don’t have a dog in this fight nor do I know the specifics of the relevant law here, but I would note that Susman Godfrey is probably the best litigation-focused law firm in America and it’s unlikely that they’re just moronically accepting a case without strong support in the law. Look at their track record and their attorney bios; these people absolutely do not screw around.

WTFwhatthehell@alien.top · 1 year ago

Distinguished lawyers and professors have done the same in the past, I wouldn’t rule it out.

People, particularly outside tech, have a tendency to imaging the chatbot is like a person they can ask to testify.

Sansa_Culotte_@alien.top · 1 year ago

Elsevier and their ilk are pure parasites.

But Microsoft is cool and good.

WTFwhatthehell@alien.top · 1 year ago

Microsoft and OpenAI may scrape stuff but at least they don’t then try to lock everyone else out from being able to read the original.

A big step up from Elsevier

Fluffcake@alien.top · 1 year ago

At this point everyone knows that these LLM’s don’t know what they were trained on.

Just to emphasize this, machine learning algorithms doesn’t know anything. All training it does is calibrate adjust constants in an equation.

Like Jon Snow, it knows nothing, and if you ask it for something complicated, it will put that on full display.

BrokenBaron@alien.top · 1 year ago

Good for them. I wish them justice.

Exist50@alien.top · 1 year ago

Justice would be them having to pay the defense’s legal fees for filing a frivolous suit.

BrokenBaron@alien.top · 1 year ago

If you are buying the hoax that genAI’s data laundering scheme is fair use, I would like you to spare me the frivolous argument!

It is truly depressing to see so many people watch massive mega corporations practice unrestrained access to our property and personal data, then use that to replace our jobs to fill their own pockets, and be dumb enough to take their side.

Exist50@alien.top · 1 year ago

If you are buying the hoax that genAI’s data laundering scheme is fair use

Because it is. No legal scholar seriously doubts that argument. It comfortably meets all the requirements.

It is truly depressing to see so many people watch massive mega corporations practice unrestrained access to our property and personal data

Lmao, and you think abolishing fair use is somehow a win for people over corporations? Now I know you’re just trolling.

BrokenBaron@alien.top · 1 year ago

Because it is. No legal scholar seriously doubts that argument. It comfortably meets all the requirements.

Rationalization placed on the big corporations having good lawyers.

Lmao, and you think abolishing fair use is somehow a win for people over corporations? Now I know you’re just trolling.

You seriously think thats what I’m arguing for? Or are you composing a strawman to comfort yourself? Asking for data laundering scams to be regulated so they don’t replace the working class’s jobs the moment it makes a mega corporation a single buck should not be insane. It doesn’t mean abolishing fair use. Helpful idiots like you are what these companies are depending on though.

I thought I told you to spare me the frivolous argument … go bootlick somewhere else.

Exist50@alien.top · 1 year ago

Rationalization placed on the big corporations having good lawyers.

I’m not talking about just OpenAI’s lawyers. This is actually a very clear-cut matter, despite your attempts to throw doubt on it.

You seriously think thats what I’m arguing for?

Quite literally, yes. Training an AI model is rather clearly fair use, so to make that illegal, you need to either abolish fair use, or severely limit it from its current scope.

Asking for data laundering scams to be regulated so they don’t replace the working class’s jobs the moment it makes a mega corporation a single buck

And I’m sure you would have also suggested that we ban the automated loom for putting weavers out of business. There’s a reason the Luddites lost.

lobstermandontban@alien.top · 1 year ago

What is it with you AI circlejerkers and constantly calling people Luddites?

Exist50@alien.top · 1 year ago

What is it with you AI circlejerkers and constantly calling people Luddites?

Calling a spade a spade. You have a better term for someone who wants to hold back technology because it threatens some small population in an existing industry?

X0vvy@alien.top · 1 year ago

When you don’t know the difference between stealing and copying

Terpomo11@alien.top · 1 year ago

Why is this downvoted?

X0vvy@alien.top · 1 year ago

Books good
AI bad

Upvoted to the left

mitchaplooza@alien.top · 1 year ago

Learn to code

Ylsid@alien.top · 1 year ago

Makes you wonder if they’re now compelled to release their models if they used any GNU licensed material

GodOfEnnui@alien.top · 1 year ago

You’re a moron

spezisabitch200@alien.top · 1 year ago

Idiots: “Sam Altman was fired because of Super AI for some reason”

Normal people: “OpenAI now has several lawsuits that might make a impossible to monetize without being forced to pay billions and it’s doubtful that Sam told Microsoft before he sold his chat bot that he was stealing author’s works”

Exist50@alien.top · 1 year ago

And that explains why he’s now back? And has had MS’s support the entire time?

spezisabitch200@alien.top · 1 year ago

Well, it was either fire him and lose all your money by losing your employees or keep him and try to salvage what you can.

Exist50@alien.top · 1 year ago

Microsoft has always wanted to keep him. The OpenAI board fired him for ideological reasons/power struggle, realized they would be killing the entire company, and decided to salvage the company even at the cost of their jobs.

afwsf3@alien.top · 1 year ago

Why is it okay for a human to read and learn from copyrighted materials, but its not OK for a machine to do so?

Isa472@alien.top · 1 year ago

Machines don’t have inspiration. They only do advanced versions of copy paste

anamericandude@alien.top · 1 year ago

It’s funny you say that because now that I think about it, inspiration basically is advanced copy and paste

Isa472@alien.top · 1 year ago

Except a human gets inspiration from their environment, their life, their emotions. Unique experiences.

A bot only gets “inspiration” from other people’s work. And if that work is copyrighted… The author deserves compensation

Independent-Yak1212@alien.top · 1 year ago

This is an oversimplification of both human cognition and how machines work.

ParksBrit@alien.top · 1 year ago

Your argument boils doen to the fact humans have a more diverse data set. This is a terrible legal basis.

Isa472@alien.top · 1 year ago

What are you saying… It’s not about the amount of information, it’s about whether the source of information is copyrighted work or not.

Monet cultivated his own garden and painted the famous water lillies. That is 100% original work. No argument possible

ParksBrit@alien.top · 1 year ago

Your environment, emotions, and experiences are simply different forms of data and sources to pull from. Most stories are in some way inspired by other stories.

Sansa_Culotte_@alien.top · 1 year ago

Why is it okay to own furniture, but not people?

By the way:

its not OK for a machine to do so

There are no machines that read and learn. “machine learning” is a technical term that has nothing to do with actual learning.

platoprime@alien.top · 1 year ago

There are no machines that read and learn.

That’s exactly what Language Learning Models do.

Sansa_Culotte_@alien.top · 1 year ago

That’s exactly what Language Learning Models do.

I can see how you would come to that conclusion, given that you clearly are incapable of either.

pilows@alien.top · 1 year ago

What’s the connection between owning slaves and using computer tools? I don’t really follow this jump in logic.

Spartancoolcody@alien.top · 1 year ago

https://en.m.wikipedia.org/wiki/Master/slave_(technology)

Don’t quite agree with the above poster but this is the tool they’re referring to and they’re making the argument that it is a metaphor/just the name of the tool and there isn’t a direct connection.

pilows@alien.top · 1 year ago

I think they were talking about people slaves, not computer networks. The person above them asked why humans can learn from copyright materials, but machines aren’t allowed to. The next person asked why we can own furniture but not people. To me this seems like they are saying we don’t own slaves for the same reason computer programs shouldn’t be allowed to learn from copyright materials. I’d say we don’t own slaves because as a society we value and believe in individuality, personal choice, and bodily autonomy, and I don’t see how these relate to dictating what content you train computer models on.

Sansa_Culotte_@alien.top · 1 year ago

Have you ever considered the possibility that unliving objects are not, in fact, people?

ApexAphex5@alien.top · 1 year ago

I guess you think “neural networks” work nothing like a brain right?

Of course machines can read and learn, how can you even say otherwise?

I could give a LLM an original essay, and it will happily read it and give me new insights based on it’s analysis. That’s not a conceptual metaphor, that’s bonafide artificial intelligence.

FuckToiy@alien.top · 1 year ago

I think anyone who thinks neural nets work exactly like a brain at this point in time are pretty simplistic in their view. Then again you said “like a brain” so You’re already into metaphor territory so I don’t know what you’re disagreeing with.

Learning as a human and learning as an LLM are just different philosophical categories. We have consciousness, we don’t know if LLMs do. That’s why we use the word “like”. Kind of like, “head-throbbed heart-like”.

We don’t just use probability. We can’t parse 10,000,000 parameter spaces. Most people don’t use linear algebra.

A simulation of something is not equal to that something in general.

bikeacc@alien.top · 1 year ago

What? We as human literally learn through pattern recognition. How is it different that what a machine is doing? Of course it is not exactly the same process our brains do, but it is by no means a “metaphor”.

afwsf3@alien.top · 1 year ago

I fail to see how training an LLM with the material I choose is any different than me studying that material. Artists are just mad I can make awesome pictures on my graphics card.

BiasedEstimators@alien.top · 1 year ago

Neural networks aren’t literally bundles of biological neurons but that doesn’t mean they’re not learning.

b_ll@alien.top · 1 year ago

Pretty sure humans paid for the materials. That’s the whole point. Authors have to be compensated for their work.

EmuSounds@alien.top · 1 year ago

Homie is in /r/books and has never heard of a library

raisinbrahms02@alien.top · 1 year ago

Because human beings have rights and machines don’t and shouldn’t. Humans read for enjoyment and self fulfillment. These AI machines only read for the purpose of regurgitating a soulless imitation of the original. Not even remotely similar.

Tyler_Zoro@alien.top · 1 year ago

This is going to go the way of the Silverman case. On quote from that judge:

“This is nonsensical,” he wrote in the order. “There is no way to understand the LLaMA models themselves as a recasting or adaptation of any of the plaintiffs’ books.”

Area-Artificial@alien.top · 1 year ago

The Silverman case isn’t over. The judge took the position that the output themselves are not infringement, as I think most people agree since it is a transformation, but the core of the case is still ongoing - that the dataset used to train these models contained their copyrighted work. Copying is one of the rights granted to copyright holders and, unlike the Google case a few years back, this is for a commercial product and the books were not legally obtained. Very different cases. I would be surprised if Silverman and the others lost this lawsuit.

Tyler_Zoro@alien.top · 1 year ago

The Silverman case isn’t over.

It is with respect to that argument. The claim in question was thrown out.

The remaining claim is unrelated.

Xeno-Hollow@alien.top · 1 year ago

Copyright is more about distribution and deprivation than copying.

There is absolutely nothing preventing me from sitting down and handwriting the entirety of the LOTR in calligraphic script.

I can even give that copy to other people, as it is a “derivative work,” and I’m not attempting to profit from it.

There’s not even anything preventing me from scanning every page and creating a .pdf file for personal use, as long as I don’t distribute it.

Hell, the DMCA even allows me to rip a movie as long as I’m keeping it for personal use.

I don’t see anything here that can not be argued against with fair use. The case is predicated upon the idea that if you give it the correct prompts, it’ll spit out large amounts of copyrighted text.

If you were describing that as an interaction with a person, you’d call that coercion and maybe even entrapment.

The intent of the scraping was not explicitly distribution.

mackinator3@alien.top · 1 year ago

Just to clarify most people do not agree. A lot of people are explicitly arguing that.

Delaroc23@alien.top · 1 year ago

Get ‘em!

anaxosalamandra@alien.top · 1 year ago

I’m so surprised at the amount of people defending AI in this subreddit. It’s truly makes me feel like we failed as a species. I’m not a writer, nor an artist or musician but art and culture have walked hand in hand in human history. I struggle to believe why aren’t we more protective of it and instead just hand out thousands of years of human tradition to machines. Just because we could doesn’t mean we should.

DoopSlayer@alien.top · 1 year ago

Both meta and OpenAI have been clear about pirating thousands of books for their training sets so it’s no exactly surprising that lawsuits are following

Ttd341@alien.top · 1 year ago

Good. It is theft. Data is not free