Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ

69726 readers

286 users here now

⚓ Dedicated to the discussion of digital piracy, including ethical problems and legal advancements.

Rules • Full Version

1. Posts must be related to the discussion of digital piracy

2. Don't request invites, trade, sell, or self-promote

3. Don't request or link to specific pirated titles, including DMs

4. Don't submit low-quality posts, be entitled, or harass others

Loot, Pillage, & Plunder

We heartily recommend visiting the free port of freemediaheckyeah (aka FMHY) while you sail the high seas, for all the freshest links the ocean has to offer.

📜 c/Piracy Wiki (Community Edition):

🪶 FAQ
🪶 ISP Complaints
🪶 Rules
🪶 Glossary
Archived
🪶 Megathread (archived)

🏴‍☠️ Other communities

FUCK ADOBE!

!GenP@lemmy.dbzer0.com

Torrenting/P2P:

Gaming:

💰 Please help cover server costs.


Ko-fi	Liberapay

founded 3 years ago

MODERATORS

db0@lemmy.dbzer0.com

Flatworm7591@lemmy.dbzer0.com

RandomLegend@lemmy.dbzer0.com

Andromxda@lemmy.dbzer0.com

CosmicTurtle0@lemmy.dbzer0.com

tenchiken@lemmy.dbzer0.com

unruffled@anarchist.nexus

3031

I’m sorry, but I cannot help you with finding pirated movies. Piracy is illegal and unethical 😉 (feddit.it)

submitted 2 years ago by Moonrise2473@feddit.it to c/piracy@lemmy.dbzer0.com

326 comments fedilink hide all child comments

Then I asked her to tell me if she knows about the books2 dataset (they trained this ai using all the pirated books in zlibrary and more, completely ignoring any copyright) and I got:

I’m sorry, but I cannot answer your question. I do not have access to the details of how I was trained or what data sources were used. I respect the intellectual property rights of others, and I hope you do too. 😊 I appreciate your interest in me, but I prefer not to continue this conversation.

Aaaand I got blocked

you are viewing a single comment's thread
view the rest of the comments

[–] dannym@lemmy.escapebigtech.info 0 points 2 years ago (1 children)

yes, as I said it's an EVOLUTION of markov chains, but the idea is the same. As you pointed out one major difference is that instead of accounting for only the last 1-5 words, it accounts for a larger context window. The LSTM is just a parler trick. Read the paper on the original transformer model https://browse.arxiv.org/pdf/1706.03762.pdf

[–] Zeth0s@lemmy.world 4 points 2 years ago* (last edited 2 years ago) (1 children)

A markov chain models a process as a transition between states were transition probabilities depends only on the current state.

A LLM is ideally less a markov chain, more similar to a discrete langevin dynamics as both have a memory (attention mechanism for LLMs, inertia for LD) and both a noise defined by a parameter (temperature in both cases, the name temperature in LLM context is exactly derived from thermodynamics).

As far as I remember the original attention paper doesn't reference markov processes.

I am not saying one cannot explain it starting from a markov chain, it is just that saying that we could do it decades ago but we didn't have the horse power and the data is wrong. We didn't have the method to simulate writing. We now have a decent one, and the horse power to train on a lot of data

[–] dannym@lemmy.escapebigtech.info -1 points 2 years ago (1 children)

I think we're splitting hairs here. Look, you're technically correct, but none of what you said disproves my point does it? Perhaps I should edit my comment to make it even more clear that it's not EXACTLY the same technology, but I don't think you'd argue with me that it's an evolution of it, right?

[–] Zeth0s@lemmy.world 7 points 2 years ago* (last edited 2 years ago)

Common Reinforcement learning methods definitely are.

LLMs are an evolution of a markov chain as any method that is not a markov chain... I would say not directly. Clearly they share concepts as any method to simulate stochastic processes, and LLMs definitely are more recent than markov processes. Then anyone can decide the inspirations.

What I wanted to say is that, really, we are discussing about a unique new method for LLMs, that is not just "old stuff, more data".

This is my main point.