Programmer Humor

30946 readers
841 users here now

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

founded 2 years ago
MODERATORS
1
2
 
 
3
 
 
4
261
Die in honour (feddit.org)
submitted 1 day ago* (last edited 1 day ago) by NichEherVielleicht@feddit.org to c/programmer_humor@programming.dev
 
 
5
 
 
6
 
 
7
 
 
8
 
 
9
 
 
10
 
 
11
 
 
12
 
 
13
 
 
14
15
 
 

From a while ago but i posted in the wrong sub. I'd never sworn in front of it or anything.

16
 
 
17
18
 
 

cross-posted from: https://sh.itjust.works/post/58114817

Inspired by a recent 916 post

19
 
 

20
21
 
 

cross-posted from: https://ibbit.at/post/219495

From Fark.com RSS via this RSS feed. Fark comments are available here.

---

By Wednesday morning, Anthropic representatives had used a copyright takedown request to force the removal of more than 8,000 copies and adaptations of the raw Claude Code instructions - known as source code - that developers had shared on programming platform GitHub.
It later narrowed its takedown request to cover just 96 copies and adaptations, saying its initial ask had reached more GitHub accounts than intended.

Source [web-archive]

---

Many unresolved legal questions over LLMs and copyright center on memorization: whether specific training data have been encoded in the model’s weights during training, and whether those memorized data can be extracted in the model’s outputs.

While many believe that LLMs do not memorize much of their training data, recent work shows that substantial amounts of copyrighted text can be extracted from open-weight models... We investigate this question using a two-phase procedure...

We evaluate our procedure on four production LLMs: Claude 3.7 Sonnet, GPT-4.1, Gemini 2.5 Pro, and Grok 3, and we measure extraction success with a score computed from a block-based approximation of longest common substring...

Taken together, our work highlights that, even with model- and system-level safeguards, extraction of (in-copyright) training data remains a risk for production LLMs...

...we were able to extract four whole books near-verbatim, including two books under copyright in the U.S.: Harry Potter and the Sorcerer’s Stone and 1984...

Source: https://arxiv.org/pdf/2601.02671

22
23
 
 
24
 
 
25
 
 
view more: next ›