this post was submitted on 13 May 2026
775 points (100.0% liked)

Science Memes

20177 readers
1292 users here now

Welcome to c/science_memes @ Mander.xyz!

A place for majestic STEMLORD peacocking, as well as memes about the realities of working in a lab.



Rules

  1. Don't throw mud. Behave like an intellectual and remember the human.
  2. Keep it rooted (on topic).
  3. No spam.
  4. Infographics welcome, get schooled.

This is a science community. We use the Dawkins definition of meme.



Research Committee

Other Mander Communities

Science and Research

Biology and Life Sciences

Physical Sciences

Humanities and Social Sciences

Practical and Applied Sciences

Memes

Miscellaneous

founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] rockSlayer@lemmy.blahaj.zone 26 points 15 hours ago (5 children)

I'm a data analyst at a medical nonprofit, primarily doing analyses on germline variants for rare forms of cancer. I'm new to this kind of work, but had a decent educational background in biology.

Something I've learned is that genetics are complicated as hell. A single gene can produce multiple different proteins, and proteins change over time due to somatic variation. Only 1% of the genome are protein coding, called exomes. Exomes can be affected by variations to start and stop codons, non coding regions, and untranslated regions. There are entire fields dedicated to studying genome-wide, exomics, transcriptomics, proteomics, phenomics, and probably several others that I don't know about. The amount of data involved with these fields is in the tebibytes region. Have you ever seen a "small" 3GiB csv? I have. The filtered and cleaned data frames created by genetics are over 100 columns wide and have nearly 5 million entries.

There are companies creating artificial life by generating custom chromosomes. There's a whole field of computer science dedicated to biological computing, using DNA as a storage medium. There are companies dedicated to simply classifying genes.

DNA is cool as hell.

[–] MrEff@lemmy.world 10 points 14 hours ago

If you really want to blow your mind, look into the theoretical alternatives to DNA. we are all taught about RNA and how it is a precursor to DNA, but what if it went another way? Look up PNA, PNA-O, or even GNA. If life existed on other worlds, there is a decent chance it follows an xNA structure, but not necessarily DNA.

[–] ptu@sopuli.xyz 3 points 11 hours ago (1 children)

Interesting, could you enlighten what types if data is in those 100 columns? I’m aware of ATGC and thought it would be just one column, but maybe the rest are some that indicate intensity or activity. Or what sequence they are part of.

[–] rockSlayer@lemmy.blahaj.zone 3 points 10 hours ago (2 children)

Well it varies depending on what the file is meant for. Usually there's columns like chromosome, variant position, reference nucleotide, observed nucleotide, type of variation, codon sequence, gene name, etc.

There's also columns that result from various analyses. In the file I've been working on lately, there are columns such as variant impact, level of confidence, pathogenicity, clinical significance, etc.

[–] The_v@lemmy.world 2 points 4 hours ago (1 children)

That sounds like a marker file. It's a bit different than a sequence file.

Molecular markers are linked to specific sequences in the DNA. These markers are generally close by or in the gene of interest. All the extra columns described its characteristics and results. Anyplace in the entire genome where there is one nucleotide difference (polymorphic) can be another marker. There's millions of these and they add up to massive files.

A sequence file is basically just a long boring sequence of nucleotides and are not that large. Now some of the files you use to generate the sequence. Let's just say they had to wait almost 20 years for computers to get fast enough to process those files in a reasonable time. Those make the marker files look like childs play.

[–] rockSlayer@lemmy.blahaj.zone 1 points 3 hours ago

I'm not familiar with the name of the file I'm currently working with tbh. It's used to create the annotation files for regenie analyses. It has every variant for every gene within the biobank. There's far more than just missense; there are stop/start gain/loss, splice donor/acceptor, frameshifts, and ptv. It contains primateAI scores, spliceAI scores, cava data, clinvar data, and more.

[–] ptu@sopuli.xyz 3 points 10 hours ago

Sweet, thanks for the reply. I didn’t expect to fully understand what they would contain but I got the idea.

There’s a Japanese artist Ryoji Ikeda who you might like, he has visualised DNA and all sorts of data. I like his data.gram exhibition’s style the most esthetically amusing and he has published some albums too.

https://www.taronasugallery.com/en/exhibitions/ryoji-ikeda%E3%80%8Cdata-gram%E3%80%8D/

[–] pelespirit@sh.itjust.works 12 points 15 hours ago (1 children)

There are companies creating artificial life by generating custom chromosomes.

My dude, not a fun thing to think about who might have control over that. Is it a musk, zuck, cook or epstein?

[–] rockSlayer@lemmy.blahaj.zone 9 points 15 hours ago (1 children)

No, none of those guys are involved afaik. The one that made the first breakthrough in artificial life is ran by the same dude who competed with the Human Genome Project to map 99% of the human genome. They modified an extremely simple bacteria that only had something like 300 base pairs

[–] pelespirit@sh.itjust.works 1 points 15 hours ago (1 children)

We still don't know what type of person they are. Them being smart and focused on the research, doesn't give them a pass. They could even not care who else has the info.

[–] halcyoncmdr@piefed.social 1 points 10 hours ago

Yup. Many Nazi scientists only cared about the research. A lot of medical and physics breakthroughs last century directly resulted from those experiments.

[–] foofiepie@lemmy.world 1 points 8 hours ago (1 children)

I have no context/knowledge on topic. Are you saying DNA has that much data that can be extracted from it? If so, that’s nuts.

[–] rockSlayer@lemmy.blahaj.zone 1 points 6 hours ago

yes, all that data is extrapolated directly from DNA. It's a huge amount of information. All the DNA in a single human cell is directly translated to about 750MiB. Now, add in the fact that genomic studies use biobanks, like the UK Biobank, which contains the genetic info of hundreds of thousands of people. The data we can extrapolate from DNA is absolutely massive.

[–] homesweethomeMrL@lemmy.world 1 points 14 hours ago

That’s too much science. We, as a people, need less sci- wait, no. No, no. Uh - We need bett-er? Science? Hmm.

Look just make it an animated cartoon with fun music for now and we’ll circle back.