this post was submitted on 09 Nov 2025
39 points (100.0% liked)

Ask Science

13540 readers
66 users here now

Ask a science question, get a science answer.


Community Rules


Rule 1: Be respectful and inclusive.Treat others with respect, and maintain a positive atmosphere.


Rule 2: No harassment, hate speech, bigotry, or trolling.Avoid any form of harassment, hate speech, bigotry, or offensive behavior.


Rule 3: Engage in constructive discussions.Contribute to meaningful and constructive discussions that enhance scientific understanding.


Rule 4: No AI-generated answers.Strictly prohibit the use of AI-generated answers. Providing answers generated by AI systems is not allowed and may result in a ban.


Rule 5: Follow guidelines and moderators' instructions.Adhere to community guidelines and comply with instructions given by moderators.


Rule 6: Use appropriate language and tone.Communicate using suitable language and maintain a professional and respectful tone.


Rule 7: Report violations.Report any violations of the community rules to the moderators for appropriate action.


Rule 8: Foster a continuous learning environment.Encourage a continuous learning environment where members can share knowledge and engage in scientific discussions.


Rule 9: Source required for answers.Provide credible sources for answers. Failure to include a source may result in the removal of the answer to ensure information reliability.


By adhering to these rules, we create a welcoming and informative environment where science-related questions receive accurate and credible answers. Thank you for your cooperation in making the Ask Science community a valuable resource for scientific knowledge.

We retain the discretion to modify the rules as we deem necessary.


founded 2 years ago
MODERATORS
 

So, this one's likely pretty niche, but I'm hoping someone here might know the answer.

So, I've gotten genotype data for myself from 23AndMe (don't worry, I made them delete it before the acquisition) and AncestryDNA years ago and I've been looking into things like SNPs and such more recently. I write code for a living, so I can do some cool things with a little code and the raw data that I've gotten to check into what interesting SNPs I might have.

Something I've noticed recently is that for some SNPs, I've got alleles that aren't listed as a possibility anywhere on the internet that I can find.

Just to take a random example, rs3746544, part of the SNAP25 gene. According to SNPedia, the available alleles are A and C with A being the major allele and C being the minor. So what is my genotype for that SNP?

[tootsweet@computer genome_raw_data]$ grep rs3746544 23andme_raw_data.txt ancestrydna_raw_data.txt
23andme_raw_data.txt:rs3746544    20      10287084        TT
ancestrydna_raw_data.txt:rs3746544   20      10287084        T       T
[tootsweet@computer genome_raw_data]$

TT? There's zero mention of "T" being an allele that you can have for rs3746544.

rs3746544 is very much not the only example. Just a few more among many:

I'm hoping some of you folks know enough about genes to know what might be up with these examples. I'm sure it's just simply something I don't yet understand about genetics. Thanks in advance!

Edit: So I had a bit of a brain fart after writing this in a comment:

(Side note: oddly of the 23 "mismatch" examples I mentioned, my genotype doesn’t have a single allele in common with the documented possible alleles for the SNP. For example, I don’t have any AT’s where the documented alleles are AA, AC, and CC. My genes either match the documented alleles or have no alleles in common with the documented genotypes. Which seems even stranger.)

A's match with T's and C's with G's. I'm guessing when I get a "mismatch" like what I'm talking about, what 23andme or AncestryDNA is giving me is the complementary base pairs. So if I see a CT where the documented options are AA, AG, and GG, I should just consider my CT to be equivalent to an AG. (Because the T matches up with an A and the C matches up with a G.)

So I guess that means that sometimes the equiment that 23andme and AncestryDNA use reads the other side of the DNA strand from the one that's documented in the literature. (This only seems to happen in about 16.5% of cases or therebouts -- at least that's what my napkin math indicates. In most cases, what 23andme and AncestryDNA report in the raw data matches and thus must be measuring/reading/reporting the "same side" of the double helix as the literature talks about.)

At least that theory seems consistent with what I'm seeing. If anybody knows better, I definitely would appreciate any further input!

That said, it does seem kindof odd that any time 23andme reads the "other side" of the DNA molecule, so does AncestryDNA and vice versa. That is, there don't seem to be any cases where they disagree on my genotype for a given SNP. At least I haven't seen any examples of that so far. I might have to do some searching now.

Edit 2: I've done a little more googling based on the first edit above and found this page. It seems 23andme always goes off of the so-called "+ strand" of the "Genome Reference Consortium Human Build 37" human reference genome. So maybe the 23 examples I've found so far are cases where at least some of the literature (or at least SNPedia and EUPedia, if not "the literature") is based more off of what the "Genome Reference Consortium Human Build 37" considers the "- strand". So maybe "the literature" (and/or SNPedia/EUPedia) uses a different reference genome? All this is still just a theory, but I definitely know more than I did a few minutes ago.

Edit 3: Some folks are suggesting that 23AndMe and AncestryDNA may just not be accurate. As in, 23AndMe and AncestryDNA may have a very high error rate when reading my genetic data. If that was the case, I wouldn't expect the inaccuracies to "match" between the two raw data files. So, to test that hypothesis out, I wrote a script to check my 23AndMe raw data against my AncestryDNA data to see how often they disagree. The script is quite slow, but at the moment it's checked over 35,000 SNPs that are measured by both services and found 12 that disagree for an error rate of roughly 0.0343%. From another comment, I mentioned the instances I've found make up about 16.5% of the ones I've checked. So it doesn't seem like that accounts for a very large percentage of these. I'm still leaning pretty heavily toward it just being the "other strand" theory. Thanks again for everyone's input!

you are viewing a single comment's thread
view the rest of the comments
[–] j4yc33@piefed.social 3 points 1 week ago

All I know is that RS232 has options for Parity and various bitstream options.