HiddenLayer555

joined 1 year ago
MODERATOR OF
[–] HiddenLayer555@lemmy.ml 3 points 2 hours ago* (last edited 2 hours ago) (1 children)

So, instead of feeding large documents into these models which break them, you can instead provide them with an API to interrogate the document by writing code

Kind of off topic, but this reminded me about something I really don't like about the current paradigm of "intelligence" and "knowledge" being parts of a single monolithic model.

Why aren't we training models on how to search any generic dataset for information, find patterns, draw conclusions, etc, rather than baking the knowledge itself into the model? 8 or so GB of pure abstract reasoning strategies would probably be way more intelligent and efficient than even a much larger model we have now. Imagine if you can just give it an arbitrarily sized database whose content you control, which you can then fill with the highest quality, ethically obtained, human expert moderated data complete with attributions to original creators, and have it base all its decisions from that. It would even be able to cite what it used with identifiers in the database, which can then be manually verified. You get a concrete foundation of where it's getting its information from, and you only need to load what it currently needs into memory, whereas right now you have to load all the AI's "knowledge," relevant or not, into your precious and limited RAM. You would also be able to update the individual data separately from the model itself, and have it produce updated results from the new data. That would actually be what I consider an artificial "intelligence" and not a fancy statistical prediction mechanism.

[–] HiddenLayer555@lemmy.ml 2 points 2 hours ago

You heard it folks, invalidate all US patents and copyright because written treaties are irrelevant!

[–] HiddenLayer555@lemmy.ml 1 points 3 hours ago* (last edited 3 hours ago)

The question is: What is an effective legal framework that focuses on the precise harms, doesn’t allow AI vendors to easily evade accountability, and doesn’t inflict widespread collateral damage?

This is entirely my opinion and I'm likely wrong about many things, but at minimum:

  1. The model has to be open source and freely downloadable, runnable, and copyleft, satisfying the distribution license requirements of copyleft source material (I'm willing to give a free pass to making it copyleft in general, as different copyleft licenses can have different and contradictory distribution license requirements, but IMO the leap from permissive to copyleft is the more important part). I suspect this alone will kill the AI bubble, because as soon as they can't exclusively profit off it they won't see AI as "the future" anymore.

  2. All training data needs to be freely downloadable and independently hosted by the AI creator. Goes without saying that only material you can legally copy and host on your own server can be used as training data. This solves the IP theft issue, as IMO if your work is licensed such that it can be redistributed in its entirety, it should logically also be okay to use it as training data. And if you can't even legally host it on your own server, using it to train AI is off the table. And the independently hosted dataset (complete with metadata about where it came from) also serves as attribution, as you can then search the training data for creators.

  3. Pay server owners for use of their resources. If you're scraping for AI you at the very least need to have a way for server owners to send you bills. And no content can be scraped from the original source more than once, see point 2.

  4. Either have a mechanism of tracking acknowledgement and accurately generating references along with the code, or if that's too challenging, I'm personally also okay with a blanket policy where anything AI generated is public domain. The idea that you can use AI generated code derived from open source in your proprietary app, and can then sue anyone who has the audacity to copy your AI generated code, is ridiculous and unacceptable.

[–] HiddenLayer555@lemmy.ml 8 points 4 hours ago* (last edited 2 hours ago) (2 children)

“Wait, not like that”: Free and open access in the age of generative AI

I hate this take. "Open source" is not "public domain" or "free reign to do whatever the hell you want with no acknowledgement to the original creator." Even the most permissive MIT license has terms that every single AI company shamelessly violate. All code derived from open source code need to at the very least reference the original author, so unless the AI can reliably and accurately cite where the code it generates came from, all AI generated code that gets incorporated into any publicly distributed software violates the license of every single open source project it has ever scraped.

That's saying nothing about projects with copyleft licenses that place conditions on how the code can then be distributed. Can AI reliably avoid using information from those codebases when generating proprietary code? No? And that's not a problem because?

I absolutely hate the hypocrisy that permeates the discourse around AI and copyright. Knocking off Studio Ghibli's art style is apparently the worst atrocity you can commit but god forbid open source developers, most of whom are working for free, have similar complaints about how their work is used.

Just because you "can't" obey the license terms due to some technical limitation doesn't mean you deserve a free pass from them. It means the technology is either too immature to be used or shouldn't be used at all. Also, why aren't they using LLMs when scraping to read the licenses and exclude anything other than pure public domain? Or better yet, use literally last century's technology to read the robots.txt and actually respect it. It's not even a technical limitation, it's a case of doing the right thing is too restrictive and won't allow us to accomplish what we want to do so we demand the right thing be expanded to what we're trying to do.

Open source only has anywhere between one and two core demands: Credit me for my work and potentially distribute derivatives in a way I can still take advantage of. And even that's not good enough for these AI chuds, they think we're the unreasonable ones for having these demands and not letting them use our code with no strings attached.

This is where many creators find themselves today, particularly in response to AI training. But the solutions they're reaching for — more restrictive licenses, paywalls, or not publishing at all — risk destroying the very commons they originally set out to build.

Yeah blame the people getting exploited and not the people doing the exploiting why don't you.

Particularly with AI, there’s also no indication that tightening the license even works. We already know that major AI companies have been training their models on all rights reserved works in their ongoing efforts to ingest as much data as possible. Such training may prove to have been permissible in US courts under fair use, and it’s probably best that it does.

No. Fuck that. There's nothing fair about scraping an independent creator's website (costing them real money) and then making massive profits from it. The creator literally fucking paid to have their work stolen.

If a kid learns that carbon dioxide traps heat in Earth's atmosphere or how to calculate compound interest thanks to an editor’s work on a Wikipedia article, does it really matter if they learned it via ChatGPT or by asking Siri or from opening a browser and visiting Wikipedia.org?

Yes. And the fact that it's stolen isn't even the biggest problem by a long shot. In fact, even Wikipedia is a pretty shitty source, do what your high school teacher said you should do and search Wikipedia for citations, not the articles themselves.

Don't let AI teach you anything you can't instantly verify with an authoritative source. It doesn't know anything and therfore can't teach anything by definition.

Instead of worrying about “wait, not like that”, I think we need to reframe the conversation to [...] “wait, not in ways that threaten open access itself”.

Okay, let's do that then. All AI training threaten open access itself. If not by ensuring the creator can never make money to sustain their work, then by LITERALLY COSTING THE CREATORS MONEY WHEN THEIR CONTENT IS SCRAPED! So the conclusion hasn't changed.

The true threat from AI models training on open access material is not that more people may access knowledge thanks to new modalities. It’s that those models may stifle Wikipedia and other free knowledge repositories, benefiting from the labor, money, and care that goes into supporting them while also bleeding them dry. It’s that trillion dollar companies become the sole arbiters of access to knowledge after subsuming the painstaking work of those who made knowledge free to all, killing those projects in the process.

And how does shaming the victims of that knowledge theft for having the audacity to try and do something about it help exactly?

Anyone at an AI company who stops to think for half a second should be able to recognize they have a vampiric relationship with the commons.

[...]

And yet many AI companies seem to give very little thought to this,

"Anyone at a Southern slave plantation who stops to think for half a second should be able to recognize they have a vampiric relationship with their black slaves." Yeah, they know. That's the point.

[–] HiddenLayer555@lemmy.ml 1 points 4 hours ago

Speak for your own instance.

[–] HiddenLayer555@lemmy.ml 1 points 5 hours ago* (last edited 5 hours ago)

Thousands of sick people killed each year by health insurance companies trying every trick and scheme to pay out as little as possible: I sleep.

Someone kills the guy masterminding it: REAL SHIT?!

[–] HiddenLayer555@lemmy.ml 4 points 5 hours ago* (last edited 5 hours ago) (5 children)

Small models have gotten remarkably good. 1 to 8 billion parameters, tuned for specific tasks — and they run on hardware that organizations already own

Hard disagree as someone who does host their own AI. Go on Ollama and run some models, you'll immediately realize that the smaller ones are basically useless. IMO 70B models are barely at the level of being usable for the simplest tasks, and with the current RAM landscape those are no longer accessible to most people unless you already bought the RAM before the Altman deal.

I suspect this is why he made that deal despite not having an immediate need for that much RAM. To artificially limit the public's ability to self host their own AI and therefore mitigate the threat open source models present to his business.

[–] HiddenLayer555@lemmy.ml 1 points 5 hours ago

"Precrime"

"predictive policing"

AKA what they accuse aUtHoRiTaRiAn countries like China of.

[–] HiddenLayer555@lemmy.ml 7 points 5 hours ago (1 children)

Prolatariat say relentless exploitation by CEOs is hurting society and has "done a lot of damage."

[–] HiddenLayer555@lemmy.ml 3 points 10 hours ago* (last edited 10 hours ago)

America: Destroys all people who try to not be cancerous

"People are the cancer, actually."

[–] HiddenLayer555@lemmy.ml 4 points 1 day ago

If we're "exploring" the galaxy like the guys in that ship "explored" the Earth, we should stay home and avoid defiling any more celestial bodies than we already have.

[–] HiddenLayer555@lemmy.ml 3 points 2 days ago (1 children)

I think a big part is emphasis. English tends to put the emphasis on the last word of the sentence so not saying it sounds weird.

 

cross-posted from: https://lemmy.ml/post/40818280

If there's anything we should take from Japan, it's treating cars like second class citizens behind transit instead of the other way around. The cute tiny cars are more a side effect of that.

 

If there's anything we should take from Japan, it's treating cars like second class citizens behind transit instead of the other way around. The cute tiny cars are more a side effect of that.

 

cross-posted from: https://lemmy.ml/post/40689539

I decided to switch to NixOS on my desktop and so far it's been great, I love being able to build out my config in the Nix file, but there is one thing I've not been able to figure out how to change. After a period of inactivity, the computer suspends (or hibernates?) and basically turns off (all the fans and lights turn off and it disconnects from the network, I don't know if it's saving the state in RAM of the drive). How do I get it to not do that and just lock the desktop and turn off the screen after inactivity? I'm using KDE Plasma and I've tried different kinds of configurations that build successfully but still don't prevent it from going offline.

 

I decided to switch to NixOS on my desktop and so far it's been great, I love being able to build out my config in the Nix file, but there is one thing I've not been able to figure out how to change. After a period of inactivity, the computer suspends (or hibernates?) and basically turns off (all the fans and lights turn off and it disconnects from the network, I don't know if it's saving the state in RAM of the drive). How do I get it to not do that and just lock the desktop and turn off the screen after inactivity? I'm using KDE Plasma and I've tried different kinds of configurations that build successfully but still don't prevent it from going offline.

 

cross-posted from: https://lemmy.ml/post/40568699

After some consideration, I've decided to replace my consumer router at home with an OpnSense box I control, and use the consumer router as just an access point. The model I have doesn't seem to support OpenWrt but the default firmware supports access point mode complete with mesh functionality, otherwise I would have just installed OpenWrt on it. I still like the consumer router's mesh Wi-Fi capabilities, especially the wireless range extender, but don't trust it enough to let it be the actual root device separating my home network from the open internet. My reasoning is that by having it behind the OpnSense router, I can monitor and detect if it's exfiltrating any "analytics" data and block them. Worst case scenario I realize it's too noisy with the analytics and buy a proper business grade access point, or an M.2 Wi-Fi 6 card with some beefy antennas.

Now I'm trying to decide if I should use one of my old mini PCs or if I should get a brand new one with an up to date processor and microcode. The biggest reason I don't want the consumer router to be the root device anymore is because I don't know how well they patch their firmware against attackers constantly scanning the internet for vulnerable devices. I imagine an open source router OS with tons of eyes on it and used by actual professionals would inherently be more secure than whatever proprietary cost cut consumer firmware my current router has. I've already picked out a suitable mini PC I'm not using and the reason I even started down this rabbit hole is because I have it, but after thinking more about it, I'm worried that whatever security I gain might be undermined by the underlying hardware being old and outdated, especially since the processor is definitely pre Spectre/Meltdown and I doubt it's still getting microcode or firmware updates.

Again, the reason I ask is because the internet really wants me to think old disused computers are perfect for converting into routers, and I really don't want to buy a new computer if I don't have to. How important is the hardware for a router? Can I expect OpnSense to have sufficient security on pretty much any hardware or will a sufficiently old computer completely defeat the purpose of even switching away from the consumer router?

Alternatively, I also have another mini PC with a Ryzen 5 from 2020, and I can reposition it from its current job to router duty, though it would definitely be overkill and wasting the hardware capabilities. Would that be substantially more secure than an older Intel processor?

I also have a Raspberry Pi 4 I can put OpenWrt on, would that somehow be more secure than an older x64 computer?

 

After some consideration, I've decided to replace my consumer router at home with an OpnSense box I control, and use the consumer router as just an access point. The model I have doesn't seem to support OpenWrt but the default firmware supports access point mode complete with mesh functionality, otherwise I would have just installed OpenWrt on it. I still like the consumer router's mesh Wi-Fi capabilities, especially the wireless range extender, but don't trust it enough to let it be the actual root device separating my home network from the open internet. My reasoning is that by having it behind the OpnSense router, I can monitor and detect if it's exfiltrating any "analytics" data and block them. Worst case scenario I realize it's too noisy with the analytics and buy a proper business grade access point, or an M.2 Wi-Fi 6 card with some beefy antennas.

Now I'm trying to decide if I should use one of my old mini PCs or if I should get a brand new one with an up to date processor and microcode. The biggest reason I don't want the consumer router to be the root device anymore is because I don't know how well they patch their firmware against attackers constantly scanning the internet for vulnerable devices. I imagine an open source router OS with tons of eyes on it and used by actual professionals would inherently be more secure than whatever proprietary cost cut consumer firmware my current router has. I've already picked out a suitable mini PC I'm not using and the reason I even started down this rabbit hole is because I have it, but after thinking more about it, I'm worried that whatever security I gain might be undermined by the underlying hardware being old and outdated, especially since the processor is definitely pre Spectre/Meltdown and I doubt it's still getting microcode or firmware updates.

Again, the reason I ask is because the internet really wants me to think old disused computers are perfect for converting into routers, and I really don't want to buy a new computer if I don't have to. How important is the hardware for a router? Can I expect OpnSense to have sufficient security on pretty much any hardware or will a sufficiently old computer completely defeat the purpose of even switching away from the consumer router?

Alternatively, I also have another mini PC with a Ryzen 5 from 2020, and I can reposition it from its current job to router duty, though it would definitely be overkill and wasting the hardware capabilities. Would that be substantially more secure than an older Intel processor?

I also have a Raspberry Pi 4 I can put OpenWrt on, would that somehow be more secure than an x64 computer?

 

My VPN provider has a limit to how many concurrent connections I can have, and a workaround I've been using is to run the Wireguard client as a daemon (wg-quick@my-wg-config) and a Squid proxy on my home server, and point my local devices to the HTTP proxy port, which will route the traffic through the Wireguard connection. However, this has broken randomly multiple times in the past few months, where it will randomly decide to just not allow the server to connect to ANY internet address while the Wireguard connection is active, and no amount of network or routing table configuration changes fixes it. The Squid proxy works fine as far as I can tell, it's just the Wireguard connection that's failing, which doesn't even allow a ping to an internet address from the server's terminal (which doesn't go through the proxy). The only way I've been able to fix it is to completely reinstall the OS on the server and reconfigure everything from scratch, which is annoying and also only works until it randomly decides to break again. This makes me think I'm doing something wrong.

Is there a more "proper" or widely supported way of routing internet traffic on local devices through a single Wireguard connection? Everything I could read online says running Wireguard with an HTTP proxy server is the way to do it, but it clearly isn't very reliable or my computer is just defective in some weird intermittent way? The server is running Fedora Server 43. I've also checked for SELinux denials but there are none.

I'm aware of wireproxy but it uses a SOCKS5 proxy which is not as widely supported as an HTTP proxy and a lot of my devices (mainly phones) won't be able to access it. Also I'd like the server itself to also use the VPN, not just the devices on the proxy.

Does anyone have more experience with this and can give some advice?

 

cross-posted from: https://lemmy.ml/post/40388903

I have a science-fantasy world with intelligent non-anthro animals living in harmony, which I've posted some lore about this in the past. Think "communist non-anthro Zootopia with sci-fi technology." This is something that I've been thinking about for a while and combines my interests in worldbuilding and software. I want to create a fictional social media platform for the animals in my world, and stage fictional threads in the typical Reddit/Lemmy format discussing news and politics taking place within the world. Then post screenshots here with context explaining what is happening. I just thought this might be a more fun way of sharing lore about my world than just the articles themselves, almost like an ARG. I'll also be able to introduce some of my main narrative characters through their social media presence.

On the technical side of things, I don't know if I want to compile and spin up a local Lemmy instance at home and actually stage accounts and posts on it. But actually logging in and out of different accounts sounds like way more work than necessary so I could also just take the Lemmy UI and add my own mock thread data to it. Or, I could write my own code for a completely fictional GUI, since I don't want to just use the default Lemmy UI and break the illusion. The second and third options might be more important if I want to make this an actual ARG and host a website for it, since in that case I don't actually want people to sign up and post.

I would love some feedback in general on this idea, and maybe gauge interest on if this is something people would like to see.

 

cross-posted from: https://lemmy.ml/post/40388903

I have a science-fantasy world with intelligent non-anthro animals living in harmony, which I've posted some lore about this in the past. Think "communist non-anthro Zootopia with sci-fi technology." This is something that I've been thinking about for a while and combines my interests in worldbuilding and software. I want to create a fictional social media platform for the animals in my world, and stage fictional threads in the typical Reddit/Lemmy format discussing news and politics taking place within the world. Then post screenshots here with context explaining what is happening. I just thought this might be a more fun way of sharing lore about my world than just the articles themselves, almost like an ARG. I'll also be able to introduce some of my main narrative characters through their social media presence.

On the technical side of things, I don't know if I want to compile and spin up a local Lemmy instance at home and actually stage accounts and posts on it. But actually logging in and out of different accounts sounds like way more work than necessary so I could also just take the Lemmy UI and add my own mock thread data to it. Or, I could write my own code for a completely fictional GUI, since I don't want to just use the default Lemmy UI and break the illusion. The second and third options might be more important if I want to make this an actual ARG and host a website for it, since in that case I don't actually want people to sign up and post.

I would love some feedback in general on this idea, and maybe gauge interest on if this is something people would like to see.

view more: next ›