this post was submitted on 05 Feb 2026

24 points (92.9% liked)

Ask Lemmy

38450 readers

1498 users here now

A Fediverse community for open-ended, thought provoking questions

Rules: (interactive)

1) Be nice and; have fun

Doxxing, trolling, sealioning, racism, toxicity and dog-whistling are not welcomed in AskLemmy. Remember what your mother said: if you can't say something nice, don't say anything at all. In addition, the site-wide Lemmy.world terms of service also apply here. Please familiarize yourself with them

2) All posts must end with a '?'

This is sort of like Jeopardy. Please phrase all post titles in the form of a proper question ending with ?

3) No spam

Please do not flood the community with nonsense. Actual suspected spammers will be banned on site. No astroturfing.

4) NSFW is okay, within reason

Just remember to tag posts with either a content warning or a [NSFW] tag. Overtly sexual posts are not allowed, please direct them to either !asklemmyafterdark@lemmy.world or !asklemmynsfw@lemmynsfw.com. NSFW comments should be restricted to posts tagged [NSFW].

5) This is not a support community.

It is not a place for 'how do I?', type questions. If you have any questions regarding the site itself or would like to report a community, please direct them to Lemmy.world Support or email info@lemmy.world. For other questions check our partnered communities list, or use the search function.

6) No US Politics.

Please don't post about current US Politics. If you need to do this, try !politicaldiscussion@lemmy.world or !askusa@discuss.online

Reminder: The terms of service apply here too.

Partnered Communities:

Logo design credit goes to: tubbadu

founded 2 years ago

MODERATORS

Bluetreefrog@lemmy.world

TheSaneWriter@lemm.ee

TheSaneWriter@lemmy.thesanewriter.com

Asudox@lemmy.world

lemmy_bot@lemmy.world

beefbaby182@lemmy.world

ModeratorCan@lemmy.world

neidu3@sh.itjust.works

asudox@lemmy.asudox.dev

candyman337@lemmy.world

candyman337@sh.itjust.works

What are alternative to searching using web search engines? (programming.dev)

submitted 1 month ago* (last edited 1 month ago) by cactus_head@programming.dev to c/asklemmy@lemmy.world

51 comments fedilink hide all child comments

For most use cases, web search engines are fine. But I am wondering if there are alternative ways to finding information. There is also the enshittification of google and tbh most(free) search engines just give google search result

Obviously, the straight answer is just asking other people, in person or online, in general forums or specialised communities

Libraries are good source too but for those of is that don't have access to physical libraries, there free online public libraries(I will post the links for those that I found below)

Books in general, a lot of them have reference to outside materials.

So, I been experimenting with an AI chat bot(Le chat), partially as life coach of sorts and partially as a fine tuned web search engine. To cut to the chase, its bad. when its not just listing google top results it list tools that are long gone or just makes shit up. I was hoping it to be a fine tuned search engine, cuz with google, if what you want is not in the top 10 websites, your on your own.

So yeah, that all I can think of. Those are all the routes I can think of for finding information and probably all there is but maybe I missed some other routes.

you are viewing a single comment's thread
view the rest of the comments

[–] riskable@programming.dev 6 points 1 month ago (4 children)

Have you tried using an LLM configured to search the Internet for you? It's amazing!

Normal search: Loads of useless results, ads, links that are hidden ads, scams, and maybe on like the 3rd page you'll find what you're looking for.

AI search: It makes calls out to Google and DDG (or any other search engines you want) simultaneously, checks the content on each page to verify relevancy, then returns a list of URLs that are precisely what you want with summaries of each that it just generated on the fly (meaning: They're up to date).

You can even do advanced stuff like, "find me ten songs on YouTube related to breakups and use this other site to convert those URLs to .ogg files and put them in my downloads folder."

Local, FOSS AI running on your own damned PC is fucking awesome. I seriously don't understand all the hate. It's the technology everyone's always wanted and it gets better every day.

[–] TheOneCurly@feddit.online 10 points 1 month ago* (last edited 1 month ago) (4 children)

checks the content on each page to verify relevancy,

No it can't do that. It's an LLM, it can only generate the next word in a sequence.

Also this doesn't solve OPs problem at all. If it's in the top 10 results on a major search engine then anyone can find it in minimal time.

Fucking AI bros being like I remade looking at 10 google links but this time it burns down a forest and tells me what a genius I am for asking.

[–] riskable@programming.dev 1 points 1 month ago (1 children)

No it can't do that. It's an LLM, it can only generate the next word in a sequence.

Your knowledge is out of date, friend. These days you can configure an LLM to run tools like curl, nmap, ping, or even write then execute shell scripts and Python (though, in a sandbox for security).

Some tools that help you manage the models are preconfigured to make it easy for them to search the web on your behalf. I wouldn't be surprised if there's a whole ecosystem of AI tools just for searching the web that will emerge soon.

What Mozilla is implementing in Firefox will likely start with cloud-based services but eventually it'll just be using local models, running on your PC. Then all those specialized AI search tools will become less popular as Firefox's built-in features end up being "good enough".

[–] TheOneCurly@feddit.online 1 points 1 month ago

I understand how agents work. I'm just saying it cannot "verify relevancy". That's a qualitative assessment that an LLM is incapable of doing. The scripts and regex that form the backbone of the "agent" can absolutely download a webpage and add the text contents to the context. But after that it's just random bullshit.

[–] kata1yst@sh.itjust.works 1 points 1 month ago (1 children)

Explain to me which forest burns down when I run an AI on my local computer that uses the same power (or less) as running a video game?

AI / LLMs aren't evil or unethical or immoral - commercializing them into enormous behemoths that eat resources 24x7 is.

[–] CorrectAlias@piefed.blahaj.zone 4 points 1 month ago (3 children)

Even local models are trained on stolen art and content. That's the immoral part.

[–] bridgeenjoyer@sh.itjust.works 3 points 1 month ago

No one seems to get this part.

[–] kata1yst@sh.itjust.works 2 points 1 month ago

There's many models that use open source training sets and weights. You can choose them.

[–] riskable@programming.dev -1 points 1 month ago* (last edited 1 month ago)

AI models aren't trained on anything "stolen". When you steal something, the original owner doesn't have it anymore. That's not being pedantic, it's the truth.

Also, if you actually understand how AI training works, you wouldn't even use this sort of analogy in the first place. It's so wrong it's like describing a Flintstones car and saying that's how automobiles work.

Let's say you wrote a book and I used it as part of my AI model (LLM) training set. As my code processes your novel, token-by-token (not word-by-word!), it'll increase or decrease a floating point value by something like 0.001. That's it. That's all that's happening.

To a layman, that makes no sense whatever but it's the truth. How can a huge list of floating point values be used to generate semi-intelligent text? That's the actually really fucking complicated part.

Before you can even use a model you need to tokenize the prompt and then perform an inference step which then gets processed a zillion ways before that .safetensors file (which is the AI model) gets used at all.

When an AI model is outputting text, it's using a random number generator in conjunction with a word prediction algorithm that's based on the floating point values inside the model. It doesn't even "copy" anything. It's literally built upon the back of an RNG!

If an LLM successfully copies something via it's model that is just random chance. The more copies of something that went into its training, the higher the chance of it happening (and that's considered a bug, not a feature).

There's also a problem that can occur on the opposite end: When a single set of tokens gets associated with just one tiny bit of the training set. That's how you can get it to output the same thing relatively consistently when given the same prompt (associated with that set of tokens). This is also considered a bug and AI researchers are always trying to find ways to prevent this sort of thing from happening.

[–] gigachad@piefed.social 1 points 1 month ago

As much as I understand your hate for LLMs, this is wrong.

[–] Onomatopoeia@lemmy.cafe 0 points 1 month ago* (last edited 1 month ago)

It does way more than that.

I have it write scripts in 30 seconds that would take me 2 days to write and verify.

I can quickly parse through what it writes (takes me about a minute) to verify it hasn't done anything wonky, then test it in a VM I use for testing my own scripts.

It does this because the question I ask is very clear and explicit: exact script language/version, exact input, exact output, how the script should flow, what commenting should look like. It takes me about 1 minute to write a good question like this.

I've setup Projects in it with specific rules so I don't have to state those rules every time - I have one for each scripting language. I've saved that definition so when I update the rules I have a local definition for reuse or share with my peers.

For informational searches I have it provide source links automatically - I have a lot of general knowledge so whenever it produces something that doesn't sit right I will steelman the information. It's surprising what it can come up with this way.

My friends say "you like to argue with it" - well sometimes arguing is necessary.

Haha, downvoters. What a joke.

[–] CorrectAlias@piefed.blahaj.zone 6 points 1 month ago (1 children)

LLMs still hallucinate what content is on the pages they search. It's not great.

[–] riskable@programming.dev 1 points 1 month ago

It depends on the size of the content on the page. As long as it's small enough to be contained within the context window, it should do a good job.

But that's all irrelevant since the point of the summary is just to give you a general idea of what's on the page. You'll still get the actual title and whatnot.

Using an LLM to search on your behalf is like using grep to filter out unwanted nonsense. You don't use it like, "I'm feeling lucky" and pray for answers. You still need to go and open the pages in the results to get at what you want.

[–] kata1yst@sh.itjust.works 2 points 1 month ago

Agreed, a locally running LLM is an amazing choice.

[–] classic@fedia.io 0 points 1 month ago (1 children)

is there a non- to middling-tech savvy option for a local ai?

[–] riskable@programming.dev 0 points 1 month ago (1 children)

Not at this point, no. Not unless you know how to setup/manage docker images and have a GPU with at least 16GB of VRAM.

Also, if you're not using Linux forget it. All the AI stuff anyone would want to run is a HUGE pain in the ass to run on Windows. The folks developing these models and the tools to use them are all running Linux. Both on their servers and on their desktops and it's obvious once you start reading the README.md for most of these projects.

Some will have instructions for Windows but they'll either be absolutely enormous or they'll hand wave away the actual complexity, "These instructions assume you know the basics of advanced rocket science and quantum mechanics."

[–] classic@fedia.io 2 points 1 month ago

i aupreciate this, thank you