overview for ColinHayhurst

Mojeek API with Open WebUI in c/selfhosted@lemmy.world

[-] ColinHayhurst@lemmy.world 1 points 13 hours ago

We have submitted a PR to the Open WebUI repo, which would enable Mojeek: https://github.com/open-webui/open-webui/discussions/6588

Google DOJ Trial Exhibit Files, Documents & Responses. in c/technology@lemmy.world

[-] ColinHayhurst@lemmy.world 1 points 1 month ago

Excellent reporting on the trials: https://www.bigtechontrial.com/

Google is no longer asking — feed the AI or you’re not in search results in c/technology@lemmy.world

[-] ColinHayhurst@lemmy.world 2 points 2 months ago

Yes.

Google is no longer asking — feed the AI or you’re not in search results in c/technology@lemmy.world

[-] ColinHayhurst@lemmy.world 15 points 2 months ago

Some discussion on that here: https://lemmy.world/comment/11859761

Google is no longer asking — feed the AI or you’re not in search results in c/technology@lemmy.world

[-] ColinHayhurst@lemmy.world 2 points 2 months ago

Where is your evidence for that? It used to be Bing and Yandex, but now it's just Bing. They use other non search engine APIs and do a small amount of crawling AFAIK. Details of who uses what here: https://seirdy.one/posts/2021/03/10/search-engines-with-own-indexes/

Google is no longer asking — feed the AI or you’re not in search results in c/technology@lemmy.world

[-] ColinHayhurst@lemmy.world 29 points 2 months ago* (last edited 2 months ago)

You should put these entries into your robots.txt file.

To block the Google search crawler use for all of your site:

User-agent: Googlebot

Disallow: /

To block the Google AI crawler use:

User-agent: Google-Advanced

Disallow: /

Any “small-web” search engines? in c/technology@lemmy.world

[-] ColinHayhurst@lemmy.world 6 points 2 months ago

Yes, it was. Matt Wells closed it down just over one year ago.

Any “small-web” search engines? in c/technology@lemmy.world

[-] ColinHayhurst@lemmy.world 5 points 2 months ago

Any “small-web” search engines? in c/technology@lemmy.world

[-] ColinHayhurst@lemmy.world 4 points 2 months ago

https://system1.com/ adtech company syndicating Bing and/or Google

Any “small-web” search engines? in c/technology@lemmy.world

[-] ColinHayhurst@lemmy.world 8 points 2 months ago* (last edited 2 months ago)

We'd love to build a distributed search engine, but it would be too slow I think. When you send us a query we go and search 8 billion+ pages, and bring back the top 10, 20....up to 1,000 results. For a good service we need to do that in 200ms, and thus one needs to centralise the index. It took years, several iterations and our carefully designed algos & architecture to make something so fast. No doubt Google, Bing, Yandex & Baidu went through similar hoops. Maybe, I'm wrong and/or someone can make it work with our API.