@marsxyz Thank you for initiating the YaCy integration in Open Web-UI here: https://github.com/open-webui/open-webui/discussions/9888#discussion-7960142
Thats a good idea and yes, mixing a local LLM with a local search engine is a good idea!
Open Source Developer and Search Engine Creator,
Maintainer of YaCy.net and SUSI.ai, Research on AI in Information Retrieval
Follow me on Patreon for news about YaCy: https://www.patreon.com/orbiterlab
@marsxyz Thank you for initiating the YaCy integration in Open Web-UI here: https://github.com/open-webui/open-webui/discussions/9888#discussion-7960142
Thats a good idea and yes, mixing a local LLM with a local search engine is a good idea!
Yesterday I finally done something I wanted to do for quite some time; have a local AI that can locally query the web without relying on google or other search engines.
In its last update, open-webui added support for Yacy as a search provider. Yacy is an open source, distributed search engine that does not rely on a central index but rely on distributed peers indexing pages themselves. I already tried Yacy in the past but the problem is that the algorithm that sorts the results is garbage and it is not really usable as a search engine. Of course a small open source software that can run on literally anything (the server it ran on for this experiment is a 12th gen Celeron with 8GB of RAM) cannot compete in term of the intelligence of the algorithm to sort the results with companies like Google or Microsoft. It was practically unusable.
Or It Was ! Coupled with an LLM, the LLM can sort the trash results from Yacy out and keep what is useful !
That means that we can now have selfhosted AI models that learn from the Web ... without relying on Google or any central entity at all !
Some caveats; 1. Of course this is inferior to using google or even duckduckgo, I just wanted to share that here because I think you'll find it cool. 2. You need a solid CPU to have a lot of concurrent research, my Celeron gets hammered to 100% usage at each query. (open-webui and a bunch of other services are running on this server, that must not help).
Wie funktioniert ChatGPT _ganz_genau_? Die verständliche und wirklich vollständige Erklärung in einem Vortrag: https://www.youtube.com/watch?v=T7K2SmqlzOI
slightly off-topic - I made a thing:
https://www.thingiverse.com/thing:7007980
Pick and Click Bit Rack Reversed
Video vom Vortrag
"Open Data in KI nutzen"
von den Chemnitzer Linuxtagen 2025 @clt_news https://media.ccc.de/v/clt25-269-open-datafreie-daten-in-ki-chatbots-nutzen
Folien vom Vortrag
"Superhuman AI - Wie und wann ist KI nützlich"
bei den @clt_news Chemnitzer Linuxtagen 2025 https://yacy.net/material/20250322_CLT2025_Superhuman_AI_Wie_und_wann_ist_KI_nuetzlich.pdf
Today, 14:30 (+07) at #FOSSASIA Bangkok:
"The Complete Anatomy of ChatGPT: A Precise Breakdown of LLMs and Transformers"
Tune in live at
https://eventyay.com/e/4c0e0c27/session/9472
Slides can be downloaded from
https://yacy.net/material/20250313_FOSSASIA_2025_The_Complete_Anatomy_of_ChatGPT.pdf
@hieronymus @ArneBab its Solr, it always takes all the given RAM, thats what it does to be efficient.
DietPi (lightweight Debian OS for SBCs) comes with #yacy https://dietpi.com/docs/software/distributed_projects/
Mit #YaCy auf den #CLT2025 / Chemnitzer Linuxtage - Blog Post von Frank:
https://do3eet.pages.dev/post/clt2025yacy/
#DeepSeek-V3 Leading in Superhuman Coding: new leading position in the PE-Bench-Python-100, showing a 15.58-fold #superhuman performance.
See: https://github.com/Orbiter/project-euler-llm-benchmark
YaCy meetup at #38c3 - I will be at the FOSSASIA table in the Critical Decentralization Cluster, watch out for the YaCy flag! If you want to meet me, just come by. See also: https://community.searchlab.eu/t/yacy-meetup-38c3/1705
@klokanek You can meet me now or any time in the next four days at 38c3: see my posting in https://community.searchlab.eu/t/yacy-meetup-38c3/1705
The PE-Bench-Python-100 test results (checking LLMs ability to code the Project Euler problems) are so far in this table. Read about it in https://github.com/Orbiter/project-euler-llm-benchmark/
@rnbwdsh ah yes, I just replied this. Your ideas are welcome if you want to contribute.
However my target is to use the test results in a follow-up project where I want to make a auto-coder which reads tickets and provides pull requests. So we can collect here best practices.
@rnbwdsh the current prompt does not use any COT on purpose because I want to measure also how much different prompt strategies influence the result, so this comes later. Templates are here: https://github.com/Orbiter/project-euler-llm-benchmark/blob/main/templates/template_python.md
@rnbwdsh my computer is already cooking hard and I want to make a large table. Currently the largest models are first in the queue.
I also want to measure the influence of different quantization levels, because at this time I doubt that usage of a large model with strong quantization is better than a smaller one with less quantization. I don't know any benchmark which shows that.
@rnbwdsh yes, project euler is around since about 10 years and started with ca. 160 problems. I also believe that most LLMs have seen solutions, however probably mostly in python. I don't think this is too influencing to other programming languages, so benchmarks to java etc. may be more significant.
My target is to evaluate the ability of LLMs to perform in an auto-coder engine and there I would not mind if those models perform because of some 'cheating'.
I am creating a LLM benchmark series "PE-Bench-Python-100" to measure how good LLMs are at coding of the Project Euler Challenges compared to humans. The series will include also other languages than python, like java.
The first result so far is interesting because even the small model llama3.2 (3B) has a two-fold super-human performance (score: 2.14)
Source: https://github.com/Orbiter/project-euler-llm-benchmark/