Many people reading this will call bullshit on the performance improvement metrics, and honestly, fair. I too thought the agents would stumble in hilarious ways trying, but they did not. To demonstrate that I am not bullshitting, I also decided to release a more simple Rust-with-Python-bindings project today: nndex, an in-memory vector “store” that is designed to retrieve the exact nearest neighbors as fast as possible (and has fast approximate NN too), and is now available open-sourced on GitHub. This leverages the dot product which is one of the simplest matrix ops and is therefore heavily optimized by existing libraries such as Python’s numpy…and yet after a few optimization passes, it tied numpy even though numpy leverages BLAS libraries for maximum mathematical performance. Naturally, I instructed Opus to also add support for BLAS with more optimization passes and it now is 1-5x numpy’s speed in the single-query case and much faster with batch prediction. 3 It’s so fast that even though I also added GPU support for testing, it’s mostly ineffective below 100k rows due to the GPU dispatch overhead being greater than the actual retrieval speed.
pony 0.8493 0.8383 -0.0109 0.8286 0.8173 -0.0113
,这一点在safew官方版本下载中也有详细论述
既然都已经做了这么大的变化,就不会带着之前很多的逻辑,这些heuristic(基于经验规则的启发式方法),就是这些很多的规则或者办法来解决现在的问题,这个也是让数据和模型不停scaling(通过增加数据量、模型参数规模、算力投入,来持续提升模型能力)最重要的核心,尽量少加其他的东西进来。
crawler and can crawl my blog (Wandering Thoughts).