bge-m3
Optimised BAAI/bge-m3 serving with dense + sparse + ColBERT embeddings, async dynamic batching and pipeline GPU inference
Self-hosted Chinese personal memory graph. Six sources, two LLMs, one graph.