Show HN: I Wrote a Full Text Search Engine from Scratch in Go

https://news.ycombinator.com/rss Hits: 12
Summary

Blaze A high-performance full-text search engine in Go with inverted indexing, boolean queries, phrase search, proximity queries, and BM25 ranking—powered by a flexible query engine, roaring bitmaps, and skip lists. Table of Contents Overview Blaze is a Go engine that provides fast, full-text search capabilities through an inverted index implementation. It's designed for applications that need to search through text documents efficiently without relying on external search engines. Key Highlights: Inverted Index : Maps terms to document positions for instant lookups : Maps terms to document positions for instant lookups Skip Lists : Probabilistic data structure providing O(log n) operations : Probabilistic data structure providing O(log n) operations Query Builder : Type-safe, fluent API for boolean queries with roaring bitmaps : Type-safe, fluent API for boolean queries with roaring bitmaps Advanced Search : Phrase search, BM25 ranking, proximity ranking, and boolean queries : Phrase search, BM25 ranking, proximity ranking, and boolean queries BM25 Algorithm : Industry-standard relevance scoring with IDF and length normalization : Industry-standard relevance scoring with IDF and length normalization Text Analysis : Tokenization, stemming, stopword filtering, and case normalization : Tokenization, stemming, stopword filtering, and case normalization Thread-Safe : Concurrent indexing with mutex protection : Concurrent indexing with mutex protection Serialization: Efficient binary format for persistence Features Search Capabilities Term Search : Find documents containing specific terms : Find documents containing specific terms Phrase Search : Exact multi-word matching ("quick brown fox") : Exact multi-word matching ("quick brown fox") Boolean Queries : Type-safe AND, OR, NOT operations with query builder : Type-safe AND, OR, NOT operations with query builder BM25 Ranking : Industry-standard relevance scoring (used by Elasticsearch, Solr) : Industry-standard relevance ...

First seen: 2025-10-09 18:20

Last seen: 2025-10-10 05:22