Tag: BM25
-
Hybrid Search
Andrew
I’ve already written a post on using a piecewise-linear scaling to bring BM25 and my semantic score (from cosine similarity with our embeddings) into the same numerical space. After performing this scaling, I found some important results weren’t scoring as well as they should. In particular, any search query with a common term (e.g. “the”…
-
Score Normalization
Andrew
Currently, I have two different scoring functions: BM25 and the semantic scoring function that comes from our sentence embedding. These scores take very different ranges, but need to be combined to make a final score. It’s not simply a matter of assigning different weights to these scores. We need to stretch them out to make…
