may have spent the last 8 hours building a fulltext search engine in nostrdb. I made the index as space efficient as possible, they keys are stored in a compressed format and map words and word indices to note ids. So when you type “the quick brown fox” it will be able to return results with those exact words in sequence (or not if it can’t find a sequence). Testing it now 👀. Will release soon ™

Replies (51)

Idea: limit the index to, say, the 64000 most used words including plurals and other variations), names (first and last), materials and brands. So keep typo’s and rare words out of the index, sanitizing the indrx makes its size much more manageable. </suggestion>
Full text search in 8 hours. Anyone impressed yet? In the last 3 weeks I changed a banner on the home page of the popular website.
techfeudalist 's avatar
techfeudalist 2 years ago
If you’re looking for a quick and dirty way to add fuzzy search and stemming, try tokenizing the lowercase string into character triplets, including spaces: [ th,e q,uic,k b,row,n f,ox ] and sorting by highest count of matching tokens.
I was going to look into stemming/lemmatization after. Keeping it simple on the first pass
dextryz's avatar
dextryz 2 years ago
Perfect. I literally implemented Elastic into my client yesterday and concluded we have to do better
Its a feature of nostrdb which has nothing to do with damus, but damus does use nostrdb
look at the difference between searching in tidal (has to be exact) and youtube (does not need to be exact)
Ah sweet. This will greatly help discovery in app for the apps that choose to use nostrdb then. Discovery in amethyst isntgreatd. I use nostr.band as a workaround usually. Thanks Will.