Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
do-me 
posted an update Mar 29
Post
1899
Hey, I just added three useful advanced use cases to do-me/SemanticFinder.
SemanticFinder is a collection of embeddings for public documents or books. You can create your own index file from any text or pdf and save it without installing or downloading anything. Try yourself:

1. Translating from 100+ languages to English (even though it might confuse a strawberry with a grapefruit ;D): https://do-me.github.io/SemanticFinder/?hf=List_of_the_Most_Common_English_Words_70320cde&firstOnly=true&inferencingActive=False
2. Finding English synonyms: https://do-me.github.io/SemanticFinder/?hf=List_of_the_Most_Common_English_Words_0d1e28dc&firstOnly=true&inferencingActive=False
3. The "universal index idea": create an embedding index with 30k English words and reuse it on unseen texts. You can decide to fill the gaps in the index by additional inferencing or just stick to the 30k index for instant semantic similarity.
Initial idea: https://github.com/do-me/SemanticFinder/discussions/48
Try here: https://do-me.github.io/SemanticFinder/?hf=List_of_the_Most_Common_English_Words_0d1e28dc&inferencingActive=False&universalIndexSettingsWordLevel with a text of your choice.

This could be enhanced by adding duplets or triplets like "climate change" or "green house gas". Eventually I'd like to set up vector DB integrations.

Super happy to hear your feedback, ideas and maybe even contributions! :)

---
Edit: Apparently markdown url formatting does only work for HF links.
In this post