what are you using to detect languages? getting lots of false negatives
Login to reply
Replies (2)
This thing:
And requiring a confidence score of at least 0.9. It was good in my tests but I've only tested Portuguese.
I'll decrease it to 0.85.
And relax the ratelimit a little bit.
GitHub
GitHub - pemistahl/lingua-go: The most accurate natural language detection library for Go, suitable for short text and mixed-language text
The most accurate natural language detection library for Go, suitable for short text and mixed-language text - pemistahl/lingua-go
I'm using this library on adre.su too, but I have to say, the Confident Score didn't help me at all. I tried their "light mode", implemented rate limits, and in the end, I gave up on checking short messages. But honestly, it all doesn't work very well. The best results came from manual calibration with similar languages specified (for each one used), but this whole language thing takes a lot of resources, and manual setup even more so.