License
Hi! Thank you for the great work. I may have missed it, but I could not find the license for this model/repository.
Would it also follow BigCode Open RAIL-M v1?
The model currently doesn't have a license we just have the Terms of use as part of the gating access, does that work for you?
Is there any intention to release StarPII under the same Open RAIL-M license as StarCoder? Unfortunately, the current gated terms are too restrictive for our desired model use cases, in that we would like to leverage the model for more than just cleaning datasets.
Could you specify in what ways you plan to leverage the model?(Open RAIL-M is designed for text/code generation models and StarPII doesn't fall in that category)
Secrets detection generally, outside the bounds of just dataset cleaning. Could fill the same role of more traditional tooling, like trufflehog (https://github.com/trufflesecurity/trufflehog).
Thanks for the added information!
The restrictive Terms of Use are motivated by the strong dual-use potential of PII detectors - secrets detection in general can easily include malicious uses, such as scanning open repos for security vulnerabilities to exploit. As such, sharing the model requires us to balance the risks and benefits, hence the restricted uses - as we do not currently have the capacity to work with potential users more closely.
TLDR: we won't be releasing the model under a more permissive license in the near future, but efforts are ongoing on the governance of sensitive models :)