|
--- |
|
model-index: |
|
- name: Sociovestix/lenu_PL |
|
results: |
|
- task: |
|
type: text-classification |
|
name: Text Classification |
|
dataset: |
|
name: lenu |
|
type: Sociovestix/lenu |
|
config: PL |
|
split: test |
|
revision: f4d57b8d77a49ec5c62d899c9a213d23cd9f9428 |
|
metrics: |
|
- type: f1 |
|
value: 0.9930020993701889 |
|
name: f1 |
|
- type: f1 |
|
value: 0.6630198501925706 |
|
name: f1 macro |
|
args: |
|
average: macro |
|
widget: |
|
- text: "INSTYTUT DIABETOLOGII SPÓŁKA Z OGRANICZONĄ ODPOWIEDZIALNOŚCIĄ" |
|
- text: '"METAL-SYSTEM" OGRODZENIA - SCHODY SŁAWOMIR BINKOWSKI' |
|
- text: "GERLACH S.A." |
|
- text: "EMU SPÓŁKA Z OGRANICZONĄ ODPOWIEDZIALNOŚCIĄ SPÓŁKA KOMANDYTOWA" |
|
- text: "JEREMIE SEED CAPITAL WOJEWÓDZTWA POMORSKIEGO FUNDUSZ INWESTYCYJNY ZAMKNIĘTY W LIKWIDACJI" |
|
- text: "MIASTO BIELSKO-BIAŁA" |
|
- text: 'MARKETING" KRYSTIAN GDOWKA, ARTUR OSTRĘGA SPÓŁKA JAWNA' |
|
- text: "Bank Spółdzielczy w Poddębicach" |
|
- text: 'Fundacja Dzieciom "POMAGAJ"' |
|
- text: "KANCELARIA RADCÓW PRAWNYCH BRUDKIEWICZ, SUCHECKA SPÓŁKA KOMANDYTOWO-AKCYJNA" |
|
- text: "AKADEMIA MARYNARKI WOJENNEJ IM. BOHATERÓW WESTERPLATTE" |
|
- text: "ZGROMADZENIE SIÓSTR URSZULANEK UNII RZYMSKIEJ DOM ZAKONNY" |
|
- text: "STOWARZYSZENIE AUTORÓW ZAIKS" |
|
- text: "SKAT TRANSPORT PROSTA SPÓŁKA AKCYJNA" |
|
- text: "Nationale-Nederlanden Dobrowolny Fundusz Emerytalny Nasze Jutro 2055" |
|
- text: "STORY HOUSE EGMONT SPÓŁKA Z OGRANICZONĄ ODPOWIEDZIALNOŚCIĄ" |
|
- text: "Narodowy Fundusz Ochrony Środowiska i Gospodarki Wodnej" |
|
- text: 'ORGANIZACJA ZAKŁADOWA NSZZ "SOLIDARNOŚĆ" NR 3395 W T-MOBILE POLSKA S.A.' |
|
- text: "CI GAMES SPÓŁKA EUROPEJSKA" |
|
- text: "PPK Pocztylion 2040 Dobrowolny Fundusz Emerytalny" |
|
- text: "TOWARZYSTWO UBEZPIECZEŃ WZAJEMNYCH POLSKI ZAKŁAD UBEZPIECZEŃ WZAJEMNYCH" |
|
- text: "KABANEK JANINA POTORSKA ROBERT POTORSKI" |
|
- text: "SPÓŁDZIELCZA KASA OSZCZĘDNOŚCIOWO-KREDYTOWA ENERGIA" |
|
- text: "SZOSTEK_BAR I PARTNERZY KANCELARIA PRAWNA" |
|
- text: "MIEJSKI ZARZĄD BUDYNKÓW MIESZKALNYCH" |
|
- text: "IZBA ADWOKACKA W KATOWICACH" |
|
- text: '1. Niepubliczny Specjalistyczny Zakład Opieki Zdrowotnej "LUNG" Krzysztof Garbino 2. Drukarnia "GARBINO"' |
|
--- |
|
|
|
# LENU - Legal Entity Name Understanding for Poland |
|
|
|
A Polish Bert (uncased) model fine-tuned on Polish legal entity names (jurisdiction PL) from the Global [Legal Entity Identifier](https://www.gleif.org/en/about-lei/introducing-the-legal-entity-identifier-lei) |
|
(LEI) System with the goal to detect [Entity Legal Form (ELF) Codes](https://www.gleif.org/en/about-lei/code-lists/iso-20275-entity-legal-forms-code-list). |
|
|
|
--------------- |
|
|
|
<h1 align="center"> |
|
<a href="https://gleif.org"> |
|
<img src="http://sdglabs.ai/wp-content/uploads/2022/07/gleif-logo-new.png" width="220px" style="display: inherit"> |
|
</a> |
|
</h1><br> |
|
<h3 align="center">in collaboration with</h3> |
|
<h1 align="center"> |
|
<a href="https://sociovestix.com"> |
|
<img src="https://sociovestix.com/img/svl_logo_centered.svg" width="700px" style="width: 100%"> |
|
</a> |
|
</h1><br> |
|
|
|
--------------- |
|
|
|
## Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
The model has been created as part of a collaboration of the [Global Legal Entity Identifier Foundation](https://gleif.org) (GLEIF) and |
|
[Sociovestix Labs](https://sociovestix.com) with the goal to explore how Machine Learning can support in detecting the ELF Code solely based on an entity's legal name and legal jurisdiction. |
|
See also the open source python library [lenu](https://github.com/Sociovestix/lenu), which supports in this task. |
|
|
|
The model has been trained on the dataset [lenu](https://huggingface.co/datasets/Sociovestix), with a focus on polish legal entities and ELF Codes within the Jurisdiction "PL". |
|
|
|
- **Developed by:** [GLEIF](https://gleif.org) and [Sociovestix Labs](https://huggingface.co/Sociovestix) |
|
- **License:** Creative Commons (CC0) license |
|
- **Finetuned from model [optional]:** dkleczek/bert-base-polish-uncased-v1 |
|
- **Resources for more information:** [Press Release](https://www.gleif.org/en/newsroom/press-releases/machine-learning-new-open-source-tool-developed-by-gleif-and-sociovestix-labs-enables-organizations-everywhere-to-automatically-) |
|
|
|
# Uses |
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
|
An entity's legal form is a crucial component when verifying and screening organizational identity. |
|
The wide variety of entity legal forms that exist within and between jurisdictions, however, has made it difficult for large organizations to capture legal form as structured data. |
|
The Jurisdiction specific models of [lenu](https://github.com/Sociovestix/lenu), trained on entities from |
|
GLEIF’s Legal Entity Identifier (LEI) database of over two million records, will allow banks, |
|
investment firms, corporations, governments, and other large organizations to retrospectively analyze |
|
their master data, extract the legal form from the unstructured text of the legal name and |
|
uniformly apply an ELF code to each entity type, according to the ISO 20275 standard. |
|
|
|
|
|
# Licensing Information |
|
|
|
This model, which is trained on LEI data, is available under Creative Commons (CC0) license. |
|
See [gleif.org/en/about/open-data](https://gleif.org/en/about/open-data). |
|
|
|
# Recommendations |
|
|
|
Users should always consider the score of the suggested ELF Codes. For low score values it may be necessary to manually review the affected entities. |
|
|