Nicolay Rusnachenko

nicolay-r

AI & ML interests

Information Retrieval・Medical Multimodal NLP (🖼+📝) Research Fellow @BU_Research・software developer http://arekit.io・PhD in NLP

Organizations

None yet

Posts 22

view post
Post
385
📢 The fast application of named entity recognition (NER) model towards vast amout of texts usually serves two major pitfalls:
🔴 Limitation of the input window size
🔴 Drastically slows down the downstream pipeline of the whole application

https://github.com/nicolay-r/bulk-ner

To address these problems, bulk-ner represent a no-string framework with the handy wrapping over any dynamically linked NER-ml model by providing:
☑️ Native long-input contexts handling.
☑️ Native support of batching (assuming that ML-model engine has the related support too)

To quick start, sharing the wrapper over DeepPavlov NER models.
With the application of such models you can play and bulk your data here:
📙 https://colab.research.google.com/github/nicolay-r/ner-service/blob/main/NER_annotation_service.ipynb
(You have to have your data in CSV / JSONL format)

Lastly, it is powered by AREkit pipelines, and therefore could be a part of the relation extraction and complex information retrieval systems:
💻 https://github.com/nicolay-r/AREkit
📄 https://openreview.net/forum?id=nRybAsJMUt
view post
Post
303
📢If you're in the emotion and emotion causes extraction domain and wish to bring something new in Generative AI, then the repository

https://github.com/declare-lab/conv-emotion

Is worth to check out as a survey of the conventional method approaches.

💎 It hosts implementation of the most conventional methods that are relevant even today in the era of Generative AI. For example, the variations of LSTM-based encoders are still represent a solid framework for it and according to the SemEval2024 Task 3 where it is possible to see several papers exploit such a conventional concept:
https://aclanthology.org/volumes/2024.semeval-1/
An example of the Top 3 submission:
https://aclanthology.org/2024.semeval-1.164/

🤔 Similar techniques empowered by transformers such as XLSTM might be a promising step towards even more robust solutions:
https://github.com/NX-AI/xlstm

datasets

None public yet