File size: 1,211 Bytes
7dd67ed
c6abecd
7dd67ed
 
 
 
 
 
 
 
72ffeb3
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
---
title: News Summarizer and NER
emoji: 🌍
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
license: mit
---

#### New Summarization and NER

News summarization uses "facebook/bart-base" that is fine-tuned using TensorFlow for summarization using 
<a href = "https://www.kaggle.com/datasets/gowrishankarp/newspaper-text-summarization-cnn-dailymail" target="_blank">CNN news articles</a> dataset.<br><br>
NER uses "microsoft/deberta-base" that is fine-tuned using TensorFlow for token classification (NER) using this 
<a href="https://www.kaggle.com/datasets/saurabhprajapat/named-entity-recognition" target="_blank">dataset</a>.<br>The fine-tuning dataset contains annotated sentences.<br>
During inference, the input text is split into sentences using Spacy and entities are identified in each sentence.<br>

The notebook to fine-tune "facebook/bart-base" for news summarization can be found <a href="https://github.com/ksv-muralidhar/hugging_face_tf_fine_tuning/blob/main/bart_en_summarization.ipynb">here</a>.<br>
The notebook to fine-tune "microsoft/deberta-base" for NER can be found <a href="https://github.com/ksv-muralidhar/hugging_face_tf_fine_tuning/blob/main/ner_deberta.ipynb">here</a>.