sergioburdisso commited on
Commit
de6d9d9
1 Parent(s): 3eaee80

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -8
README.md CHANGED
@@ -11,14 +11,13 @@ datasets:
11
  - Salesforce/dialogstudio
12
  pipeline_tag: sentence-similarity
13
  base_model:
14
- - aws-ai/dse-bert-base
15
  ---
16
 
17
 
18
- # Dialog2Flow single target (DSE-base)
19
 
20
- This a variation of the **D2F$_{single}$** model introduced in the paper ["Dialog2Flow: Pre-training Soft-Contrastive Action-Driven Sentence Embeddings for Automatic Dialog Flow Extraction"](https://publications.idiap.ch/attachments/papers/2024/Burdisso_EMNLP2024_2024.pdf) published in the EMNLP 2024 main conference.
21
- This version uses DSE-base as the backbone model which yields to an increase in performance as compared to the vanilla version using BERT-base as the backbone (results reported in Appendix C).
22
 
23
  Implementation-wise, this is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or search.
24
 
@@ -38,7 +37,7 @@ Then you can use the model like this:
38
  from sentence_transformers import SentenceTransformer
39
  sentences = ["your phone please", "okay may i have your telephone number please"]
40
 
41
- model = SentenceTransformer('sergioburdisso/dialog2flow-single-dse-base')
42
  embeddings = model.encode(sentences)
43
  print(embeddings)
44
  ```
@@ -64,8 +63,8 @@ def mean_pooling(model_output, attention_mask):
64
  sentences = ['your phone please', 'okay may i have your telephone number please']
65
 
66
  # Load model from HuggingFace Hub
67
- tokenizer = AutoTokenizer.from_pretrained('sergioburdisso/dialog2flow-single-dse-base')
68
- model = AutoModel.from_pretrained('sergioburdisso/dialog2flow-single-dse-base')
69
 
70
  # Tokenize sentences
71
  encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
@@ -154,4 +153,4 @@ SentenceTransformer(
154
  ## License
155
 
156
  Copyright (c) 2024 [Idiap Research Institute](https://www.idiap.ch/).
157
- MIT License.
 
11
  - Salesforce/dialogstudio
12
  pipeline_tag: sentence-similarity
13
  base_model:
14
+ - google-bert/bert-base-uncased
15
  ---
16
 
17
 
18
+ # Dialog2Flow joint target (BERT-base)
19
 
20
+ This is the original **D2F$_{joint}$** model introduced in the paper ["Dialog2Flow: Pre-training Soft-Contrastive Action-Driven Sentence Embeddings for Automatic Dialog Flow Extraction"](https://publications.idiap.ch/attachments/papers/2024/Burdisso_EMNLP2024_2024.pdf) published in the EMNLP 2024 main conference.
 
21
 
22
  Implementation-wise, this is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or search.
23
 
 
37
  from sentence_transformers import SentenceTransformer
38
  sentences = ["your phone please", "okay may i have your telephone number please"]
39
 
40
+ model = SentenceTransformer('sergioburdisso/dialog2flow-joint-bert-base')
41
  embeddings = model.encode(sentences)
42
  print(embeddings)
43
  ```
 
63
  sentences = ['your phone please', 'okay may i have your telephone number please']
64
 
65
  # Load model from HuggingFace Hub
66
+ tokenizer = AutoTokenizer.from_pretrained('sergioburdisso/dialog2flow-joint-bert-base')
67
+ model = AutoModel.from_pretrained('sergioburdisso/dialog2flow-joint-bert-base')
68
 
69
  # Tokenize sentences
70
  encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
 
153
  ## License
154
 
155
  Copyright (c) 2024 [Idiap Research Institute](https://www.idiap.ch/).
156
+ MIT License.