zelalt commited on
Commit
96e0fa0
1 Parent(s): 16923de

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -11
README.md CHANGED
@@ -25,16 +25,12 @@ This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co
25
  It achieves the following results on the evaluation set:
26
  - Loss: 2.1587
27
 
28
- ## Model description
29
-
30
- ## Sample Code
31
-
32
  ### Requirements
33
  ```python
34
  !pip install accelerate transformers einops datasets peft bitsandbytes
35
  ```
36
 
37
- ### Test Dataset
38
  If you prefer, you can use test dataset from [zelalt/scientific-papers](https://huggingface.co/datasets/zelalt/scientific-papers)
39
  or [zelalt/arxiv-papers](https://huggingface.co/datasets/zelalt/arxiv-papers) or read your pdf as text with PyPDF2.PdfReader then give this text to LLM with adding "What is the title of this paper?" prompt.
40
 
@@ -44,14 +40,14 @@ from datasets import load_dataset
44
  test_dataset = load_dataset("zelalt/scientific-papers", split='train')
45
  test_dataset = test_dataset.rename_column('full_text', 'text')
46
 
47
- def formatting_prompts_func(example):
48
  text = f"What is the title of this paper? {example['text'][:180]}\n\nAnswer: "
49
  return {'text': text}
50
 
51
- formatted_dataset = test_dataset.map(formatting_prompts_func)
52
  ```
53
 
54
- ### Inference
55
  ```python
56
 
57
  import torch
@@ -79,17 +75,18 @@ text = tokenizer.batch_decode(outputs)[0]
79
  print(text)
80
  ```
81
 
82
- After running it for the first time and loading the model and tokenizer, you can only run generating part to avoid RAM crash.
 
83
 
84
  ### Output
85
  Input:
86
- ```
87
  What is the title of this paper? Bursting Dynamics of the 3D Euler Equations\nin Cylindrical Domains\nFrançois Golse ∗ †\nEcole Polytechnique, CMLS\n91128 Palaiseau Cedex, France\nAlex Mahalov ‡and Basil Nicolaenko §\n\nAnswer:
88
  ```
89
 
90
  ## Output from LLM:
91
 
92
- ```
93
  What is the title of this paper? Bursting Dynamics of the 3D Euler Equations
94
  in Cylindrical Domains
95
  François Golse ∗ †
 
25
  It achieves the following results on the evaluation set:
26
  - Loss: 2.1587
27
 
 
 
 
 
28
  ### Requirements
29
  ```python
30
  !pip install accelerate transformers einops datasets peft bitsandbytes
31
  ```
32
 
33
+ ## Test Dataset
34
  If you prefer, you can use test dataset from [zelalt/scientific-papers](https://huggingface.co/datasets/zelalt/scientific-papers)
35
  or [zelalt/arxiv-papers](https://huggingface.co/datasets/zelalt/arxiv-papers) or read your pdf as text with PyPDF2.PdfReader then give this text to LLM with adding "What is the title of this paper?" prompt.
36
 
 
40
  test_dataset = load_dataset("zelalt/scientific-papers", split='train')
41
  test_dataset = test_dataset.rename_column('full_text', 'text')
42
 
43
+ def formatting(example):
44
  text = f"What is the title of this paper? {example['text'][:180]}\n\nAnswer: "
45
  return {'text': text}
46
 
47
+ formatted_dataset = test_dataset.map(formatting)
48
  ```
49
 
50
+ ### Sample Code
51
  ```python
52
 
53
  import torch
 
75
  print(text)
76
  ```
77
 
78
+ **Notes**
79
+ - After running it for the first time and loading the model and tokenizer, you can only run generating part to avoid RAM crash.
80
 
81
  ### Output
82
  Input:
83
+ ```markdown
84
  What is the title of this paper? Bursting Dynamics of the 3D Euler Equations\nin Cylindrical Domains\nFrançois Golse ∗ †\nEcole Polytechnique, CMLS\n91128 Palaiseau Cedex, France\nAlex Mahalov ‡and Basil Nicolaenko §\n\nAnswer:
85
  ```
86
 
87
  ## Output from LLM:
88
 
89
+ ```markdown
90
  What is the title of this paper? Bursting Dynamics of the 3D Euler Equations
91
  in Cylindrical Domains
92
  François Golse ∗ †