ammarnasr commited on
Commit
0c8cbf1
1 Parent(s): d3d57c9

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -0
README.md ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - ammarnasr/the-stack-swift-clean
5
+ library_name: adapter-transformers
6
+ tags:
7
+ - code
8
+ pipeline_tag: text-generation
9
+ language:
10
+ - code
11
+ ---
12
+
13
+
14
+ # CodeGen (CodeGen-Mono 350M LoRa Swift)
15
+
16
+ ## Model description
17
+ CodeGen LoRa Swift is a family of autoregressive language models fine-tuned using LoRa on Different Programming Langauges.
18
+ ## Training data
19
+ <!-- https://huggingface.co/datasets/ammarnasr/the-stack-swift-clean -->
20
+ This model was fine-tuned on the cleaned Swift subset from TheStack Avilable [here](https://huggingface.co/datasets/ammarnasr/the-stack-swift-clean). The data consists of 1 Million Swift code files.
21
+
22
+ ## Training procedure
23
+
24
+ This model was fine-tuned using LoRa on 1 T4 GPU. The model was trained for 10,000 steps with batch size of 4. The model was trained using causal language modeling loss.
25
+
26
+ ## Evaluation results
27
+
28
+ We evaluate our models on the MultiPle-E bencchmark. The model achieves 8.9 Pass@10 Rate.
29
+
30
+
31
+ ## Intended Use and Limitations
32
+
33
+ However, the model is intended for and best at **program synthesis**, that is, generating executable code given English prompts, where the prompts should be in the form of a comment string. The model can complete partially-generated code in Swift and Python.
34
+
35
+ ## How to use
36
+
37
+ This model can be easily loaded using the `AutoModelForCausalLM` functionality:
38
+
39
+ ```python
40
+ from transformers import AutoTokenizer, AutoModelForCausalLM
41
+ tokenizer = AutoTokenizer.from_pretrained("ammmarnasr/codegen-350M-mono-swift")
42
+ model = AutoModelForCausalLM.from_pretrained("Salesforce/codegen-350M-mono")
43
+
44
+ text = "def hello_world():"
45
+ input_ids = tokenizer(text, return_tensors="pt").input_ids
46
+
47
+ generated_ids = model.generate(input_ids, max_length=128)
48
+ print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))
49
+ ```
50
+
51
+ ## BibTeX entry and citation info
52
+
53
+ ```bibtex
54
+ @article{Nijkamp2022ACP,
55
+ title={A Conversational Paradigm for Program Synthesis},
56
+ author={Nijkamp, Erik and Pang, Bo and Hayashi, Hiroaki and Tu, Lifu and Wang, Huan and Zhou, Yingbo and Savarese, Silvio and Xiong, Caiming},
57
+ journal={arXiv preprint},
58
+ year={2022}
59
+ }
60
+ ```