Edit model card

Model Card for Model ID

[ 850/1478 4:08:38 < 3:04:08, 0.06 it/s, Epoch 141/247]
Step	Training Loss	Validation Loss
50	1.889000	1.862223
100	1.871100	1.832502
150	1.822600	1.765569
200	1.741100	1.677201
250	1.633600	1.558259
300	1.503000	1.429101
350	1.407300	1.384786
400	1.373600	1.356308
450	1.347900	1.339552
500	1.333900	1.329472
550	1.324200	1.321458
600	1.315300	1.314869
650	1.306500	1.309380
700	1.300400	1.304810
750	1.294500	1.300931
800	1.288500	1.297661
850	1.283600	1.294858
Run history:

eval/loss	██▇▆▄▃▂▂▂▁▁▁▁▁▁▁▁
eval/runtime	██▇▆▅▅▆▇▄▆▅▄▅▂▃▂▁
eval/samples_per_second	▁▁▂▃▄▄▃▂▅▃▄▅▅▇▆▇█
eval/steps_per_second	▁▁▅▅▅▅▅▁▅▅▅▅▅████
train/epoch	▁▁▁▁▂▂▂▂▃▃▃▃▄▄▄▄▄▄▅▅▅▅▆▆▆▆▇▇▇▇█████
train/global_step	▁▁▁▁▂▂▂▂▃▃▃▃▄▄▄▄▅▅▅▅▅▅▆▆▆▆▇▇▇▇█████
train/grad_norm	▇██▇▆▅▂▂▁▁▁▁▁▁▁▁▁
train/learning_rate	▁▂▃▄▅▅▆▇████▇▇▆▆▅
train/loss	██▇▆▅▄▂▂▂▂▁▁▁▁▁▁▁

Run summary:

eval/loss	1.29486
eval/runtime	19.1369
eval/samples_per_second	11.914
eval/steps_per_second	1.515
total_flos	2.208419516825174e+18
train/epoch	141.66667
train/global_step	850
train/grad_norm	0.05381
train/learning_rate	1e-05
train/loss	1.2836
train_loss	1.47271
train_runtime	14936.8511
train_samples_per_second	12.666
train_steps_per_second	0.099

image/png image/png image/png image/png image/png image/png image/png image/png image/png

Model Details

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

  • Developed by: [More Information Needed]
  • Funded by [optional]: [More Information Needed]
  • Shared by [optional]: [More Information Needed]
  • Model type: [More Information Needed]
  • Language(s) (NLP): [More Information Needed]
  • License: [More Information Needed]
  • Finetuned from model [optional]: [More Information Needed]

Model Sources [optional]

  • Repository: [More Information Needed]
  • Paper [optional]: [More Information Needed]
  • Demo [optional]: [More Information Needed]

Uses

Direct Use

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

[More Information Needed]

Bias, Risks, and Limitations

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

  • Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: [More Information Needed]
  • Hours used: [More Information Needed]
  • Cloud Provider: [More Information Needed]
  • Compute Region: [More Information Needed]
  • Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

Downloads last month
8
Safetensors
Model size
4.48B params
Tensor type
F32
·
U8
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.