lawma-8b / README.md
ricdomolm's picture
Update README.md
bf804ed verified
|
raw
history blame
3.46 kB
metadata
language:
  - en
license: mit
tags:
  - legal
datasets:
  - ricdomolm/lawma-all-tasks

Lawma 8B

Lawma 8B is a fine-tune of Llama 3 8B Instruct on 260 legal classification tasks derived from Supreme Court and Songer Court of Appeals databases. Lawma was fine-tuned on over 500k task examples, totalling 2B tokens. As a result, Lawma 8B outperforms GPT-4 on 95% of these legal classification tasks, on average by over 17 accuracy points. See our arXiv preprint and GitHub repository for more details.

Evaluations

We report mean classification accuracy across the 260 legal classification tasks that we consider. We use the standard MMLU multiple-choice prompt, and evaluate models zero-shot. You can find our evaluation code here.

Model All tasks Supreme Court tasks Court of Appeals tasks
Lawma 70B 81.9 84.1 81.5
Lawma 8B 80.3 82.4 79.9
GPT4 62.9 59.8 63.4
Llama 3 70B Inst 58.4 47.1 60.3
Mixtral 8x7B Inst 43.2 24.4 46.4
Llama 3 8B Inst 42.6 32.8 44.2
Majority classifier 41.7 31.5 43.5
Mistral 7B Inst 39.9 19.5 43.4
Saul 7B Inst 34.4 20.2 36.8
LegalBert 24.6 13.6 26.4

FAQ

What are the Lawma models useful for? We recommend using the Lawma models only for the legal classification tasks that they models were fine-tuned on. The main take-away of our paper is that specializing models leads to large improvements in performance. Therefore, we strongly recommend practitioners to further fine-tune Lawma on the actual tasks that the models will be used for. Relatively few examples --i.e, dozens or hundreds-- may already lead to large gains in performance.

What legal classification tasks is Lawma fine-tuned on? We consider almost all of the variables of the Supreme Court and Songer Court of Appeals databases. Our reasons to study these legal classification tasks are both technical and substantive. From a technical machine learning perspective, these tasks provide highly non-trivial classification problems where even the best models leave much room for improvement. From a substantive legal perspective, efficient solutions to such classification problems have rich and important applications in legal research.

Citation

This model was trained for the project

Lawma: The Power of Specizalization for Legal Tasks. Ricardo Dominguez-Olmedo and Vedant Nanda and Rediet Abebe and Stefan Bechtold and Christoph Engel and Jens Frankenreiter and Krishna Gummadi and Moritz Hardt and Michael Livermore. 2024

Please cite as:

@misc{dominguezolmedo2024lawmapowerspecializationlegal,
      title={Lawma: The Power of Specialization for Legal Tasks}, 
      author={Ricardo Dominguez-Olmedo and Vedant Nanda and Rediet Abebe and Stefan Bechtold and Christoph Engel and Jens Frankenreiter and Krishna Gummadi and Moritz Hardt and Michael Livermore},
      year={2024},
      eprint={2407.16615},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2407.16615}, 
}