|
--- |
|
license: cc-by-nc-4.0 |
|
datasets: |
|
- passing2961/multifaceted-skill-of-mind |
|
language: |
|
- en |
|
base_model: |
|
- meta-llama/Llama-3.2-3B-Instruct |
|
tags: |
|
- conversational ai |
|
- conversational skill |
|
--- |
|
|
|
# Thanos-3B Model Card |
|
|
|
[π§ Multifaceted Skill-of-Mind](https://huggingface.co/datasets/passing2961/multifaceted-skill-of-mind) | [π€ Thanos-1B](https://huggingface.co/passing2961/Thanos-1B) | [π€ Thanos-8B](https://huggingface.co/passing2961/Thanos-8B) | [π» Github](https://github.com/passing2961/Thanos) | [π Arxiv](https://arxiv.org/abs/2411.04496) | [π PDF](https://arxiv.org/pdf/2411.04496) |
|
|
|
> π¨ Disclaimer: All models and dataset are intended to be used for research purposes only. |
|
|
|
## Model Description |
|
- **Repository:** [Code](https://github.com/passing2961/Thanos) |
|
- **Paper:** Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model |
|
- **Point of Contact:** [Young-Jun Lee](mailto:[email protected]) |
|
|
|
## Model Details |
|
- **Model**: Thanos-series is a fully open-source, skill-of-mind-infused LLM designed to help general conversational agents respond in a more human-like way. |
|
- **Date**: Thanos-series was trained in 2024. |
|
- **Training Dataset**: [100K Multifaceted Skill-of-Mind](https://huggingface.co/datasets/passing2961/multifaceted-skill-of-mind) |
|
- **Architecture**: Thanos-3B was trained on top of [LLaMA-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct). |
|
|
|
## How to Use |
|
|
|
## License and Recommendations |
|
|
|
π¨ Thanos-3B is intended to be used for research purposes only. |
|
|
|
## Acknowledgement |
|
|
|
This work was supported by a grant of the KAIST-KT joint research project through AI Tech Lab, Institute of convergence Technology, funded by KT [Project No. G01230605, Development of Task-oriented Persona-based Dialogue Generation Combining Multi-modal Interaction and Knowledge Modeling]. |
|
|
|
## Citation |
|
|
|
If you find the resources in this repository useful, please cite our work: |
|
|
|
``` |
|
@misc{lee2024thanosenhancingconversationalagents, |
|
title={Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model}, |
|
author={Young-Jun Lee and Dokyong Lee and Junyoung Youn and Kyeongjin Oh and Ho-Jin Choi}, |
|
year={2024}, |
|
eprint={2411.04496}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL}, |
|
url={https://arxiv.org/abs/2411.04496}, |
|
} |
|
``` |