Depressive Comments Detective Fine-tuned Model
This model is designed to detect depressive comments. By utilizing depressive comments gathered from Twitter, we have fine-tuned the Gemma 2 2b instruct model to identify signs of depression in textual content.
When the fine-tuned model receives input data, it classifies the text as either depressive or non-depressive. If the text is determined to be depressive, the model outputs a 1, indicating that signs of depression have been detected. Conversely, if the text is not depressive or the model is unable to classify it for various reasons, it returns a 0.
We have integrated this model into a chatbot that analyzes entries from a user's diary. The chatbot detects depressive comments and offers supportive responses to provide emotional support.
1. How we proceeded to fine-tune the model :
First, we tried hard to find a proper dataset to detect depressive comments. As there are a lot of datasets related to depression, we had concerned a lot about which one would be the best to train gemma2. Eventually we decided to use dataset from twitter, that's because we thought that twitter is a social media that people use while not being two faced.
Dataset : https://huggingface.co/datasets/ziq/depression_tweet
Then, we put dataset to gemma 2 models and tried to find which one would be the best one to utilize. As there are 2B, 9B and 27B model, and there are a lot of types of accelorators like TPU, T4, etc, we tried all of them and tested the quality finetuned models after training. And we have found that 2B instruct model is the best one since the training speed was fast especially with TPU (It took only about 600 seconds for 1 epoch) and it is the lightest model to run with limited RAM. (2B model is about 12GB, meanwhile 9B is about 35GB, which are too big to run on the normal cloud platforms like google colab or kaggle notebook) And the ability to classify the depressive comments is also good enough. (Insteresting point is that there are no difference in detection ability between 2B and 9B.)
2. About the detailed training method :
It was very greatful for us that Google prepare TPU as an accelorator on Kaggle notebook so that we could train the model really fast. So using the distribution method, we activated 4 cores of TPU. Since Gemma 2B model only supports parallel processing with only 4 cores, we couldn't fully use the rest of the cores that TPU has. But it was super fast so no need to worry about it! And LoRA was another key factor for us to make the train lightweight and fast. As you already know, LoRA is the one that makes the fine-tuning process very efficient and fast. And We haven't mentioned yet but anyway we have used 2 Gemma models in one chatbot program, which means that the memory usage is really critical. Luckily, LoRA makes everything fine. LoRA weight is way much smaller than the original one so we could deploy 2 models at the same time. Thanks LoRA! You can find the methods and any othe details that i didn't mention on the notebook so please check this out!
Kaggle Notebook : https://www.kaggle.com/code/minkyuuukim/gemma2-2b-tpu-finetuning-depression-detection
3. Result :
Model | Accuracy | Train Set Size | Test Set Size | Epoch | Training Time |
---|---|---|---|---|---|
2B (TPU) | 90.5% | 1000 | 1000 | 1 | about 3 min |
2B (TPU) | 95% | entire dataset | entire dataset | 1 | about 2700 sec |
2B (TPU) | 95% | entire dataset | entire dataset | 5 | about 13500 sec |
As I already mentioned, the quality of detection of this fine-tuned model is really great. With the given test set, we compared the test set's classification and the gemma's analysis. Fine-tuned Gemma shows 95% of detection accuracy, which means that the model itself is well trained and good at detecting the depressive contexts.
4. Further Information :
This model is used in a chatbot program which detects depressive comments on the diary and gives supportive advices with counseling insights. As a part of the program, we have prepared one more model and it's about the counseling. If you wanna find more informations about counseling, please cheack it out to find out the other model that are used as a counselor!
Counseling Model : https://huggingface.co/yunzi7/gemma2_2b_it_en_counseling
And you could also find and try the full program with 2 models constructed in Gradio. Gradio is the framework that we used to deploy the program easier and faster. With this framework, we put diary and chatbot interface in one page. So detective gemma can detect blue comments from the diary, and after pressing the submit button, counseling gemma can give an advice based on its insight and information. One thing you should know before trying the chatbot is that, the program itself is quite slow :( We perchased better GPU but it still takes long to get the answer back. So please be patient and wait for the comment if you wanna try.
Depression Detective Diary and Chatbot : https://huggingface.co/spaces/fidelkim/depression_detective_diary_chatbot
End of Description
This model has been uploaded using the Keras library and can be used with JAX, TensorFlow, and PyTorch backends.
This model card has been generated automatically and should be completed by the model author. See Model Cards documentation for more information.
For more details about the model architecture, check out config.json.
- Downloads last month
- 17