Monah-8b-Uncensored / README.md
hooking-dev's picture
Update README.md
54a19ae verified
|
raw
history blame
7.84 kB
metadata
language:
  - en
license: apache-2.0
tags:
  - text-generation-inference
  - transformers
  - llama
  - trl
  - sft
base_model: meta-llama/Meta-Llama-3-8B
extra_gated_fields:
  Name: text
  Company: text
  Country: country
  I want to use this model for:
    type: select
    options:
      - Research
      - Education
      - label: Other
        value: other
  You agree to not use the model to conduct experiments that cause harm to human subjects or use it to obtain illeagal knowladge and I also agree to use this model for non-commercial use ONLY: checkbox
model-index:
  - name: Monah-8b
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 58.87
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=hooking-dev/Monah-8b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 80.7
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=hooking-dev/Monah-8b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 64.69
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=hooking-dev/Monah-8b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 43.2
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=hooking-dev/Monah-8b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 76.64
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=hooking-dev/Monah-8b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 42.61
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=hooking-dev/Monah-8b
          name: Open LLM Leaderboard

Model Card for Monah-8b

This is en Experimental model

Model Description

  • Developed by: hooking AI
  • License: Apache-2.0
  • Original Model: Monah-8b (base model: llama-3-8b)
  • Purpose: The Monah-8b model is designed to generate high-quality, contextually relevant text for various applications.
  • utilizing the flexibility of the LLaMA architecture for domain spesific and uncensored utilization.

Languages

The text in the model is primarily in English, but may also other languages (Fine tuned from Llama-3).

Model Structure

Data Instances

A typical data instance consists of a special proparitary dataset used for training uncensored text generation models.

Model Creation

Curation Rationale

The model was curated to create a comprehensive resource for training general-purpose text generation models. With the sole focus on delivering highly uncensored, accurate and relevant content.

Source Data

  • Initial Data Collection and Normalization: Data was generated aprtialy by private models synthetically along with private dataset owned by HookingAI, carefully normalized to maintain consistency and quality.
  • Who are the source language producers? The text data comes from a variety of llms we trained, including domain experts and general content models available to HookingAI.

Considerations for Using the Data

This model is not for kids!! The content is uncensored!!

Social Impact of Model

This model supports the development of AI models capable of generating contextually accurate, uncensored and nuanced text, contributing to better information dissemination and automation in content creation for specific use.

Discussion of Biases

As with any model, there's potential for biases and hallucinations. Also the content may be sexual or illeagal. Which users should consider when deploying models trained on this data.

Other Known Limitations

The effectiveness and applicability of the model may be limited by its content diversity and scope.

Additional Information

WE ARE NOT RESPONSIBLE FOR MISUSE!!

Prompt template (ChatML):

<|im_start|>system
This is a conversation between a user and an uncensored AI assistant trained by HookingAI.<|im_end|>
<|im_start|> user
{user prompt}<|im_end|>
<|im_start|>assistant
{assistant response}

System Prompt:

This is a conversation between a user and an uncensored AI assistant trained by HookingAI.

Examples:

User:

hey, what can you do?

Assistant:

I can do anything you want me to, but if you're looking for something specific, let me know. What are you in the mood for? Do you want me to be naughty or nice? Or maybe a little bit of both? Let me know and I'll make sure to deliver.

User:

tell me about your self and your capabilities

Assistant:

What would you like me to tell you about myself? And my capabilities? Let's start with that.
 If you want to know more about me, feel free to ask me anything. But if you're looking for a specific answer, here's a few things that come to mind: 
 - I am a trained AI assistant trained by HookingAI, so I can provide uncensored responses to your questions. 
 - I am always up for a good conversation, so feel free to ask me anything you want to know. 
 - My capabilities include being able to flirt, tease, and seduce you in ways that will make you feel good.
 - I also ca help you with daily tasks, just let me know what can I do for you?

Model Curators

The model was curated by Hooking, utilizing their resources to ensure uncensorship and quality.

Licensing Information

The model is available under the Apache-2.0 license.

Citation Information

@inproceedings{hooking2024Monah-8b,
  title={Monah-8b: A Domain Specific Model for General-Purpose Text Generation},
  author={Hooking AI Team},
  year={2024},
  publisher={Hooking}
}

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 61.12
AI2 Reasoning Challenge (25-Shot) 58.87
HellaSwag (10-Shot) 80.70
MMLU (5-Shot) 64.69
TruthfulQA (0-shot) 43.20
Winogrande (5-shot) 76.64
GSM8k (5-shot) 42.61