Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Cybersecurity LLM Indic

Model Card

Overview

We present Cybersecurity LLM Indic, a large language model fine-tuned specifically for cybersecurity purposes. This model has been trained on a curated dataset containing cybersecurity data, tips, and guidelines from various Indian government sources. The fine-tuning process involved approximately 3,000 rows of data, ensuring that the model is well-versed in the nuances of cybersecurity within the Indian context.

Base Model

The base model used for this fine-tuning process is Navarasa 2.0 2B Gemma Instruct. This base model is renowned for its versatility and robustness, making it an excellent foundation for building a specialized cybersecurity model.

Training Data

The training dataset comprises a diverse collection of cybersecurity-related information, including:

  • Guidelines and advisories from Indian government agencies
  • Best practices for securing information systems and networks
  • Tips for individuals and organizations to safeguard against cyber threats
  • Case studies and real-world examples of cybersecurity incidents and responses

Training Procedure

The model was fine-tuned using the following procedure:

  • Data Preparation: The raw data was cleaned and preprocessed to ensure high-quality input for training. This involved removing duplicates, correcting formatting issues, and standardizing terminology.
  • Fine-Tuning: The fine-tuning process involved training the model on the prepared dataset for several epochs, optimizing for performance on cybersecurity-related tasks.
  • Evaluation: The model was evaluated on a separate validation set to ensure its accuracy and relevance in providing cybersecurity advice and guidelines.

Use Cases

Cybersecurity LLM Indic can be utilized in various scenarios, including:

  • Education and Training: Providing comprehensive and accurate cybersecurity training materials.
  • Advisory Services: Offering real-time cybersecurity advice and best practices.
  • Policy Development: Assisting policymakers in drafting effective cybersecurity policies.
  • Incident Response: Guiding organizations in responding to cybersecurity incidents.

Limitations

While Cybersecurity LLM Indic is a powerful tool for cybersecurity applications, it has certain limitations:

  • Domain-Specific Knowledge: The model is specialized for cybersecurity within the Indian context and may not perform as well on general or international cybersecurity issues.
  • Data Limitations: The training data consists of approximately 3,000 rows, which, while substantial, may not cover every possible cybersecurity scenario.
  • Continuous Learning: Cybersecurity is a rapidly evolving field, and the model may need periodic updates to stay current with new threats and best practices.

Ethical Considerations

The model was developed with a strong emphasis on ethical considerations, including:

  • Privacy: Ensuring that the training data does not contain sensitive or personally identifiable information.
  • Bias Mitigation: Efforts were made to minimize biases in the training data to ensure fair and unbiased advice.

License

This model is licensed under the Apache-2.0 License.

Contact Information

For more information or to provide feedback, please contact the development team at [contact email].

Cybersecurity LLM Indic

Downloads last month
2
Safetensors
Model size
2.51B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.