Edit model card

SaiLy 100B (deepnight-research/saily_100B)

Saily: Experimental AI Models by DEEPNIGHT

SaiLy is a series/collection of AI Models by DEEPNIGHT-RESEARCH which are highly experimental and uncensored. Please use with responsibility.



*waiting for evals, the model is submitted on HuggingFace OpenLLM Leaderboard, and is currently in the pending list* Prompt Template: Alpaca
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
{prompt}
### Response:

Description:

This is the first stable model of the series. The model is based on Llama2-chat.


Did some said CODE?

Here you go!

import transformers
model = transformers.AutoModelForCausalLM.from_pretrained(
  'deepnight-research/saily_100B'
)

To use the optimized triton implementation of FlashAttention, you can load the model on GPU (cuda:0) with attn_impl='triton' and with bfloat16 precision:

import torch
import transformers

name = 'deepnight-research/saily_100B'

config = transformers.AutoConfig.from_pretrained(name)
config.attn_config['attn_impl'] = 'triton'
config.init_device = 'cuda:0' # For fast initialization directly on GPU!

model = transformers.AutoModelForCausalLM.from_pretrained(
  name,
  config=config,
  torch_dtype=torch.bfloat16, # Load model weights in bfloat16
  trust_remote_code=True
)

If you would like to support us, please consider donating for #aiforcause.

Cheers✌️

Downloads last month
712
Safetensors
Model size
118B params
Tensor type
FP16
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for deepnight-research/saily_100b

Quantizations
2 models

Collection including deepnight-research/saily_100b