Edit model card

mpt-7b-storysummarizer

This is a fine-tuned version of mosaicml/mpt-7b-storywriter intended for summarization and literary analysis of fiction stories.

The code for this model includes the adaptions from Birchlabs/mosaicml-mpt-7b-chat-qlora which allow MPT models to be loaded with device_map="auto" and load_in_8bit=True. It also has the latest key-value cache MPT code to allow for fast inference with transformers (thus, use_cache is set to True in config.json).

How to Use

SOME_FICTION

### SUMMARY:

or

SOME_FICTION

### ANALYSIS:
tokenizer = AutoTokenizer.from_pretrained("emozilla/mpt-7b-storysummarizer")
model = AutoModelForCausalLM.from_pretrained(
  "emozilla/mpt-7b-storysummarizer",
  load_in_8bit=True,
  trust_remote_code=True,
  device_map="auto")

Example outputs

Outputs on the text of Waystation City (6,287 tokens)

Summary

temperature=0.6, repetition_penalty=1.04, top_p=0.95, top_k=50, do_sample=True, max_new_tokens=1024

The Disappeared Twins asks Gerty to record their disappearance and she agrees on the condition that they tell her their story. After hearing their story, Gerty tells them that she will give them passage to the lower city and they accept. They board a late-night tram and Micheala joins them. At the end of the line, they bid goodbye to Gerty and Micheala advises them to wait patiently for their turn to disappear. The twins decide to wait and depart. As they descend the stairs leading into the lower city, they pass a sign warning them of the dangers awaiting them and arrive at the dock. Here, the twins paint a message on the wall reading "We Want To Stay What We Are" and then approach the ferry drivers. The ferry drivers notice them and prepare to attack, but Gerty warns them and they flee. The twins decide to follow the advice of the sign and wait. They watch the boats appear and disappear in the holes in the fabric of reality and eventually one appears that seems to be from the 1970's. The twins run forward and jump into the hole. However, as they leap into the hole, the boat disappears and the hole snaps shut. Claude calls out to Gerty to write down their story and then jumps into the hole. He lands in a field of wheat and waits for Gerty to catch up with him. When she arrives, he instructs her to write down the story of the Disappeared Twins. Claude then dissolves into the hole and Gerty watches as the field turns to sand, then rock, and then water. She sees him shaking his head and then he disappears.

Analysis

temperature=0.75, repetition_penalty=1.04, top_p=0.95, top_k=50, do_sample=True, max_new_tokens=1024

This chapter details the twins' journey to the Lower City, where the reader learns that it is not safe for anyone who wishes to leave to go. By describing the twins' experiences and thoughts, the reader gets a sense of what it feels like to be trapped in a seemingly permanent limbo, unable to move forward or backward in time. In addition, the readers realizes that despite the danger of staying, the twins cannot move on, nor do they seem interested in leaving. The twins seek out Gretel's expertise because they believe that she can help them get home and avoid dissolving entirely. However, Gretel explains that she can only help if they tell her their entire stories, including their pasts--something the twins have thus far avoided doing. Furthermore, they never speak of why they wish to return to the time in which they lived. The fact that they never explain their desire to leave suggests that this limbo is a sort of purgatory, meant to test the twins' resolve to stay or go. Thus, the twins may realize that they do not wish to leave Waystation City, but they have not quite figured out how to fully settle in.

Training

The model was trained on emozilla/booksum-summary-analysis_gptneox-8192, which is adapted from kmfoda/booksum. The training run was performed using llm-foundry on an 8xA100 80 GB node at 8,192 token sequence length using this configuration. The run can be viewed on wandb.

Downloads last month
14
Inference Examples
Inference API (serverless) does not yet support model repos that contain custom code.

Datasets used to train emozilla/mpt-7b-storysummarizer