support transformers 4.44

Files changed (5) hide show

README.md CHANGED Viewed

@@ -16,6 +16,8 @@ inference: false
 Read this in [English](README_en.md).
 **2024/07/24，我们发布了与长文本相关的最新技术解读，关注 [这里](https://medium.com/@ChatGLM/glm-long-scaling-pre-trained-model-contexts-to-millions-caa3c48dea85) 查看我们在训练 GLM-4-9B 开源模型中关于长文本技术的技术报告**
 ## 模型介绍

 Read this in [English](README_en.md).
+**2024/08/12, 本仓库代码已更新并使用 `transforemrs>=4.44.0`, 请及时更新依赖。**
 **2024/07/24，我们发布了与长文本相关的最新技术解读，关注 [这里](https://medium.com/@ChatGLM/glm-long-scaling-pre-trained-model-contexts-to-millions-caa3c48dea85) 查看我们在训练 GLM-4-9B 开源模型中关于长文本技术的技术报告**
 ## 模型介绍

README_en.md CHANGED Viewed

@@ -1,5 +1,7 @@
 # GLM-4-9B-Chat
 **On July 24, 2024, we released the latest technical interpretation related to long texts. Check
 out [here](https://medium.com/@ChatGLM/glm-long-scaling-pre-trained-model-contexts-to-millions-caa3c48dea85) to view our
 technical report on long context technology in the training of the open-source GLM-4-9B model.**

 # GLM-4-9B-Chat
+**2024/08/12, The repository code has been updated and now requires `transformers>=4.44.0`. Please update your dependencies accordingly.**
 **On July 24, 2024, we released the latest technical interpretation related to long texts. Check
 out [here](https://medium.com/@ChatGLM/glm-long-scaling-pre-trained-model-contexts-to-millions-caa3c48dea85) to view our
 technical report on long context technology in the training of the open-source GLM-4-9B model.**

config.json CHANGED Viewed

@@ -38,7 +38,7 @@
   "seq_length": 131072,
   "use_cache": true,
   "torch_dtype": "bfloat16",
-  "transformers_version": "4.42.4",
   "tie_word_embeddings": false,
   "eos_token_id": [151329, 151336, 151338],
   "pad_token_id": 151329

   "seq_length": 131072,
   "use_cache": true,
   "torch_dtype": "bfloat16",
+  "transformers_version": "4.44.0",
   "tie_word_embeddings": false,
   "eos_token_id": [151329, 151336, 151338],
   "pad_token_id": 151329

generation_config.json CHANGED Viewed

@@ -9,5 +9,5 @@
   "temperature": 0.8,
   "max_length": 128000,
   "top_p": 0.8,
-  "transformers_version": "4.42.4"
 }

   "temperature": 0.8,
   "max_length": 128000,
   "top_p": 0.8,
+  "transformers_version": "4.44.0"
 }

modeling_chatglm.py CHANGED Viewed

@@ -924,12 +924,9 @@ class ChatGLMForConditionalGeneration(ChatGLMPreTrainedModel):
             outputs: ModelOutput,
             model_kwargs: Dict[str, Any],
             is_encoder_decoder: bool = False,
-            standardize_cache_format: bool = False,
     ) -> Dict[str, Any]:
         # update past_key_values
-        cache_name, cache = self._extract_past_from_model_output(
-            outputs, standardize_cache_format=standardize_cache_format
-        )
         model_kwargs[cache_name] = cache
         # update attention mask

             outputs: ModelOutput,
             model_kwargs: Dict[str, Any],
             is_encoder_decoder: bool = False,
     ) -> Dict[str, Any]:
         # update past_key_values
+        cache_name, cache = self._extract_past_from_model_output(outputs)
         model_kwargs[cache_name] = cache
         # update attention mask