File size: 4,246 Bytes
d5bdfe9
e90feb1
 
 
 
 
d5bdfe9
e90feb1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d5bdfe9
e90feb1
 
 
 
 
d5bdfe9
e90feb1
 
 
 
d5bdfe9
e90feb1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d5bdfe9
e90feb1
 
 
 
 
d5bdfe9
e90feb1
 
 
 
d5bdfe9
e90feb1
 
 
d5bdfe9
e90feb1
 
 
 
 
 
 
 
 
 
 
 
d5bdfe9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
{
 "nbformat": 4,
 "nbformat_minor": 0,
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "name": "python3",
   "display_name": "Python 3"
  },
  "language_info": {
   "name": "python"
  }
 },
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "47fPyWltjSqE"
   },
   "outputs": [],
   "source": [
    "!pip install transformers sentencepiece"
   ]
  },
  {
   "cell_type": "code",
   "source": [
    "from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer\n",
    "\n",
    "hi_text = \"जीवन एक चॉकलेट बॉक्स की तरह है।\"\n",
    "chinese_text = \"生活就像一盒巧克力。\"\n",
    "\n",
    "model = M2M100ForConditionalGeneration.from_pretrained(\"facebook/m2m100_1.2B\")\n",
    "model.eval()\n",
    "\"\"\"\n",
    "在PyTorch中,`model.eval()`是用来将模型设置为评估(evaluation)模式的方法。在深度学习中,训练和评估两个阶段的模型行为可能会有所不同。以下是`model.eval()`的主要作用:\n",
    "\n",
    "1. **Batch Normalization和Dropout的影响:**\n",
    "- 在训练阶段,`Batch Normalization`和`Dropout`等层的行为通常是不同的。在训练时,`Batch Normalization`使用批次统计信息来规范化输入,而`Dropout`层会随机丢弃一些神经元。在评估阶段,我们通常希望使用整个数据集的统计信息来规范化,而不是每个批次的统计信息,并且不再需要随机丢弃神经元。因此,通过执行`model.eval()`,模型会切换到评估模式,从而确保这些层的行为在评估时是正确的。\n",
    "\n",
    "2. **梯度计算的关闭:**\n",
    "- 在评估模式下,PyTorch会关闭自动求导(autograd)的计算图,这样可以避免不必要的梯度计算和内存消耗。在训练时,我们通常需要计算梯度以进行反向传播和参数更新,而在评估时,我们只对模型的前向传播感兴趣,因此关闭梯度计算可以提高评估的速度和减少内存使用。\n",
    "\n",
    "总的来说,执行`model.eval()`是为了确保在评估阶段模型的行为和性能是正确的,并且可以提高评估时的效率。\n",
    "\"\"\"\n",
    "tokenizer = M2M100Tokenizer.from_pretrained(\"facebook/m2m100_1.2B\")"
   ],
   "metadata": {
    "id": "ziPisPX_jXNC"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "source": [
    "# translate Hindi to French\n",
    "tokenizer.src_lang = \"hi\"\n",
    "encoded_hi = tokenizer(hi_text, return_tensors=\"pt\")\n",
    "generated_tokens = model.generate(\n",
    "    **encoded_hi, forced_bos_token_id=tokenizer.get_lang_id(\"fr\")\n",
    ")\n",
    "tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)"
   ],
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "00h7PwrOjehw",
    "outputId": "eb4e92ec-5e00-452d-8ead-d06d2e23b78e"
   },
   "execution_count": 3,
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "['La vie est comme une boîte de chocolat.']"
      ]
     },
     "metadata": {},
     "execution_count": 3
    }
   ]
  },
  {
   "cell_type": "code",
   "source": [
    "# translate Chinese to English\n",
    "tokenizer.src_lang = \"zh\"\n",
    "encoded_zh = tokenizer(chinese_text, return_tensors=\"pt\")\n",
    "generated_tokens = model.generate(\n",
    "    **encoded_zh, forced_bos_token_id=tokenizer.get_lang_id(\"en\")\n",
    ")\n",
    "tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)"
   ],
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "ifzvH6Ezj62j",
    "outputId": "c5c6307d-5811-4978-f565-709e22d4a16b"
   },
   "execution_count": 4,
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "['Life is like a box of chocolate.']"
      ]
     },
     "metadata": {},
     "execution_count": 4
    }
   ]
  },
  {
   "cell_type": "code",
   "source": [],
   "metadata": {
    "id": "YwHxXY-RkDPH"
   },
   "execution_count": null,
   "outputs": []
  }
 ]
}