File size: 8,021 Bytes
256a159
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
03/06 17:26:09 - OpenCompass - INFO - Task [my_api/siqa]
/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
03/06 17:26:17 - OpenCompass - INFO - Start inferencing [my_api/siqa]
[2024-03-06 17:26:18,118] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...

  0%|          | 0/245 [00:00<?, ?it/s]Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
Request Error:Expecting value: line 1 column 1 (char 0)
[2024-03-06 17:26:33,682] torch.distributed.elastic.agent.server.api: [WARNING] Received Signals.SIGINT death signal, shutting down workers
[2024-03-06 17:26:33,682] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 256371 closing signal SIGINT
[2024-03-06 17:26:33,988] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 256371 closing signal SIGTERM
Traceback (most recent call last):
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 727, in run
    result = self._invoke_run(role)
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 868, in _invoke_run
    time.sleep(monitor_interval)
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 62, in _terminate_process_handler
    raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 256257 got signal: 2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 734, in run
    self._shutdown(e.sigval)
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/local_elastic_agent.py", line 311, in _shutdown
    self._pcontext.close(death_sig)
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 318, in close
    self._close(death_sig=death_sig, timeout=timeout)
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 706, in _close
    handler.proc.wait(time_to_wait)
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/subprocess.py", line 1209, in wait
    return self._wait(timeout=timeout)
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/subprocess.py", line 1953, in _wait
    time.sleep(delay)
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 62, in _terminate_process_handler
    raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 256257 got signal: 2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 347, in wrapper
    return f(*args, **kwargs)
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/run.py", line 812, in main
    run(args)
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/run.py", line 803, in run
    elastic_launch(
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 135, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 259, in launch_agent
    result = agent.run()
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 123, in wrapper
    result = f(*args, **kwargs)
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 739, in run
    self._shutdown()
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/local_elastic_agent.py", line 311, in _shutdown
    self._pcontext.close(death_sig)
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 318, in close
    self._close(death_sig=death_sig, timeout=timeout)
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 699, in _close
    handler.close(death_sig=death_sig)
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 584, in close
    os.killpg(self.proc.pid, death_sig)
  File "/export/home/tanwentao1/anaconda3/envs/opencompass/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 62, in _terminate_process_handler
    raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 256257 got signal: 2