Spaces:
Running
no kernel image is available for execution on the device
Using ZeroGPU for projects with custom CUDA extensions is highly challenging.
- The lack of available nvcc compilers means I must precompile the .whl file on my local machine.
- Even with a precompiled .whl, I encounter the error "no kernel image is available for execution on the device," despite ensuring that my local build environment exactly matches the dependencies on ZeroGPU.
I've already spent hours troubleshooting this without success. Could you offer any suggestions?
Related spaces: https://huggingface.co/spaces/hzxie/city-dreamer
Error Logs:
To create a public link, set `share=True` in `launch()`.
[INFO] 2024-09-22 05:09:00,080 generated new fontManager
[INFO] 2024-09-22 05:09:00,340 HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-pod8a1341ae_14c0_4a55_a7a2_186a8f02387f.slice%2Fcri-containerd-64dfe113d446587ed59593a3f5a144d066dc5f901e9ee7868a35c6b9bb77552c.scope&taskId=140441127454608&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjIxOS43NC4xMTYuMjA4IiwidXNlciI6Imh6eGllIiwidXVpZCI6bnVsbCwiZXhwIjoxNzI2OTc0NTk5fQ.P-YarKitYJ1myDu5A2W8tIx55s_KYL7YnLys3eMtZWU "HTTP/1.1 200 OK"
[INFO] 2024-09-22 05:09:00,385 HTTP Request: POST http://device-api.zero/allow?allowToken=9867384f64ee42173d467ec34e06815732f3034e8d19d7002f4add340d19a832&pid=276 "HTTP/1.1 200 OK"
[INFO] 2024-09-22 05:09:01,231 CUDA is available: True
[INFO] 2024-09-22 05:09:01,231 PyTorch is built with CUDA: 12.1
[INFO] 2024-09-22 05:09:05,008 Generating latent codes ...
[INFO] 2024-09-22 05:09:05,208 Generating seg volume ...
Error in extrude_tensor_ext_cuda_forward: no kernel image is available for execution on the device
[INFO] 2024-09-22 05:09:05,251 Rendering City Image ...
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
res = future.result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/user/app/app.py", line 94, in get_generated_city
return citydreamer.inference.generate_city(
File "/home/user/app/citydreamer/inference.py", line 80, in generate_city
img = render(
File "/home/user/app/citydreamer/inference.py", line 508, in render
buildings = torch.unique(voxel_id[voxel_id > CONSTANTS["BLD_INS_LABEL_MIN"]])
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
From the global scope after starting the Zero GPU space, the GPU is barely recognized, so if it is a library, etc., it is normal to install as follows, but this is tough when it comes to custom CUDA extensions...
import subprocess
subprocess.run('pip install flash-attn --no-build-isolation', env={'FLASH_ATTENTION_SKIP_CUDA_BUILD': "TRUE"}, shell=True)
@John6666
Thanks for your reply. But this cannot solve my issue.
I moved the building CUDA extension blocks to the global scope as below.
import os
import subprocess
# Compile CUDA extensions
# Ref: https://huggingface.co/spaces/zero-gpu-explorers/README/discussions/110#66ef9672127751a231f83f80
ext_dir = os.path.join(os.path.dirname(__file__), "citydreamer", "extensions")
for e in os.listdir(ext_dir):
if os.path.isdir(os.path.join(ext_dir, e)):
subprocess.call(
["pip", "install", "./%s" % e, "no-build-isolation"], cwd=ext_dir
)
And I got the following errors.
Processing ./extrude_tensor
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'error'
error: subprocess-exited-with-error
ร python setup.py egg_info did not run successfully.
โ exit code: 1
โฐโ> [12 lines of output]
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/home/user/app/citydreamer/extensions/extrude_tensor/setup.py", line 20, in <module>
CUDAExtension(
File "/usr/local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1074, in CUDAExtension
library_dirs += library_paths(cuda=True)
File "/usr/local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1201, in library_paths
if (not os.path.exists(_join_cuda_home(lib_dir)) and
File "/usr/local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2407, in _join_cuda_home
raise OSError('CUDA_HOME environment variable is not set. '
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
ร Encountered error while generating package metadata.
โฐโ> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
Powered by Google (Bing)!
As expected, this can't fool the installer?๐
https://huggingface.co/spaces/zero-gpu-explorers/README/discussions/5
export CUDA_HOME=/usr/local/cuda-X.X
import os
os.environ["CUDA_HOME"] = "/usr/local/cuda-X.X"
from torch.utils.cpp_extension import CUDA_HOME
It seems to actually not exist, at least as far as searches can be made. If it is not important whether it exists or not, it can be fooled, but if the thing is really necessary, is there any way to do it...?
https://huggingface.co/spaces/TencentARC/InstantMesh/blob/main/app.py
import shutil
def find_cuda():
# Check if CUDA_HOME or CUDA_PATH environment variables are set
cuda_home = os.environ.get('CUDA_HOME') or os.environ.get('CUDA_PATH')
if cuda_home and os.path.exists(cuda_home):
return cuda_home
# Search for the nvcc executable in the system's PATH
nvcc_path = shutil.which('nvcc')
if nvcc_path:
# Remove the 'bin/nvcc' part to get the CUDA installation path
cuda_path = os.path.dirname(os.path.dirname(nvcc_path))
return cuda_path
return None
cuda_path = find_cuda()
if cuda_path:
print(f"CUDA installation found at: {cuda_path}")
else:
print("CUDA installation not found")
โ
CUDA installation not found
No need to try, this function will definitely return None.
Okay.
I tried packages.txt to no avail, but even though I could install ffmpeg, which is irrelevant in this case, I couldn't locate cuda or cuda-toolkit in any way...
https://huggingface.co/docs/hub/spaces-dependencies
@John6666
I solved this problem by manually installing CUDA toolkit.
def install_cuda_toolkit():
CUDA_TOOLKIT_URL = "https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run"
# CUDA_TOOLKIT_URL = "https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda_12.2.0_535.54.03_linux.run"
CUDA_TOOLKIT_FILE = "/tmp/%s" % os.path.basename(CUDA_TOOLKIT_URL)
subprocess.call(["wget", "-q", CUDA_TOOLKIT_URL, "-O", CUDA_TOOLKIT_FILE])
subprocess.call(["chmod", "+x", CUDA_TOOLKIT_FILE])
subprocess.call([CUDA_TOOLKIT_FILE, "--silent", "--toolkit"])
os.environ["CUDA_HOME"] = "/usr/local/cuda"
os.environ["PATH"] = "%s/bin:%s" % (os.environ["CUDA_HOME"], os.environ["PATH"])
os.environ["LD_LIBRARY_PATH"] = "%s/lib:%s" % (
os.environ["CUDA_HOME"],
"" if "LD_LIBRARY_PATH" not in os.environ else os.environ["LD_LIBRARY_PATH"],
)
# Fix: arch_list[-1] += '+PTX'; IndexError: list index out of range
os.environ["TORCH_CUDA_ARCH_LIST"] = "8.0;8.6"
Good!
But the inference results are not the same as the ones on my local machine.
Looks like...
To Be Continued...
https://huggingface.co/spaces/zero-gpu-explorers/README/discussions/111#66efcbe227c9867906d4314d