Post
263
We just released Transformers.js v3.1 and you're not going to believe what's now possible in the browser w/ WebGPU! ๐คฏ Let's take a look:
๐ Janus from Deepseek for unified multimodal understanding and generation (Text-to-Image and Image-Text-to-Text)
๐๏ธ Qwen2-VL from Qwen for dynamic-resolution image understanding
๐ข JinaCLIP from Jina AI for general-purpose multilingual multimodal embeddings
๐ LLaVA-OneVision from ByteDance for Image-Text-to-Text generation
๐คธโโ๏ธ ViTPose for pose estimation
๐ MGP-STR for optical character recognition (OCR)
๐ PatchTST & PatchTSMixer for time series forecasting
That's right, everything running 100% locally in your browser (no data sent to a server)! ๐ฅ Huge for privacy!
Check out the release notes for more information. ๐
https://github.com/huggingface/transformers.js/releases/tag/3.1.0
Demo link (+ source code): webml-community/Janus-1.3B-WebGPU
๐ Janus from Deepseek for unified multimodal understanding and generation (Text-to-Image and Image-Text-to-Text)
๐๏ธ Qwen2-VL from Qwen for dynamic-resolution image understanding
๐ข JinaCLIP from Jina AI for general-purpose multilingual multimodal embeddings
๐ LLaVA-OneVision from ByteDance for Image-Text-to-Text generation
๐คธโโ๏ธ ViTPose for pose estimation
๐ MGP-STR for optical character recognition (OCR)
๐ PatchTST & PatchTSMixer for time series forecasting
That's right, everything running 100% locally in your browser (no data sent to a server)! ๐ฅ Huge for privacy!
Check out the release notes for more information. ๐
https://github.com/huggingface/transformers.js/releases/tag/3.1.0
Demo link (+ source code): webml-community/Janus-1.3B-WebGPU