DPT 3.0 - a Intel Collection

Intel 's Collections

AI PC: Text Generation

AI PC: Text-to-Image

AI PC: Audio Classification

AI PC: Automatic Speech Recognition

AI PC: Feature Extraction

AI PC: Image Classification

AI PC: Image-to-Text

AI PC: Masked Language Models

AI PC: Question Answering

AI PC: Text Classification

AI PC: Text2Text Generation

AI PC: Token Classification

AI PC: Translation

DPT 3.1

Intel Neural Chat

Whisper

Mistral

Stable Diffusion

GPT

DPT 3.0

BGE

TVP

Table Transformer

BERT

ALBERT

RoBERTa

DeBERTa

ColBERT

MiniLM

BART

NQ

ELI5

T5

Electra

XLNet

ViT

LDM3D collection

DPT 3.0

updated Aug 23

DPT 3.0 (MiDaS) models, leveraging ViT and ViT-hybrid backbones

Vision Transformers for Dense Prediction

Paper • 2103.13413 • Published Mar 24, 2021 • 1

Note We introduce dense vision transformers, an architecture that leverages vision transformers in place of convolutional networks as a backbone for dense prediction tasks. We assemble tokens from various stages of the vision transformer into image-like representations at various resolutions and progressively combine them into full-resolution predictions using a convolutional decoder.
Intel/dpt-large

Depth Estimation • Updated Feb 24 • 159k • 182

Note This model leverages a Vision Transformer (ViT) backbone for monocular depth estimation.
Intel/dpt-hybrid-midas

Depth Estimation • Updated Feb 9 • 292k • 84

Note This model leverages a hybrid Vision Transformer (ViT-hybrid) backbone for monocular depth estimation.
Intel/dpt-large-ade

Image Segmentation • Updated Mar 25 • 1.4k • 7