Spaces:
Running
Running
Merge branch 'main' into pr/26
Browse files- papers.csv +12 -12
papers.csv
CHANGED
@@ -575,7 +575,7 @@ FeatEnHancer: Enhancing Hierarchical Features for Object Detection and Beyond Un
|
|
575 |
DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds,"Ma, Tao*; Yang, Xuemeng; Zhou, Hongbin; Li, Xin; Shi, Botian; Liu, Junjie; Yang, Yuchen; Liu, Zhizheng; He, Liang; Li, Hongsheng; Li, Yikang; Qiao, Yu",poster,2306.06023,https://arxiv.org/abs/2306.06023,,https://huggingface.co/papers/2306.06023,,,,12,0
|
576 |
DETRs with Collaborative Hybrid Assignments Training,"Zong, Zhuofan*; Song, Guanglu; Liu, Yu",poster,2211.12860,https://arxiv.org/abs/2211.12860,https://github.com/Sense-X/Co-DETR,https://huggingface.co/papers/2211.12860,,,,3,0
|
577 |
Open Vocabulary Object Detection With an Open Corpus,"Wang, Jiong*; zhang, huiming; Hong, Haiwen; Jin, Xuan; He, Yuan; xue, hui; Zhao, Zhou",poster,,,,,,,,,
|
578 |
-
SparseDet: Improving Sparsely Annotated Object Detection with Pseudo-positive Mining,"Suri, Saksham*; Rambhatla, Sai Saketh ; Chellappa, Rama; Shrivastava, Abhinav",poster,2201.04620,https://arxiv.org/abs/2201.04620,,https://huggingface.co/papers/2201.04620,,,,4,
|
579 |
Unsupervised Anomaly Detection with Diffusion Probabilistic Model,"Zhang, Xinyi*; Li, Naiqi; Li, Jiawei; Dai, Tao; Jiang, Yong; Xia, Shu-Tao",poster,,,,,,,,,
|
580 |
UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation,"Wang, Haiyang*; Tang, Hao; Shi, Shaoshuai; Li, Aoxue; Li, Zhenguo; Schiele, Bernt; Wang, Liwei",poster,,,,,,,,,
|
581 |
Focus the Discrepancy: Intra- and Inter-Correlation Learning for Image Anomaly Detection,"Yao, Xincheng*; Li, Ruoqi; Qian, Zefeng; Luo, Yan; Zhang, Chongyang",poster,,,,,,,,,
|
@@ -592,7 +592,7 @@ Delving into Motion-Aware Matching for Monocular 3D Object Tracking,"Huang, Kuan
|
|
592 |
FB-BEV: BEV Representation from Forward-Backward View Transformations,"Li, Zhiqi*; Yu, Zhiding; Wang, Wenhai; Anandkumar, Animashree; Lu, Tong; Alvarez, Jose M",poster,,,,,,,,,
|
593 |
Learning from Noisy Data for Semi-Supervised 3D Object Detection,"Chen, Zehui; Li, Zhenyu; Wang, Shuo; Fu, Dengpan; Zhao, Feng*",poster,,,,,,,,,
|
594 |
Boosting Long-tailed Object Detection via Step-wise Learning on Smooth-tail Data,"Dong, Na*; Zhang, Yongqiang; Ding, Mingli; Lee, Gim Hee",poster,2305.12833,https://arxiv.org/abs/2305.12833,,https://huggingface.co/papers/2305.12833,,,,4,0
|
595 |
-
Objects do not disappear: Video object detection by single-frame object location anticipation,"Liu, Xin*; Karimi Nejadasl, Fatemeh; van Gemert, Jan C; Booij, Olaf; Pintea, Silvia L",poster,2308.04770,https://arxiv.org/abs/2308.04770,https://github.com/L-KID/Videoobject-detection-by-location-anticipation,https://huggingface.co/papers/2308.04770,,,,5,
|
596 |
Unified Visual Relationship Detection with Vision and Language Models,"Zhao, Long*; Yuan, Liangzhe; Gong, Boqing; Cui, Yin; Schroff, Florian; Yang, Ming-Hsuan; Adam, Hartwig; Liu, Ting",poster,2303.08998,https://arxiv.org/abs/2303.08998,,https://huggingface.co/papers/2303.08998,,,,8,1
|
597 |
Universal Domain Adaptation via Compressive Attention Matching,"zhu, didi; Li, Yinchuan; Yuan, Junkun; Li, Zexi; Kuang, Kun; Wu, Chao*",poster,2304.11862,https://arxiv.org/abs/2304.11862,,https://huggingface.co/papers/2304.11862,,,,6,0
|
598 |
Unsupervised Domain Adaptive Detection with Network Stability Analysis,"Zhou, Wenzhang; Fan, Heng; Luo, Tiejian; Zhang, Libo*",poster,2308.08182,https://arxiv.org/abs/2308.08182,https://github.com/tiankongzhang/NSA,https://huggingface.co/papers/2308.08182,,,,4,0
|
@@ -639,7 +639,7 @@ EverLight: Indoor-Outdoor Editable HDR Lighting Estimation,"Karimi Dastjerdi, Mo
|
|
639 |
Prompt Tuning Inversion for Text-driven Image Editing Using Diffusion Models,"Dong, Wenkai*; Duan, Xiaoyue; Xue, Song; Han, Shumin",poster,2305.04441,https://arxiv.org/abs/2305.04441,,https://huggingface.co/papers/2305.04441,,,,4,0
|
640 |
Efficient Diffusion Training via Min-SNR Weighting Strategy,"Hang, Tiankai; Gu, Shuyang*; Li, Chen; Bao, Jianmin; Chen, Dong; Hu, Han; Geng, Xin; Guo, Baining",poster,2303.09556,https://arxiv.org/abs/2303.09556,https://github.com/TiankaiHang/Min-SNR-Diffusion-Training,https://huggingface.co/papers/2303.09556,,,,8,0
|
641 |
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion,"Xie, Jinheng; Li, Yuexiang; Huang, Yawen; Liu, Haozhe; Zhang, Wentian; Zheng, Yefeng; Shou, Mike Zheng*",poster,2307.10816,https://arxiv.org/abs/2307.10816,https://github.com/showlab/BoxDiff,https://huggingface.co/papers/2307.10816,,,,7,0
|
642 |
-
Improving Sample Quality of Diffusion Models Using Self-Attention Guidance,"Hong, Susung*; Lee, Gyuseong; Jang, Wooseok; Kim, Seungryong",poster,2210.00939,https://arxiv.org/abs/2210.00939,,https://huggingface.co/papers/2210.00939,https://github.com/KU-CVLAB/Self-Attention-Guidance,,,4,
|
643 |
Not All Steps are Created Equal: Selective Diffusion Distillation for Image Manipulation,"WANG, Luozhou*; Yang, Shuai; Liu, Shu; Chen, Yingcong",poster,2307.08448,https://arxiv.org/abs/2307.08448,https://github.com/AndysonYs/Selective-Diffusion-Distillation,https://huggingface.co/papers/2307.08448,,,,4,0
|
644 |
Deep Image Harmonization with Learnable Augmentation,"Niu, Li*; Cao, Junyan; Cong, Wenyan; Zhang, Liqing",poster,2308.00376,https://arxiv.org/abs/2308.00376,https://github.com/bcmi/SycoNet-Adaptive-Image-Harmonization,https://huggingface.co/papers/2308.00376,,,,4,0
|
645 |
Out-of-domain GAN inversion via Invertibility Decomposition for Photo-Realistic Human Face Manipulation,"YANG, Xin*; XU, Xiaogang; Chen, Yingcong",poster,2212.09262,https://arxiv.org/abs/2212.09262,,https://huggingface.co/papers/2212.09262,,,,3,0
|
@@ -666,7 +666,7 @@ Householder Projector for Unsupervised Latent Semantics Discovery,"Song, Yue*; Z
|
|
666 |
Deep Image Harmonization with Globally Guided Feature Transformation and Relation Distillation,"Niu, Li*; Tan, Linfeng; Tao, Xinhao; Cao, Junyan; Guo, Fengjun; Long, Teng; Zhang, Liqing",poster,2308.00356,https://arxiv.org/abs/2308.00356,https://github.com/bcmi/Image-Harmonization-Dataset-ccHarmony,https://huggingface.co/papers/2308.00356,,,,7,0
|
667 |
One-Shot Generative Domain Adaptation,"Yang, Ceyuan*; Shen, Yujun; Zhang, Zhiyi; Xu, Yinghao; Zhu, Jiapeng; Wu, Zhirong; Zhou, Bolei",poster,2111.09876,https://arxiv.org/abs/2111.09876,,https://huggingface.co/papers/2111.09876,,,,7,0
|
668 |
Hashing Neural Video Decomposition with Multiplicative Residuals in Space-Time,"Chan, Cheng-Hung; Yuan, Cheng-Yang; Sun, Cheng; Chen, Hwann-Tzong*",poster,,,,,,,,,
|
669 |
-
"Versatile Diffusion: Text, Images and Variations All in One Diffusion Model","Xu, Xingqian*; Wang, Zhangyang; Zhang, Gong; Wang, Kai; Shi, Humphrey",poster,2211.08332,https://arxiv.org/abs/2211.08332,https://github.com/SHI-Labs/Versatile-Diffusion,https://huggingface.co/papers/2211.08332,,,,5,
|
670 |
Sound Source Localization is All about Cross-Modal Alignment,"Senocak, Arda*; Ryu, Hyeonggon; Kim, Junsik; Oh, Tae-Hyun; Pfister, Hanspeter; Chung, Joon Son",poster,,,,,,,,,
|
671 |
Class-Incremental Grouping Network for Continual Audio-Visual Learning,"Mo, Shentong; Pian, Weiguo; Tian, Yapeng*",poster,,,,,,,,,
|
672 |
Audio-Visual Class-Incremental Learning,"Pian, Weiguo*; Mo, Shentong; Guo, Yunhui; Tian, Yapeng",poster,2308.11073,https://arxiv.org/abs/2308.11073,https://github.com/weiguoPian/AV-CIL_ICCV2023,https://huggingface.co/papers/2308.11073,,,,4,0
|
@@ -742,7 +742,7 @@ Sparse Point Guided 3D Lane Detection,"Yao, Chengtang*; Yu, Lidong; Jia, Yunde;
|
|
742 |
A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection,"Zhang, Dingyuan*; Liang, Dingkang; Zou, Zhikang; Li, Jingyu; Ye, Xiaoqing; Tan, Xiao; Liu, Zhe; Bai, Xiang",poster,,,,,,,,,
|
743 |
Learn TAROT with MENTOR: A Meta-Learned Self-supervised Approach for Trajectory Prediction,"Pourkeshavarz, Mozhgan MP*; Chen, Changhe; Rasouli, Amir",poster,,,,,,,,,
|
744 |
FocalFormer3D : Focusing on Hard Instance for 3D Object Detection,"Chen, Yilun*; Yu, Zhiding; Chen, Yukang; Lan, Shiyi; Anandkumar, Animashree; Jia, Jiaya; Alvarez, Jose M",poster,2308.04556,https://arxiv.org/abs/2308.04556,https://github.com/NVlabs/FocalFormer3D,https://huggingface.co/papers/2308.04556,,,,7,1
|
745 |
-
Scene as Occupancy,"Tong, Wenwen; Sima, Chonghao*; Wang, Tai; Chen, Li; wu, silei; Deng, Hanming; Gu, Yi; Lu, Lewei; Luo, Ping; Lin, Dahua; Li, Hongyang",poster,2306.02851,https://arxiv.org/abs/2306.02851,,https://huggingface.co/papers/2306.02851,,,,11,
|
746 |
Neural Scene Rasterization for Large Scene Rendering in Real-time,"Liu, Jeffrey Yunfan*; Chen, Yun; Yang, Ze; Wang, Jingkang; Manivasagam, Sivabalan; Urtasun, Raquel",poster,,,,,,,,,
|
747 |
A Game of Bundle Adjustment - Learning Efficient Convergence,"Belder, Amir*; VIVANTI, REFAEL; Tal, Ayellet",poster,,,,,,,,,
|
748 |
Efficient Transformer-based 3D Object Detection with Dynamic Token Halting,"Ye, Mao*; Meyer, Gregory P; Chai, Yuning; Liu, Qiang",poster,2303.05078,https://arxiv.org/abs/2303.05078,,https://huggingface.co/papers/2303.05078,,,,4,0
|
@@ -1060,7 +1060,7 @@ Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models,"Höllei
|
|
1060 |
LivePose: Online 3D Reconstruction from Monocular Video with Dynamic Camera Poses,"Stier, Noah; Angles, Baptiste; Yang, Liang*; yan, yajie; Colburn, Alex; Chuang, Ming",oral,2304.00054,https://arxiv.org/abs/2304.00054,,https://huggingface.co/papers/2304.00054,,,,6,0
|
1061 |
NDDepth: Normal-Distance Assisted Monocular Depth Estimation,"Shao, Shuwei*; pei, zhongcai; Chen, Weihai; Wu, Xingming; Li, Zhengguo",oral,,,,,,,,,
|
1062 |
LATR: 3D Lane Detection from Monocular Images with Transformer,"Luo, Yueru; Zheng, Chaoda; Yan, Xu; Tang, Kun; zheng, chao; Cui, Shuguang; Li, Zhen*",oral,2308.04583,https://arxiv.org/abs/2308.04583,https://github.com/JMoonr/LATR,https://huggingface.co/papers/2308.04583,,,,7,0
|
1063 |
-
DriveAdapter: Breaking the Coupling Barrier of Perception and Planning in End-to-End Autonomous Driving,"Jia, Xiaosong*; Gao, Yulu; Chen, Li; Yan, Junchi; Liu, Langechuan; Li, Hongyang",oral,2308.00398,https://arxiv.org/abs/2308.00398,,https://huggingface.co/papers/2308.00398,,,,6,
|
1064 |
Dynamic Point Fields,"Prokudin, Sergey*; Ma, Qianli; Raafat, Maxime; Valentin, Julien; Tang, Siyu",oral,2304.02626,https://arxiv.org/abs/2304.02626,,https://huggingface.co/papers/2304.02626,,,,5,0
|
1065 |
Generalizing Neural Human Fitting to Unseen Pose With Articulated E(3) Equivariance,"Feng, Haiwen*; Kulits, Peter; Liu, Shichen; Black, Michael J.; Fernandez Abrevaya, Victoria",oral,,,,,,,,,
|
1066 |
Probabilistic Human Mesh Recovery in 3D Scenes from Egocentric Views,"Zhang, Siwei*; Ma, Qianli; Zhang, Yan; Aliakbarian, Sadegh; Cosker, Darren P; Tang, Siyu",oral,2304.06024,https://arxiv.org/abs/2304.06024,,https://huggingface.co/papers/2304.06024,,,,6,0
|
@@ -1582,7 +1582,7 @@ Efficient Deep Space Filling Curve,"Chen, Wanli *; Yao, Xufeng; Zhang, Xinyun; Y
|
|
1582 |
Q-Diffusion: Quantizing Diffusion Models,"Li, Xiuyu*; Liu, Yijiang; Lian, Long; Yang, Huanrui; Dong, Zhen; Kang, Daniel; Zhang, Shanghang; Keutzer, Kurt",poster,,,,,,,,,
|
1583 |
Lossy and Lossless (L$^2$) Post-training Model Size Compression,"Shi, Yumeng*; bai, shihao; Wei, Xiuying; Gong, Ruihao; Yang, Jianlei",poster,2308.04269,https://arxiv.org/abs/2308.04269,https://github.com/ModelTC/L2_Compression,https://huggingface.co/papers/2308.04269,,,,5,0
|
1584 |
Robustifying Token Attention for Vision Transformers,"Guo, Yong*; Stutz, David; Schiele, Bernt",poster,2303.11126,https://arxiv.org/abs/2303.11126,,https://huggingface.co/papers/2303.11126,,,,3,0
|
1585 |
-
Strivec: Sparse Tri-Vector Radiance Fields,"Xu, Qiangeng; Gao, Quankai*; Su, Hao; Neumann, Ulrich; Xu, Zexiang",poster,2307.13226,https://arxiv.org/abs/2307.13226,,https://huggingface.co/papers/2307.13226,,,,5,
|
1586 |
Image Features with Formal Privacy Guarantees,"Pittaluga, Francesco*; Zhuang, Bingbing",poster,,,,,,,,,
|
1587 |
SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection,"Xie, Yichen*; Xu, Chenfeng; Rakotosaona, Marie-Julie; Rim, Patrick; Tombari, Federico; Keutzer, Kurt; TOMIZUKA, Masayoshi; Zhan, Wei",poster,2304.14340,https://arxiv.org/abs/2304.14340,https://github.com/yichen928/SparseFusion,https://huggingface.co/papers/2304.14340,,,,8,0
|
1588 |
Strata-NeRF : Neural Radiance fields for Stratified Scenes,"Dhiman, Ankit*; R, Srinath; Rangwani, Harsh; Parihar, Rishubh; Boregowda, Lokesh; Sridhar, Srinath; RADHAKRISHNAN, Venkatesh Babu",poster,2308.10337,https://arxiv.org/abs/2308.10337,,https://huggingface.co/papers/2308.10337,,,,7,0
|
@@ -1761,7 +1761,7 @@ ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document
|
|
1761 |
ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer,"Huang, Mingxin; Zhang, Jiaxin; Peng, Dezhi; Lu, Hao; Huang, Can; Liu, Yuliang; Bai, Xiang; Jin, Lianwen *",poster,2308.10147,https://arxiv.org/abs/2308.10147,https://github.com/mxin262/ESTextSpotter,https://huggingface.co/papers/2308.10147,,,,8,0
|
1762 |
Few shot font generation via transferring similarity guided global style and quantization local style,"Pan, Wei; Zhu, Anna*; Zhou, Xinyu; Iwana, Brian K; Li, Shilin",poster,,,,,,,,,
|
1763 |
Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration,"Cao, Haoyu*; Bao, Changcun; Liu, Chaohu; Chen, Huang; Yin, Kun; Liu, Hao; Liu, Yinsong; Jiang, Deqiang; Sun, Xing",poster,,,,,,,,,
|
1764 |
-
Document Understanding Dataset and Evaluation (DUDE),"Van Landeghem, Jordy*; Tito, RubÚn; Borchmann, ?ukasz; Pietruszka, Micha?; Joziak, Pawel; Powalski, Rafal; Jurkiewicz, Dawid; Coustaty, Mickael; Anckaert, Bertrand; Valveny, Ernest; Blaschko, Matthew B.; Moens, Sien; Stanislawek, Tomasz",poster,2305.08455,https://arxiv.org/abs/2305.08455,,https://huggingface.co/papers/2305.08455,,,,13,
|
1765 |
LISTER: Neighbor Decoding for Length-Insensitive Scene Text Recognition,"Cheng, Changxu*; Wang, Peng; Da, Cheng; Zheng, Qi; Yao, Cong",poster,2308.12774,https://arxiv.org/abs/2308.12774,,https://huggingface.co/papers/2308.12774,,,,5,0
|
1766 |
MolGrapher: Graph-based Visual Recognition of Chemical Structures,"Morin, Lucas*; Danelljan, Martin; Agea, M. Isabel; Nassar, Ahmed S; weber, valery; Meijer, Gerhard Ingmar; Staar, Peter W J; Yu, Fisher",poster,2308.12234,https://arxiv.org/abs/2308.12234,,https://huggingface.co/papers/2308.12234,,,,8,0
|
1767 |
SCOB: Universal Text Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap,"Kim, Daehee; Kim, Yoonsik*; Kim, DongHyun; Lim, Yumin; Kim, Geewook; Kil, Taeho",poster,,,,,,,,,
|
@@ -1993,7 +1993,7 @@ Generating Visual Scenes from Touch,"Yang, Fengyu*; Zhang, Jiacheng; Owens, Andr
|
|
1993 |
Multimodal High-order Relation Transformer for Scene Boundary Detection,"Wei, Xi*; Shi, Zhangxiang; Zhang, Tianzhu; Yu, Xiaoyuan; Xiao, Lei",poster,,,,,,,,,
|
1994 |
Muscles in Action,"Chiquier, Mia*; Vondrick, Carl",poster,2212.02978,https://arxiv.org/abs/2212.02978,,https://huggingface.co/papers/2212.02978,,,,2,0
|
1995 |
Self-Evolved Dynamic Expansion Model for Task-Free Continual Learning,"Ye, Fei*; Bors, Adrian",poster,,,,,,,,,
|
1996 |
-
Multi-event Video-Text Retrieval,"Zhang, Gengyuan*; Ren, Jisen; Gu, Jindong; Tresp, Volker",poster,2308.11551,https://arxiv.org/abs/2308.11551,https://github.com/gengyuanmax/MeVTR,https://huggingface.co/papers/2308.11551,,,,4,
|
1997 |
Referring Image Segmentation Using Text Supervision,"Liu, Fang*; Liu, Yuhao; Kong, Yuqiu; Xu, Ke; Zhang, Lihe; Yin, Baocai ; Hancke, Gerhard P.; Lau, Rynson W.H.",poster,2308.14575,https://arxiv.org/abs/2308.14575,https://github.com/fawnliu/TRIS,https://huggingface.co/papers/2308.14575,,,,8,0
|
1998 |
Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning,"Guo, Xiaobao*; Muthuchamy Selvaraj, Nithish; Yu, Zitong; Kong, Wai-Kin Adams; Shen, Bingquan; Kot, Alex",poster,2303.12745,https://arxiv.org/abs/2303.12745,https://github.com/NMS05/Audio-Visual-Deception-Detection-DOLOS-Dataset-and-Parameter-Efficient-Crossmodal-Learning,https://huggingface.co/papers/2303.12745,,,,6,0
|
1999 |
EMMN: Emotional Motion Memory Network for Audio-driven Emotional Talking Face Generation,"Tan, Shuai; Ji, Bin; pan, ye*",poster,,,,,,,,,
|
@@ -2001,7 +2001,7 @@ CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-tra
|
|
2001 |
Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video,"Wu, Xiuzhe; Hu, Pengfei; Wu, Yang*; Lyu, Xiaoyang; Cao, Yan-Pei; Shan, Ying; Yang, Wenming; Sun, Zhongqian; Qi, Xiaojuan",poster,,,,,,,,,
|
2002 |
GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-training,"Deng, Xinchi*; Shi, Han; Huang, Runhui; Li, Changlin; Xu, Hang; Han, Jianhua; Kwok, James; Zhao, Shen; Zhang, Wei; Liang, Xiaodan",poster,2308.11331,https://arxiv.org/abs/2308.11331,,https://huggingface.co/papers/2308.11331,,,,10,0
|
2003 |
A Retrospect to Multi-prompt Learning across Vision and Language,"Chen, Ziliang; Huang, Xin; Guan, Quanlong*; Lin, Liang; Luo, Weiqi",poster,,,,,,,,,
|
2004 |
-
ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules,"Cheng , Zhi-Qi; Dai, Qi*; Hauptmann, Alexander ",poster,2304.02173,https://arxiv.org/abs/2304.02173,https://github.com/zhiqic/ChartReader,https://huggingface.co/papers/2304.02173,,,,6,
|
2005 |
Boosting Multi-modal Model Performance with Adaptive Gradient Modualtion,"Li, Hong*; Li, Xingyu; Hu, Pengbo ; Lei, Yinuo; Li, Chunxiao; Zhou, Yi",poster,,,,,,,,,
|
2006 |
ViLLA: Fine-Grained Vision-Language Representation Learning from Real-World Data,"Varma, Maya*; Delbrouck, Jean-Benoit; Hooper, Sarah; Chaudhari, Akshay S; Langlotz, Curtis",poster,2308.11194,https://arxiv.org/abs/2308.11194,,https://huggingface.co/papers/2308.11194,,,,5,0
|
2007 |
Robust Referring Video Object Segmentation with Cyclic Structural Consensus,"Li, Xiang*; Wang, Jinglu; Xu, Xiaohao; Li, Xiao; Raj, Bhiksha; Lu, Yan",poster,,,,,,,,,
|
@@ -2065,7 +2065,7 @@ StyleLipSync: Style-based Personalized Lip-sync Video Generation,"Ki, Taekyung*;
|
|
2065 |
StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation,"Wang, Yuhan*; Jiang, Liming; Loy, Chen Change",poster,2308.16909,https://arxiv.org/abs/2308.16909,,https://huggingface.co/papers/2308.16909,,,,3,0
|
2066 |
3D-Aware Generative Model for Improved Side-View Image Synthesis,"Jo, Kyungmin; Jin, Wonjoon*; Choo, Jaegul; Lee, Hyunjoon; Cho, Sunghyun",poster,,,,,,,,,
|
2067 |
Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer,"Yang, Serin*; HWANG, HYUNMIN; Ye, Jong Chul",poster,2303.08622,https://arxiv.org/abs/2303.08622,,https://huggingface.co/papers/2303.08622,,,,3,0
|
2068 |
-
FlipNeRF: Flipped Reflection Rays for Few-shot Novel View Synthesis,"Seo, Seunghyeon; Chang, Yeonjin; Kwak, Nojun*",poster,2306.17723,https://arxiv.org/abs/2306.17723,,https://huggingface.co/papers/2306.17723,,,,3,
|
2069 |
Inverse problem regularization with hierarchical variational autoencoders,"Prost, Jean*; Houdard, Antoine; Almansa, Andres; Papadakis, Nicolas",poster,2303.11217,https://arxiv.org/abs/2303.11217,,https://huggingface.co/papers/2303.11217,,,,4,0
|
2070 |
3D-aware Blending with Generative NeRFs,"Kim, Hyunsu*; Lee, Gayoung; Choi, Yunjey; Kim, Jin-Hwa; Zhu, Jun-Yan",poster,2302.06608,https://arxiv.org/abs/2302.06608,,https://huggingface.co/papers/2302.06608,,,,5,0
|
2071 |
NeMF: Inverse Volume Rendering with Neural Microflake Field,"Zhang, Youjia; Xu, Teng; Yu, Junqing; Ye, YuTeng; Wang, Junle; Jing , Yanqing; Yu, Jingyi; Yang, Wei*",poster,2304.00782,https://arxiv.org/abs/2304.00782,,https://huggingface.co/papers/2304.00782,,,,8,0
|
@@ -2100,7 +2100,7 @@ Multi-view Spectral Polarization Propagation for Video Glass Segmentation,"Qiao,
|
|
2100 |
WALDO: Future Video Synthesis using Object Layer Decomposition and Parametric Flow Prediction,"Le Moing, Guillaume*; Ponce, Jean; Schmid, Cordelia",poster,2211.14308,https://arxiv.org/abs/2211.14308,,https://huggingface.co/papers/2211.14308,,,,3,1
|
2101 |
Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image Generation,"Chen, Eric M*; Holalkere, Sidhanth; Yan, Ruyu; Zhang, Kai; Davis, Abe",poster,2304.13681,https://arxiv.org/abs/2304.13681,,https://huggingface.co/papers/2304.13681,,,,5,0
|
2102 |
Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models,"Lee, Jaewoong*; Jang, Sangwon; Jo, Jaehyeong; Yoon, Jaehong; Kim, Yunji; Kim, Jin-Hwa; Ha, Jung-Woo; Hwang, Sung Ju",poster,2304.01515,https://arxiv.org/abs/2304.01515,,https://huggingface.co/papers/2304.01515,,,,8,1
|
2103 |
-
Efficient Video Prediction via Sparsely Conditioned Flow Matching,"Davtyan, Aram*; Sameni, Sepehr; Favaro, Paolo",poster,2211.14575,https://arxiv.org/abs/2211.14575,,https://huggingface.co/papers/2211.14575,,,,3,
|
2104 |
Democratising 2D Sketch to 3D Shape Retrieval Through Pivoting.,"Chowdhury, Pinaki Nath*; Bhunia , Ayan Kumar; Sain, Aneeshan; Koley, Subhadeep; Xiang, Tao; Song, Yi-Zhe",poster,,,,,,,,,
|
2105 |
Towards Instance-adaptive Inference for Federated Learning,"Feng, Chun-Mei*; Yu, Kai; Liu, Nian; Xu, Xinxing; Khan, Salman; Zuo, Wangmeng",poster,2308.06051,https://arxiv.org/abs/2308.06051,,https://huggingface.co/papers/2308.06051,,,,6,0
|
2106 |
TransTIC: Transferring Transformer-based Image Compression from Human Visualization to Machine Perception,"Chen, Yi-Hsin; Weng, Ying-Chieh; Kao, Chia Hao; CHIEN, CHENG; Chiu, Wei-Chen; Peng, Wen-Hsiao*",poster,,,,,,,,,
|
|
|
575 |
DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds,"Ma, Tao*; Yang, Xuemeng; Zhou, Hongbin; Li, Xin; Shi, Botian; Liu, Junjie; Yang, Yuchen; Liu, Zhizheng; He, Liang; Li, Hongsheng; Li, Yikang; Qiao, Yu",poster,2306.06023,https://arxiv.org/abs/2306.06023,,https://huggingface.co/papers/2306.06023,,,,12,0
|
576 |
DETRs with Collaborative Hybrid Assignments Training,"Zong, Zhuofan*; Song, Guanglu; Liu, Yu",poster,2211.12860,https://arxiv.org/abs/2211.12860,https://github.com/Sense-X/Co-DETR,https://huggingface.co/papers/2211.12860,,,,3,0
|
577 |
Open Vocabulary Object Detection With an Open Corpus,"Wang, Jiong*; zhang, huiming; Hong, Haiwen; Jin, Xuan; He, Yuan; xue, hui; Zhao, Zhou",poster,,,,,,,,,
|
578 |
+
SparseDet: Improving Sparsely Annotated Object Detection with Pseudo-positive Mining,"Suri, Saksham*; Rambhatla, Sai Saketh ; Chellappa, Rama; Shrivastava, Abhinav",poster,2201.04620,https://arxiv.org/abs/2201.04620,,https://huggingface.co/papers/2201.04620,,,,4,3
|
579 |
Unsupervised Anomaly Detection with Diffusion Probabilistic Model,"Zhang, Xinyi*; Li, Naiqi; Li, Jiawei; Dai, Tao; Jiang, Yong; Xia, Shu-Tao",poster,,,,,,,,,
|
580 |
UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation,"Wang, Haiyang*; Tang, Hao; Shi, Shaoshuai; Li, Aoxue; Li, Zhenguo; Schiele, Bernt; Wang, Liwei",poster,,,,,,,,,
|
581 |
Focus the Discrepancy: Intra- and Inter-Correlation Learning for Image Anomaly Detection,"Yao, Xincheng*; Li, Ruoqi; Qian, Zefeng; Luo, Yan; Zhang, Chongyang",poster,,,,,,,,,
|
|
|
592 |
FB-BEV: BEV Representation from Forward-Backward View Transformations,"Li, Zhiqi*; Yu, Zhiding; Wang, Wenhai; Anandkumar, Animashree; Lu, Tong; Alvarez, Jose M",poster,,,,,,,,,
|
593 |
Learning from Noisy Data for Semi-Supervised 3D Object Detection,"Chen, Zehui; Li, Zhenyu; Wang, Shuo; Fu, Dengpan; Zhao, Feng*",poster,,,,,,,,,
|
594 |
Boosting Long-tailed Object Detection via Step-wise Learning on Smooth-tail Data,"Dong, Na*; Zhang, Yongqiang; Ding, Mingli; Lee, Gim Hee",poster,2305.12833,https://arxiv.org/abs/2305.12833,,https://huggingface.co/papers/2305.12833,,,,4,0
|
595 |
+
Objects do not disappear: Video object detection by single-frame object location anticipation,"Liu, Xin*; Karimi Nejadasl, Fatemeh; van Gemert, Jan C; Booij, Olaf; Pintea, Silvia L",poster,2308.04770,https://arxiv.org/abs/2308.04770,https://github.com/L-KID/Videoobject-detection-by-location-anticipation,https://huggingface.co/papers/2308.04770,,,,5,1
|
596 |
Unified Visual Relationship Detection with Vision and Language Models,"Zhao, Long*; Yuan, Liangzhe; Gong, Boqing; Cui, Yin; Schroff, Florian; Yang, Ming-Hsuan; Adam, Hartwig; Liu, Ting",poster,2303.08998,https://arxiv.org/abs/2303.08998,,https://huggingface.co/papers/2303.08998,,,,8,1
|
597 |
Universal Domain Adaptation via Compressive Attention Matching,"zhu, didi; Li, Yinchuan; Yuan, Junkun; Li, Zexi; Kuang, Kun; Wu, Chao*",poster,2304.11862,https://arxiv.org/abs/2304.11862,,https://huggingface.co/papers/2304.11862,,,,6,0
|
598 |
Unsupervised Domain Adaptive Detection with Network Stability Analysis,"Zhou, Wenzhang; Fan, Heng; Luo, Tiejian; Zhang, Libo*",poster,2308.08182,https://arxiv.org/abs/2308.08182,https://github.com/tiankongzhang/NSA,https://huggingface.co/papers/2308.08182,,,,4,0
|
|
|
639 |
Prompt Tuning Inversion for Text-driven Image Editing Using Diffusion Models,"Dong, Wenkai*; Duan, Xiaoyue; Xue, Song; Han, Shumin",poster,2305.04441,https://arxiv.org/abs/2305.04441,,https://huggingface.co/papers/2305.04441,,,,4,0
|
640 |
Efficient Diffusion Training via Min-SNR Weighting Strategy,"Hang, Tiankai; Gu, Shuyang*; Li, Chen; Bao, Jianmin; Chen, Dong; Hu, Han; Geng, Xin; Guo, Baining",poster,2303.09556,https://arxiv.org/abs/2303.09556,https://github.com/TiankaiHang/Min-SNR-Diffusion-Training,https://huggingface.co/papers/2303.09556,,,,8,0
|
641 |
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion,"Xie, Jinheng; Li, Yuexiang; Huang, Yawen; Liu, Haozhe; Zhang, Wentian; Zheng, Yefeng; Shou, Mike Zheng*",poster,2307.10816,https://arxiv.org/abs/2307.10816,https://github.com/showlab/BoxDiff,https://huggingface.co/papers/2307.10816,,,,7,0
|
642 |
+
Improving Sample Quality of Diffusion Models Using Self-Attention Guidance,"Hong, Susung*; Lee, Gyuseong; Jang, Wooseok; Kim, Seungryong",poster,2210.00939,https://arxiv.org/abs/2210.00939,,https://huggingface.co/papers/2210.00939,https://github.com/KU-CVLAB/Self-Attention-Guidance,,,4,1
|
643 |
Not All Steps are Created Equal: Selective Diffusion Distillation for Image Manipulation,"WANG, Luozhou*; Yang, Shuai; Liu, Shu; Chen, Yingcong",poster,2307.08448,https://arxiv.org/abs/2307.08448,https://github.com/AndysonYs/Selective-Diffusion-Distillation,https://huggingface.co/papers/2307.08448,,,,4,0
|
644 |
Deep Image Harmonization with Learnable Augmentation,"Niu, Li*; Cao, Junyan; Cong, Wenyan; Zhang, Liqing",poster,2308.00376,https://arxiv.org/abs/2308.00376,https://github.com/bcmi/SycoNet-Adaptive-Image-Harmonization,https://huggingface.co/papers/2308.00376,,,,4,0
|
645 |
Out-of-domain GAN inversion via Invertibility Decomposition for Photo-Realistic Human Face Manipulation,"YANG, Xin*; XU, Xiaogang; Chen, Yingcong",poster,2212.09262,https://arxiv.org/abs/2212.09262,,https://huggingface.co/papers/2212.09262,,,,3,0
|
|
|
666 |
Deep Image Harmonization with Globally Guided Feature Transformation and Relation Distillation,"Niu, Li*; Tan, Linfeng; Tao, Xinhao; Cao, Junyan; Guo, Fengjun; Long, Teng; Zhang, Liqing",poster,2308.00356,https://arxiv.org/abs/2308.00356,https://github.com/bcmi/Image-Harmonization-Dataset-ccHarmony,https://huggingface.co/papers/2308.00356,,,,7,0
|
667 |
One-Shot Generative Domain Adaptation,"Yang, Ceyuan*; Shen, Yujun; Zhang, Zhiyi; Xu, Yinghao; Zhu, Jiapeng; Wu, Zhirong; Zhou, Bolei",poster,2111.09876,https://arxiv.org/abs/2111.09876,,https://huggingface.co/papers/2111.09876,,,,7,0
|
668 |
Hashing Neural Video Decomposition with Multiplicative Residuals in Space-Time,"Chan, Cheng-Hung; Yuan, Cheng-Yang; Sun, Cheng; Chen, Hwann-Tzong*",poster,,,,,,,,,
|
669 |
+
"Versatile Diffusion: Text, Images and Variations All in One Diffusion Model","Xu, Xingqian*; Wang, Zhangyang; Zhang, Gong; Wang, Kai; Shi, Humphrey",poster,2211.08332,https://arxiv.org/abs/2211.08332,https://github.com/SHI-Labs/Versatile-Diffusion,https://huggingface.co/papers/2211.08332,,,,5,1
|
670 |
Sound Source Localization is All about Cross-Modal Alignment,"Senocak, Arda*; Ryu, Hyeonggon; Kim, Junsik; Oh, Tae-Hyun; Pfister, Hanspeter; Chung, Joon Son",poster,,,,,,,,,
|
671 |
Class-Incremental Grouping Network for Continual Audio-Visual Learning,"Mo, Shentong; Pian, Weiguo; Tian, Yapeng*",poster,,,,,,,,,
|
672 |
Audio-Visual Class-Incremental Learning,"Pian, Weiguo*; Mo, Shentong; Guo, Yunhui; Tian, Yapeng",poster,2308.11073,https://arxiv.org/abs/2308.11073,https://github.com/weiguoPian/AV-CIL_ICCV2023,https://huggingface.co/papers/2308.11073,,,,4,0
|
|
|
742 |
A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection,"Zhang, Dingyuan*; Liang, Dingkang; Zou, Zhikang; Li, Jingyu; Ye, Xiaoqing; Tan, Xiao; Liu, Zhe; Bai, Xiang",poster,,,,,,,,,
|
743 |
Learn TAROT with MENTOR: A Meta-Learned Self-supervised Approach for Trajectory Prediction,"Pourkeshavarz, Mozhgan MP*; Chen, Changhe; Rasouli, Amir",poster,,,,,,,,,
|
744 |
FocalFormer3D : Focusing on Hard Instance for 3D Object Detection,"Chen, Yilun*; Yu, Zhiding; Chen, Yukang; Lan, Shiyi; Anandkumar, Animashree; Jia, Jiaya; Alvarez, Jose M",poster,2308.04556,https://arxiv.org/abs/2308.04556,https://github.com/NVlabs/FocalFormer3D,https://huggingface.co/papers/2308.04556,,,,7,1
|
745 |
+
Scene as Occupancy,"Tong, Wenwen; Sima, Chonghao*; Wang, Tai; Chen, Li; wu, silei; Deng, Hanming; Gu, Yi; Lu, Lewei; Luo, Ping; Lin, Dahua; Li, Hongyang",poster,2306.02851,https://arxiv.org/abs/2306.02851,,https://huggingface.co/papers/2306.02851,,,,11,1
|
746 |
Neural Scene Rasterization for Large Scene Rendering in Real-time,"Liu, Jeffrey Yunfan*; Chen, Yun; Yang, Ze; Wang, Jingkang; Manivasagam, Sivabalan; Urtasun, Raquel",poster,,,,,,,,,
|
747 |
A Game of Bundle Adjustment - Learning Efficient Convergence,"Belder, Amir*; VIVANTI, REFAEL; Tal, Ayellet",poster,,,,,,,,,
|
748 |
Efficient Transformer-based 3D Object Detection with Dynamic Token Halting,"Ye, Mao*; Meyer, Gregory P; Chai, Yuning; Liu, Qiang",poster,2303.05078,https://arxiv.org/abs/2303.05078,,https://huggingface.co/papers/2303.05078,,,,4,0
|
|
|
1060 |
LivePose: Online 3D Reconstruction from Monocular Video with Dynamic Camera Poses,"Stier, Noah; Angles, Baptiste; Yang, Liang*; yan, yajie; Colburn, Alex; Chuang, Ming",oral,2304.00054,https://arxiv.org/abs/2304.00054,,https://huggingface.co/papers/2304.00054,,,,6,0
|
1061 |
NDDepth: Normal-Distance Assisted Monocular Depth Estimation,"Shao, Shuwei*; pei, zhongcai; Chen, Weihai; Wu, Xingming; Li, Zhengguo",oral,,,,,,,,,
|
1062 |
LATR: 3D Lane Detection from Monocular Images with Transformer,"Luo, Yueru; Zheng, Chaoda; Yan, Xu; Tang, Kun; zheng, chao; Cui, Shuguang; Li, Zhen*",oral,2308.04583,https://arxiv.org/abs/2308.04583,https://github.com/JMoonr/LATR,https://huggingface.co/papers/2308.04583,,,,7,0
|
1063 |
+
DriveAdapter: Breaking the Coupling Barrier of Perception and Planning in End-to-End Autonomous Driving,"Jia, Xiaosong*; Gao, Yulu; Chen, Li; Yan, Junchi; Liu, Langechuan; Li, Hongyang",oral,2308.00398,https://arxiv.org/abs/2308.00398,,https://huggingface.co/papers/2308.00398,,,,6,1
|
1064 |
Dynamic Point Fields,"Prokudin, Sergey*; Ma, Qianli; Raafat, Maxime; Valentin, Julien; Tang, Siyu",oral,2304.02626,https://arxiv.org/abs/2304.02626,,https://huggingface.co/papers/2304.02626,,,,5,0
|
1065 |
Generalizing Neural Human Fitting to Unseen Pose With Articulated E(3) Equivariance,"Feng, Haiwen*; Kulits, Peter; Liu, Shichen; Black, Michael J.; Fernandez Abrevaya, Victoria",oral,,,,,,,,,
|
1066 |
Probabilistic Human Mesh Recovery in 3D Scenes from Egocentric Views,"Zhang, Siwei*; Ma, Qianli; Zhang, Yan; Aliakbarian, Sadegh; Cosker, Darren P; Tang, Siyu",oral,2304.06024,https://arxiv.org/abs/2304.06024,,https://huggingface.co/papers/2304.06024,,,,6,0
|
|
|
1582 |
Q-Diffusion: Quantizing Diffusion Models,"Li, Xiuyu*; Liu, Yijiang; Lian, Long; Yang, Huanrui; Dong, Zhen; Kang, Daniel; Zhang, Shanghang; Keutzer, Kurt",poster,,,,,,,,,
|
1583 |
Lossy and Lossless (L$^2$) Post-training Model Size Compression,"Shi, Yumeng*; bai, shihao; Wei, Xiuying; Gong, Ruihao; Yang, Jianlei",poster,2308.04269,https://arxiv.org/abs/2308.04269,https://github.com/ModelTC/L2_Compression,https://huggingface.co/papers/2308.04269,,,,5,0
|
1584 |
Robustifying Token Attention for Vision Transformers,"Guo, Yong*; Stutz, David; Schiele, Bernt",poster,2303.11126,https://arxiv.org/abs/2303.11126,,https://huggingface.co/papers/2303.11126,,,,3,0
|
1585 |
+
Strivec: Sparse Tri-Vector Radiance Fields,"Xu, Qiangeng; Gao, Quankai*; Su, Hao; Neumann, Ulrich; Xu, Zexiang",poster,2307.13226,https://arxiv.org/abs/2307.13226,,https://huggingface.co/papers/2307.13226,,,,5,3
|
1586 |
Image Features with Formal Privacy Guarantees,"Pittaluga, Francesco*; Zhuang, Bingbing",poster,,,,,,,,,
|
1587 |
SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection,"Xie, Yichen*; Xu, Chenfeng; Rakotosaona, Marie-Julie; Rim, Patrick; Tombari, Federico; Keutzer, Kurt; TOMIZUKA, Masayoshi; Zhan, Wei",poster,2304.14340,https://arxiv.org/abs/2304.14340,https://github.com/yichen928/SparseFusion,https://huggingface.co/papers/2304.14340,,,,8,0
|
1588 |
Strata-NeRF : Neural Radiance fields for Stratified Scenes,"Dhiman, Ankit*; R, Srinath; Rangwani, Harsh; Parihar, Rishubh; Boregowda, Lokesh; Sridhar, Srinath; RADHAKRISHNAN, Venkatesh Babu",poster,2308.10337,https://arxiv.org/abs/2308.10337,,https://huggingface.co/papers/2308.10337,,,,7,0
|
|
|
1761 |
ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer,"Huang, Mingxin; Zhang, Jiaxin; Peng, Dezhi; Lu, Hao; Huang, Can; Liu, Yuliang; Bai, Xiang; Jin, Lianwen *",poster,2308.10147,https://arxiv.org/abs/2308.10147,https://github.com/mxin262/ESTextSpotter,https://huggingface.co/papers/2308.10147,,,,8,0
|
1762 |
Few shot font generation via transferring similarity guided global style and quantization local style,"Pan, Wei; Zhu, Anna*; Zhou, Xinyu; Iwana, Brian K; Li, Shilin",poster,,,,,,,,,
|
1763 |
Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration,"Cao, Haoyu*; Bao, Changcun; Liu, Chaohu; Chen, Huang; Yin, Kun; Liu, Hao; Liu, Yinsong; Jiang, Deqiang; Sun, Xing",poster,,,,,,,,,
|
1764 |
+
Document Understanding Dataset and Evaluation (DUDE),"Van Landeghem, Jordy*; Tito, RubÚn; Borchmann, ?ukasz; Pietruszka, Micha?; Joziak, Pawel; Powalski, Rafal; Jurkiewicz, Dawid; Coustaty, Mickael; Anckaert, Bertrand; Valveny, Ernest; Blaschko, Matthew B.; Moens, Sien; Stanislawek, Tomasz",poster,2305.08455,https://arxiv.org/abs/2305.08455,,https://huggingface.co/papers/2305.08455,,,,13,2
|
1765 |
LISTER: Neighbor Decoding for Length-Insensitive Scene Text Recognition,"Cheng, Changxu*; Wang, Peng; Da, Cheng; Zheng, Qi; Yao, Cong",poster,2308.12774,https://arxiv.org/abs/2308.12774,,https://huggingface.co/papers/2308.12774,,,,5,0
|
1766 |
MolGrapher: Graph-based Visual Recognition of Chemical Structures,"Morin, Lucas*; Danelljan, Martin; Agea, M. Isabel; Nassar, Ahmed S; weber, valery; Meijer, Gerhard Ingmar; Staar, Peter W J; Yu, Fisher",poster,2308.12234,https://arxiv.org/abs/2308.12234,,https://huggingface.co/papers/2308.12234,,,,8,0
|
1767 |
SCOB: Universal Text Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap,"Kim, Daehee; Kim, Yoonsik*; Kim, DongHyun; Lim, Yumin; Kim, Geewook; Kil, Taeho",poster,,,,,,,,,
|
|
|
1993 |
Multimodal High-order Relation Transformer for Scene Boundary Detection,"Wei, Xi*; Shi, Zhangxiang; Zhang, Tianzhu; Yu, Xiaoyuan; Xiao, Lei",poster,,,,,,,,,
|
1994 |
Muscles in Action,"Chiquier, Mia*; Vondrick, Carl",poster,2212.02978,https://arxiv.org/abs/2212.02978,,https://huggingface.co/papers/2212.02978,,,,2,0
|
1995 |
Self-Evolved Dynamic Expansion Model for Task-Free Continual Learning,"Ye, Fei*; Bors, Adrian",poster,,,,,,,,,
|
1996 |
+
Multi-event Video-Text Retrieval,"Zhang, Gengyuan*; Ren, Jisen; Gu, Jindong; Tresp, Volker",poster,2308.11551,https://arxiv.org/abs/2308.11551,https://github.com/gengyuanmax/MeVTR,https://huggingface.co/papers/2308.11551,,,,4,1
|
1997 |
Referring Image Segmentation Using Text Supervision,"Liu, Fang*; Liu, Yuhao; Kong, Yuqiu; Xu, Ke; Zhang, Lihe; Yin, Baocai ; Hancke, Gerhard P.; Lau, Rynson W.H.",poster,2308.14575,https://arxiv.org/abs/2308.14575,https://github.com/fawnliu/TRIS,https://huggingface.co/papers/2308.14575,,,,8,0
|
1998 |
Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning,"Guo, Xiaobao*; Muthuchamy Selvaraj, Nithish; Yu, Zitong; Kong, Wai-Kin Adams; Shen, Bingquan; Kot, Alex",poster,2303.12745,https://arxiv.org/abs/2303.12745,https://github.com/NMS05/Audio-Visual-Deception-Detection-DOLOS-Dataset-and-Parameter-Efficient-Crossmodal-Learning,https://huggingface.co/papers/2303.12745,,,,6,0
|
1999 |
EMMN: Emotional Motion Memory Network for Audio-driven Emotional Talking Face Generation,"Tan, Shuai; Ji, Bin; pan, ye*",poster,,,,,,,,,
|
|
|
2001 |
Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video,"Wu, Xiuzhe; Hu, Pengfei; Wu, Yang*; Lyu, Xiaoyang; Cao, Yan-Pei; Shan, Ying; Yang, Wenming; Sun, Zhongqian; Qi, Xiaojuan",poster,,,,,,,,,
|
2002 |
GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-training,"Deng, Xinchi*; Shi, Han; Huang, Runhui; Li, Changlin; Xu, Hang; Han, Jianhua; Kwok, James; Zhao, Shen; Zhang, Wei; Liang, Xiaodan",poster,2308.11331,https://arxiv.org/abs/2308.11331,,https://huggingface.co/papers/2308.11331,,,,10,0
|
2003 |
A Retrospect to Multi-prompt Learning across Vision and Language,"Chen, Ziliang; Huang, Xin; Guan, Quanlong*; Lin, Liang; Luo, Weiqi",poster,,,,,,,,,
|
2004 |
+
ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules,"Cheng , Zhi-Qi; Dai, Qi*; Hauptmann, Alexander ",poster,2304.02173,https://arxiv.org/abs/2304.02173,https://github.com/zhiqic/ChartReader,https://huggingface.co/papers/2304.02173,,,,6,1
|
2005 |
Boosting Multi-modal Model Performance with Adaptive Gradient Modualtion,"Li, Hong*; Li, Xingyu; Hu, Pengbo ; Lei, Yinuo; Li, Chunxiao; Zhou, Yi",poster,,,,,,,,,
|
2006 |
ViLLA: Fine-Grained Vision-Language Representation Learning from Real-World Data,"Varma, Maya*; Delbrouck, Jean-Benoit; Hooper, Sarah; Chaudhari, Akshay S; Langlotz, Curtis",poster,2308.11194,https://arxiv.org/abs/2308.11194,,https://huggingface.co/papers/2308.11194,,,,5,0
|
2007 |
Robust Referring Video Object Segmentation with Cyclic Structural Consensus,"Li, Xiang*; Wang, Jinglu; Xu, Xiaohao; Li, Xiao; Raj, Bhiksha; Lu, Yan",poster,,,,,,,,,
|
|
|
2065 |
StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation,"Wang, Yuhan*; Jiang, Liming; Loy, Chen Change",poster,2308.16909,https://arxiv.org/abs/2308.16909,,https://huggingface.co/papers/2308.16909,,,,3,0
|
2066 |
3D-Aware Generative Model for Improved Side-View Image Synthesis,"Jo, Kyungmin; Jin, Wonjoon*; Choo, Jaegul; Lee, Hyunjoon; Cho, Sunghyun",poster,,,,,,,,,
|
2067 |
Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer,"Yang, Serin*; HWANG, HYUNMIN; Ye, Jong Chul",poster,2303.08622,https://arxiv.org/abs/2303.08622,,https://huggingface.co/papers/2303.08622,,,,3,0
|
2068 |
+
FlipNeRF: Flipped Reflection Rays for Few-shot Novel View Synthesis,"Seo, Seunghyeon; Chang, Yeonjin; Kwak, Nojun*",poster,2306.17723,https://arxiv.org/abs/2306.17723,,https://huggingface.co/papers/2306.17723,,,,3,1
|
2069 |
Inverse problem regularization with hierarchical variational autoencoders,"Prost, Jean*; Houdard, Antoine; Almansa, Andres; Papadakis, Nicolas",poster,2303.11217,https://arxiv.org/abs/2303.11217,,https://huggingface.co/papers/2303.11217,,,,4,0
|
2070 |
3D-aware Blending with Generative NeRFs,"Kim, Hyunsu*; Lee, Gayoung; Choi, Yunjey; Kim, Jin-Hwa; Zhu, Jun-Yan",poster,2302.06608,https://arxiv.org/abs/2302.06608,,https://huggingface.co/papers/2302.06608,,,,5,0
|
2071 |
NeMF: Inverse Volume Rendering with Neural Microflake Field,"Zhang, Youjia; Xu, Teng; Yu, Junqing; Ye, YuTeng; Wang, Junle; Jing , Yanqing; Yu, Jingyi; Yang, Wei*",poster,2304.00782,https://arxiv.org/abs/2304.00782,,https://huggingface.co/papers/2304.00782,,,,8,0
|
|
|
2100 |
WALDO: Future Video Synthesis using Object Layer Decomposition and Parametric Flow Prediction,"Le Moing, Guillaume*; Ponce, Jean; Schmid, Cordelia",poster,2211.14308,https://arxiv.org/abs/2211.14308,,https://huggingface.co/papers/2211.14308,,,,3,1
|
2101 |
Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image Generation,"Chen, Eric M*; Holalkere, Sidhanth; Yan, Ruyu; Zhang, Kai; Davis, Abe",poster,2304.13681,https://arxiv.org/abs/2304.13681,,https://huggingface.co/papers/2304.13681,,,,5,0
|
2102 |
Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models,"Lee, Jaewoong*; Jang, Sangwon; Jo, Jaehyeong; Yoon, Jaehong; Kim, Yunji; Kim, Jin-Hwa; Ha, Jung-Woo; Hwang, Sung Ju",poster,2304.01515,https://arxiv.org/abs/2304.01515,,https://huggingface.co/papers/2304.01515,,,,8,1
|
2103 |
+
Efficient Video Prediction via Sparsely Conditioned Flow Matching,"Davtyan, Aram*; Sameni, Sepehr; Favaro, Paolo",poster,2211.14575,https://arxiv.org/abs/2211.14575,,https://huggingface.co/papers/2211.14575,,,,3,1
|
2104 |
Democratising 2D Sketch to 3D Shape Retrieval Through Pivoting.,"Chowdhury, Pinaki Nath*; Bhunia , Ayan Kumar; Sain, Aneeshan; Koley, Subhadeep; Xiang, Tao; Song, Yi-Zhe",poster,,,,,,,,,
|
2105 |
Towards Instance-adaptive Inference for Federated Learning,"Feng, Chun-Mei*; Yu, Kai; Liu, Nian; Xu, Xinxing; Khan, Salman; Zuo, Wangmeng",poster,2308.06051,https://arxiv.org/abs/2308.06051,,https://huggingface.co/papers/2308.06051,,,,6,0
|
2106 |
TransTIC: Transferring Transformer-based Image Compression from Human Visualization to Machine Perception,"Chen, Yi-Hsin; Weng, Ying-Chieh; Kao, Chia Hao; CHIEN, CHENG; Chiu, Wei-Chen; Peng, Wen-Hsiao*",poster,,,,,,,,,
|