kingabzpro (Abid Ali Awan)

replied to their post 2 months ago

@victor I think the community is eagerly awaiting the next big month-long event, where the community can come together to build something, like we used to do in the past.

reacted to their post with 😔 2 months ago

Post

1003

I believe Hugging Face should have something similar to Hacktoberfest. I miss the days when there were events like this every 3 months for audio, deep reinforcement learning, gradio themes, but it turns out everything slowed down. There are no more Hugging Face events.
@victor

3 replies

·

posted an update 2 months ago

Post

1003

I believe Hugging Face should have something similar to Hacktoberfest. I miss the days when there were events like this every 3 months for audio, deep reinforcement learning, gradio themes, but it turns out everything slowed down. There are no more Hugging Face events.
@victor

3 replies

·

reacted to their post with 👀 2 months ago

Post

1199

I never imagined that Jenkins could be as powerful and easy to implement as GitHub Actions. Loving it. 🥰

posted an update 2 months ago

Post

1199

I never imagined that Jenkins could be as powerful and easy to implement as GitHub Actions. Loving it. 🥰

replied to their post 2 months ago

I'm having some issues with the RAG pipeline. It generally takes 0.2-2 seconds for it to respond, and most of the time the embedding model takes even longer. I can implement prompt caching, but I was considering a more hardware-related solution. What do you think about using Ray for distributed serving? Also, what do you think about GraphQL?

reacted to their post with 👀 2 months ago

Post

1838

How can I make my RAG application generate real-time responses? Up until now, I have been using Groq for fast LLM generation and the Gradio Live function. I am looking for a better solution that can help me build a real-time application without any delay. @abidlabs

kingabzpro/Real-Time-RAG

2 replies

·

posted an update 2 months ago

Post

1838

How can I make my RAG application generate real-time responses? Up until now, I have been using Groq for fast LLM generation and the Gradio Live function. I am looking for a better solution that can help me build a real-time application without any delay. @abidlabs

kingabzpro/Real-Time-RAG

2 replies

·

reacted to merve's post with 🔥🤗 5 months ago

Post

4206

I love Depth Anything V2 😍
It’s Depth Anything, but scaled with both larger teacher model and a gigantic dataset!

Here's a small TLDR of paper with a lot of findings, experiments and more.
I have also created a collection that has the models, the dataset, the demo and CoreML converted model 😚 merve/depth-anything-v2-release-6671902e798cd404513ffbf5

The authors have analyzed Marigold, a diffusion based model against Depth Anything and found out what’s up with using synthetic images vs real images for MDE:

🔖 Real data has a lot of label noise, inaccurate depth maps (caused by depth sensors missing transparent objects etc) and there are many details overlooked

🔖 Synthetic data have more precise and detailed depth labels and they are truly ground-truth, but there’s a distribution shift between real and synthetic images, and they have restricted scene coverage

The authors train different image encoders only on synthetic images and find out unless the encoder is very large the model can’t generalize well (but large models generalize inherently anyway) 🧐
But they still fail encountering real images that have wide distribution in labels (e.g. diverse instances of objects) 🥲

Depth Anything v2 framework is to..

🦖 Train a teacher model based on DINOv2-G based on 595K synthetic images
🏷️ Label 62M real images using teacher model
🦕 Train a student model using the real images labelled by teacher
Result: 10x faster and more accurate than Marigold!

The authors also construct a new benchmark called DA-2K that is less noisy, highly detailed and more diverse!

reacted to DmitryRyumin's post with 🔥 8 months ago

Post

🚀💃🏻🌟 New Research Alert - CVPR 2024! 🌟🕺 🚀
📄 Title: Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling 🌟🚀

📝 Description: Animatable Gaussians - a novel method for creating lifelike human avatars from RGB videos, utilizing 2D CNNs and 3D Gaussian splatting to capture pose-dependent garment details and dynamic appearances with high fidelity.

👥 Authors: Zhe Li, Zerong Zheng, Lizhen Wang, and Yebin Liu

📅 Conference: CVPR, Jun 17-21, 2024 | Seattle WA, USA 🇺🇸

🔗 Paper: Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling (2311.16096)

🌐 Github Page: https://animatable-gaussians.github.io
📁 Repository: https://github.com/lizhe00/AnimatableGaussians

📺 Video: https://www.youtube.com/watch?v=kOmZxD0HxZI

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #AnimatableGaussians #HumanAvatars #3DGaussianSplatting #CVPR2024 #DeepLearning #Animation #Innovation

reacted to merve's post with 🤗 10 months ago

Post

Posting about a very underrated model that tops paperswithcode across different segmentation benchmarks: OneFormer 👑

OneFormer is a "truly universal" model for semantic, instance and panoptic segmentation tasks ⚔️
What makes is truly universal is that it's a single model that is trained only once and can be used across all tasks.
The enabler here is the text conditioning, i.e. the model is given a text query that states task type along with the appropriate input, and using contrastive loss, the model learns the difference between different task types 👇 (see in the image below)

It's also super easy to use with transformers.

from transformers import OneFormerProcessor, OneFormerForUniversalSegmentation

processor = OneFormerProcessor.from_pretrained("shi-labs/oneformer_ade20k_swin_large")
model = OneFormerForUniversalSegmentation.from_pretrained("shi-labs/oneformer_ade20k_swin_large")

# swap the postprocessing and task_inputs for different types of segmentation
semantic_inputs = processor(images=image, task_inputs=["semantic"], return_tensors="pt")
semantic_outputs = model(**semantic_inputs)
predicted_semantic_map = processor.post_process_semantic_segmentation(outputs, target_sizes=[image.size[::-1]])[0]

I have drafted a notebook for you to try right away ✨ https://colab.research.google.com/drive/1wfJhoTFqUqcTAYAOUc6TXUubBTmOYaVa?usp=sharing
You can also check out the Space without checking out the code itself 👉 shi-labs/OneFormer

3 replies

·

Abid Ali Awan

AI & ML interests

Organizations

kingabzpro's activity