2 1 25

Valeriy L

valeriylo

valeriylo

AI & ML interests

CV, NLP

Recent Activity

liked a model 24 days ago

Vikhrmodels/Vikhr-2-VL-2b-Instruct-experimental

liked a Space about 1 month ago

lllyasviel/iclight-v2

View all activity

Organizations

None yet

valeriylo's activity

liked a model 24 days ago

Vikhrmodels/Vikhr-2-VL-2b-Instruct-experimental

Image-Text-to-Text • Updated 25 days ago • 348 • 11

liked a Space about 1 month ago

Running on Zero

638

📈

IC Light V2

liked 2 Spaces 2 months ago

Running on Zero

147

🖼

Whisper Speech X DreamTalk

Combine voice cloning and portrait lipsync animation

Running on Zero

991

🌘w🌖

alvdansen/littletinies

Text-to-Image • Updated Jun 16 • 2.54k • • 390

upvoted a paper 7 months ago

KAN: Kolmogorov-Arnold Networks

Paper • 2404.19756 • Published Apr 30 • 108

Reacted to Sentdex's post with 🔥 7 months ago

Post

8277

Okay, first pass over KAN: Kolmogorov–Arnold Networks, it looks very interesting!

Interpretability of KAN model:
May be considered mostly as a safety issue these days, but it can also be used as a form of interaction between the user and a model, as this paper argues and I think they make a valid point here. With MLP, we only interact with the outputs, but KAN is an entirely different paradigm and I find it compelling.

Scalability:
KAN shows better parameter efficiency than MLP. This likely translates also to needing less data. We're already at the point with the frontier LLMs where all the data available from the internet is used + more is made synthetically...so we kind of need something better.

Continual learning:
KAN can handle new input information w/o catastrophic forgetting, which helps to keep a model up to date without relying on some database or retraining.

Sequential data:
This is probably what most people are curious about right now, and KANs are not shown to work with sequential data yet and it's unclear what the best approach might be to make it work well both in training and regarding the interpretability aspect. That said, there's a rich long history of achieving sequential data in variety of ways, so I don't think getting the ball rolling here would be too challenging.

Mostly, I just love a new paradigm and I want to see more!

KAN: Kolmogorov-Arnold Networks (2404.19756)

5 replies

Reacted to diwank's post with 🔥 7 months ago

Post

1655

Really excited to read about Kolmogorov Arnold Networks as a novel alternatives to Multi Layer Perceptrons.

Excerpt:
> Kolmogorov-Arnold Networks (KANs) are promising alternatives of Multi-Layer Perceptrons (MLPs). KANs have strong mathematical foundations just like MLPs: MLPs are based on the universal approximation theorem, while KANs are based on Kolmogorov-Arnold representation theorem. KANs and MLPs are dual: KANs have activation functions on edges, while MLPs have activation functions on nodes. This simple change makes KANs better (sometimes much better!) than MLPs in terms of both model accuracy and interpretability.

https://github.com/KindXiaoming/pykan