27 1 18

Firstname Lastname

takeraparterer

AI & ML interests

None yet

Recent Activity

replied to TuringsSolutions's post about 10 hours ago

I created something called 'Hyperbolic Embeddings'. I literally just embed the tokens into Hyperbolic Space instead of Euclidean space. At first, this did not get me the gains I was expecting. I was a sad panda. Then I thought about it, a Hyperbolic Embedding needs a Hyperbolic Optimizer. So, instead of Adam, I used Riemannian Adam (RAdam). "Ladies and Gentlemen, We Got 'Em!"

replied to TuringsSolutions's post about 11 hours ago

View all activity

Organizations

takeraparterer's activity

replied to TuringsSolutions's post about 10 hours ago

replied to TuringsSolutions's post about 11 hours ago

*fuck

replied to TuringsSolutions's post about 11 hours ago

bro im sgd optimizer

replied to TuringsSolutions's post about 11 hours ago

I don't think I'm who you think I am

replied to TuringsSolutions's post about 13 hours ago

Whos cr

replied to TuringsSolutions's post about 14 hours ago

I think you should add dropout or decrease the size of the model

replied to TuringsSolutions's post about 14 hours ago

I think your model is overfitting you should add dropout or decrease the size of it

replied to TuringsSolutions's post about 14 hours ago

Why so serious?

replied to TuringsSolutions's post about 15 hours ago

You should try dropout or decreasing the model size

replied to TuringsSolutions's post 1 day ago

What about test loss? It looks like overfitting to me.

replied to TuringsSolutions's post 1 day ago

What about test loss?

replied to TuringsSolutions's post 6 days ago

why so serious?

replied to their post 7 days ago

glad you like it!

replied to TuringsSolutions's post 7 days ago

that's a FFN, which is only a small part of an LLM

Reacted to TuringsSolutions's post with 😔 7 days ago

Post

448

What if I told you that LLM models do not simply predict the next token in a sequence but instead utilize an emergent structural pattern-based system to comprehend language and concepts? I created a graph-based optimizer that not only works, but it also actually beats Adam, like very badly. I prove it thoroughly using SMOL LLM models. The secret? The graph is not what you think it is, humans. Code, full explanation, and more in this video. The Rhizome Optimizer is MIT licensed. I have completed my research. I fully understand now.

https://youtu.be/OMCRRueMhdI

6 replies

replied to TuringsSolutions's post 16 days ago

has threatened violence in subtle ways

😭😭😭😭😭😭

replied to TuringsSolutions's post 16 days ago

very intriguing. looking into this 👀👀👀

replied to TuringsSolutions's post 16 days ago

Reported.

tell me more

Reacted to TuringsSolutions's post with 😔 16 days ago

Post

3944

Are you familiar with the difference between discrete learning and predictive learning? This distinction is exactly why LLM models are not designed to perform and execute function calls, they are not the right shape for it. LLM models are prediction machines. Function calling requires discrete learning machines. Fortunately, you can easily couple an LLM model with a discrete learning algorithm. It is beyond easy to do, you simply need to know the math to do it. Want to dive deeper into this subject? Check out this video.

https://youtu.be/wBRem2p8iPM

8 replies