6
Wisecrack
344d

Apparently under the correct architecture, hopfield networks reduce down to the attention mechanism in transformers.

Very damn cool discovery. Surprised that I'm just reading about it.

Image is a snapshot from the article.

Whole article here:
https://nature.com/articles/...

Comments
  • 2
    Transformers are awesome.

    Had gpt2 running with the python transformers lib with three lines of code for generating text
  • 2
    @retoor How did it turn out?

    Have you experimented with any of the larger opensource language models yet?

    What kind of hardware were you running it on?
  • 2
    It worked fine. But only sentence length of 100, not more because cpu. My cpu was an i5. Don't remember the specs. Nu gpu. Answering by gpt2 costed around a second.

    Yes, played with more models. You can use any model from the huggingface website in transformers lib AFAIK. And that's a lot: https://huggingface.co/models
  • 3
  • 1
    @electrineer looks lewd...
  • 2
    @retoor what a difference a gpu makes.

    I got my machine out of the trash, legit.

    Have better hardware but I'd have to repurchase the other os because of a motherboard change, and I'm like nah.

    Really don't like intel fwiw, its just the best I have because its the most common thing around.

    Funny thing was, the last beast I found, and it really was a beast of a machine, had everything except the graphics card. And from the psu and size of the fans it must have been something *expensive*, say north of two grand, its just I got there after someone had already scavenged it.

    But the amount of hardware they threw away, there was no way the owner was the one that took the card. It was someone that came after that knew enough to know graphics cards are valuable, but didn't recognize a $200 psu and a pair of $150 fans, not speaking of the cpu, the ram, the ssd, and the mobo.

    Upgrades are worth it when it comes to desktop, if you got something that will take upgrades.
  • 1
    @Wisecrack nice. gpt3 took two minutes per word without gpu btw.
  • 1
    @retoor yeah, but how much ram did it take?

    I also get the sense, from the amount of research I've seen over just the last three years, that matrix mathematics (which most of the machine learning algorithms run on) is actually a really active field, and that they're *still* making significant 'low hanging fruit' discoveries on a year-to-year and month-to-month basis.

    There could well be some very large gains in efficiency and speed hiding somewhere in the ongoing research. We'll just have to wait and see.
  • 0
    @Wisecrack no idea how much ram it took but the laptop had 8gb. It was a very basic ThinkPad x270
Add Comment