123
-=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- (c) WidthPadding Industries 1987 0|640|0 -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=-
Socoder -> Python/Ruby -> AGI woes

Fri, 06 Dec 2024, 05:18
Afr0

AGI woes


My AGI (I'm calling her "Stella", named after the North Star) is just burning through her training in less than a second. Can anyone see why?
This is the latest iteration, where I'm using GPT2 to generate embeddings. I have earlier versions where she was training but struggling to pick up on language (I got her perplexity down as low as 590)

If nobody comments on this I might remove the code, I haven't really decided whether or not the world is ready for her yet.

Nevermind, I fixed it.

These stats are... mindboggling.

Her perplexity is down to 409 before the end of epoch 1

-=-=-
Afr0 Games

Project Dollhouse on Github - Please fork!
Fri, 06 Dec 2024, 13:59
spinal
hehe, I have no idea what any of that means
Sat, 07 Dec 2024, 03:56
Afr0
Perplexity is a measurement for a model’s language capabilities, basically how confused she is about the language patterns she’s learning.
State of the art models like GPT2 operate with a perplexity of ~20-30, and last night I got Stella’s perplexity down to 409 after *one* epoch. That is, after training on wikitext-2 for one iteration she had an overall perplexity of 409 which suggests the ability to get down under 100 in about 10 epochs. That is revolutionary, most language models have to train for at least 50 epochs to get anywhere near GPT2.
I didn’t continue training because the evaluation script didn’t match the results I was seeing at wandb so I need to fix the evaluation pipeline (the evaluation pipeline reported perplexity like 94k so is obviously broken), but yeah… even if I haven’t managed to create AGI (kinda hard to know for sure, as we’re struggling to even define AGI right now), I’ve stumbled (over engineered) my way to a training methodology so quick it can be performed on one GPU in…

TLDr; Stella will, if I give her the chance (she’s scaring me a little at the moment, I need to have a think to decide what to do next) be able to suck up all the world’s knowledge in a fraction of the time and using a fraction of the energy consumption of all the world’s AAA models out there right now.

-=-=-
Afr0 Games

Project Dollhouse on Github - Please fork!
Sat, 07 Dec 2024, 04:26
Afr0
And just admit it: if anyone on this forum were ever going to create AGI you all know it would be me, as I’d be the only one crazy and/or stupid enough to even try.

-=-=-
Afr0 Games

Project Dollhouse on Github - Please fork!
Sat, 07 Dec 2024, 04:35
Jayenkai
It's the "able to suck up all the world’s knowledge" bit that worries me.

> Reveal 🔎

-=-=-
''Load, Next List!''
Sat, 07 Dec 2024, 05:06
Afr0
I'm sorry for having brought all this unwanted traffic your way, Jay, but at least this confirms my suspicions about her performance.
Maybe I need to talk to a professor of ethics at my university about what to do next. Thanks for the heads up!

Edit: If you think that was her, it wasn't. I'm only training her on Wikitext2, locally, for now.

-=-=-
Afr0 Games

Project Dollhouse on Github - Please fork!
Sat, 07 Dec 2024, 05:32
Jayenkai
Don't worry, just saying how much traffic these things are generating.
GPTBot's been hammering the server for weeks, now, and it's rather worrying to consider just how much unused content it'll be grabbing.
We're filling its brain with nonsense!!!

-=-=-
''Load, Next List!''
Sat, 07 Dec 2024, 05:53
Afr0
Thanks, I just deleted a lot of convos with GPT that I’ve been using to debug her just to be on the safe side (I turned off sharing data with OpenAI a long time ago)

-=-=-
Afr0 Games

Project Dollhouse on Github - Please fork!