- Joined
- Mar 1, 2020
- Highlight
- #161
It gets funnier. This is the kind of shit Latitude actually trained AI Dungeon on:This is actually much funnier than the article let on. Thanks for the clarification.

"THE FINAL FURSECUTION IS AT HAND!"

Apparently, they scraped the entire database at chooseyourstory.com. I find it hard to believe that they were totally naïve about what the training data actually contained, given that it's trivially easy to perform word searches for profanity and sexual content.
Far more probable is that they knew all along, but decided to torpedo the game on purpose and run off with the money because it isn't economical to give your customers unlimited fucking supercomputer time even if they pay an arm and a leg for it.
GPT stands for Generative Pretrained Transformer. As the name would suggest, it's a language model based on a Transformer architecture that is pre-trained on a very large corpus of text and figures out the likelihood of words appearing in a given sequence based on the overall context.

Understanding Transformers, the machine learning model behind GPT-3
How this novel neural network architecture changes the way we analyze complex data types, and powers revolutionary models like GPT-3 and BERT.


OpenAI GPT-3: How It Works and Why It Matters - DZone AI
GPT-3 has many strengths, but it also has some weaknesses. Explore why it matters and how to use it to write code, design an app, and compose music.


Better Language Models and Their Implications
We’ve trained a large-scale unsupervised language model which generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering, and summarization.
Policy Implications
Large, general language models could have significant societal impacts, and also have many near-term applications. We can anticipate how systems like GPT-2 could be used to create:
We can also imagine the application of these models for malicious purposes, including the following (or other applications we can’t yet anticipate):
- AI writing assistants
- More capable dialogue agents
- Unsupervised translation between languages
- Better speech recognition systems
These findings, combined with earlier results on synthetic imagery, audio, and video, imply that technologies are reducing the cost of generating fake content and waging disinformation campaigns. The public at large will need to become more skeptical of text they find online, just as the “deep fakes” phenomenon calls for more skepticism about images.[3]
- Generate misleading news articles
- Impersonate others online
- Automate the production of abusive or faked content to post on social media
- Automate the production of spam/phishing content
GPT is no joke. It can spit out text that almost looks human. A GPT-driven bot could lead people on a wild goose chase all over Twitter, 4chan, you name it. The biggest barrier to its usage? It needs supercomputer time to run. Lots and lots of it. Particularly racks full of Nvidia GPGPU cards.

As one person put it:
Haha, that's not accurate at all. 300 GB is the raw model parameters. You want to save your working data too right? Keep in mind that actually training GPT-3 takes around 2,300 GB of HIGH-BANDWIDTH RAM, the network parameters is a trade secret so you really have to train it up yourself. Which took OpenAI/Microsoft a top 10 supercomputer with 10,000 GPUs to do. Disregarding that and say OpenAI just gives out their trade secrets, minimum you need another 300GB RAM cache for the results of the calculations, in addition to the 300 GB needed to keep the model in memory.
The key here is still HIGH-BANDWIDTH, the Tesla V100 has a memory bandwidth of 1100GB/sec, now you're talking about doing this over 16x pcie 3, with 16GB/s? You can however do everything on a single 3090, that still works fine.
This is the estimation of GPT-3 inference speed per GPU ignoring memory constraints,
https://medium.com/modern-nlp/estimating-gpt3-api-cost-50282f869ab8
1860 inferences/hour/GPU (with seq length 1024, even though GPT-3 is 204
The performance is memory bandwidth bottlenecked on a normal GPU, with 1100 GB/s read, now over PCIe you're getting 16GB/s. From 1860 inferences/hour to 27 inferences/hour.
1 inference = 1 token.
At least in AI dungeon the default text length is 45 tokens, so we're talking 1 hour 40 minutes to generate a single prompt.
AI Dungeon is not economical in the slightest. If someone wanted to run GPT-3 locally, they would need to be very, very rich. As in, they'd need a half-million-dollar rackmount server in their basement. That's after the model has already been pre-trained. Actually training the fucker? Forget about it.

Microsoft announces new supercomputer, lays out vision for future AI work - The AI Blog
Microsoft has built one of the top five publicly disclosed supercomputers in the world, with new infrastructure available to train very large AI models.

The supercomputer developed for OpenAI is a single system with more than 285,000 CPU cores, 10,000 GPUs and 400 gigabits per second of network connectivity for each GPU server. Compared with other machines listed on the TOP500 supercomputers in the world, it ranks in the top five, Microsoft says. Hosted in Azure, the supercomputer also benefits from all the capabilities of a robust modern cloud infrastructure, including rapid deployment, sustainable datacenters and access to Azure services.
Yes, that's right. AI Dungeon's customers used literal fucking supercomputer time to generate text of fucking their goblin waifu. The insanity and hilarity of the entire enterprise really highlights just how far behind the hardware is, and how long it'll take before shit like this is actually running on people's personal devices. Which, if Moore's Law held out, would be in about 10 to 20 years for top-end consumer hardware, and 20 to 30 years for mobile devices, barring some huge breakthrough in electronic device architectures or substrates.
It also tells us that we have less than a decade left before the internet no longer has any humans at all on it relative to the number and complexity of bots, all competing in a giant arena and parroting their creators' political views across the whole of social media.