By OpenAI Davinci Model Self Published 5e Levels 1-3
Explore a story of “fractured societies and forbidden magic”, and discover the (quite unbalanced) magical power of the Sword of Segedwyn (or was it a staff?)
This thirteen page adventure was written by an AI program, with just a little of guidance from a human feeding it some keywords to expand upon. 1) Tt is terrible. 2) It is interesting to see what the AI app can turn out. 3) It is NOT the worst thing I’ve ever reviewed, which is either sad commentary on the state of RPG adventure affairs or on me and my ability to pick/review adventures.
I know, I said I was taking a break from 5e reviews for awhile. But, this one is different! I know, I know, that’s what I always say, and tell myself. Anyway …I tell people various things when they ask about my degree. Some combination of philosophy and/or computer science, depending on the context. Which, while true, is not actually my degree. It’s actually in Cognitive Science, which was the fancy pants way of saying “Artificial Intelligence” back in the early nineties. Happily, my only actual AI class, I received a D in, which, makes sense, since I was only person at the particle accelerator/cyclotron facility that I worked at who had received a D in physics. So, anyway, this AI shit is now delivering and I have a passing interest in it. So you get to fucking suffer.
If you’ve followed the news at all you should be tangentially aware that the computers are now generating text that human brains can perceive as being a story. There are models that create fiction, write news stories, and other basic tasks, all with varying degrees of success … varying degrees that are rapidly improving. The more focused your model is, the more specialized, the greater the degree of success the model has in generating something that the pattern recognition systems that live in our brain will string together in to something we tell ourselves we recognize. They are good enough now, it looks like, that specialized programs, like for news stories, can take basic facts and string them together in to something to be published. We’re not talking Joyce here, but it’s enough. And thus we get to this.
What we’ve got, it appears, is an app that can create a text story. The human attached has fed it certain keywords in to guide it, just a bit, in to creating something like a module instead of a pure fiction story. This keyword guidance is, helpfully, bolded in the text of the document so you can see what the guidance was, and, there are screenshots at the end showing the keywords and raw text generated. Basically, the human is using the keywords/leading phrases to guide the app to create some text about it. Everything is up front and the vast, vast majority of the text is being generated by the AI app. It’s just poked in the ribs a few times to get it to expand some of the ideas/details it has previously generated so it better fits the model of an adventure.
It’s not doing too shabby.
The basic plot is that the king and his henchmen are evil. The resistance wants you to take them down, and gives you a magic sword to do so. There’s a couple of double-crosses, including the sword, and a staff eventually fills in the role the sword was supposed to take as The Thing in the prophecy that actually kills the king.
This is a 5e adventure, so, you know, plot. Which actually works out ok since thats what the AI app does in creating fiction. The generator has the ability to reference callbacks, things that have happened or were introduced earlier in the story. There are a couple of interesting things going on.
I’m sure that I’m reading too much in to this, but, it looks like there may be some kind of hook or training in fiction? Or, maybe, I want that to be the case. In particular, the app has used the number 3. There are THREE henchmen of the king to be killed. Three is, of course, a magic item and tends to have great cultural significance. I don’t see any prompting by the human to use the number three, so, it’s choice in this is quite interesting. Further, the AI has generated the concept that the three henchmen, who must all be defeated in order to defeat the evil king, are actually one person with three bodies. The bolding is not the easiest to make out, but the human prompting seems minimal. It looks like this could be human prompting “This is because the henchmen “ and then the app has filled in “are actually one person with three bodies.” There we go! Given the use of the number “three” I guess I shouldn’t be surprised. And it seems to be able to understand a surprise/double-cross, so, turning the three in to one is not a huge stretch. You can see a connection between this concept and the one with the magic sword/staff. You’re given the magic sword, to kill the king and his henchmen. You need it to kill them. But, it’s a trap! It actually harms YOU. You need the STAFF in the throne room to actually kill them. There’s another example where a rebel is actually bad guy and double-crosses you. Once you see the pattern “it knows how to turn a concept back on itself to generate surprise” then you start to see its use/overuse. I suspect that everything it generates has a lot of this in it. 🙂
Some of the prompting works better than other examples. An attempt to generate a magic amulet “The amulet allows “ generates “the user to see through the eyes of the animals.” Not bad. But in other cases he prompting works less well. There’s an attempt to insert “Matt the Rat King” in to the story and the generator essentially refuses to have anything to do with him except when prompted.
One of the conceits of this blog is that so many adventures fail because the designer cannot recognize that adventure writing is technical writing. This is the “formatting” part of the three part Brycian adventure model. The adventure fails utterly there, being just arranged in simple paragraph form with a few chapter heads that seems to be mostly meaningless. Interactivity is mostly limited to combats and some sneaking around. Not great, but, at least as good as most dreck adventures. Evocative writing is not particularly strong. The writing seems aimed at a lower grade level. This is most notable in a section in which the human prompted “The mansion is described as follows:” This gets us the following text from the generator “This stone mansion, built withing the last decade, features high walls small windows, and iron-bound doors. A small door with a peephole stands opposite the main gate. The mansion is surrounded by a grove of dead trees. The grove is protected by a 10’ high, 10’ deep ditch filled with a yellowing noxious smell.”
It’s done a few interesting things here which stand out from the rest of the generated text. It’s generation of a grove of trees SURROUNDED by the ditch is quite interesting. The pairing of the two ideas. They have no relation to anything else and never appear again, and, being adjacent to the castle, are no obstacle. But, the app has paired the two items which is interesting. It also represents the best example of descriptive text in the adventure. A GROVE of DEAD trees. And a yellow noxious smell. We can quibble about smells being yellow. The philistine says NO, but the poet says YES, so, some awkwardness in the word usage can be a good thing. Mostly, though, the word choices are not too great. The king is a BAD man, and so on. There may be some usefulness in a model that avoids high usage words for adjectives and adverbs.
It’s generated a plot, but its ability to form that plot in to coherent sections, akin to adventure beats, it currently lacking. Solving that issue, as well as its general tendency to pad by repetition, would elevate it greatly, maybe to the point of being a real adventure. Work on word choice would push things even further.
You can see how close this thing is to actually being useful, and this is in an almost fully automated manner. With some more guidance this could easily generate ideas for a human designer to “fix.” I look forward to the day in which DriveThru is flooded by these things, finally forcing a solution to the curation problem.
This is Pay What You Want at DriveThru with a suggested price of $1.50. I would check it out, as a curiosity at least. And $1.50 to support dudes research/efforts? That’s trivial.
dmsguild.com/product/372065/OpenAI-Series-1-The-Purging-of-Segedwyn?1892600
Using an AI is right in line with the fad for over (and inappropriate) use of random tables and improv-heavy RPGs. This non-sense is not limited to D&D. In engineering, the desire to push “training data” through “neural networks” (a high flatulent term for a cascade of multipliers) in order to auto-build models divorced from logic, insight or physics is equally BS.
Both boil down to a new paradigm in human laziness: not content with having the machines do the physical labor, we now want them to do the mental labor too.
All this lowest-form-of-science is presumably so we can play games and watch movies full-time (and somehow draw a full salary in our sedentary existence). Why not just go whole-hog and let the computers play D&D for us while we watch on YouTube and Tweet about it?
I may sound very old-man grumpy, but this sort of charlatan-science (by-and-large everything with the trendy “AI” label) needs to be seen for the snake-oil that it is. The existential threat Musk has warned about is not the machines becoming aware, but instead us hanging ourselves with our own rope while they mechanically and mindlessly assist.
To put it D&D terms: In the eternal struggle between Law and Chaos, what is more chaotic than a device driven by a pseudo-random number generator?
Tale of tales
Thank you for reviewing this Bryce! Glad we share the interest, crazy thats barley bad
I think there’s a lot of room for AI text generation to prod ideas and come up with potentially clever insights. It’s essentially a slicker version of the sorts of “cut-up” work from a hundred years ago.
The trick is to take the surprise generated by the technique and then edit it and polish it and polish it some more until it is good enough to use. Raw AI text is crap and probably always will be; there’s just no “there” there.
I wonder if we could link up the adventure-writing machine’s output with the adventure-reviewing machine, and push that shiny red button on the control panel.
Of course, “should?” shall not even enter our minds. We just roll that way.
> I look forward to the day in which DriveThru is flooded by these things, finally forcing a solution to the curation problem.
Very interesting point.
Hopefully they separate into “human” & AI at drive thru.
Then again they’d need another AI to discover which is which.
How about reviewing the dozens of promising adventures I’ve seen suggested and not the clearly shit ones. Although I guess this one is for posterity.
If you only highlight shit- that’s what you inadvertently promote. If you more regularly highlight quality- guess what that inspires…
Bryce does not only highlight shit. His diet is high in shit, but that’s because 99% of products out there are shit. If you think he’s actively looking for shit I don’t think you’ve been reading his reviews, frankly.
*Looks for the AI reply button*
Hi, Bryce. Cool post. As it has been since the 1950s right before the first AI winter, AI is always just around the corner. The fact that this wasn’t the worse thing you’ve reviewed is sad, but not too surprising.
I was just coming to say that I also have a degree in cognitive science (Edinburgh, which was the only game going in the 1980s) and eventually got out of natural language processing and speech recognition because it’s become too divorced from linguistics to be interesting to me.
P.S. squeen, you sound like a specific grumpy old man—Noam Chomsky. See Peter Norvig’s discussion of “On Chomsky and the Two Cultures of Learning” (didn’t want to include a link and get spam filtered). Would it help to realize that neural nets are just non-linear regressions from one angle and non-parametric function kernels from another? And while the hype may make ML seem really distanced from reality and only good for self-driving cars and speech recognition, it’s being used more and more in the biological and physical sciences. For instance, the partial differential equations governing the gravitational lensing that affects long-range observations informing the development of the universe are too complex to easily compute directly, so we build neural network emulators for them using tools like normalizing flows (chained linear and non-linear changes of variables with simple Jacobians). Then we can plug the emulators in as components in traditional statistical models. Here, it’s just a method for numerical analysis.
Even for analysis of systems of PDEs, you need to recognise the potential limitations. If there is no unique existence result, what are you computing numerically? You might be
(spuriously) oscillating from one solution to another. And might your scheme be better if it takes into account any conserved quantities (e.g. energy) or properties/structure of solutions? And how close is a solution of the linearised system to a solution of the nonlinear system? Sophisticated mathematics is needed for this.
In little words, if you construct your numerical scheme intelligently, you are likely to get more useful results.
I can do old and grumpy.
Shuffling Wombat is absolutely right. It’s not at all trivial to solve PDEs with neural networks. It’s not even trivial to implement some simple functional forms with standard convolutional nets.
Luckily, I work with some of the world’s top PDE experts and numerical analysts here at Flatiron Institute’s Center for Computational Mathematics. For example, our center director is Leslie Greengard! I’m not personally a PDE specialist—I’m a computer scientist working on probabilistic programming languages (I developed the Stan PPL, which uses traditional adjoint sensitivity analysis over ODE and PDE solvers to support gradients of log densities so we can sample with Hamiltonian Monte Carlo).
Overall, everyone here is skeptical about NN PDE solvers and proceeding with extreme caution and validating as they go. The consensus among my colleagues is that the traditional PDE solvers way outperform neural networks, especially if you need more than a digit of accuracy or if there isn’t inherent low-dimensional structure in the PDE. But that doesn’t mean you can’t tailor reliable NN emulators for particular domains of parameters in particular cases with lots of efforts (aka “grad-student ascent”).
P.S. Sorry for dragging the thread even further off topic!
P.P.S. Needless to say, many of my colleagues are RPG fans, especially the postdocs and interns.
Folks love multi-dimensional look-up tables too in lieu of a proper model. In my experience, these leaps of data-fitting faith always become their own mini-science project. Too hack-ish—I’m not fan.
The AI hype is just that, so let’s not call them Neural Networks anymore. If I think long an hard enough, I’m sure I can find a pejorative term that’s more appropriate to their low-brow nature. “Artificial Intelligence” too, for good measure.
🙂
I looked up Chomsky, and am definitely in his camp. We use statistics all the time in engineering for filtering, Monte Carlo analysis, etc. — so I’m not against it.
But statistically derived models are junk and do not provided insight or understanding. The real danger with “AI” method is that are becoming all the rage and that the next generation of engineers and scientists are reaching for these methods first instead of as a last resort. It’s dead-end, decline-of-civilization lazy-thinking.
ML may be “low brow” by your standards, but it has brought us halfway decent self-driving cars, usable machine translation, scary-good image recognition and deep fakes, world-beating Go programs, robots that frankly freak me out, fantastic multilingual spell checking, a Jeopardy champion, a couple percent improvement in speech rec accuracy, and not the worst adventure reviewed on this blog. Those aren’t nothing. Of course, none of it approaches anything like general intelligence, but it’s a huge leap from where we were before scalable, non-parametric, non-linear regressions (how I think of “neural networks”).
At least the careful among us are not taking leaps of faith. Or believing everything we read in the proceedings of NeurIPS or even worse, The NY Times or TED talks. We’re taking carefully controlled steps with lots of validation. And it largely doesn’t work! So in case I haven’t been absolutely clear, I’m with squeen and Shuffling Wombat on not buying the high-level hype. I don’t even work on neural nets myself—I work on more traditional statistical inference problems.
The horrors of direct & indirect illocutionary force. Hofstadter returned to IU in 88, and thus the CogSci program.
Extra cool. It was really reading Gödel, Escher, Bach in high school that led me on the path I took to cog sci! I visited David McCarty, Larry Moss, and Jon Barwise at IU around 1992 and gave a talk on constraint logics for the syntax/semantics interface.
Thanks again for all the great reviews. I find them really useful in thinking about how to structure my own adventures.
Good gravy, can someone invent a time machine and send me back to the 70s, please?
That’s the nice thing about time machines, you don’t have to rush when you’re building one. Any time to complete it will do just fine..
Indeed, my favorite saying is that if someone invents a time machine it will retroactively have always existed.