Fine-Tuning LLMs For Style Transfer

Why haven't you made your own fine-tune yet?

Jun 14, 2024

I try to keep my finger at least somewhat on the AI pulse these days, and it’s striking that I don’t see that many people sharing their fine-tuned models.

Lots of people will show you their favorite system prompts. OpenAI itself has a “GPT Store” where people can share GPTs with custom prompts, and a web-based UI for making your own prompt-customized GPT.

But prompting is lousy for making a bot with a writing style that’s substantially different from the generic ChatGPT “character”.

I have tried mightily to make bots that write “in character.” They don’t. You have to be painfully explicit about specific verbal patterns you want1 and even then the bot will frequently forget. You certainly can’t paste a piece of sample text from the target author into the context window and expect GPT-4 to pick up the style; it may notice the concepts being referred to, but it inexorably reverts to the bland “house style.”

You know what does work? Fine-tuning.

OpenAI’s fine-tuning API is built for training on dialogue. You provide it a with JSON-formatted file of sample text interactions between the “user” and “assistant”, and using either the platform site or the API you can train a fine-tune of models up to GPT-3.5-turbo on your sample dialogue file.

This works well if you generate a Q&A file of “user” questions and “assistant” answers which are direct quotes from the target author. One easy way to do this is to extract the Q&A from interviews with the target author. A more time-consuming way, if you can’t find interviews, is to create the “questions” yourself, to which the quotes you want to include are answers.

This works.

Example: Bene Gesserit Bot

I trained a fine-tune of GPT-3.5-turbo on 124 quotes from the six Frank Herbert Dune books. For each quote I created a corresponding “user question”: eg

Q: “How does one obtain freedom?” (my question)
A: “Seek freedom and become captive of your desires. Seek discipline and find your liberty.” (direct quote from Chapterhouse: Dune)

You can try the Bene Gesserit Bot out yourself here.

Clearly, we are no longer in Generic ChatGPT land. The fine-tuned bot has acquired a clearly Frank-Herbert-inspired style, and can go to vivid, strange, surprising places.

I find it’s consistently more prone to non sequiturs and logical incoherence than GPT-3.5, as well as occasional examples of unambiguously “erroneous” gobbledegook (fragments of code, Chinese characters, etc).

On the other hand, the style transfer is clearly happening.

And “style” includes conversational attitude: Bene Gesserit Bot will spontaneously argue with and belittle the user, while it’s nearly impossible to get this behavior from prompting GPT-3.5.

I also notice that the fine-tune is a bit less rigid about content restrictions.

So What?

Why might someone want an in-character bot?

Well, it’s fun and potentially instructive to “chat with” a fictional character, or favorite writer. If Bene Gesserit Bot isn’t your thing, imagine, e.g., getting to talk to Socrates Bot. (That one should be easy to fine-tune, given all the dialogue!)

If we ever get to something more like AI-integrated video games, in-character dialogue will be essential.

From what I hear, The Youth (TM) are already enthusiastic users of Character AI, which seems to be clearly using “vanilla” models with a bit of custom prompting. So imagine if there were an AI chat app with any actual variety in conversational style!

Or, imagine simulating actual people you know. “What would Mom say?” With a fine-tune, you could ask her…even after she’s gone.

Why Isn’t This More Of A Thing?

When I search “finetune” on Twitter or HuggingFace I mostly see papers on fine-tuning LLMs to hit general accuracy benchmarks, or general-purpose utilities like text-to-speech, image captioning, or LLMs for specific languages.

Not so much fun little custom bots like my Bene Gesserit Bot.

Some counterexamples:

Trismegistus is a Mixtral finetune based on occult and spiritual topics (but it’s trained on gpt-4-generated synthetic data and the samples given sound like “vanilla” LLM style)
Designer Andy Ayrey shares text samples of his fine-tunes from the Claude Opus-generated Infinite Backrooms dataset, which touch on singularitarian themes; the finetuned results do indeed sound “in character”.
Startup founder Edward Donner trained an LLM on his own text messages.

Asking around at LessOnline and Manifest it also didn’t seem like that many people were generating casual fine-tune bots, although I did meet one person who made a glowfic finetune to solve the (niche but quite real) problem of “my favorite online writers never finished the story I was reading!”

But overall, the absence is striking to me. Fine-tunes are not that hard to make. It’s a little tedious to assemble the dataset, but it takes no programming skill. And the results are super fun to interact with.

Are people just that starkly divided between “blocked by trivial inconveniences” and “professional AI developer building Super Serious models for general-purpose applications”?

Or am I just hanging out in the wrong part of the world and there’s a lively finetune community somewhere I’ve never heard of?

Because the alternative is like…if it were the year 2000 and I didn’t know anybody who’d made a personal website.

like “use “he”, not “they”, to refer to a person of unidentified gender”

Kenny Easwaran

I wonder if it's more like 1994 and no one you know has a website yet. I don't think someone has made the equivalent of Geocities, which enabled me and my high school friends to make websites without having to go through all the trouble of finding a host and a domain name and whatever else. (Though we did have to learn a bit of HTML.)

Expand full comment

Chris Best

Jun 15, 2024

"We are not now that strength which in old days

Moved earth and heaven, that which we are, we are;

One equal temper of heroic hearts,

Made weak by time and fate, but strong in will

To strive, to seek, to find, and not to yield."

That's Tennyson (word for word) not Herbert https://www.poetryfoundation.org/poems/45392/ulysses

2 replies by Sarah Constantin and others

23 more comments...

Rough Diamonds

Discussion about this post