When AI learns an author’s voice, even experts prefer it

Estimated reading time: 5 minutes

AI’s ability to copy an author’s writing style isn’t just about how advanced the technology is—it’s about how people use it. And the way it’s used could have major effects on creativity, jobs, and even the law.

In a new study by researchers at the University of Michigan, Stony Brook University, and Columbia University, when AI was given simple instructions, writing experts could tell that people wrote better stories. However, many everyday readers preferred the AI’s stories. When the AI was trained by reading real authors’ books, even experts often preferred the AI’s writing for both style and quality.

Paramveer Dhillon
Paramveer Dhillon

“These findings show that copying a writer’s voice with AI is no longer rare or expensive—it’s easy to scale, and the results are good enough that even trained experts often prefer them to human-written text,” said Paramveer Dhillon, associate professor at U-M’s School of Information and a co-author of the preprint study. The other authors were Tuhin Chakrabarty of Stony Brook University and Jane Ginsburg of Columbia University.

The findings raise pressing questions about authorship, consent and what creative work looks like in an era when machines can convincingly borrow a human voice.

What surprised you most about the findings?

What surprised me most was how much the outcome depended on the way we used the Generative AI models. With in-context prompting (you give the model instructions and a few examples inside the prompt, and it imitates without any new training), Master of Fine Art-trained expert readers usually preferred the human writing, but lay readers, meaning a broad sample of educated readers recruited online, often already rated the AI’s writing quality higher. When we moved to fine-tuning (we further train the model on a specific author’s books so it internalizes that author’s voice), expert preferences shifted sharply, and the AI often won on style match and overall quality.

The other surprise was the economics. On average, it cost about $81 to train the AI to copy an author’s style, compared with roughly $25,000 to pay a professional writer to create the same amount of writing, making the AI option about 99.7% cheaper

How confident are you that these results would hold for longer-form writing, such as full novels or short story collections?

I’m fairly confident the style-imitation result will generalize because voice shows up quickly, and we observed it across both MFA-trained experts and educated lay readers recruited online. The bigger challenge is book-level coherence, meaning sustaining plot, character development, pacing and consistency across tens of thousands of words. That is exactly what we are studying next in our ongoing work on long-form prose. In practice, the most realistic path is human steering: The model generates a paragraph or scene, the writer edits and redirects it, then moves to the next section, plus outlining and revision passes to keep the whole manuscript coherent. So I expect it can mimic voice to carry over sooner than it can write a strong novel end-to-end without guidance.

What specific stylistic “AI quirks” mattered most to expert readers, and why do you think fine-tuning eliminated them?

Under in-context prompting, MFA-trained experts were especially sensitive to writing that felt generic or performatively literary, such as over-explained emotions, stock metaphors and a smoothness that did not feel earned. Lay readers recruited online tended to weigh clarity and flow more heavily, which helps explain why they often liked the AI’s writing quality even when experts did not. Fine-tuning likely reduces these quirks because the model stops guessing an author’s voice from a short prompt and instead learns recurring rhythms, lexical choices and constraints from the author’s full body of work. Once those constraints are learned, the output relies less on generic “literary” patterns and more on the specific author’s habits. That is also why many of the usual “AI tells” become less reliable.

Were there particular authors or styles where AI struggled even after fine-tuning, and what does that reveal about creative voice?

Yes, there was meaningful variation across authors. Tony Tulathimutte is a clear case where the model still struggled with stylistic fidelity, even when it could produce fluent prose that some lay readers recruited online might rate as high quality. One practical reason is that he has written fewer books, so we had fewer tokens to fine-tune on, which limits how much the model can internalize his voice. A second reason is stylistic: His writing leans into transgressive humor, internet vernacular and sharp tonal shifts that are hard to reproduce without becoming generic or sanitized. More broadly, it shows that “good writing” and “this author’s voice” are not the same thing, and some voices remain more resistant to imitation than others.

How will professional writers and MFA programs respond to the capabilities demonstrated in this study?

MFA programs should treat these tools as part of the modern writing environment because different audiences already respond differently to AI text. Our expert readers were MFA-trained writers, and our lay readers were educated readers recruited online; their preferences diverged even under in-context prompting. In our follow-up interviews, several writers said versions of: “I thought I could tell what AI was, and now I’m not so sure,” and some described feeling unsettled about what this means for craft and professional identity. Programs can respond by emphasizing what remains hardest to imitate: developing a distinctive voice, learning rigorous revision and making high-level narrative choices with intention. They should also teach practical workflows, including when AI is useful for exploration and when it starts flattening voice into something generic.

What ethical or policy questions does your work raise about consent, compensation, and authors’ control over their literary styles?

The central ethical issue is consent since authors may not have meaningful control over whether their books are used to build systems that can imitate them. Compensation follows quickly because imitation can scale, and our results show that educated lay readers recruited online often prefer AI writing quality even under in-context prompting. When fine-tuning shifts MFA-trained experts’ preference as well, the prospect of substitution and market dilution becomes harder to dismiss. That puts pressure on policy to move beyond general principles and toward enforceable mechanisms like disclosure, provenance and licensing pathways that authors can opt into or out of. It also raises the question of what obligations model providers and platforms have when they make style imitation easy and cheap.

What are the legal implications related to copyright laws?

These questions are being tested in several recent lawsuits, including Kadrey v. Meta and Bartz v. Anthropic, which probe how copyright law applies to training on books and to downstream uses. The most direct connection to our findings is the fourth fair use factor, which focuses on the effect of the use on the market for the copyrighted work. If educated lay readers recruited online already prefer AI writing quality under in-context prompting, and fine-tuning can shift MFA-trained experts’ preference on style match and quality, then substitution and market dilution become plausible outcomes rather than abstract worries. That strengthens the argument that author-specific fine-tuning designed to produce style-emulating outputs is harder to justify as fair use, especially when it is positioned as a market substitute. It also points to concrete guardrails, such as refusal protocols for “in the style of” requests and clear disclosure requirements when text is AI-generated or heavily AI-assisted.

Topics: