AI translation, whether accessed via large language models (LLMs), neural machine translation, or a combination of the two, can perform language conversion with speed and a high level of fluency. And with commonly spoken languages, fluency is much higher, since machine translation has benefitted from the availability of massive amounts of data for training.
Even in those languages, however, there is a critical element that is often not quite right in AI translation: nuance. Given that language is far more than mere vocabulary and syntax, conveying culture, style, social context, and underlying meaning in translation is a necessity for true fluency and accuracy.
AI translation nuance mishaps are more frequent in literary texts, such as fiction or poetry works. In many of these texts, AI translation might be technically accurate, but struggles with subtle shades of meaning, sentiment, uncommon turns of phrase, context, and message intent.
Languages evolve within specific cultural contexts. And although culturally bound ideas can be expressed through grammatical structures, idiomatic expressions, and even humor, mapping those ideas coherently to another language while preserving the intent of the original might be difficult — at times nearly impossible — if the languages are too dissimilar linguistically and culturally.
AI translation is trained on datasets that may or may not adequately represent the diversity of a human language. When it comes to nuance, AI translation has a harder time with languages that are not related in any way, i.e., that have a larger lexical, semantic, and structural distance, such as German and Korean, or Arabic and Icelandic.
2024 Slator Pro Guide: Translation AI
The 2024 Slator Pro Guide presents 20 new and impactful ways that LLMs can be used to enhance translation workflows.
Additionally, the more stylistically rich the source text, as is the case with literature, the less fluent and accurate the output, but more so when languages are structurally distant.
At the other end of the spectrum, when languages have the same roots, such as French and Spanish, AI translation can better utilize parallel structures to produce a more fluent and nuanced translation.
Literary experts like Dr B.J. Woodstein have seen first-hand how certain aspects of language can be mishandled in translation when the broader context is not present, including cultural knowledge, whether the translation is done by humans or machines.
During SlatorPod episode #207, Dr Woodstein broached the matter of nuance in literary choices, especially in children’s literature. She explained that in her Swedish into English translation work, for example, she has seen content that is acceptable for children’s literature in Sweden, like depictions of nudity and weapons, but require adaptations for English-speaking audiences because of cultural differences.
Dr Woodstein’s example is one of those cases where AI translation still requires the intervention of an expert human editor to ensure the final target language has the right adaptation, even after careful domain-specific AI translation training for literary text and other techniques are applied.
LLMs are advancing rapidly and “shortening” the semantic and structural distance between some languages, thanks to training and many proven fine-tuning techniques. However, research devoted specifically to how well LLMs can handle literary translation has revealed shortcomings rather than distance shortening.
Slator 2024 Language Industry Market Report — Language AI Edition
The 140-page flagship report features in-depth market analysis, language AI opportunities, survey results, and much more.
Researchers acknowledge that newer LLMs, among them GPT-4o, are better at translation than older ones, but also that the quality of AI translation is still far from reaching human literary translation quality.
LLMs frequently translate literally, which in literature can amount to substantially changing what the original author meant to transmit.
Standard LLM evaluation metrics could also deceive some people into thinking the quality of literary translation is OK based only on scores, only to realize later that the target text comes quite short of an ideal, nuanced translation.
Some researchers are working on creative text evaluation metrics, proposing they center on metaphorical equivalence, emotion, authenticity, and overall quality.
Will AI translation be ever capable of reaching a level of semantic and cultural discernment akin to that of humans? Only time will tell.