The Hidden Limitations of Machine Translation
The era of artificial intelligence (AI) is well and truly upon us, with AI technologies infiltrating countless industries. On paper, AI seems like the ideal solution for a “straightforward” task such as translation. And indeed, AI-powered machine translation (MT) often produces well-written texts which, at first glance, might pass a linguistic Turing test.
But does this really hold up on closer inspection?
We asked some professionals. The Milengo team has been working with machine translation systems for 15 years, having tested over a dozen engines and translated millions of words in that time. Discover what we’ve learned from serving leading enterprises across the globe and explore best practices that counter the inherent limitations of machine translation.
What is machine translation?
Everyone knows free translation apps such as Google Translate and DeepL. The current generation is based on neural networks and is known as neural machine translation (NMT). NMT models are a massive leap forward in translation technology, as they recognize the relationships between words in a sentence with astounding reliability. Plus, they constantly improve translation quality using deep learning.
Another competitor has emerged recently in large language models (LLMs) like ChatGPT. While LLMs are not inherently designed for translation, the results they’re producing already show promise. Apart from free apps, professional translation software such as memoQ and Trados Studio now also include AI features as standard.
What are the risks of machine translation for businesses?
The application of machine translation in a corporate environment may create considerable risks. We have compiled the six biggest risks for you below, scored them from 1 (lowest) to 5 (highest) for both the severity of the risk and how often it occurs, and included tips on how to avoid them.
1. Damage to your company’s reputation
Unintentionally offensive, inappropriate, or culturally unacceptable content can paint companies in a rather unflattering light. And the limitations of machine translation often mean that even the tiniest inaccuracies in the source text can lead to a severe translation gaffe.
While reports suggest that NMT systems and LLMs have become “smarter”, there’s one very specific type of intelligence that they can’t bring to the table: a human translator’s understanding of social norms and cultural differences. Or in other words, what is and isn’t appropriate to say.
In this sense, it’s often helpful to view human translators as a testing ground for your company’s content. For example, information that you may think is acceptable may in fact be considered taboo or tasteless in the target audience’s culture. The translator can flag this material and suggest an alternative way of presenting the issue or rewriting it in a more “digestible” manner.
Risk of occurrence: The media loves to run stories about high-profile translation fails. But in our experience, mistranslations rarely have major repercussions in the corporate world. Score: 2
Severity: Consequences can range from disgruntled customers and employees to full-blown PR disasters. Even multinational corporations such as IKEA aren’t immune to making the headlines. Score: 5
How to avoid it: An in-house localization manager can establish quality assurance procedures and weigh up the advantages and disadvantages of machine translation on a case-by-case basis. Sensitive translations should always undergo a final check by the relevant department (such as Product Development, Marketing, HR, Legal).
2. Loss of control over your public image
Many brands are curated down to the smallest detail – including style guides and instructions on the usage of product names. This requires human translators to internalize the brand identity first before engaging in the actual translation. But machine translation does things differently. It generates sentences by calculating the highest probability of a certain sequence of words. This approach is ideal for producing large volumes of text, but not for strategically planned corporate communication.
The truth is machine translation still can’t fully reflect the nuances of human language. Both neural machine translation and large language models are basically a black box – or in other words, it’s nearly impossible to reliably predict the output.
Take an age-old translation problem as an example – typos that lead to changes in meaning:
Imagine you work for a German company that sells e-learning courses on corporate compliance. When using machine translation, the sentence “In diesem Fall ist der Ausschuss des Erstangebots rechtswidrig” comes back translated as “In this case, the committee of the initial offer is unlawful”. Doesn’t make much sense, right?
What you really want is something like “In this case, excluding the initial offer is unlawful.” In German, the word for “exclusion” (Ausschluss) and “committee” (Ausschuss) differ by only a single letter. Only a human translator would be able to notice this typo and correct it in the translation. If the client doesn’t pick up on the error and request an edit, then the end user will surely be left somewhat perplexed.
Risk of occurrence: Almost every company has a clear idea of how it wants to present itself to its customers. When you throw international companies into the mix, this image will also differ from country to country. Score: 4
Severity: Using machine translation means you’ll lose a certain amount of control over your company’s image. Translation errors might also be attributed to a lack of professionalism on your end. Score: 4
How to avoid it: The paid subscription option of machine and AI translation services can (with considerable time and effort) be trained in a company’s corporate language. However, highly sensitive PR or marketing texts should still be left in the hands of human translators.
Consult our experts before choosing your machine translation system
3. Incorrect product and company names
Let’s pretend you run a German e-commerce store called “Schneeflocke Kids” (which literally translates to “Snowflake Kids” in English) that sells outdoor gear for children, and you’re looking to expand into the U.S. market.
When localizing your store, you’ll need to bear in mind that the MT engine won’t spare non-translatable product names. In fact, the likelihood is that it will simply translate them freely.
This can lead to unintentionally comical or reputation–damaging mistranslations, particularly if a translated product name has different connotations. In the case of “Schneeflocke Kids”, DeepL would translate this verbatim as “Snowflake Kids”. In SEO terms, this means a Google search would produce think-pieces on helicopter parenting and “political correctness gone mad” instead of ski pants for ten-year-olds.
Risk of occurrence: Automatically translated product names and descriptions are widespread on e-commerce portals such as Amazon. Score: 4
Severity: Mistranslations make it difficult to find products and cause frustration among customers. Brands that pride themselves on quality and a customer-centric approach need to proceed with extra care here. This type of translation error is even more serious for B2B companies that only offer a handful of products instead of thousands of consumer goods. Score: 2
How to avoid it: Customizing machine translation systems can train them not to translate predefined product names. In addition, translation software offers technical features to exclude predefined text segments from the machine translation process.
4. Sloppily translated software
On the surface, it seems like software and machine translation are a match made in heaven. User interface (UI) text like “Click OK to return to the main menu” shouldn’t make an MT engine blink a mechanical eye. After all, it has been trained using millions of sentence pairs. Right?
Wrong. The challenge here is that UI is usually made up of small chunks of text, such as menu titles, button text, and table cells. It’s impossible for the NMT system or LLM to find out where these texts appear in the user interface. Given that context is crucial when translating software UI, this can lead to a whole host of errors.
Design conflicts are another stumbling block. If a machine-translated string is longer than the original text, it may be cut off if certain buttons or menu elements have a character limit.
Risk of occurrence: There has been a rampant increase in the volume of software translations. Take 2023’s Game of the Year “Baldur’s Gate 3”, which contains two million words – that’s more than the entire Game of Thrones book series! Since this spike in volume is causing localization costs to rise sharply, companies are experimenting more and more with AI and machine translation. Score: 3
Severity: Software that is machine-translated without human involvement (known as “raw machine translation”) is bound to result in mistranslations and design conflicts. This affects the user experience and can even make an app unusable. Score: 4
How to avoid it: Humans are still the first port of call for translating software user interfaces, as they can understand and interpret UI strings in context. However, AI and machine translation remain effective tools for translating help texts or software documentation.
5. Misleading product manuals and documentation
A dishwasher manual or software help content doesn’t need to be on par with Shakespeare, but it does need to be accurate and understandable.
In rare cases, machine translation can omit information or even hallucinate new content altogether. The untrained eye will miss this at first glance. After all, these tools produce texts that are very easy to read, which in turn lulls the reader in a false sense of security.
This also flags up another problem caused by the limitations of machine translation: erratically translated terminology. AI translators ignore the overall context and are often unable to translate terms consistently. For instance, the German legal term “Garantie” becomes either “warranty” or “guarantee” – two vastly different legal concepts.
Risk of occurrence: Documentation and manuals are popular staples for machine translation. The volume of text is large, and the wording is usually relatively simple and easy to understand. Score: 5
Severity: Mistranslations impair the user experience and may result in the product being used incorrectly, which may entail legal implications. Score: 4
How to avoid it: Always have content that has been pre-translated by a machine professionally post-edited according to ISO standards.
6. Bland and uninspired marketing content
Good marketing content thrives on personality, persuasion, and a narrative arc. Copywriters have a box of tricks at their disposal to tick all these boxes, including wordplay, metaphors, and heavily condensed language. But this is also precisely where NMT systems and LLMs struggle. They are generalists that have been trained using huge quantities of words. They simply lack linguistic sophistication and imagination.
This can directly affect the readability of a text. Neural machine translation interprets texts sentence by sentence, which is why conjunctions and textual references are often skewed or even incorrect. AI translators also don’t pay much attention to the tone and rhythm of a text – meaning a carefully constructed sales argument or cross-channel brand story loses its impact.
A text that is translated this way can’t be sufficiently optimized for a specific target group and the underlying marketing KPIs (such as lead generation and traffic), as is the case with a transcreation or SEO translation, for instance. Addressing a specific target group, taking cultural sensitivities into account, and using emotions are still inherently human skills.
Risk of occurrence: Artificial intelligence and machine translation will play a vital role in tomorrow’s content marketing. However, many marketing teams are currently still reluctant to depend on these tools – especially when it comes to premium content. NMT systems and LLMs lack the creativity and finesse to create this type of content right now. Score: 3
Severity: Orchestrating a marketing campaign requires a lot of time and money. A poor translation can undo this hard work. As a result, leads, traffic, or even forecasted sales numbers might fail to materialize. Score: 3
How to avoid it: Given the limitations of machine translation and AI, humans should still be entrusted with marketing content that needs to perform well. But NMT systems and LLMs can still provide a valid alternative for technical marketing, such as for no-nonsense blog posts in the B2B sector.
How can companies overcome the limitations of machine translation?
A lack of quality assurance turns machine translation into an unquantifiable risk. Despite this, many companies throw caution to the wind because human translations are often simply too expensive these days, especially for large volumes of text such as documentation or knowledge bases.
Fortunately, there is a solution here: Curated, AI-driven machine translations based on trained MT engines.
Have a look to see how Milengo provides this service:
Free MT engine | Curated machine translation |
No guarantee of quality | Professional post-editing according to ISO 18587 |
Frequent terminology errors | Customized MT trained using existing corporate terminology |
No cultural adaptation | Native-speaking post-editors and proofreaders who are familiar with your target market |
Lack of clarity about how the MT engine works | Collaboration with MT experts who select and train the ideal MT engine for you depending on your industry and subject area |
Issues with importing/exporting translation content from CMS systems; high copy and paste effort | Customized integrations and APIs for importing and exporting content; in-house desktop publishing service |
Conclusion
The limitations of machine translation are still substantial – and so are the risks connected to it. But with the right approach, they are also easy to navigate. Using the technology smartly will mean you no longer have to worry about the risks to quality we mentioned earlier. What’s more, you can finally reduce the pressure on your localization budget.