Defining regular expressions to exclude content from translation
When configuring Next Generation Translator for Confluence, it is possible to define a set of regular expressions in Java syntax to exclude content from translation:
Best practices: It is recommended…
to avoid overlapping regular expressions.
E.g. instead of defining two expressions Fantasy Company
and Fantasy Company LLC
, one should rather opt to create one regular expression to represent all possible occurrences of an entity. In this case, such an expression could be defined as Fantasy Company( LLC)?
, where ( LLC)?
represents the optional part.
to be sensitive about exclusion patterns.
Excluding just single words might bring along a lot of false positives. Terms could also be used in different contexts and while excluding a term from translation in one context might be reasonable, it might not make much sense in another context. Therefore, exclusions should be as specific as possible and mainly relate to proper nouns, such as company names.
Default exclusions
Some patterns are already excluded from translation by default, such as representations of Confluence user mentions, email addresses, and websites, among others.
Example:
Adding the regular expression Fantasy Company( (GmbH|AG|LLC))?
as exclusion pattern covers a company name in a range of different combinations and would cover all of the following sentences:
The name of the company is Fantasy Company.
The name of the company is Fantasy Company GmbH.
The name of the company is Fantasy Company AG.
The name of the company is Fantasy Company LLC.
To avoid being at the mercy of the translation backend, for some proper nouns translation exclusions are meaningful. Otherwise, the following can happen:
When configuring a reasonable exclusion pattern, the translation quality can be improved: