A Bowl of i18n & l10n Complexity
Languages are sick; change my mind
Linguistics problems continue to fascinate me since I was introduced to them years ago within a UX/UI context while working on the worldwide released Kindle Paperwhite. Meanwhile, Large Language and Machine Learning models have captured my attention.
While designing a fixed screen user interface for the e-ink e-reader, i18n and l10n considerations included:
- text expansion as in the degree to which translations increase a word's width
- reading direction as in right to left, left to right, top to bottom
- string formats as in with dates, currencies, addresses, time zones
- locales as in Portuguese for Brazil vs. Portugal or regional dialects and varieties as in Québecois French or Parisian French
Context and use case can add complexity with numbers, because "1992" translated into spoken French could be:
- mille neuf cent quatre vingt douze (year)
- dix neuf dollar quatre vingt douze centime as in "nineteen dollars ninety-two cents"
- un neuf neuf deux as in "one nine nine two"
With text expansion, it helped to know that translated English strings can expand 30-45% in length. This meant that if a button was 100 pixels wide to fit its text label plus padding, it could become 145 pixels wide when the user changed their language preference. German and Scandinavian languages are notorious for this UX need, for example.
A Poem as a Case Study
Here's a translation case study. One of my aunts shared this poem, to which another aunt replied, "Stop complaining. Everyone has suffered."
This piece of text is a stunning example of linguistic complexity - and a lot like poetic rap:
Literal & Figurative Translations
Nowadays work will feed you because
The boss paints the pancake (draws unrealistic plans, makes empty promises)
The colleague stirs the water (slacks off)
The supervisor slings a pot (shifts blame or shirk responsibility)
The teammates make bread (make mistakes)
The performance yields eggs (yields nothing)
The customers found tea (make complaints)
The open wound's salt (added insult to injuries)
The self emits fire (you're infuriated)
The boss fries the curling squid legs (the boss fires you)
Plus trivial people add spice and oil (gossip and manipulate)
Idioms & Expressions
- 畫餅 *huà bǐng*: paints the pancake refers to “紙上談兵” *zhǐ shàng tán bing* or idle theorizing on paper
- 划水 huá shuǐ: stirs the water refers to 混水摸魚 *hún shuǐ mō yú* or touching the fish which is like slacking off at work
- 甩鍋 *shuǎi gūo*: sling a pot or shirk responsibility when something goes wrong
- 出包 bāo: make bread, but this phrasing can also refer to mistakes like 腿長了一個包 *tuí zhǎng le yì ge bāo*. Used when your drunkard friend falls and their injured leg develops a swollen bump
- Eggs visually symbolize nothing because the circular shape resembles the number zero (fun fact: an old school Chinese hand symbol for "nothing" is 👌🏼)
- 找茶 zhǎo chá: found tea, with tea as a homonym for 碴 *chá* in 檢查 jiǎn chá or examine, meaning complaining or being nit-picky like a Karen
- 發火 fā huǒ: get mad, literally meaning emit fire. 發 is the same fā in 恭喜發財! as in "Have a prosperous new year!" wherein the statement sends positive vibes
- 炒魷魚 *chǎo yóu yú*: fry squid, where the squid legs curling up is a visual for how you'd feel when you get fired
- 小人 xiǎo rén: small, meaning trivial people or enemies
The lede states that you don't have to bring lunch to work since work will provide food for you. On the surface, the text positively describes kitchen colleagues collaborating to create a meal. Yet, the subsequent visual metaphors involving various ingredients and cookware double as animated satire about office life. Each line is actually rife with negative connotation, referring to the broadly relatable difficulties of many jobs, all the while using food - a major cultural value of Chinese-speaking countries - as a general theme.
The second to last line where "the boss fires you" is especially visual by saying "the boss fries squid." When fried, squid legs curl up. This is similar to the Anglicized phrase of "toes curling" to indicate an unpleasant and embarrassed feeling. American English becomes so inanimate by comparison.
If you were to ask me which option is better... I'd go with the squid.
What's Next
Nowadays, being meticulous about published text is paramount while it remains difficult to detect how much writing contains partially to mostly LLM-generated writing. Moreover, it's incredibly easy for technology pioneers working on the bleeding edge to over-pour computing complexity into every given puzzle. As humans ourselves, we remain challenged by making judgment calls all the time whereas Large Language Models cannot always catch potential harms such as:
- social biases and stereotypes
- misinformation
- inhumane or unsafe ideas
- poor performance
- suggesting illegal or immoral activity
- security and privacy risks
- copyright and legal protections
- and more
Meanwhile, exciting developments include:
- faster compute times and loads
- light speed synthesis and translations
- reduced cost of human error
- automated routine, disliked, or under-resourced tasks
- excellence at summarization, clustering, or grouping
- major leaps by the month
- and more
There are some consequences where users or customers have to:
- compensate for machine errors
- provide support for machine gaps
- expand on or contract specific outputs
- decide how much human labor to balance with machine labor
- decide which model or tool to invest in
- decide which output to publish
- figure out the desired "vibe" and match that to their target audience
- among other decisions
Taking a step back though, not only has translation been directly relevant as an issue surrounding my own upbringing, I've witnessed endless communication gaps, needs, and definitions into my adult experiences while working with people and traveling.
Traditionally, anybody's language fluency across the 4 skills of writing, speaking, listening, and reading can significantly vary. Even still, this categorical breakdown is American as well as irrelevant for people who communicate with sign language, flags, or morse code. Humans are inherently complex thereby making effective communication a continued problem, particularly with international co-workers where "good English" reduces language comprehension in a group of English as a Second Language speakers and is a minority amongst global majorities and phenomena. Slang, pidgin, dialects, or colloquialisms are ever-evolving and more difficult to discern for language models made for business purposes, not to mention aphasia among other neurophysiological phenomena and multilingual or sociopolitical dynamics such as linguistic hegemony.
If the purpose of translations via AI is simply for short-term SFW entertainment, for example, it is possibly okay to make small mistakes. But seemingly small acts in software like:
- converting a value, such as a 64-bit floating point number to a 16-bit signed integer, has inadvertently led to an explosion (see: Ariane 5), or
- releasing an updated configured security file can disrupt millions of computers around the world
... which affect us.
Verifying ("Are we building the thing right?") the inputs and outputs with human-in-the-loop reviews iteratively validates ("Are we building the right thing?") various AI models and a widening definition of robots. Meanwhile, models and bots continue to develop and manifest in many corners of human lives at the risk of upending some of the most vulnerable populations.
Hopefully as AI developments continue, people understand that technology harbors no value without its creators, customers, and users.