August 3, 2019

Below is a post from November, 2017, that I ghost-wrote for Morningside Translations. This translation services conglomerate had just acquired my father's company, Advanced Language Translation one year prior, and he must have been feeling too important to be stuck in the weeds blogging about web technologies.

HTML5: How a Developer’s Dream Turned into a Localization Engineer’s Good Fortune

Last month, the newest HTML standard from the World Wide Web Consortium (W3C) officially turned three years old. However, in a trend similar to most programming languages or front-end frameworks, HTML5 is only now starting to get the global use it deserves.

Its release three years ago sparked the usual curiosity from UX designers and web programmers, but now the development community is not the only audience excited about HTML5’s capabilities. Business analysts, small business owners, localization engineers, social media marketers, and search engine optimizers are all admitting that their companies are making great strides to become HTML5 compliant.

The release of this new “Living Standard,” as many are calling, has made content localization incredibly easy and immensely flexible for both creators and consumers on the web.


Keeping it simple

The most brilliant aspect of this move by W3C is its simplicity – technical updates are brief, intuitive and powerful. Sites that leverage HTML5 are now free from worrying over the differences between devices, browsers and the content the site itself owns. For the first time in web history, HTML markup itself can handle a shift between mobile and desktop without the aid of a larger styling framework like Bootstrap to handle the legwork.

Beyond that, HTML5 introduced a rich and robust API for natively hosting playback for video, audio content and vector-graphic animation in a site’s static code. By now, it’s not exactly news that HTML5 was the final nail in Flash’s coffin. The technical benefits, however, are just the tip of the iceberg.

Introducing New elements

Most aspects of the new standard that appeal to global content creators are primarily simple additions to the markup language’s element and attribute library. HTML5 offers new tags that allow for a logical organization of content based on type. For example, an entire block of elements dedicated to blogs and articles was given support.

Content creators now have a slew of native HTML elements to choose from to organize posts or informative pages within their domain. The gist of the elements <article/>, <section/>, <header/> and <footer/> is still simply a method of storing plain text in a document, but with better out-of-the-box styling and the ability to localize more effectively as compared to <span/> and <p/>. A developer may work with their business team to build requirements around placing all translatable content in <article/> elements, for instance. Other examples of elements that display HTML5’s modularity are <summary/>, <section/> and <description/>, whose purposes are fairly self-explanatory.


The only ironic part about adding new element sets to allow for content encapsulation is that a smaller component of the markup – attributes – is proving to be more powerful for people worrying about localization. Attributes now allow for any element that houses text to be tagged to identify:

  • The source language of the text
  • Whether or not it contains a term that should be translated
  • The intended directionality of the text.

While new modular element sets may help with organization, attributes such as these allow for both increased developer freedom and seamless localization capabilities.

Translation just got a whole lot easier

The concept of if a particular text block should be translated is huge; now the parsing engines of translation services can get by with filtering markup on one of two attributes: translate and its-term. The former’s purpose is to indicate if a medium-to-large body of text requires translation, and the latter is for indicating terminology definitions within a body of text.

The lang attribute is another simple addition that allows for indicating which source language is contained in an element. Pairing lang with the dir attribute has the potential to leave zero gaps in requirements when a content creator approaches a vendor for localization. The dir attribute, standing for “directionality,” explains whether the source language is right-to-left or vice versa.

A perhaps less obviously named, but just as powerful feature that appeared with HTML5 is the ruby annotation. The elements that the ruby annotation is comprised of allow for specifying the pronunciation of the text the annotation encapsulates. While this is applicable to all languages, it marks the first time that Asian languages have the web support for full coverage of their metadata. Gone are the days of “difficult languages” when performing localization and desktop publishing on web content.

A tool of revolution

HTML5 is the small and clairvoyant technical change that allowed for a huge step in the right direction in terms of globalizing web content. Beyond being a sturdy addition to a crucial software construct, it has already acted as a tool of revolution in the way companies think of their presence on the Web. Hopefully, over the next three years we will see the frameworks that are leveraging its sleek and lucid API make it easier for brilliant business minds to reach a wider audience with more accurate localization capabilities.