Cached at:
06/17/26, 09:55 PM
# Future Text API
Source: [https://wiesmann.codiferes.net/wordpress/archives/41710](https://wiesmann.codiferes.net/wordpress/archives/41710)

[](https://en.wikipedia.org/wiki/Codex_Gigas#/media/File:CodexGigas_505_Matthew.jpg)
While commenting on[Ian Hickson’s UI framework](https://software.hixie.ch/ui-frameworks)document, the question of future text API popped\-up\. The way you interact with text has changed surpassingly little since the original Macintosh Toolbox, you provide a string, some attributes like the font, the size, and some styling bits \(bold, italic\) and you can either ask for the string to be drawn, or just measured, so it is only fair to think it will be the same for the foreseeable future\.
Text is intimately link to culture, and there are cultural shifts under way, so I think text processing will change\. I’m not claiming I know what will happen, I can just give*some*possible ways things could change, with the understanding that those are*possibilities*, taken from my limited perspective\.
## Extended Grapheme Clusters
```
<em>🇨</em>🇭
```
The most probable evolution of text APIs is the migration to[Extended Grapheme Clusters instead of Code\-points](https://wiesmann.codiferes.net/wordpress/archives/41500)\. This avoids problems like the HTML sequence in the box\.
This is technically valid, but meaningless, it basically puts emphasis on half of a Swiss flag, and browsers don’t really know how to render it*🇨*🇭, you get a broken flag and an italic`C`\. Swift already defines strings as sequences of Grapheme Clusters, I suppose that other languages and graphical APIs will follow suit\.
## Font features
The open\-type system allows for many features: colored characters, font variations, ligatures, variants, but most fonts don’t use these\. The reason is simply that building a feature rich font is a lot of work\. The result is that many graphical APIs don’t expose or leverage these features\.
One of the first uses of neuronal networks was actually character recognition\. Tools that leverage AI to generate fonts are already very common, sooner or later these will allow to generate feature rich fonts\. It is only a matter of time before someone converts the[Codex Gigas](https://en.wikipedia.org/wiki/Codex_Gigas)into a font with extensive features: historiated and decorated initials, scribal abbreviations etc\.
As feature rich fonts become more common and more affordable, two things will become more important: APIs to access these features, and UI component which use them\.
## Feature introspection
One of the basic assumption of computer typography is that when you apply an attribute, say italic, to some text,*something*will happen, ideally the text will be slanted\. This assumption is pretty reasonable for text, say*tiger*or*虎*but won’t hold with an emoji*🐅*\(or will it?\), same goes for many other features, like color, title\-casing\. This means an API is needed to know what attributes actually*do*something\.
An alternative would be to introduce an level of indirection, like telling you want the font attributes that maximises contrast with a blue background\. This is something the original web did, with attributes like`strong`and emphasis, which were meant to be rendered in the most expressive way available on the rendering system \(so for instance highlighted on text terminals\)\. In the case of HTML, this clearly failed, maybe such an approach could work now\.
## Dynamic Text Component
UI toolkits typically have a basic text element, which in turn is used to build controls like buttons, or title panes\. While these days, most layout components are flexible, and can change their bounding boxe to fit the layout, text elements have typically a line height and a text width, and that’s it\. They can expand with white\-space but not contract\. Pretty static when all the other UI elements around it can adjust their size\. Now you could add a bit of horizontal elasticity by adding space between the glyphs, and making the spaces wider, but that’s just adding white\-space in a more distributed fashion\.
One of the few feature rich fonts on Mac OS X is Zapfino, it includes character variants, which could be used to expand or contract text horizontally\. Below is the compact version of my name:
[![Matthias [Zapfino, Compact]](https://wiesmann.codiferes.net/wordpress/wp-content/uploads/ZapfinoMatthiasCompact.png)](https://wiesmann.codiferes.net/wordpress/wp-content/uploads/ZapfinoMatthiasCompact.png)
And below is the expanded version, which is 24 pixels wider\.
[![Matthias [Zapfino Extended]](ttps://wiesmann.codiferes.net/wordpress/wp-content/uploads/ZapfinoMatthiasExpanded.png)](https://wiesmann.codiferes.net/wordpress/wp-content/uploads/ZapfinoMatthiasExpanded.png)
Now Zapfino has quite a special style, and I would certainly not advocate using it in UI controls, but the underlying typographic feature is available for any font, including serious looking*grotesk*fonts, but also emojis\. If you consider the glyph for a family with two kids, 🧑🧑🧒🧒 you can have a wide version with everyone lined up, or a stacked version with the kids in front\.
Basically, this means you could design a text\-component that switches typographic variants based on the available width, maybe also the height by switch ascenders and descenders\.
## Sub glyph processing
It’s tempting to see text as a sequence of atomic units, characters, glyphs, grapheme clusters\. Features like ligatures, composed characters blur the line\. For the user ½ is a compact representation of1/2 but I can color the top number red in the second case, but not in the first\. If you write it in math\-ml, it works:12\. It is never a good sign when a feature’s presence depends on the representation\.
We tend to think that the dot on a`i`character is part of it, but the Turkish languages disagree\. So I can write the same character as a sequence of a dotless i and a combining dot: ı̇, with the dot colored red, at least in the markup\. At the time of writing, Chrome just renders the character in uniform black\.
In Arabic and Hebrew, vowels are written in the form of diacritical marks, called Niqqud or Harakat\. These are often printed in a different color in religious or educational texts\. This feature is supported by page layout systems, but not web tools\.
Now you might argue this is a minor case\. Except, most Chinese characters are compounds\. The character for forest is 林, which has the following Ideographic Description Sequence ⿰ \(Left\-to\-Right\) \+ 木 \(tree\) \+ 木 \(tree\)\. Currently, Hanzi composition for rendering is not handled by Unicode, but specialized libraries \(if at all\)\. Now if you think that such decomposition do not apply to latin characters, let me introduce the CJK Unit Symbols\. You can write meter per second squared as a single character: ㎨ \(`U\+33A8`\)\.
The fact is, in each case, you can set attributes to the text in decomposed form, but not in composed form, and this is annoying \(and inconsistent\), in particular if you have complicated characters like龖\.
## Rotation
Matthias
Writing vertically is a traditional text layout in China, Japan and Korea and probably many other locales\. This translates into a different approach for layouts, but also hardware\. My Xiaomi band has a screen with a resolution of 192 × 490 pixels, narrow and wide\. This shapes makes sense for a watch, and is no problem to display CJK text, notifications in latin text are a bit weird\.
This means text controls could have a dynamic layout that switches between horizontal and vertical alignement according to the layout constraints, but also things like device orientation\. Vertical text is less a thing in western typography, but we all saw a vertical sign for a hôtel or a restaurant and our brains did not explode\.
The interesting thing is, some glyphs transform when in vertical mode, for instance in CJK text, parenthesizes are rotated by 90°\. National flags are another thing that can rotate by 90° depending on the circumstances, and the selection of glyph variants describe above could kick in\. Fractions are a special case of rotation: ff you consider the half fraction from above, the 3 representations ½,1/2and12move around the digits and the fraction bar, which results in a different height/width tradeoff\.
Basically, in this model, a chunk of text is much closer to a layout component, that displays a row \(or a column\) of glyphs, which in turn can adapt their layout\.
## Hyper\-localisation / personalization
One role of text in user\-interface is to represent a human readable version of some data: dates, times, measured values, positions, etc\. This gets complicated rather quickly as different people expect this information in different formats, and the way it is presented in text changes based on language, culture, and pesky things like grammar\.
To avoid this, most user interfaces use a contrived way of expressing things, instance of saying*you received 4 e\-mails*, it will say something like*received e\-mails: 4*, which allows to side\-step things like grammatical cases and correct plural handling\.
I think AI will change that\. LLMs use normal language to express these things, so expectations will change and the complexity of handling the language formatting is something that can be automated using AI\. This would mean that the API could offer facilities that instantiate a text component to display some value in natural language and handle all the localisation and formatting aspects, including personalisation, politeness, so nuclear physicists can get surfaces expressed in[ronabarns](https://en.wikipedia.org/wiki/Barn_(unit))\.
If an AI can generate one formatting template, it can also generate more\. Dates could always be formatted in a variety of lengths, from*2025/11/1*to*Saturday the first of November 2025*, this can certainly be generalized, as units can be abbreviated \(" for inches\), numbers simplified \(½ for 50%\), or changed into the canonical representation, like for instance roman numerals for centuries in French\.
## Complex, dynamic, page layout
The web did something strange to page layouts: on one hand the web is a omnipresent page layout system, on the other hand it is a pretty mediocre one, as it was never really designed for this[¹](https://wiesmann.codiferes.net/wordpress/archives/41710#footnote1)and the renderer is such a beast that there are very few implementations\. A few years back, I wrote that[HTML is the new X11](https://wiesmann.codiferes.net/wordpress/archives/30897), in the sense that it is omnipresent, but largely dead, with few implementation, and everyone coding against higher level APIs\.
As graphic API evolves, they will sooner or later get to the point where they can offer a better page rendering experience than embedding a web\-page, which is currently the default\. One way to look at this is to say you want to be able to implement a page layout program using the API, the other is that you want an API at which can throw a graph of nodes and ask it to lay it out, as this is already the API you use to do UI layout\.
The difference is that the container need, to handle proper layout rules, features like justification and word\-breaking, but taking into account that the various runs of text could change their dimensions to get a nicer layout\. The container would also have to handle non\-linear parts, floating figures and images, excerpts, notes and[ruby annotations](https://en.wikipedia.org/wiki/Ruby_character)\. Basically, a better LaΤεχ, where each part of the page is potentially dynamic\.
LLMs tend to output a lot of text, and they should be capable of providing additional structure, but also out of band information, so an API that can render their output will be most valuable\.