Showing posts with label emoji. Show all posts
Showing posts with label emoji. Show all posts

Friday, March 8, 2024

Breaking the Cycle ๐Ÿ”—๐Ÿ’ฅ

by Jennifer Daniel

(This article was originally published on Jenniferโ€™s Substack, January 17, 2023. Republished here with minor revision.)

Phoenix image
In the fall of 2022, the Unicode Technical Committee announced that the 2023 release of the Unicode Standard would be a โ€œdotโ€ release with limited character additions, with the next major release in 2024. This wasnโ€™t without precedent โ€” COVID slowed down the release of Unicode 14.0 in 2020 and the world seemed to survive ๐Ÿ˜‰. Subcommittees were well prepared and adjusted accordingly, discussing what this meant for their respective areas of expertise.

For the Emoji Subcommittee (ESC) โ€” the group responsible for defining the rules, algorithms, and properties necessary to achieve interoperability between different platforms for those smiley faces that appear on your keyboard (Shout out ๐Ÿ˜๐Ÿฅฐ๐Ÿฅน๐Ÿค”๐Ÿซฃ๐Ÿซก๐Ÿ˜ตโ€๐Ÿ’ซ!) โ€” this delay presented an opportunity. Sure, we were so close to exhaling a sigh of relief (the intake period for Emoji 16.0 proposals had just completed). But upon learning we couldnโ€™t ship any new codepoints until 2024 we turned our energy towards recommending new emoji based on existing ones. (These are called emoji ZWJ sequences. That's when a combination of multiple emoji display as a single emoji โ€ฆ like ๐Ÿ‘ฉ ๐Ÿฝ +๐Ÿญ = ๐Ÿง‘๐Ÿฝโ€๐Ÿญ).

When Less is More

An incredibly powerful aspect of written language is that it consists of a finite number of characters that can "do it all". And yet, as the emoji ecosystem has matured over time our keyboards have ballooned and emoji categories are about to hit or have hit a level of saturation. Upon reflecting on how emoji are used, the ESC has entered a new era where the primary way for emoji to move forward is not merely to add more of them to the Unicode Standard. Instead, the ESC approves fewer and fewer emoji proposals every year.

But our work is not done. Not by a longshot. Language is fluid and doesnโ€™t stand still. There is more to do! This โ€œoff-cycleโ€ gives us a chance to address some long-standing major pain points using emoji. The first one that came to mind: skin-tone.

What is a family?

The encoding of multi-person multi-tone support has matured over the years; However, the implementation can seem random to the average person: While itโ€™s true, all people emoji have toned options (with the exception of characters where you canโ€™t see skin like ๐Ÿคบ) there are โ€ฆ misfits. Some two people emoji offer tone support ( ๐Ÿง‘๐Ÿปโ€โค๏ธโ€๐Ÿง‘๐Ÿฟ) others do not ( ๐Ÿ‘ฏ). A few non RGI emoji render with tone but with no affordance to change one of the two characters (For example, ๐Ÿคผ๐Ÿพโ€โ™‚ renders with skintone on Android but as gold on iOS. WHY. This is why we standardize these things, people).

And then ... There is the suite of family emoji (๐Ÿ‘จโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘ฆโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘ง๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ง๐Ÿ‘ฉโ€๐Ÿ‘ฆ๐Ÿ‘ฉโ€๐Ÿ‘ฆโ€๐Ÿ‘ฆ๐Ÿ‘ฉโ€๐Ÿ‘ง๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง ๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘ฆโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘ง๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ง๐Ÿ‘ฉโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆ๐Ÿ‘ฉโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆโ€๐Ÿ‘ฆ๐Ÿ‘ฉโ€๐Ÿ‘ฉโ€๐Ÿ‘ง๐Ÿ‘ฉโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ๐Ÿ‘ฉโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘ง๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง๐Ÿ‘ช). These characters include two people, three people, sometimes four and none of them have any tone support (!). We seem to have a lot of family emoji and yet simultaneously not enough.

The 26 โ€œfamilyโ€ emoji can be broken down into four groups:

Families image

Despite the Unicode Standard containing 26 โ€œfamilyโ€ emoji, each one of these glyphs is overly prescriptive with regard to delivering on a visual representation of a family. The inclusion of many permutations of families was well intentioned. But we canโ€™t list them all, and by listing some of the combinations, it calls attention to the ones that are excluded.

What even is a family? For some, family is the people you were raised with. Others have embraced friends as their chosen family. Some families have children, other families have pets. There are multi-generational families, mutliracial families and of course many families are any combination of all of these characteristics and more.

Fortunately, we donโ€™t need to add 7000 variants to your keyboards (even this would fall short of capturing the breadth of "family" as a concept). Instead we can juxtapose individual emoji together to capture a concept with some reasonable level of specificity โ€” not too unlike arranging letters together to create words to convey concepts ๐Ÿ˜‰

Different families image
For emoji keyboards to advance in creating more intuitive and personalized experiences the Emoji Subcommittee is recommending a visual deprecation of the family emoji. This small set of emoji will be redesigned as part of a multi-phase effort to โ€œcomplete the setโ€ of toned variants for the remaining multi-person emoji. This of course begs the question: when there are as many families as there are people in the world, is there an effective way at conveying the concept of โ€œfamilyโ€ without being overly prescriptive in defining what is and is not a family? Well, thankfully icons can do a lot of heavy lifting without requiring very much detail.

Famiy, symbol image

When is an emoji running for the police or getting chased by them?

Another area the ESC is actively exploring is how the semantics of emoji sequences can differ when writing directionality changes. Some emoji characters have semantics that encode implicit directionality but when the string is mirrored and their meaning may be unintentionally lost or changed.

Left to Right emoji image
Left to Right Emoji Sequence: Quickly running towards an โ€œexcitingโ€ police chase

Right to Left emoji image
Right to Left Emoji Sequence: Running away from the coppers

What, if anything, can we do to aid in ensuring that messages are meaningfully translated be them tiny pictures or tiny letters? As part of 15.1 weโ€™re proposing a small set of emoji with strong directionality โ€” with an initial focus on people โ€” to face the opposite direction. Soon you too can run towards or away from ... excitement.

Emoji 15.1

Given that the intake cycle of emoji proposals for Unicode 16.0 ended last July, the Emoji Subcommittee has also decided to temporarily delay the intake of Unicode Version 17.0 proposals until April 2024. Fortunately, you wonโ€™t have to wait until then to get new emoji. (Note: I know it sounds like Iโ€™m talking about the past and future simultaneously ... the emoji lifecycle is looooong and as a result overlaps with multiple releases. Expect a future blog post about the Emoji 15.0 candidates landing early this year (Shout out goose, pink heart, and pushing hands). Iโ€™ve been holding off writing about this set until you can actually see them on your phones but given that weโ€™re already talking about 2024 maybe itโ€™s time I dust that blog post off).

Emoji 2023 timeline image

Anyways, among the list of Emoji 15.1 recommendations for 2024 includes 578 characters (most of them the candidates described above to support directionality). The list also includes a few humble additions including a broken chain, a lime, a non-poisonous mushroom, a nodding and shaking face, and a phoenix bird. Each one of these leverages a unique valid ZWJ sequence of emoji so while they look like atomic characters made of a single codepoint they are composed of two or more codepoints.

Broken chain and other emoji image

Broken chain is the result of a ๐Ÿ”—๐Ÿ’ฅ ZWJ and contains a variety of meanings, such as freedom, breaking a cycle, or perhaps a broken url ;-). Nodding face and shaking face are composed of arrows to imply movement in a still image (๐Ÿ™‚โ†”๏ธ) and (๐Ÿ™‚โ†•๏ธ). Oh, and of course there is a phoenix rising from the ashes (๐Ÿฆ๐Ÿ”ฅ), an ancient metaphor that captures the zeitgeist of today.

The Unicode Technical Committee (UTC) will review the required documents at its first meeting of 2023 in January โ€“ and if these candidates move forward, you can expect an update from the UTC later this Spring and Summer.


Adopt a Character and Support Unicodeโ€™s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
๐Ÿ•‰๏ธ๐Ÿ’—๐ŸŽ๏ธ๐Ÿจ๐Ÿ”ฅ๐Ÿš€็ˆฑโ‚ฟโ™œ๐Ÿ€

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicodeโ€™s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock

Wednesday, November 1, 2023

What do a leafless tree, a fingerprint, and a harp have in common?

This is not a set up to a riddle. This is Emoji 16.0.

By Jennifer Daniel, Chair of the ESC


This week, the Unicode Technical Committee gathered for our last meeting of 2023 to discuss the encoding, data files, and list of characters related to digitizing the worldโ€™s languages. Amongst the topics discussed were emoji and as a result seven new characters are on their way for inclusion into the Unicode Standard, into your keyboards, and into your hearts ;-)

emoji table image
The final recommendations culminated in seven emoji: one emoji per major category.

An incredibly powerful aspect of written language is that it consists of a finite number of characters that can โ€œdo it allโ€. And yet, as the emoji ecosystem has matured over time our keyboards have ballooned and emoji categories are about to hit โ€” or have hit โ€” a level of saturation. Upon reflecting on how emoji are used, the Unicode Emoji Subcommittee (ESC) has entered a new era where the primary way for emoji to move forward is not merely to add more of them to the Unicode Standard, but to consider how the ones added provide the most linguistic flexibility. As a result, the ESC approves fewer and fewer emoji proposals every year.

The few that are added this year have demonstrated their adaptability in different contexts โ€” take for example, fingerprint. It is commonly used to represent multiple concepts. Fingerprints are a symbol of identity (unique as you), security (as a passkey), and forensics (what crime show logo is complete without a fingerprint?). While we think of fingerprints as a relatively modern phenomenon according to Forensics Digest, the earliest use of fingerprints dates back to 1000 B.C.

In fact all of this yearโ€™s emoji candidates have deep roots in history. Harps have been known since antiquity in Asia, Africa, and Europe, dating back at least as early as 3000 BCE. Today it has political, sporting, corporate, and religious symbolism ๐Ÿ‘ผ Leafless trees have been around as long as ... well, trees (and poetry!) I suppose. Leafless trees literally represent droughts or winter and metaphorically indicate a state of barrenness and death.

Shovel isnโ€™t just another noun โ€” sure, yes, itโ€™s a tool commonly found in your shed โ€” in our keyboards, however, itโ€™s also a verb. Digging yourself out of a hole, digging yourself into a hole, shoveling ๐Ÿ’ฉ, it does it all. But wait, thereโ€™s more. Splatter is one of those stealth emoji that when you look at you might be thinking, โ€œreally, another sex emoji?โ€ (To be honest, show me someone who doesnโ€™t think an emoji is a sex emoji and Iโ€™ll show you someone who lacks imagination). Splatter is a spill. Splatter is expressive. Splatter is soft โ€”ย  a perfect counterpoint to collision ๐Ÿ’ฅ โ€” the bouba to ๐Ÿ’ฅโ€™s kiki.

When can you get these new emoji?

A simple question that deserves a simple answer. Alas, youโ€™re dealing with Unicode so the answer is complex. Did you know it can take up to two years to encode an emoji? Itโ€™s true. If we want the symbols we digitize to truly โ€œjust workโ€ across the entirety of not just the Internet but all digital surfaces โ€ฆ it takes time. So, donโ€™t expect to see these characters anytime soon. In fact, despite the previous batch of emoji (phoenix, lime, broken chain, etc.) getting approved last year they still havenโ€™t landed on your device of choice yet but are well on their way to pop up in the first half of 2024.

emoji at a glance
Emoji 16.0 has a long road ahead and will appear on most devices in May-June 2025.



Support Unicode
To support Unicodeโ€™s mission to ensure everyone can communicate in their languages across all devices, please consider adopting a character, making a gift of stock, or making a donation. As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.

[badge]

Tuesday, September 12, 2023

Announcing The Unicodeยฎ Standard, Version 15.1


Version 15.1 of the Unicode Standard is now available. This minor version update includes updated code charts, data files and annexes. The core specification is unchanged from Unicode Version 15.0.

This version adds 627 characters, bringing the total number of characters to 149,813. The additions include 622 CJK unified ideographs in a new block, CJK Unified Ideographs Extension I. These new ideographs are urgently needed in China for use in public service databases, and are expected to be included in a forthcoming amendment to Chinaโ€™s GB 18030-2022 standard. The other new characters are five ideographic description characters that enhance the ability to describe rare or not-yet-encoded CJK ideographs.

There are six completely new emoji, such as for phoenix and lime and (finally) an edible mushroom. For 108 people emoji, you can now switch the direction that they are facing (for example, person walking facing right versus facing left).

Security-related updates have been made to UAX #9, Unicode Bidirectional Algorithm and UAX #31, Unicode Identifiers and Syntax along with updates to UTS #39, Unicode Security Mechanisms. These updates complement the release of a new Unicode Technical Standard, UTS #55, Unicode Source Code Handling.

The new characters are limited to three blocks, and the code charts for several other blocks have changed. The most significant change to charts is for the CJK Unified Ideographs, CJK Unified Ideographs Extension A and CJK Unified Ideographs Extension B blocks with the addition of representative glyphs and source references for over 24,000 KP-source (North Korea) ideographs. There are also many other glyph corrections and improvementsโ€”see the 15.1 delta code charts for details.

Significant updates have been made to UAX #14, Unicode Line Breaking Algorithm and UAX #29, Unicode Text Segmentation adding better support for scripts of South and Southeast Asia, including grapheme cluster support for aksaras and consonant conjuncts, and line breaking at orthographic syllable boundaries.

For complete details on Unicode Version 15.1, see https://www.unicode.org/versions/Unicode15.1.0/.



Support Unicode
To support Unicodeโ€™s mission to ensure everyone can communicate in their languages across all devices, please consider adopting a character, making a gift of stock, or making a donation. As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.

[badge]

Monday, February 6, 2023

Announcing New Unicode Adopt-a-Character Site

[image]
The Adopt-a-Character program was launched in 2015. Since that time, AAC funds have supported Unicode's mission to ensure everyone can communicate in their own language. This includes preserving historical scripts such as Egyptian hieroglyphics and providing better language support for digitally disadvantaged and under-resourced languages such as Hanifi Rohingya used in Myanmar and Bangladesh.

Now you can more easily adopt a character and show off your hobby or business, favorite sport, or love โ€“ while also supporting a good cause. You can also give the gift of a letter to someone in your life. The possibilities are endless โ€“ and each adoption helps Unicodeโ€™s goal to support the worldโ€™s languages.

All character adoptions are permanent. Adoption of a specific character at the limited gold and silver levels is on a first-come-first-served basis. All sponsors receive a digital badge and are recognized onย Unicodeโ€™s website, Twitter feed, and Friends of Unicode Facebook page.

To start your adoption, visit our new page!

Unicode, Inc. is a non-profit, 501(c)3 organization and contributions may be eligible for a tax deduction. Please consult with a tax expert for details.



[badge]

Tuesday, January 17, 2023

Whatโ€™s New in Emoji 15.1?

Doing more, with less

By: Jennifer Daniel, Chair of the Emoji Subcommittee

[image phoenix]

This past Fall, the Unicode Technical Committee announced the delay of Unicode 16.0. This wasnโ€™t without precedent โ€” COVID slowed down the release of Unicode 14.0 in 2020 and the world seemed to survive ๐Ÿ˜‰. Subcommittees were well prepared and adjusted accordingly, discussing what this meant for their respective areas of expertise.

For the Emoji Subcommittee (ESC) โ€” the group responsible for defining the rules, algorithms, and properties necessary to achieve interoperability between different platforms for those smiley faces that appear on your keyboard (Shout out ๐Ÿ˜๐Ÿฅฐ๐Ÿฅน๐Ÿค”๐Ÿซฃ๐Ÿซก๐Ÿ˜ตโ€๐Ÿ’ซ!) โ€” this delay presented an opportunity. Sure, we were so close to exhaling a sigh of relief (the intake period for Emoji 16.0 proposals had just completed). But upon learning we couldnโ€™t ship any new codepoints until 2024 we turned our energy towards recommending new emoji based on existing ones. (These are called emoji ZWJ sequences. That's when a combination of multiple emoji display as a single emoji โ€ฆ like ๐Ÿ‘ฉ ๐Ÿฝ +๐Ÿญ = ๐Ÿง‘๐Ÿฝโ€๐Ÿญ).

When Less is More

An incredibly powerful aspect of written language is that it consists of a finite number of characters that can "do it all". And yet, as the emoji ecosystem has matured over time our keyboards have ballooned and emoji categories are about to hit or have hit a level of saturation. Upon reflecting on how emoji are used, the ESC has entered a new era where the primary way for emoji to move forward is not merely to add more of them to the Unicode Standard. Instead, the ESC approves fewer and fewer emoji proposals every year.

But our work is not done. Not by a longshot. Language is fluid and doesnโ€™t stand still. There is more to do! This โ€œoff-cycleโ€ gives us a chance to address some long-standing major pain points using emoji. The first one that came to mind: skin-tone.

What is a family?

The encoding of multi-person multi-tone support has matured over the years; however, the implementation can seem random to the average person: While itโ€™s true, all people emoji have toned options (with the exception of characters where you canโ€™t see skin like ๐Ÿคบ) there are โ€ฆ misfits. Some two people emoji offer tone support ( ๐Ÿง‘๐Ÿปโ€โค๏ธโ€๐Ÿง‘๐Ÿฟ) others do not ( ๐Ÿ‘ฏ). A few non RGI emoji render with tone but with no affordance to change one of the two characters (For example, ๐Ÿคผ๐Ÿพโ€โ™‚).

And then โ€ฆ There is the suite of family emoji (๐Ÿ‘จโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘ฆโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘ง๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ง๐Ÿ‘ฉโ€๐Ÿ‘ฆ๐Ÿ‘ฉโ€๐Ÿ‘ฆโ€๐Ÿ‘ฆ๐Ÿ‘ฉโ€๐Ÿ‘ง๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง ๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘ฆโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘ง๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ง๐Ÿ‘ฉโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆ๐Ÿ‘ฉโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆโ€๐Ÿ‘ฆ๐Ÿ‘ฉโ€๐Ÿ‘ฉโ€๐Ÿ‘ง๐Ÿ‘ฉโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ๐Ÿ‘ฉโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘ง๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง๐Ÿ‘ช). These characters include two people, three people, sometimes four and none of them have any tone support (!). We seem to have a lot of family emoji and yet simultaneously not enough.

The 26 โ€œfamilyโ€ emoji can be broken down into four groups:

[image families]

Despite the Unicode Standard containing 26 โ€œfamilyโ€ emoji, each one of these glyphs is overly prescriptive with regard to delivering on a visual representation of a family. The inclusion of many permutations of families was well intentioned. But we canโ€™t list them all, and by listing some of the combinations, it calls attention to the ones that are excluded.

What even is a family? For some, family is the people you were raised with. Others have embraced friends as their chosen family. Some families have children, other families have pets. There are multi-generational families, mutli-racial families and of course many families are any combination of all of these characteristics and more.

Fortunately, we donโ€™t need to add 7000 variants to your keyboards (even this would fall short of capturing the breadth of "family" as a concept). Instead we can juxtapose individual emoji together to capture a concept with some reasonable level of specificity โ€” not too unlike arranging letters together to create words to convey concepts ๐Ÿ˜‰

[image toned families]

For emoji keyboards to advance in creating more intuitive and personalized experiences the Emoji Subcommittee is recommending a visual deprecation of the family emoji. This small set of emoji will be redesigned as part of a multi-phase effort to โ€œcomplete the setโ€ of toned variants for the remaining multi-person emoji. This of course begs the question: when there are as many families as there are people in the world, is there an effective way at conveying the concept of โ€œfamilyโ€ without being overly prescriptive in defining what is and is not a family? Well, thankfully icons can do a lot of heavy lifting without requiring very much detail.
[image before-after]

When is an emoji running for the police or getting chased by them?

Another area the ESC is actively exploring is how the semantics of emoji sequences can differ when writing directionality changes. Some emoji characters have semantics that encode implicit directionality but when the string is mirrored and their meaning may be unintentionally lost or changed.

[image rightwards]
Left to Right Emoji Sequence
Quickly running towards an โ€œexcitingโ€ police chase


[image leftwards]
Right to Left Emoji Sequence
Running away from the coppers


What, if anything, can we do to aid in ensuring that messages are meaningfully translated be them tiny pictures or tiny letters? As part of 15.1 weโ€™re proposing a small set of emoji with strong directionality โ€” with an initial focus on people โ€” to face the opposite direction. Soon you too can run towards or away from โ€ฆ excitement.

Emoji 15.1

Given that the intake cycle of emoji proposals for Unicode 16.0 ended last July, the Emoji Subcommittee has also decided to temporarily delay the intake of Unicode Version 17.0 proposals until April 2024. Fortunately, you wonโ€™t have to wait until then to get new emoji. Among the list of recommendations includes 578 characters (most of them the candidates described above to support directionality). The list also includes a few humble additions including a broken chain, a lime, a non-poisonous mushroom, a nodding and shaking face, and a phoenix bird. Each one of these leverages a unique valid ZWJ sequence of emoji so while they look like atomic characters made of a single codepoint they are composed of two or more codepoints.

[image candidates]

Broken chain is the result of ๐Ÿ”—๐Ÿ’ฅ, with a variety of meanings, such as freedom, breaking a cycle, or perhaps a broken url ;-). Like the bi-directional emoji touched on above, nodding face and shaking face are the result of ๐Ÿ™‚โ†”๏ธand ๐Ÿ™‚โ†•๏ธ respectively. Oh, and of course there is a phoenix rising from the ashes (๐Ÿฆ๐Ÿ”ฅ), a perfect metaphor to capture where we are today.

The Unicode Technical Committee (UTC) will review the required documents at its first meeting of 2023 in January โ€“ and if these candidates move forward, you can expect an update from the UTC later this Spring and Summer.


Support Unicode
To support Unicodeโ€™s mission to ensure everyone can communicate in their languages across all devices, please consider adopting a character, making a gift of stock, or making a donation. As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.

[badge]

Wednesday, December 21, 2022

Unicode in 2022

2022 Image

Hello Everyone!

As we go into the New Year, the Unicode team thought weโ€™d share some highlights from this past year. From source-code spoofing to preserving indigenous languages, the Unicode team has had another full year, including expanding the number of characters that appear on billions of devices around the world.


Nearly 150,000 characters!

On the character side, we reached a total of just shy of 150,000 characters (149,186 to be exact). Of the 4,489 characters added in the 15.0 release, the biggest set was 4,192 ideographs for use in Chinese, Japanese, and Korean. There are also two new scripts, Nag Mundari and Kawi. Nag Mundari is a script used to write the Mundari language of India, a language with 1.1 million speakers. Kawi is an important historic script of insular Southeast Asia, found in inscriptions and on artifacts in several languages dating from the 8th to the 16th centuries โ€” and is undergoing a revival today amongst enthusiasts.

And we canโ€™t forget the 20 new emoji characters โ€” weโ€™re looking forward to seeing which are the most popular: shaking face? Goose? Maracas? Pink heart? If youโ€™re involved in implementing emoji, youโ€™ll also want to look at latest changes in UTS #51 Unicode Emoji.

See the Unicode15.0.0 page for more details. Weโ€™re also changing how we do releases โ€” for more, see 2023 Release Planning.

The Launch of ICU4X

ICU is used in every major device and operating system; itโ€™s how you see a date or number on your phone, for example. This new project, ICU4X, was created to solve the needs of clients who wish to provide client-side internationalization for their products in resource-constrained environments and across many programming languages. After 2ยฝ years of work by Google, Mozilla, Amazon, and community partners, the Unicode Consortium has published ICU4X 1.0, its first stable release. Built from the ground up to be lightweight, portable, and secure, ICU4X learns from decades of experience to bring localized date formatting, number formatting, collation, text segmentation, and more to devices that, until now, did not have a suitable solution. For details, see Announcing ICU4X 1.0.

When does i โ‰  ั–?

Can you tell the difference between i and ั–? Yeah, most people canโ€™t. The first set of changes to help counter source-code spoofing were included in the 15.0 versions of the UAX #9 Unicode Bidirectional Algorithm, UAX #31 Unicode Identifier and Pattern Syntax, and UTS #39 Unicode Security Mechanisms.

For 2023, there is a new draft UTS #55 Unicode Source Code Handling, providing guidance for programming language designers and tooling developers, and specifying mechanisms to avoid usability and security issues arising from improper handling of Unicode. More changes are on their way for UAX #9, UAX #31, and UTS #39 as well.

ร…ge Mรธller, ฮ ฮญฯ„ฯฮฟฯ‚ ฮฮนฮบฯŒฮปฮฑฮฟฯ‚ ฮšฮฑฯฮฑฯ„ฮถฮฎฯ‚, เฎฐเฎพเฎœเฏ‡เฎจเฏเฎคเฎฟเฎฐ เฎšเฏ‹เฎดเฎฉเฏ

Weโ€™re making great progress on internationalized formatting of peopleโ€™s names. What does that mean? Software needs to be able to format people's names, such as John Smith or ๅฎฎๅดŽ้งฟ. The formatting can be surprisingly complicated: for example, people may have a different number of names, depending on their culture โ€” they might have only one name (โ€œZendayaโ€), only two (โ€œAlbert Einsteinโ€), or three or more. So the software needs to handle missing or extra name fields gracefully.

There are many more complexities โ€” for more details, see Formatting peopleโ€™s names.

You have 2 unread messages.

Or, you have 3 items in your cart. Whenever a computer needs to construct a sentence using โ€œplaceholdersโ€ such as 3, it is formatting a message. The current industry standard is ICUโ€™s message formatting; a project started about 3 years ago, with the goal of improving on that to build a more robust and extensible mechanism. There is now a Tech Preview in ICU โ€” weโ€™d urge developers to try it out!

See message-format-wg for details on the syntax and message2/package-summary.html for the API (note that the ICUโ€™s convention for tech previews is to mark as Deprecated), and the test code in MessageFormat2Test.java for examples of usage.

(There are of course other fixes, upgrades and new features in ICU: see ICU 72 and ICU 71 for more details.)

Mฤori, โ€ŽWolof, ั‚ะพาทะธะบำฃ, โ€Žโ€Žฺฉูฒุดูุฑ, โ€Žแ‰ตแŒแˆญแŠ›, เค•เฅ‰เคถเฅเคฐโ€Ž, โ€Žเฆฎเงˆเฆคเงˆเฆฒเง‹เฆจเง, โ€ŽแฑฅแฑŸแฑฑแฑ›แฑŸแฑฒแฑค

In CLDR, we now have 95 languages at the Modern level (suitable for full UI internationalization), 6 at the Moderate level (suitable for โ€œdocument contentโ€ internationalization), and 29 at the Basic level (suitable for locale selection). We added a tech preview of formatting for person names, plus additions for Unicode 15.0 (emoji names and search keywords), names for new scripts, new CJK collation, and so on. For more information, see CLDR v42.

Revitalization and Preservation of Indigenous Languages

The Nattilik language community was unable to use their language reliably for even simple, everyday digital text exchanges such as email or text messaging. The Typotheque Syllabics Project, an initiative based out of Toronto and The Hague, Netherlands, undertook research with language keepers across various Syllabics-using Indigenous communities in Canada. By collaborating with Nattilik language keepers and elders in the community, key issues the Nattilik community of Western Nunavut faced were identified, and it was discovered that there were 12 missing syllabic characters from the Unicode Standard. The Consortium worked with the Typotheque Syllabics Project to add 16 characters to the script to support Nattilik and other languages in Unicode version 14.0, and improved the glyphs in Unicode version 15.0. See this blog post from June.

The Past and Future of Flag Emoji

Despite being the largest emoji category with a strong association tied to identity, flags are by far the least used. Flag emoji have always been subject to special criteria due to their open-ended nature, infrequent use, and burden on implementations. The addition of other flags and thousands of valid sequences into the Unicode Standard has not resulted in wider adoption. They donโ€™t stand still, are constantly evolving, and due to the open-ended nature of flags, the addition of one creates exclusivity at the expense of others. Curious to learn more? Read more about the Past and Future of Flag Emoji.

Available Now! New YouTube Playlist and Technical Quick Start Guide

On September 28th, Unicode held a webinar on the โ€œOverview of Internationalization and Unicode Projectsโ€ for Unicode enthusiasts. Unicode technical leadership and other experts shared background on our core projects with participants from more than 30 countries. If you missed the webinar, no worries! The recorded sessions are available on this YouTube playlist. And if you are new to Unicode and internationalization or simply want a refresh, you can also check out our Technical Quick Start Guide. This handy guide explains what Unicode is, including answering the question, โ€œWhat is Internationalization and Why it Matters.โ€ There are also useful links to more detailed information and how you can get involved. Read more here.

Support Unicode ๐Ÿ’ž๐Ÿ’•๐Ÿ’Œ๐Ÿ’ฏโœจ๐ŸŒŸ๐Ÿค ๐Ÿ›Ÿ๐ŸŽ

Finally, if you are already a contributor to โ€” or member of Unicode (or your company or organization is!), thank you, Danke, Dฤ›kuju, เคงเคจเฅเคฏเคตเคพเคฆ, merci, ่ฐข่ฐขไฝ , grazie, เฎจเฎฉเฏเฎฑเฎฟ, and gracias! What we have accomplished is only possible because of supporters like you.

And if you want to support Unicodeโ€™s mission to ensure everyone can communicate in their languages across all devices, please consider adopting a character, making a gift of stock, or making a donation. As Unicode is a US-based non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.

Monday, April 4, 2022

Emoji Are Not Born, They Are Made

Unicode now accepting proposals for Emoji 16.0

Itโ€™s hard to believe that just as Emoji 14.0 begins to appear on your device of choice this year, the Unicode Emoji Subcommittee [ESC] has already begun to plan for Emoji 16.0. Thatโ€™s right, as of today โ€” April 4, 2022 โ€” applications to submit ideas for new emoji are open through July 31, 2022! ๐Ÿ‘๏ธ๐Ÿ“๐Ÿ‘๏ธ

So, how do you ensure your proposal is the best it can be? Well, here are some tips for consideration as you prepare it.

Check whether the emoji already exists!

โœ… First: See if itโ€™s already been approved.

๐Ÿค” Second, is it being reviewed?

๐Ÿง‘๐Ÿพโ€๐Ÿซ Tip: Donโ€™t skip any of the fields in the form! Incomplete proposals wonโ€™t be processed and will be returned. The ESC team members get a lot of submissions and complete proposals help them evaluate the submissions.

Be sure your proposal meets the criteria for consideration.

We recommend being faithful to the criteria for inclusion as much as possible and to consult the Emoji Subcommitteeโ€™s priorities, guidelines, strategies, reports, and audits. Many of the new provisional candidates for Emoji 15.0 are the result of these documents: pink heart, shaking face, rightwards pushing hand. The following are just some of the many considerations for writing a compelling proposal:
  • Multiple Uses
    Does the candidate emoji have significant metaphorical references or symbolism and not merely represent itself?
  • Use in sequences
    How is the emoji used with other emoji to communicate something new?
  • Breaking new ground
    Does the emoji represent something that is not already representable?
  • Distinctiveness
    Explain how and why this emoji represents a distinct, visually iconic entity that is relevant to a global audience
  • Compatibility
    Is it needed for compatibility with frequently-used emoji in popular existing systems, such as WeChat, Twitter, etc.
  • Frequency of Use
    Is there a high frequency of use? There should be empirical evidence of high usage in literature, movies, graphic novels, etc. worldwide.
Examples can be found on this page under โ€œSelection Factorsโ€

Well, letโ€™s get going! How do I propose an emoji?

๐Ÿ“ Submit a proposal

My proposal wasnโ€™t selected :(

We recognize that it will come as a disappointment if your proposal is not one of the few selected for inclusion. ๐Ÿ’• There are loads of reasons why this may have happened.
  • โž• It can already be represented by a sequence
    (Ex. Garbage fire ๐Ÿ—‘๏ธ๐Ÿ”ฅ, Can of worms ๐Ÿฅซ๐Ÿชฑ)
  • ๐Ÿ” Itโ€™s too specific
    We canโ€™t add every type of flower, every breed of dog, every color of drink
  • ๐Ÿ’ฐ Very few are selected
    Roughly thirty emoji characters are added each year
  • ๐Ÿฃ Itโ€™s a transient concept
    Think less โ€œmemesโ€ and more โ€œstable long-standing conceptsโ€. Can you cite how this concept has existed in a communicative manner such as literature, movies, graphic novels, etc.?
  • โ™พ๏ธ Itโ€™s open-ended
    There is no compelling evidence to add it over others of a similar type
  • โŒ Many other factors for exclusion

Why canโ€™t we make EVERYTHING an emoji?

Any emoji additions have to take into consideration usage frequency, trade-offs with other choices, font file size, and the burden on developers (and users!) to make it easier to send and receive emoji. Thatโ€™s why the Emoji Subcommittee set out to reduce the number of emoji we encode in any given year.

Reconciling the rapid, transient nature of modern communication with the formal, methodical process required by a standards body like the Unicode Consortium is the name of the game these days. Until the sending and receiving of images is standardized in some manner so you can send any image in the world alongside your text messages not just code points ... well, Unicode is here for the worldโ€™s emoji character needs. ๐Ÿซ‚๐Ÿ’–


Over 144,000 characters are available for adoption to help the Unicode Consortiumโ€™s work on digitally disadvantaged languages

[badge]

Monday, March 28, 2022

The Past and Future of Flag Emoji

Emoji Flags are dead, long live Emoji Flags ๐Ÿ ๐Ÿ ๐Ÿ

By Jennifer Daniel, Unicode Emoji Subcommittee Chair

With Emoji 16.0 submissions open from April 4, 2022 through July 31, 2022, the Unicode Emoji Subcommittee members stand with open arms for your future hair pick, khanda, and pink heart emoji proposals (BTW, if you were planning to prepare proposals for those concepts, we have some good news for you: they are already Emoij 15.0 draft candidates!).

That being said, there is one particular type of emoji for which the Unicode Consortium will no longer accept proposals. Flag emoji of any category.

Flag emoji have always been subject to special criteria due to their open-ended nature, infrequent use, and burden on implementations. Today nine out of ten are in the top twenty most frequently shared flags. (The only outlier is Russia.) The addition of other flags and thousands of valid sequences into the Unicode Standard has not resulted in wider adoption. They donโ€™t stand still, are constantly evolving, and due to the open-ended nature of flags, the addition of one creates exclusivity at the expense of others.

Why do flag emoji exist in the first place?

Well, the shorter, more technical answer is: The country flags use a generative mechanism, and were encoded early on for compatibility reasons.

The longer answer requires a flashback to the 1990โ€™s. KDDI and SoftBank โ€” two Japanese mobile phone carriers โ€” had early emoji sets which included 10 country flags: ๐Ÿ‡จ๐Ÿ‡ณ ๐Ÿ‡ฉ๐Ÿ‡ช ๐Ÿ‡ช๐Ÿ‡ธ ๐Ÿ‡ซ๐Ÿ‡ท ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡ฎ๐Ÿ‡น ๐Ÿ‡ฏ๐Ÿ‡ต ๐Ÿ‡ฐ๐Ÿ‡ท ๐Ÿ‡ท๐Ÿ‡บ ๐Ÿ‡บ๐Ÿ‡ธยน. A possibly apocryphal explanation is that they were used to denote what to grab for dinner: "American ๐Ÿ‡บ๐Ÿ‡ธ or Italian ๐Ÿ‡ฎ๐Ÿ‡น?" (Such an innocent time in emoji history, pre-hamburger ๐Ÿ” emoji). Alas, as Unicode stepped in to create meaningful interoperability between these carrier-specific encodings, they were presented with a problem: why should these 10 countries have flag emoji when others do not?

The original emoji set included ten flags (shown above).
ยน Interestingly, Windows has never supported flag emoji ๐Ÿ”ฎ. So, if you are reading this on a Windows device and flags aren't displaying, simply refer to the image above of the ten original flag emoji.

Various ideas were considered. The Unicode Consortium isnโ€™t in the business of determining what is a country and what isnโ€™t. Thatโ€™s when the Consortium chose ISO 3166-1 alpha 2 as the source for valid country designations. ISO 3166 is a widely-accepted standard, and this particular mechanism represents each country with 2 letters, such as โ€œUSโ€ (For United States), โ€œFRโ€ (France), or โ€œCNโ€ (China).

It wasnโ€™t a perfect solution, but by allowing the 10 flag emoji โ€” and the rest of the country flags โ€” to be accurately interchanged between DoCoMo, KDDI, SoftBank, Google, and Apple, and others, it worked just fine.

Why this flag emoji but not that one?

Today, the largest emoji category is flags (Out of only ~3600 emoji, there are over 200 flags!). But, did you know that there are over 5,000 geographically-recognized regions that are also โ€œvalidโ€? These are known as subdivision regions and are based on ISO 3166-2. (These include states in the US, regions in Italy, provinces in Argentina, and so on.)

First, what does โ€œvalidโ€ mean to the Unicode Standard? Well, think of it this way. Today, anyone could make a font of 5,000 emoji flags using these sequences. They are valid sequences. They are legit sequences. They wonโ€™t break. Any platform, application, or font can implement them. The significant difference here is that valid doesnโ€™t mean they are recommended for implementation.

Back to ISO. ISO groups countries in a more formal way than say FIFA or The Olympics. For example, the four regions of the UK are regularly used in sport but not recognized in ISO 3166-1. In 2016, the Unicode Consortium started looking into solutions to support their inclusion (with the technical feasibility of adding more if needed in the future). This was the impetus for adding a general mechanism to make all ISO 3166-2 codes be valid for flags. However, only three of the 5,000 ISO 3166-2 codes have widely adopted emojiโ€” England, Scotland, and Wales. (Northern Ireland remains in limbo until an โ€œofficial flagโ€ is formalized).

Flags for England, Scotland, and Wales were included in Emoji 5.0

So, with so many โ€œvalid sequencesโ€ why hasnโ€™t anyone taken advantage of this sweet sweet rich flag opportunity?

At the time, in 2016, adding a few flags seemed reasonable but in retrospect was short-sighted. If the Emoji Subcommittee recommends the addition of a Catalonia flag emoji, then it looks like favoritism unless all the other subdivisions of Spain are added. And if those are added, what about the subdivisions of Japan or Namibia, or the Cantons of Liechtenstein? The inclusion of new flags will always continue to emphasize the exclusion of others. And there isnโ€™t much room for the fluid nature of politics โ€” countries change but Unicode additions are forever โ€” once a character is added it can never be removed. (That being said, font designers can always update the designs as regimes change).

How are flag emoji used?

Flags are very specific in what they mean, and they donโ€™t represent concepts used multiple times a day or even multiple times a year. You could say flag emoji have transcended the messaging experience and are primarily found in more auto-biographical contexts. (Like your TikTok bio. Or, maybe you add a flag to your username on Twitter.) But, even then flags are not as commonly found in biographical spaces as you may expect. (The top five emoji found in Twitter bios? โค๏ธโœจ๐Ÿ’™๐Ÿ’œ๐Ÿ’›.)

Despite being the largest emoji category with a strong association tied to identity, flags are by far the least used. (There are exceptions: usage of the rainbow flag is above median!) That begs the question, โ€œSo, why not encode more identity flags?โ€ Well, we have seen the same results for flags as we have seen for other emoji โ€” a very long tail of rarely used options. They also tend to change over time! In the past six years since adding a Pride Flag to the Unicode Standard (2019) itโ€™s already been redesigned. Many times. Identities are fluid and unstoppable which makes mapping them to a formal unchanging universal character set incompatible.

Why does usage matter in selecting emoji?

Any emoji additions have to take into consideration usage frequency, trade-offs with other choices, font file size, and the burden on developers (and users!) to make it easier to send and receive emoji. Thatโ€™s why the Emoji Subcommittee set out to reduce the number of emoji we encode in any given year. Flags are also super hard to discern at emoji sizes โ€” itโ€™s quite easy to send a different flag than you intended (and with each additional flag the problem gets worse). The simple truth is that if more people used flags then there would be more of an argument to encode them. The Unicode Standard subset is just not a viable solution here for implementers nor users. Fortunately, there are seemingly infinite other ways to exchange images of flags that are more flexible and decentralized, such as stickers, gifs, and image attachments.

What is Unicode doing about it?

We realize closing this door may come as a disappointment โ€” after all, flags often serve as a rallying cry to be seen, heard, recognized, and understood.

The Internet is a different place now than it was in the 90โ€™s โ€” the distribution of imagery online is unstoppable! Given how flags are commonly used this is a reasonable path forward: If you care to denote your affiliation with a region be it geographic, political, or identity (or all three) you can add a flag to your avatar image, share videos, or send a gif or sticker to razz your friend during a sports game (and of course there is always โšฝ โšฝ โšฝ โšฝ โšฝ).


The more emoji can operate as building blocks, the more versatile, fluid, and useful they become! Rather than relying on Unicode to add new emoji for every concept under the Sun (this is simply not attainable) the citizens of the world have proven to be infinitely creative and fluid: often using existing emoji like the colored hearts (โค๏ธ๏ธ ๐Ÿงก ๐Ÿ’› ๐Ÿ’š ๐Ÿ’™ ๐Ÿ’œ ๐ŸคŽ ๐Ÿ–ค ๐Ÿค) to express themselves. Hearts are among the most frequently used type of emoji and the nine colored hearts are often juxtaposed next to each other to denote markers of emotion (โ€œIโ€™m sorry ๐Ÿ’™โ€ or โ€œlove you โค๏ธโ€) and identity or affiliation that are not represented with atomic emoji in the Unicode Standard (ex. โ€œPan African pride โค๏ธ๏ธ๐Ÿ’š๐Ÿ–คโ€, โ€œHi Iโ€™m bi ๐Ÿ’–๐Ÿ’™๐Ÿ’œโ€, and yes even sports teams โ€œGo Mets! ๐Ÿ’™๐Ÿงกโ€ ).

With this in mind, the Emoji Subcommittee has put forth a strategy to add a pink heart, a light blue heart, and a gray heart to the Unicode Standard. These are colors commonly found in gender flags (gender fluid pride flag), sexuality flags (bisexual pride flag), in sports team colors (Go Spurs!) and even some regional flags (Brussels). As of this year, these three heart emoji advanced as draft candidates, and you can expect them to land on your device of choice sometime next year.

In some ways we have returned to where we first started: Adding three new emoji to support a seemingly infinite number of concepts. This time if it fails, at least weโ€™ll be left with lots of heart emoji that have multiple uses. โค๏ธ๐Ÿงก๐Ÿ’›๐Ÿ’š๐Ÿ’™๐Ÿ’œ๐ŸคŽ๐Ÿ–ค๐Ÿค



In light of this change, weโ€™d like to clarify a few additional frequently asked questions with regards to emoji flags

Wait, if a country gains independence and is recognised by ISO, does that mean no flag emoji for them?
Flags for countries with Unicode region codes are automatically recommended, with no proposals necessary! First their codes and translated names are added to Unicodeโ€™s Common Locale Data Repository [CLDR], and then the emoji become valid in the next version of Unicode. These emoji are also automatically recommended for general interchange and wide deployment.

What about flags that change designs for geopolitical reasons?
Unicode does not specify the appearance of flag emoji. It is the responsibility of font designers to update their fonts as politics change. EG: no Unicode changes required for https://emojipedia.org/flag-mauritania/

My region was assigned a 3166-2 code. Do we have to submit a proposal?
No, the Emoji Subcommittee is no longer taking in any proposals for flags of any kind.

As a recent example, Kurdistan (a subdivision of Iraq) became an official subdivision in ISO 3166-2 (IQ-KR) on May 3, 2021. The corresponding Unicode subdivision code (iqkr) is slated for release in CLDR v41 on Apr 6, 2022. At that point the flag for Kurdistan will officially be valid โ€” any platform, app, or font could support it. But that doesnโ€™t mean it automatically gets in the queue for everyoneโ€™s phone. Only countries with ISO 3166-1 region codes are automatically recommended and require no proposal to move forward.

So what warrants an ISO 3166-1 assignment vs ISO 3166-2?
ISO 3166-1 is for countries recognized by the United Nations and ISO 3166-2 is for parts of countries.

Why is Antarctica part of ISO 3166-1 but Africa isnโ€™t? There seems to be no rational explanation with regard to why islands with no inhabitants have a flag while regions with millions of people have no emoji flag.
Itโ€™s true, there are "Exceptional reservations." Antarctica has an ISO 3166-1 alpha 2 code: AQ. But WHY does it have an ISO 3166-1 code? Because ISO 3166 decided to (ages ago) include it, probably since the whole continent is "shared."

For historical reasons, you may see other exceptions like ๐Ÿ‡ฆ๐Ÿ‡จ AC Ascension Island, ๐Ÿ‡จ๐Ÿ‡ต CP Clipperton Island, or ๐Ÿ‡ฉ๐Ÿ‡ฌ DG Diego Garcia.

Why donโ€™t we have asexual, bisexual, pansexual, and non-binary pride flags? And if ๐Ÿด๓ ง๓ ข๓ ท๓ ฌ๓ ณ๓ ฟ and ๐Ÿด๓ ง๓ ข๓ ณ๓ ฃ๓ ด๓ ฟ get Unicode flags, surely thereโ€™s room for the Aboriginal and Torres Strait Islander flags?
Before diving into the facts of why these flags are not part of the universal character set, we want to first take a moment to consider what people mean when they ask these questions and what Unicode means when they decline these flag proposals. Because this question is not one we take lightly. In the course of world history, groups have used flags as a rallying cry to be seen, heard, recognized, and understood. In the Unicode Consortiumโ€™s mission to digitize the worldโ€™s languages, improve communication online, and achieve meaningful interoperability between platforms, the requests for flags have become a lightning rod for these rallying cries.

When people ask for a new flag emoji, we recognize that the underlying request is about more than simply a new emoji. And when we say, โ€œWe arenโ€™t adding more flags,โ€ we are only saying changing the Unicode Standard is not an effective mechanism for this recognition.

What if I submit a proposal for a flag despite this policy?
Your proposal will not be processed.

Relevant docs/Further Reading
https://www.unicode.org/L2/L2021/21128-esc-recs.pdf
https://www.unicode.org/L2/L2021/21167.htm
https://www.unicode.org/L2/L2021/21172-esc-recs.pdf
https://www.unicode.org/emoji/proposals.html#Flags
http://www.unicode.org/L2/L2019/19084-trans-flag.pdf

Thursday, December 2, 2021

The Most Frequently Used Emoji of 2021

The Unicode Emoji Mirror Project

Emoji 15 image
92% of the worldโ€™s online population use emoji โ€”ย but which emoji are we using? The Unicode Consortium, the not-for-profit organization responsible for digitizing the worldโ€™s languages, gathers information about how frequently emoji are used. Looking at patterns of usage helps to determine what new emoji should be added to the Unicode Standard. As part of this effort, we are making that data available to the public.ย 

The new Unicode Emoji Frequency page lists the Unicode v12.0 emoji ranked in order of how frequently they were used in 2021 and what has changed since 2019. Check it out for more analysis, insights and patterns that illustrate our collective experience during a global pandemic.

#UnicodeEmojiMirror


Over 144,000 characters are available for adoption to help the Unicode Consortiumโ€™s work on digitally disadvantaged languages

[badge]

Wednesday, November 17, 2021

Unicode Emoji 15.0 Provisional Candidates

Emoji 15 image
The Unicode Technical Committee has approved the list of provisional candidates for Emoji 15.0. They are slated for release in September 2022 together with Unicode 15.0. These candidates were identified by the Unicode Emoji Subcommittee after reviewing proposals ranked according to previously-determined selection factors.

The list of provisional emoji candidates can be found here. Note that they have not yet been assigned code points or properties. For comments on these candidates, please reference PRI #435 in your feedback.

How to Provide Feedback: For information about how to discuss this Public Review Issue and how to supply formal feedback, please see the feedback and discussion instructions.

Feedback is reviewed by the relevant committee according to their meeting schedule.


Over 144,000 characters are available for adoption to help the Unicode Consortiumโ€™s work on digitally disadvantaged languages

[badge]

Monday, July 12, 2021

Adopt a Character to Celebrate World Emoji Day

World Emoji Day 2021 This week, the Unicode Consortium is excited to celebrate the calendar emoji, ๐Ÿ“…, commonly displayed with July 17th. People are the power driving the popularity of emoji through their innovative use of them to share joy, activities, sports, individuality, and so much more.

Celebrate a favorite emoji or character this week by adopting a character! While many characters have been adopted since the program launched in December 2015, hundreds of emoji havenโ€™t been adopted by anyone at any level, including fantastic ones like clapping hands (๐Ÿ‘twelfth๐Ÿ‘most๐Ÿ‘used๐Ÿ‘emoji๐Ÿ‘), check box (for all your to-do list dreams), and the loudly crying emoji (Iโ€™m so proud of you! ๐Ÿ˜ญ). Imagine the possible messages you could send with a gift adoption! For example:
  • Congratulations!!!!! ๐Ÿฅ‚
  • Love You! ๐Ÿ–ค
  • Kisses ๐Ÿ’‹
  • Did you see this๐Ÿ‘‡๐Ÿฝ
  • Yes, I adopted this face in your name ๐Ÿฅด
  • My bad ๐Ÿ˜ณ
  • Happy Birthday! To 100 more! ๐ŸŽ‚
When you celebrate World Emoji Day this week by sponsoring your favorite emoji or another character for yourself or as a gift, your donation helps the non-profit Unicode Consortium support the worldโ€™s many languages and make the digital world more inclusive. The Consortium is funded by membership fees and donations from individuals, corporations, and other organizations. Your donations help support the vital work of the Consortium, making modern software and computing systems support the widest range of human languages. The Consortium will use your donation to improve language support and to preserve digital heritage. For more details, see How Donations are Used.

Thursday, April 15, 2021

Now Accepting Unicode Emoji Proposals ๐ŸŽ‰

[hands image] When you last heard from the Unicode Emoji Subcommittee in April of 2020, the Unicode Consortium had just announced a 6-month delay to Unicode Version 14.0 due to COVID-19. Despite all of this :waves at the world: weโ€™ve been busy.

Whatโ€™s new? Great question!

During this pause in proposal submissions, the Unicode Emoji Subcommittee consulted with experts, developing a process that more completely reflects our criteria for inclusion in an effort to prioritize globally relevant emoji. Weโ€™ve looked for new ways to reconcile the rapid, transient nature of modern communication with the formal, methodical process required by a standards body like the Unicode Consortium.

Moving forward, the proposal review season will be open each year from April 15-August 31. To submit a proposal, first read these Guidelines and fill out this form.

Thanks to all our Unicode Emoji Subcommittee volunteers who made these improvements possible. The world would be without emoji if it werenโ€™t for you!

Looking forward to 2021!
The Unicode Emoji Subcommittee


Over 140,000 characters are available for adoption to help the Unicode Consortiumโ€™s work on digitally disadvantaged languages

[badge]

Wednesday, March 3, 2021

Emoji โ€” There's more than meets the ๐Ÿ‘๏ธ

A lot more goes into selecting and designing an emoji than you might expect. For some in-depth glimpses into the factors designers weigh when expanding the set of emoji characters, check out these videos on our Unicode Consortium YouTube channel:

When a Merperson is a Merman: Using Gender-Inclusive Design for Codepoints Which Don't Specify Gender

Race is Not a Skin Tone. Gender is Not a Haircut.

Hanmoji: Analyzing Chinese Radicals to Determine Semantic Gaps in Emoji๏ปฟ


Over 140,000 characters are available for adoption to help the Unicode Consortiumโ€™s work on digitally disadvantaged languages

[badge]

Friday, October 9, 2020

Unicode CLDR Locale Data v38 beta available for testing

[beta image] The beta version of Unicode CLDR version 38 is now available. The data will not be changed except for showstoppers, but the LDML v38 spec can still be changed. The final release of v38 is planned for October 28, 2020. If you find any problems, please file a ticket.

Unicode CLDR provides an update to the key building blocks for software supporting the world's languages. CLDR data is used by all major software systems (including all mobile phones) for their software internationalization and localization, adapting software to the conventions of different languages.

CLDR v38 includes:
  • Enhancements to existing locale data: adding support for units of measurement in inflected languages (phase 1), adding annotations (names and search keywords) for Unicode symbols that are non-emoji (~400), and annotations forย  Emoji v13.1.ย 
  • Survey Tool upgrades: substantial performance improvements, plus structured forum entries to improve coordination among translators.
LDML v38 includes:
  • To make the canonicalization of locale identifiers clear and unambiguous, provided major restructuring of the specification for it. (This was done in concert with fixes to the alias data to work better with the specification.)
  • To support inflected units of measurement:
    • minimalPairs adds new elements
      caseMinimalPairs and genderMinimalPairs
    • unit adds a new element gender
    • grammaticalData adds new elements
      grammaticalDerivations, deriveCompound, and deriveComponent
    • unitPattern adds a new attribute case
    • grammaticalCase, grammaticalGender, grammaticalDefiniteness add a new attribute scope
    • compoundUnitPattern1 adds new attributes case and gender
    • compoundUnitPattern adds a new attribute case
  • To allow for overriding dictionary-based segmentation breaks, added the Unicode Dictionary Break Exclusion Identifier, with the new key โ€œdxโ€.
  • For picking the correct units of measurement for locales, defined the userPreferences skeleton more precisely.
  • For accurate plural categories in compact numbers, added the 'c' operand to plural rules to provide formatting for languages such as French.
See additional details in the draft CLDR v38 Release note.

The overall changes to the data items were:

Added Deleted Changed Total
155,131 33,805 45,895 2,175,821


Over 140,000 characters are available for adoption to help the Unicode Consortiumโ€™s work on digitally disadvantaged languages

[badge]

Friday, September 18, 2020

Emoji 13.1 โ€” Now final, to be widely available in 2021

Emoji 13.1 is now final with 217 new emoji sequences! Of these, 210 are skin tone variants; the other seven new emoji are:

Most of the skin tone variants are for the multi-person emoji groupings couples with heart and couples kissing.
This minor release was created to add new emoji before 2022. The Unicode Consortium is a volunteer organization and we would be completely without new emoji in 2021 if it werenโ€™t for the dedication of many volunteers who make this possible. Thank you! โœจ

The new emoji are listed in Emoji Recently Added v13.1. The images provided on that page are just samples: vendors for mobile phones, PCs, and web platforms create their own images.

New emoji in this release should begin appearing on devices in the coming months. These new emoji will also be available for adoption. Donations for adoptions help the Unicode Consortiumโ€™s work on digitally disadvantaged languages.

For implementers:
  1. There are no new atomic characters. Instead, each emoji is a sequence of existing characters.
  2. UTS #51 and associated data files have been updated for Emoji 13.1.
  3. CLDR v38 alpha has also been updated for Emoji 13.1. This includes names, search keywords, and sort orderings for the new emoji, available for over 80 languages. It is scheduled for release at the end of October.

Over 140,000 characters are available for adoption to help the Unicode Consortiumโ€™s work on digitally disadvantaged languages

[badge]

Tuesday, September 1, 2020

Emoji 15.0 Submissions Re-Open April 15, 2021

Emoji15 The Unicode Consortium is postponing the submissions of new emoji for Unicode version 15.0 until April 15, 2021. This delay follows on the postponement of the release of the upcoming Unicode 14.0 version from March to September 2021.

This delay impacts related specifications and data, such as new emoji characters. As a consequence, the deadline for submission of new emoji character proposals for Emoji 14.0 was extended until September 1, 2020.

Pausing Processing of New Emoji Proposals โธ๏ธ

The Emoji Subcommittee is in the process of revising the submission form. Until the new submission form is ready on April 15, 2021, proposals will be returned to sender. During this period the committee will also be prioritizing Emoji 15.0 initiatives as described in document L2/20-197.

Submissions for Emoji 15.0 Open April 2021 โ–ถ๏ธ

The Emoji Subcommittee will be accepting new emoji character proposals for Emoji 15.0 from April 15, 2021 onward. Any new emoji characters incorporated into Emoji 15.0 can be expected to appear on devices such as computers, phones, and tablets in 2023.

Edited 2021-03-31 to reflect modification of the opening date from April 2 to April 15.


Over 140,000 characters are available for adoption to help the Unicode Consortiumโ€™s work on digitally disadvantaged languages

[badge]

Thursday, June 18, 2020

Unicode Regular Expressions v21 Released

Regex image Regular expressions are a powerful tool for using patterns to search and modify text, and are vital in many programs, programming languages, databases, and spreadsheets.

Starting in 1999, UTS #18: Unicode Regular Expressions has supplied guidelines and conformance levels for supporting Unicode in regular expressions. The new version 21 broadens the scope of properties for regular expressions (regex) to allow for properties of strings (such as for emoji sequences). For example, the following matches all emoji flags except the French flag:

/[\p{RGI_Emoji_Flag_Sequence}--\q{๐Ÿ‡ซ๐Ÿ‡ท}]/

Among the improvements are:
  • Provides a new Annex D: Resolving Character Classes with Strings for handling negations of sets of strings.
  • Updates the full property list to include the latest UCD properties, plus Emoji properties and UTS #39 properties.
  • Removes obsolete text passages, and makes editorial changes for clarity.


Over 140,000 characters are available for adoption to help the Unicode Consortiumโ€™s work on digitally disadvantaged languages

[badge]

Unicode Consortium Announces New Additions to Leadership Team

Logo image We are pleased to announce the following leadership additions at the Unicode Consortium. โ€œEach of these individuals brings deep expertise in their field,โ€ said Mark Davis, president of the Consortium. โ€œThey have already made significant improvements in their new roles.โ€

Unicode Emoji Subcommittee

Chair: Jennifer Daniel
Jennifer Danielโ€™s first contribution to Unicode was standardizing gender inclusive representations in emoji. As a designer, author and former graphics editor at the New York Times, she now explores communication and messaging through verbal, written, auditory and visual expression at a small ad company called Google. Jennifer is a co-author and illustrator of a number of graphics books including How to Be Human, Space!, and the Origins of Almost Everything. Her work has been recognized by the Walker Art Museum, Society of Illustrators and published in the New Yorker, The Washington Post, and Time Magazine to name a few. She has had the honor to serve as a judge for the Society of News Design, Online News Association, Society of Illustrators, American Illustration, Data is Beautiful and the Art Director's Club. She lives in Berkeley, California but also in cyberspace.

Vice Chair: Ned Holbrook
Ned Holbrook is a typographic engineer at Apple, specializing in text layout and fonts. He was one of the participants in the industry-wide effort to standardize variable font technology in OpenType. He previously worked on wireless networking, virtualization, digital audio, embedded graphics, and remote filesystems.

Unicode CLDR Committee

Vice Chair: Kristi Lee
Kristi Lee is the CLDR technical committee vice-chair, and she represents Microsoft in the CLDR technical committee. She joined Microsoft in 1997 and has worked in a number of different divisions and product development groups. Her focus has been delivering solutions to international customers in localization and internationalization. She holds a mathematics degree from University of Washington. Currently, she is in the Corporate division in Microsoft and works with engineering groups across Microsoft including Windows, .NET, Office, and others on topics relating to CLDR and i18n.

Executive Officer

General Counsel: Anne Gundelfinger
Anne is an experienced legal executive with 30 years in private practice and in-house legal roles. From 2013-2019 she served as vice president for global intellectual property for Swarovski, a global fashion jewelry brand based in central Europe. Before that she held various positions over a decade in the Intel legal department including vice president for global public policy, vice president for global sales & marketing legal affairs, and director of trademarks & brands. Early in her career she was an associate at Fenwick & West and director of trademarks at Sun Microsystems. Since retiring from Swarovski, Anne has been a consultant and has served as a World Intellectual Property Organization domain name panelist under the Uniform Dispute Resolution Policy of ICANN. Anne has long been a leader in the global IP bar. She served on the Board of Directors of the International Trademark Association for nearly a decade and served as the Associationโ€™s president in 2005.

Mark Davis, the former chair of the emoji subcommittee, will continue to contribute to the emoji subcommittee and serve as president of the Unicode Consortium. โ€œIโ€™d also like to thank John Emmons for his many years of service as chair and vice chair of the CLDR technical committee,โ€ said Davis. โ€œEspecially for his work in promoting support for digitally disadvantaged languages.โ€


Over 140,000 characters are available for adoption to help the Unicode Consortiumโ€™s work on digitally disadvantaged languages

[badge]
ย