As of this week, Unicode has grown by 7,500 characters. This international standard for symbols added at least six new languages, 72 new emojis and 19 new symbols for the 4K television standard. The new Unicode 9.0 Standard also includes new and updated Unicode Technical Standards, including steps to reduce Unicode spoofing and to process non-ASCII URLs.
Because Unicode has been a standard since 1992, it’s gotten difficult to find languages that aren’t yet among its 128,172 characters. This time, additional languages were mostly regional, such as Osage, a Native American language, or Fulani, an African language. Other languages supported in Unicode 9.0 include the Bravanese dialect of Swahili, Nepalese Bhasa, Tangut (a historical Chinese language), and the Warsh orthography for Arabic, which is popular in northern and western Africa.
(Related: Teaching neural networks with emojis)
Emojis also get a good going over in Unicode 9.0. New emojis include a half avocado, clinking glasses, a motor scooter, and a butterfly. The entire list of new emojis is available here.
Unicode is by no means done supporting characters, however. There is an effort inside the standard to support the Mayan hieroglyphic language, sponsored by Adobe, IBM, the San Jose Earthquakes soccer team, and TCP/IP co-creator Vinton Cerf, to name a few.
While the standard for Unicode 9.0 is now complete, the actual core specification is not yet complete. A good deal of editing is required before the specification can be fully released to the public. The final release will likely arrive in August.