Emoji are encoded in Unicode based primarily on appearance, not on an intended semantic. In fact, many characters acquire multiple meanings based on their appearance โ consider the peach (๐) and the eggplant (๐). But with embeddings from an "emoji2vec" repository, we can visualize emoji meaning in 3D space.
Researchers have found that about 25% of the time, people disagree about whether the emotion being conveyed by an emoji is positive, negative, or neutral โ even when they are viewing the same emoji on the same type of device. Yet, examining Jose Berengueres' emoji sentiment dataset below, we do see a general correlation between sentiment of readers and writers of emoji: the symbols enable empathy, with negative emoji in the bottom left, and more positive, celebratory emoji in the top right.
Examining the difference between reader and writer sentiment below, however, some emojis do seem to mean more to one party than the other. Perfect communication would call for low difference values: we see that readers and writers align most on sentiment with the ok hand (๐).
Originally meaning pictograph, the word emoji comes from Japanese: e (็ตต, "picture") + moji (ๆๅญ, "character"). Today, emoji are a complex network of pictorial symbols that can be used in text โ either as a standalone character, with a modifier (for instance, skin tone), or in a sequence, with a "zero width joiner" (or ZWJ, pronounced โzwidgeโ). Read on for a history of how these characters have evolved over time.
Encoded emoji are selected based on the following factors: (1) high expected usage, (2) distinctiveness, and (3) broad scope, or "useful ambiguity." Note that the overall process of submitting an emoji to seeing it on your device takes at least a year โ probably even longer.
Proposals are first submitted to the emoji subcommittee, who then forwards reasonable proposals to the Unicode Technical Committee for voting. But even after being approved by vote, proposals must be considered "highest priority" in order to be developed โ first by Unicode, then by platforms like Apple and Google โ to appear on computers and mobile phones. If you're still interested, get started by checking out the submission form template and FAQ.
As the number of encoded emoji continues to grow, you may already struggle to find the right symbol quickly. The Unicode Consortium currently anticipates a 60 character "emoji budget" per year, until alternative solutions come into play. Such longer-term solutions include supporting embedded graphics โ a.k.a. "stickers," like Bitmoji. Since such stickers aren't dependent on encoding, these would ultimately allow both faster implementation and more freedom of expression.