How to Build a Semantic Search Engine for Emojis

Author:Murphy  |  View: 22574  |  Time: 2025-03-22 23:24:36
Semantic search over emojis for "halloween" using a custom emoji search engine.

If you've ever used Google Docs, or Slack, you may have noticed that when you type a ":" immediately followed by another character, a list of emojis pops up:

Since I discovered this, I've been making major use out of the feature. I add emojis into way more of my messages, blog posts, and other written works than I ever imagined I would. I actually got so accustomed to this means of adding emojis that I installed Rocket – a free app that brings the same emoji searchability to all text boxes and text editors on the computer. It's a game changer.

But as I've used these emoji search engines more and more, I've noticed a frustrating limitation: all of the searches are based on the exact text in your query and in the name and description of the emoji. Essentially, you need to search for something incredibly precisely for any results to show up.

Here's an example: if we search for "audio", not a single result shows up:

This isn't because the set of emojis is lacking in the audio category. If we were to type in "music" or "speaker", we would get a long list of results. Instead, it has to do with the fact that the specific string of text "audio" does not show up in the name or textual description associated with any of the emojis.

This relatively minor inconvenience bothered me so much that I decided to build this:

By "this", I mean an open-source semantic emoji search engine, with both UI-centric and CLI versions. The Python CLI library can be found [[here](https://try.fiftyone.ai/datasets/emojis/samples)](https://github.com/jacobmarks/emoji-search-plugin), and the UI-centric version can be found here. You can also play around with a hosted (also free) version of the UI emoji search engine online here.

Command line version of the Semantic Emoji Search Engine

Building this was not as simple or straightforward as I initially hoped. It took a lot of experimentation, and a lot of ideas I thought were quite clever fell essentially flat. But in the end, I was able to create an emoji search engine that works fairly well.

Here's how I built it, what worked, and what didn't, and the lessons learned along the way.

What is an Emoji

Before building a semantic search engine for emojis, it's worth briefly explaining what exactly an emoji is. The term emoji derives from the Japanese kanji 絵 (eh) meaning picture, and 文字 (mōji) meaning letter or character. Essentially, this means that an emoji is etymologically a pictogram, and while it is connected to the English word emotion, it is not an "emotion icon" – that is an emoticon.

Along with alphanumeric characters, African click sounds, mathematical and geometric symbols, dingbats, and computer control sequences, emojis can be represented as Unicode characters, making them computer-readable. Unlike alphanumeric characters and other symbols, however, emojis are maintained by the Unicode Consortium. The consortium solicits proposals for new emojis, and regularly selects which emojis will be added to the standard.

At the time of writing, in November 2023, there are more than 3,600 recognized emojis, symbolizing a wide range of ideas and sentiments. Some emojis are represented by a single unicode character, or code-point. For example, the "grinning face" emoji,

Tags: Computer Vision Emoji Hands On Tutorials Machine Learning Semantic Search

Comment