Homoglyph Detector
Detect Cyrillic, Greek, and fullwidth look-alike chars in text. Highlight homoglyphs by script, replace with ASCII, or compare strings at codepoint level.
Highlighted text
Detected homoglyphs
| Char | Codepoint | Unicode name | Script | Looks like | Count |
|---|
Cleaned text (ASCII only)
| Pos | String A char | A codepoint | String B char | B codepoint |
|---|
How to use
- Paste or type the text you want to inspect into the input area. For phishing detection, paste a suspicious URL, email address, or token name.
- Click Analyze. Every look-alike character is highlighted in color — red for Cyrillic, orange for Greek, blue for fullwidth, purple for mathematical variants.
- Review the risk table to see each codepoint's Unicode name and what ASCII character it resembles. Click Clean to replace all homoglyphs with ASCII equivalents and copy the safe text.
What are homoglyphs?
Homoglyphs (also called confusables or look-alike characters) are characters from different Unicode scripts that are visually indistinguishable or nearly identical in most fonts. The Latin lowercase "a" and the Cyrillic small letter "а" (U+0430) look the same in virtually every web font — but they are different codepoints. Security researchers call this technique a homoglyph attack or IDN homograph attack, widely used to register deceptive domain names, create look-alike wallet addresses, and spoof email senders.
Cyrillic homoglyphs
Cyrillic is the most common source of homoglyphs because it shares dozens of letter shapes with the Latin alphabet. Phishing campaigns routinely use Cyrillic А (U+0410), О (U+041E), Р (U+0420), С (U+0421), Т (U+0422), and Х (U+0425) in place of their Latin counterparts. A convincing look-alike domain like "раура1.com" using Cyrillic р, а, у is trivial to register on most registrars.
Fullwidth and mathematical variants
Unicode includes a fullwidth Latin block (U+FF01–U+FF5E) designed for CJK typography and mathematical alphanumeric symbols (U+1D400–U+1D7FF) for mathematical notation. Both blocks contain characters visually identical to standard ASCII letters and digits at different codepoints — making them a vector for name spoofing in usernames, package registries, and code identifiers.
よくある質問
- What is a homoglyph attack?
- A homoglyph (or confusable) attack substitutes ASCII characters with visually identical characters from other Unicode scripts — most commonly Cyrillic (е→e, о→o, р→p) or Greek (ο→o, ν→v). The resulting text looks identical in most fonts but contains different codepoints, making "google.com" and "gοοgle.com" (with Greek omicrons) entirely different domain names. This technique is widely used in phishing emails, crypto wallet scams, and malicious npm package names.
- Does my text leave my device?
- No. All analysis runs entirely in your browser using JavaScript. Your text — including private crypto wallet addresses or confidential documents — is never sent to any server.
- How do I use the Compare mode?
- Switch to Compare mode by clicking the "Compare two strings" tab. Paste one string in the left box and another in the right box, then click Compare. The tool walks through each Unicode codepoint side by side, highlighting any positions where the codepoints differ — even when the characters look identical to the eye. This is ideal for verifying that a crypto wallet address you copied matches the one you intend to send funds to.
- Which scripts does this tool detect?
- The tool detects Cyrillic (red), Greek (orange), fullwidth Latin/digit characters (blue), and mathematical bold/italic variants (purple). Each detected character maps to its ASCII lookalike and displays the Unicode codepoint name so you know exactly what is present.
- What does the Clean button do?
- Clean replaces every detected homoglyph with its ASCII equivalent — for example, Cyrillic о (U+043E) becomes Latin o (U+006F). The cleaned text is safe to copy into forms, code, or communications.
Last updated: By jarvisbox