Check for BiDi Override Attack (Trojan Source)
Bidirectional Unicode control characters can reverse the visual order of text in code, making malicious logic appear as harmless comments — the CVE-2021-42574 "Trojan Source" vulnerability.
The Unicode Bidirectional Algorithm (UBA) controls how text with mixed writing directions — English, Arabic,
Hebrew — is displayed. Its control characters are legitimate for internationalised text, but in source code
they have been weaponised. A Right-to-Left Override (U+202E) or Right-to-Left
Isolate (U+2067) placed inside a comment or string literal causes subsequent characters to render
right-to-left, making a code reviewer see // Check if admin while the compiler processes
if isAdmin .
This was publicly disclosed in November 2021 as CVE-2021-42574 "Trojan Source". The researchers demonstrated it against GCC, Clang, Rust, Go, Java, JavaScript, Python, and C# compilers — all of which accepted BiDi characters in comments and string literals without warning. GitHub, GitLab, and Bitbucket now warn on pull request diffs that contain these characters, but older repos and copy-pasted snippets remain at risk.
Use the Invisible Unicode Character Detector to scan any code snippet or text for BiDi control characters. The tool detects all 11 BiDi formatting characters (U+200E, U+200F, U+202A–U+202E, U+2066–U+2069) and flags them as High risk with their exact codepoints and occurrence counts.
How to check for BiDi override characters
- Open the Invisible Unicode Character Detector. Paste the source code or text snippet you want to audit into the input area.
- Click Analyze. Any BiDi control characters (U+202A–U+202E, U+2066–U+2069, U+200E, U+200F, U+061C) are listed in the results table with a High risk badge and their exact codepoints and counts.
- If BiDi characters are found, do not use the code snippet. The cleaned output removes them, but you should also inspect the original source for logic that may have been intentionally obfuscated.
Related security checks
- Full invisible character scan — covers BiDi, zero-width, tag characters, and more
- Hidden character finder — general invisible character sweep
- Remove invisible characters — produce a clean copy
- Detect zero-width characters — ZWSP, ZWJ, ZWNJ focus