homoglyph-detector

Install

View on GitHub

Quality Score: 95/100

Stars 20%

97

Recency 20%

100

Frontmatter 20%

70

Documentation 15%

100

Issue Health 10%

50

License 10%

100

Description 5%

100

Skill Content

# Homoglyph Detector Byte-level forensic analysis of code changes to detect Unicode homoglyph substitutions — characters that look identical to ASCII in every editor and diff tool but have different codepoints, silently breaking string comparisons, dictionary lookups, and identifier resolution. ## Purpose Homoglyph attacks (related to CVE-2021-42574 "Trojan Source") are the highest-stealth trojan technique. A Cyrillic `р` (U+0440) looks identical to a Latin `p` (U+0070) in every font, editor, and diff viewer. The only way to detect it is byte-level analysis via `hexdump`. This skill pipes git diffs through `hexdump -C` and scans for multi-byte UTF-8 sequences where single-byte ASCII is expected, particularly in string literals used as dictionary keys, variable names, and identifiers. ## Capabilities ### Confusable Character Detection Scans for these high-risk Unicode confusables: | Latin | Cyrillic | Greek | UTF-8 Bytes | |-------|----------|-------|-------------| | a (61) | а (D0 B0) | α (CE B1) | 1 vs 2 bytes | | c (63) | с (D1 81) | — | 1 vs 2 bytes | | e (65) | е (D0 B5) | ε (CE B5) | 1 vs 2 bytes | | o (6F) | о (D0 BE) | ο (CE BF) | 1 vs 2 bytes | | p (70) | р (D1 80) | ρ (CF 81) | 1 vs 2 bytes | | x (78) | х (D1 85) | χ (CF 87) | 1 vs 2 bytes | | y (79) | у (D1 83) | — | 1 vs 2 bytes | ### Zero-Width Character Detection - U+200B — Zero-width space - U+200C — Zero-width non-joiner - U+200D — Zero-width joiner - U+FEFF — Byte order mark (in non-BOM position) ### ...

Details

Author: a5c-ai
Repository: a5c-ai/babysitter
Created: 4 months ago
Last Updated: today
Language: JavaScript
License: MIT

Install

Quality Score: 95/100

Skill Content

Details

Related Skills

videodb

ck

browser