Humanizer for German: Why AI Text Detection Is Language-Specific
German AI writing leaves different traces than English. The Humanizer detects 65 language-specific patterns with voice calibration, false-positive guardrails, and a typography audit.
AI-generated text has telltale patterns. But those patterns aren't universal.
When you read a text that feels wrong—too smooth, perfectly structured, every paragraph the same length—you can often sense that an LLM wrote it. But what you're sensing is mostly English. German AI text sounds different. It has its own tells.
The problem: most AI detection tools and guides are designed for English. They catch English patterns beautifully. They miss German ones entirely.
I built the Humanizer (Deutsch) to fix that. It's a free, open-source Claude Code skill that detects 65 German-specific AI writing patterns, ranks them by severity, and walks you through a structured five-pass cleanup — from artifact removal to rhythm work. New in v3: optional voice calibration that matches your personal writing style from samples. The newest additions cover abstract-noun stacking, fabricated first-person anecdotes, uniform document structure, and a deterministic rhythm linter — plus a clear-eyed take on what AI detectors actually measure (more below).

Try it
github.com/marmbiz/humanizer-de — MIT licensed, free, works as a Claude Code skill.
Easiest via the Claude Code plugin marketplace:
/plugin marketplace add marmbiz/humanizer-de
/plugin install humanizer-de@humanizer-de
Or the classic clone into your skills directory:
git clone https://github.com/marmbiz/humanizer-de ~/.claude/skills/humanizer-de
Then in Claude Code: /humanizer — done.
What the Humanizer Does
You call it with /humanizer or just say "Humanize this text for me." It gives you:
- A draft rewrite with the obvious AI patterns removed
- A quick anti-AI audit flagging remaining tells
- A final version after the closing self-audit
New in v3: Voice calibration. Provide a sample of your own writing and the skill analyzes your sentence rhythm, word choices, and quirks — then applies them to the rewrite. Instead of generic "clean" output, you get text that sounds like you.
Three modes adjust the correction to your context:
| Mode | When | What happens |
|---|---|---|
| Casual | Blog posts, social media, newsletters | Adds personality and rhythm |
| Neutral | Business reports, product docs, emails | Removes AI tells, keeps tone neutral |
| Formal | Academic papers, legal texts, technical docs | Only removes tells, preserves structure |
Default is Neutral when the context is unclear.
Recently added (details in the changelog below):
- v4.0 – standalone project: The Humanizer now follows its own versioning scheme without the fork suffix — its own roadmap instead of upstream tracking. Plus two new patterns: AI marker vocabulary (the German counterparts to "delve" and "tapestry") and copula avoidance ("fungiert als" instead of "ist").
- v3.8 – six new patterns, five-pass workflow, and a rhythm linter: abstract-noun stacking, fabricated first-person anecdotes, synonym rotation, isometric documents, markerless closure compulsion, and modal-particle anomalies. The cleanup now runs in five fixed passes (artifacts → lexis → structure → rhythm → self-audit), and a new measurement script delivers deterministic rhythm metrics instead of gut feeling.
- v3.7 – two new patterns and plugin install: aphoristic empty formulas ("X is the language of Y") and decorative Markdown structure (single-row tables, skipped heading levels, thematic breaks before headings). The skill now also installs directly from the Claude Code plugin marketplace.
- v3.6 – realistic about detectors: two new patterns (colon-title scheme, uniform sentence rhythm) and a clear stance on what online AI detectors actually measure — and what you therefore should not mangle in your text just to chase a score.
- v3.5 – leaner architecture: the pattern catalog, decision tables, and a dedicated Unicode/quote linter are split out; the skill loads only what it needs.
Severity ranking (HIGH / MEDIUM / LOW) for each pattern lets you focus on what matters most. HIGH patterns are almost always AI. LOW patterns only stand out when they cluster.
Why German AI Text Is Different
English and German diverge in their vulnerabilities to LLMs. The same model that produces flawless English can betray itself immediately in German through patterns that native English speakers don't notice.
Take these examples:
- Participle-I constructions like "gewährleistend" or "hervorhebend" (ensuring, highlighting). In English, "-ing" forms are natural everywhere. In German, this construction screams LLM.
- Overused transition phrases like "Darüber hinaus" (furthermore) appearing three times per paragraph. Native German writers vary their transitions. LLMs repeat the same mechanical connectors.
- Em-dashes everywhere — a punctuation habit from English that German doesn't share natively.
- Vague authorities like "Experten sagen" (experts say) with no sources attached.
- Symbolic overload like "steht als Zeugnis für" (stands as testimony to) — nobody writes like this.
- Promotional tone with "atemberaubend" (breathtaking) in contexts where it doesn't belong.
- Chatbot artifacts like "Stand Januar 2024" (as of January 2024) appearing in articles written months later.
Before (LLM):
Die atemberaubende Stadt mit ihrem reichen kulturellen Erbe steht als Zeugnis für die künstlerische Brillanz vergangener Generationen.
"The breathtaking city with its rich cultural heritage stands as testimony to the artistic brilliance of past generations."
After (human):
Die Stadt hat eine lange Geschichte. Ihre Denkmäler zeigen die Handwerkskunst des Mittelalters.
"The city has a long history. Its monuments show medieval craftsmanship."
Less decoration, more substance.
65 Patterns in 10 Categories
The Humanizer detects patterns across ten categories:
1. Language & Tone (17 patterns, mostly HIGH)
Symbolic overload, promotional language, editorial comments, mechanical conjunctions, section summaries, participle-I constructions, vague authorities, forced conclusions, negative parallelisms (now including clipped negation fragments like "kein Raten.", "keine Kompromisse."), tricolon overuse, false extensions, misplaced "Fazit" sections, abstract-noun stacking ("verschiedene Maßnahmen" instead of the concrete thing), synonym rotation for the same entity ("die Hansestadt", "die Elbmetropole"), modal-particle anomalies (close-register German with zero "ja", "eben", "wohl" — or far too many), AI marker vocabulary ("beleuchten", "eintauchen", "spannend", "die digitale Landschaft" — the German counterparts to "delve" and "tapestry"), and copula avoidance ("fungiert als", "verfügt über" instead of plain "ist"/"hat").
2. Style (4 patterns, MEDIUM/LOW)
Excessive bold text, false lists, emojis before headings, em-dash overuse (now with a replacement hierarchy: period > comma > colon > semicolon > parentheses > rephrase, plus detection of paired inserts and dash variants).
3. Communication (6 patterns, all HIGH)
Letter-style writing, collaborative chatbot phrases ("I hope this helps!"), knowledge cutoff references, prompt refusals, placeholder text, links to search queries.
4. Markup (6 patterns, all MEDIUM)
Markdown instead of wikitext, broken wikitext and AI tool artifacts (oaicite tags, contentReference spans, turn0search0 references), dead links, full citation fabrication (hallucinated publications, non-existent journals, utm_source parameters), incorrect reference formats, wrong categories.
5. Miscellaneous (3 patterns, LOW/MEDIUM)
Abrupt cutoffs, style shifts mid-text, first-person edit summaries.
6. Rhetoric & Structure (11 patterns)
| Pattern | Severity | Example |
|---|---|---|
| Persuasive authority phrases | MEDIUM | "Im Kern" (at its core), "In Wirklichkeit" (in reality) |
| Signposting | MEDIUM | "Schauen wir uns an" (let's look at), "Here's what you need to know" |
| Fragmented headings | LOW | Generic one-liner immediately after a heading |
| Rhetorical questions as fake engagement | MEDIUM | "Aber was bedeutet das?" (But what does this mean?) |
| Universal human experience opener | MEDIUM | "Seit jeher" (since time immemorial), "Seit Anbeginn der Zivilisation" |
| "In today's X world" framing | MEDIUM | "In der heutigen digitalen Welt" (in today's digital world) |
| Aspirational corporate closing | MEDIUM | "bestens aufgestellt" (well-positioned), "die Möglichkeiten sind grenzenlos" |
| Diff-anchored writing | MEDIUM | "wurde jetzt ergänzt" (has now been added) when the text should describe the current state |
| Aphorism formulas | MEDIUM | "X ist die Sprache des Y" (X is the language of Y), "X wird zur Falle" — a nice-sounding empty formula replacing a concrete claim |
| Isometric document | MEDIUM | Every paragraph 3–5 sentences, every section the same length, every aspect weighted equally |
| Markerless closure compulsion | MEDIUM | Every paragraph ends on an evaluative wrap-up sentence that adds nothing ("Damit ist die Grundlage gelegt.") |
7. Argumentation & Evidence (5 patterns)
| Pattern | Severity | Example |
|---|---|---|
| Passive constructions and subjectless fragments | MEDIUM | "wurde durchgeführt" (was carried out), "Keine Konfiguration nötig." (No configuration needed.) — hides the actor |
| Conditional stacking | MEDIUM | Piled-up "wenn/falls/sofern" (if/in case/provided that) clauses instead of stating what the analysis found |
| Miscalibrated epistemic confidence | MEDIUM | Swings between over-assertion ("grundlegend verändert", "zweifellos") and over-hedging ("scheint möglicherweise", "könnte eventuell") |
| Speculative gap-filling | HIGH | "hält sich bedeckt" (keeps a low profile), "vermutlich" (likely), despite missing sources |
| Fabricated first-person experience | HIGH | "Als ich letzte Woche mit einem Kunden sprach..." (When I talked to a client last week...) — an anecdote with no real owner |
LLMs hide the actor behind passive voice and subjectless sentences. They stack conditionals where a direct statement would do. Most telling: the swing between over-assertion ("revolutionary", "without doubt") and over-hedging ("seems possibly", "could perhaps") within the same paragraph. When sources are missing, they often add another tell: plausible filler where the text should simply say the point is not documented.
Fabricated first-person experience is the second-order tell: it often appears precisely when someone tries to make AI text sound "more human." Staged anecdotes and forced casualness ("Ehrlich gesagt", "Keine Sorge") are fabrication, not style. That's why the Humanizer never invents experience when rewriting — voice comes only from your writing sample or facts you explicitly provide.
Patterns 32–34 were adapted from upstream PR #39. Patterns 35–38 were adapted from upstream PR #67. Patterns 39–41 are from v3.1, adapted from upstream PRs #79, #80, #84, #85, #94, #96. All have German-specific phrasing and examples.
8. Additions (4 patterns, new in v3.2)
Four patterns drawn from the German Wikipedia's Erkennung KI-Einsatz guideline and its Schnelltest KI companion:
| Pattern | Severity | Example |
|---|---|---|
| Source incongruence | HIGH | Source exists but doesn't support the claim |
| Hidden Unicode characters | HIGH | Zero-Width Space (U+200B), Soft Hyphen, BOM, bidi controls |
| Standard chapters without substance | MEDIUM | "Future perspectives" + unsourced filler; don't shorten — concretize/integrate |
| Anglicism structures | MEDIUM | Hard calques & false friends: "am Ende des Tages", "eventuell" = "eventually/finally" (not "possibly"), "aktuell" = "actually" (not "currently") |
Source incongruence is particularly tricky: the source exists, the DOI validates, the author did publish — only the paper doesn't actually support the claim. A classic LLM hallucination pattern that simple fact-checking tools miss. False friends like "eventuell" (eventually = finally, not "maybe") are corrected regardless of mode because they are semantic errors.
9. Typography & Format (7 patterns)
This category was added in v3.3. It catches texts that are convincing in substance but give themselves away through typographic anglicisms or decorative formatting.
| Pattern | Severity | Example |
|---|---|---|
| Incorrect German quotation marks | HIGH | German opener with U+201D/ASCII close instead of U+201C |
| English title-case capitalization | MEDIUM | "Die Zukunft Der Digitalen Transformation" |
| English decimal/date formats | LOW | "3.5 Prozent", "May 12, 2026" |
| English genitive apostrophe | MEDIUM | "Martin's Profil" instead of "Martins Profil" |
| Bullet-point punctuation | LOW | Periods on bare keywords, inconsistent lists |
| Obsessive parataxis | MEDIUM | 4+ same-shape main clauses without subordination |
| Markdown structure artifacts | MEDIUM | Single-row tables, skipped heading levels (H2→H4), thematic break --- right before a heading |
The quotation-mark problem is unusually stubborn: Claude picks the wrong closing German quote systematically, and prompting alone won't reliably fix it. The Humanizer flags those spots — the actual fix belongs in a post-processor or linter. Not every odd quote is a tell, though: the only hard AI signal is the asymmetry — a German opening quote (U+201E) paired with a wrong or straight closing mark instead of the correct U+201C. Consistently straight quotes, by contrast, are usually a CMS or editor artifact, not an AI tell; consistently English curly quotes are a weak signal at best. Treating every straight quote as AI just manufactures false positives.
Obsessive parataxis is the opposite kind of tell: subtle. Each individual sentence is correct, readability scores fine — but the monotony betrays the machine. The fix isn't "rewrite everything", it's turning every third sentence into a complex sentence with subordination. Exception: if staccato is the intended style (advertising, manifestos), the "don't touch" rule for soft-pattern clustering applies.
Before (LLM):
Das Team Analysierte Die Daten. Die Ergebnisse waren eindeutig. Die Conversion stieg um 3.5 Prozent. Das Projekt wurde im Budget abgeschlossen.
After (human):
Das Team analysierte die Daten und kam zu einem eindeutigen Ergebnis: Die Conversion stieg um 3,5 Prozent, obwohl das Projekt im Budget blieb.
10. Title & Sentence Structure (2 patterns, new in v3.6)
Two patterns that only stand out when they cluster — and the only ones in this catalog that statistical detectors also measure (more on that below).
| Pattern | Severity | Example |
|---|---|---|
| Colon-title scheme | MEDIUM | Repeated "Keyword: explanatory tail" across titles and subheadings |
| Uniform sentence rhythm | MEDIUM | Sentences nearly all the same length, always subject-first |
What the Humanizer Does Not Flag
v3.4 adds a guardrail against over-editing. Not every polished or formally correct piece of writing is AI-generated.
These are not reliable tells on their own:
- perfect grammar and consistent style
- one dash or curly quotes
- dry prose without specific patterns
- one transition word such as "allerdings" or "zudem"
- unsourced claims without additional source or speculation patterns
The skill also preserves positive human signals: concrete details, unresolved tension, era-bound references, genuine asides, self-corrections, and varied sentence length. The point is to remove AI artifacts, not flatten human authors.
Why AI Detectors Flag Clean Text as AI
There are tools online that scan a text and spit out an "AI probability." But these detectors do not check the patterns in this article. They estimate two statistical quantities:
- Perplexity — how predictable the next word is. Precise, smooth technical prose is highly predictable and scores low perplexity.
- Burstiness — how much sentence length and structure vary. Uniform sentences produce low burstiness.
The pretty labels such tools attach — "mechanical precision," "impersonal tone," "robotic formality" — are just translations of those two numbers. And that's exactly the problem: these detectors often punish the very things that make a good technical text. Correct terminology lowers perplexity. Cleanly attributed sources sound "impersonal." A clear, factual structure reads as "mechanical."
So a text can be entirely human-written, every source accurate, every sentence earning its place — and still get flagged as "AI."
The wrong response is to deliberately degrade the text: water down terminology, loosen sources, sprinkle in artificial typos or slang just to push a number down. That makes the text worse, not more human.
The Humanizer goes the other way. It treats only two of these statistical findings — and only because they are genuine readability problems even without a detector:
- Colon-title scheme (pattern 54): When the H1, the caption, and several subheadings all follow the "Keyword: explanatory tail" shape, the result is a mechanical rhythm. A single colon title is perfectly fine — the clustering is the signal.
- Uniform sentence rhythm (pattern 55): When nearly every sentence is the same length and starts with the subject, the text turns monotone. The fix is not to insert errors but to deliberately spread sentence length — a short sentence next to a long, structured one.
Since v3.8 there is a third, substance-preserving lever: abstract-noun stacking (pattern 58). Replacing "verschiedene Maßnahmen zur Verbesserung der Verkehrssituation" (various measures to improve the traffic situation) with "two closed through-roads and a 30 km/h limit" raises word variance as a side effect of precision — the only way to treat low perplexity without making the text worse. And rhythm diagnosis is now measured, not felt: the new rhythm_lint.py script counts sentence-length spread, subject-first openings, paragraph lengths, and connector density deterministically and reports suspicions as numbers.
Everything else a detector flags as "too clean" is not an AI tell but, usually, simply good writing. Leave it alone.
Why I Created the German Humanizer
I discovered Siqi Chen's original Humanizer and immediately saw the gap: it worked brilliantly for English, but German AI had different patterns. Testing it on German text was like using an English spell-checker on German — not wrong, just missing the point.
The German Wikipedia maintains its own guide to AI-generated content indicators. The English Wikipedia has a comparable resource. Siqi's original pulls from the English one; the German version documents something different. I used both as the foundation.
The philosophy is the same as Siqi's tool — analysis, not auto-rewriting. But the patterns are German-specific. Since v4.0.0 the project stands on its own: it follows its own versioning scheme and roadmap, roughly half of its 65 patterns have no upstream counterpart, and it ships deterministic linters and a test suite the original doesn't have. The original remains both the inspiration and a source of ideas worth adapting.
Working with English content? Use Siqi Chen's original Humanizer. It's excellent for English text.
Working with German content? That's what the German adaptation is for.
If your goal is to disguise AI use, this is the wrong tool. The point is better writing, not camouflage.
Who Needs This
- German content creators using AI who want their writing to sound authentic
- Marketing teams reviewing copy for AI artifacts
- Wikipedia editors evaluating German submissions
- Bilingual teams where English editors need to catch German AI patterns
- Anyone learning how to recognize German AI-generated text
Credits and Open Source
The tool is MIT licensed and open source. It builds on:
- Pattern research: German Wikipedia's AI detection guide + English Wikipedia's AI writing signs
- Original concept & English Humanizer: Siqi Chen (blader)
- German adaptation: github.com/marmbiz/humanizer-de
I built the German version. Siqi built the original. Both Wikipedias documented the patterns.
Changelog
v4.0.0 (June 2026)
- Standalone release: own versioning scheme without the
-de.FORKsuffix; the project no longer tracks upstream versions — blader/humanizer remains the inspiration and an idea source - 2 new patterns (64–65), adapted from the English original for German: AI marker vocabulary (the German counterparts to "delve" and "tapestry": "beleuchten", "eintauchen", "spannend", "nahtlos", "die digitale Landschaft") and copula avoidance ("fungiert als", "verfügt über", "stellt dar" instead of "ist"/"hat")
- Pattern 58 sharpened: the vocabulary-trap list moved into pattern 64; 58 now focuses on hypernyms and nominal style
- 65 patterns across 10 categories
v3.8.0-de.1 (June 2026)
- 6 new patterns (58–63): Abstract-noun stacking and hypernym preference, fabricated first-person experience and forced casualness, synonym rotation for the same entity, isometric document, markerless closure compulsion, modal-particle anomaly
- Five-pass workflow: fixed order artifacts/evidence → lexis → structure → rhythm → self-audit; rhythm work (prefield rotation, sentence-length spreading, connector budget) is now the default in Casual and Neutral modes
- New measurement script
scripts/rhythm_lint.py: deterministic burstiness/rhythm metrics (sentence-length spread, subject-initial ratio, paragraph lengths, connector density) feeding patterns 4/51/54/55/61 as suspicions - Self-audit against new monotony: replacement strategies are rotated so fixes (e.g. em-dash → period) don't create new AI patterns themselves
- Golden corpus in
tests/corpus/with deterministically verifiable expectations - 63 patterns across 10 categories
v3.7.0-de.1 (June 2026)
- 2 new patterns (56–57): Aphorism formulas (category "Rhetoric & Structure", now 9 patterns), Markdown structure artifacts (category "Typography & Format", now 7 patterns)
- Aphorism formulas: catches nice-sounding empty formulas like "X ist die Sprache des Y" ("X is the language of Y") that replace a concrete claim with a catchy template
- Markdown structure artifacts: bundles three format tells — single-row tables instead of prose, skipped heading levels (H2→H4), and decorative thematic breaks (
---) right before headings - Plugin install: installable via the Claude Code plugin marketplace (
/plugin marketplace add marmbiz/humanizer-de) - 57 patterns across 10 categories
v3.6.0-de.1 (June 2026)
- 2 new patterns (54–55) in a new "Title & Sentence Structure" category: Colon-title scheme, Uniform sentence rhythm
- Realistic about statistical detectors: a new guardrail noting that perplexity/burstiness findings usually hit legitimate technical language and are not an AI tell
- Pattern 46 sharpened: only the asymmetry (German opener + wrong closing mark) is a hard tell; consistently straight quotes are a CMS artifact
- 55 patterns across 10 categories
v3.5.0-de.1 (May 2026)
- Architecture overhaul: SKILL.md is now a lean router; the full pattern catalog lives in a dedicated reference file
- Decision tables for overlapping findings and a standalone Unicode/quote linter with conservative auto-fix
- Test suite added; no new patterns
v3.4.0-de.1 (May 2026)
- False-positive guardrails: New "What NOT to flag" section plus human-writing signals to preserve
- 2 new patterns (52–53): Diff-anchored writing, Speculative gap-filling
- Guardrails extended: Speculative filler is now treated as a source-based finding and as a substanceless AI artifact when it needs removal
- 53 patterns across 9 categories
v3.3.0-de.1 (May 2026)
- 6 new patterns (46–51) in a new "Typography and Format" category: Incorrect German quotation marks, English title-case capitalization, English decimal/date formats, genitive apostrophe errors, bullet-point punctuation, obsessive parataxis
- Pattern 43 extended: Unicode scanner now covers U+2061–U+2064 (invisible mathematical operators used as potential AI watermarks)
- 51 patterns across 9 categories
v3.2.4-de.1 (April 2026)
- 4 new patterns (42–45): Source incongruence, Hidden Unicode characters, Standard chapters without substance, Anglicism structures — new category "Additions"
- Additional sources: Now also builds on the Wikipedia guidelines Erkennung KI-Einsatz and Schnelltest KI
- Guardrails harmonized: "Never shorten substance" (instead of "Never shorten") with an explicit exception list for artifact cleanup
- 3+-clustering rule limited to soft stylistic patterns; HIGH patterns, structural findings, source-based findings, and false friends are corrected on every occurrence
- Mode system made consistent: "Add voice" full in Casual, moderate in Neutral, none in Formal
- Operational precisions in patterns 21, 22, 25, 26: External research is out of scope for the skill; mark instead
- 45 patterns across 8 categories
v3.1.0-de.1 (April 2026)
- 3 new patterns (39–41): Passive constructions, Conditional stacking, Miscalibrated epistemic confidence — new category "Argumentation & Evidence"
- 4 expanded patterns: Negative parallelisms (+clipped negation fragments), Dashes (replacement hierarchy), Broken wikitext (+AI artifacts), DOIs (→full citation fabrication)
- "Never shorten" rule: Output must cover everything the original contains
- Dash scan: Dedicated workflow step
- Quick checklist: 7-point pre-output audit
- 41 patterns in 7 categories
- Integrates 6 PRs from the English original (blader/humanizer): #79, #80, #84, #85, #94, #96
v3.0.0-de.1 (March 2026)
- Voice calibration: Match the user's personal writing style from samples
- 4 new patterns (35–38): Rhetorical fake questions, Universal human experience openers, "In today's X world" framing, Aspirational corporate closings (adapted from upstream PR #67)
- 38 patterns total
v2.3.0-de.1 (March 2026)
- 3 new patterns (32–34): Persuasive authority phrases, Signposting, Fragmented headings (adapted from upstream PR #39)
- Severity ranking (HIGH / MEDIUM / LOW) for all 34 patterns (inspired by upstream PR #51)
- Mode system: Casual / Neutral / Formal
- Quick reference table for fast scanning
- "Don't touch" rules and guardrails
v2.2.0-de.2 (February 2026)
- 2-pass workflow instead of one-shot cleanup: Draft -> Quick audit -> Final
- More emphasis on voice: rhythm, perspective, natural variation
- Cleaner review format: three separated output blocks
This post was written with AI assistance. But reviewed with language-specific awareness. Because the patterns that reveal AI aren't just in what you write — they're in which language you write it in.