AI

Humanizer for German: Why AI Text Detection Is Language-Specific

AI-generated text has telltale patterns. But those patterns aren't universal.

When you read a text that feels wrong—too smooth, perfectly structured, every paragraph the same length—you can often sense that an LLM wrote it. But what you're sensing is mostly English. German AI text sounds different. It has its own tells.

The problem: most AI detection tools and guides are designed for English. They catch English patterns beautifully. They miss German ones entirely.

I built the Humanizer (Deutsch) to fix that. It's a free, open-source Claude Code skill that detects 45 German-specific AI writing patterns, ranks them by severity, and walks you through a structured 2-pass cleanup. New in v3: optional voice calibration that matches your personal writing style from samples.

Try it

github.com/marmbiz/humanizer-de — MIT licensed, free, works as a Claude Code skill.

git clone https://github.com/marmbiz/humanizer-de ~/.claude/skills/humanizer-de

Then in Claude Code: /humanizer — done.

What the Humanizer Does

You call it with /humanizer or just say "Humanize this text for me." It gives you:

  1. A draft rewrite with the obvious AI patterns removed
  2. A quick anti-AI audit flagging remaining tells
  3. A final version after the second pass

New in v3: Voice calibration. Provide a sample of your own writing and the skill analyzes your sentence rhythm, word choices, and quirks — then applies them to the rewrite. Instead of generic "clean" output, you get text that sounds like you.

Three modes adjust the correction to your context:

Mode When What happens
Casual Blog posts, social media, newsletters Adds personality and rhythm
Neutral Business reports, product docs, emails Removes AI tells, keeps tone neutral
Formal Academic papers, legal texts, technical docs Only removes tells, preserves structure

Default is Neutral when the context is unclear.

New in v3.2: Four additional patterns and structural guardrails:

  • Source incongruence (HIGH): The source exists, but doesn't actually support the claim in the text. Classic LLM hallucination signal.
  • Hidden Unicode characters (HIGH): Zero-Width Space, Soft Hyphen, bidi controls — invisible artifacts from LLM output.
  • Standard chapters without substance (MEDIUM): Generic headings like "Future perspectives" with unsourced filler. Don't shorten — concretize, integrate, or re-title.
  • Anglicism structures (MEDIUM): Hard calques ("am Ende des Tages") and false friends ("eventuell" in the sense of "eventually/finally" rather than "possibly").

Plus harmonized guardrails: "Never shorten substance" instead of "Never shorten" (artifacts may be removed), the 3+-clustering rule applies only to soft stylistic patterns, and mode-dependent voice calibration is now consistent throughout.

Severity ranking (HIGH / MEDIUM / LOW) for each pattern lets you focus on what matters most. HIGH patterns are almost always AI. LOW patterns only stand out when they cluster.

Why German AI Text Is Different

English and German diverge in their vulnerabilities to LLMs. The same model that produces flawless English can betray itself immediately in German through patterns that native English speakers don't notice.

Take these examples:

  • Participle-I constructions like "gewährleistend" or "hervorhebend" (ensuring, highlighting). In English, "-ing" forms are natural everywhere. In German, this construction screams LLM.
  • Overused transition phrases like "Darüber hinaus" (furthermore) appearing three times per paragraph. Native German writers vary their transitions. LLMs repeat the same mechanical connectors.
  • Em-dashes everywhere — a punctuation habit from English that German doesn't share natively.
  • Vague authorities like "Experten sagen" (experts say) with no sources attached.
  • Symbolic overload like "steht als Zeugnis für" (stands as testimony to) — nobody writes like this.
  • Promotional tone with "atemberaubend" (breathtaking) in contexts where it doesn't belong.
  • Chatbot artifacts like "Stand Januar 2024" (as of January 2024) appearing in articles written months later.

Before (LLM):

Die atemberaubende Stadt mit ihrem reichen kulturellen Erbe steht als Zeugnis für die künstlerische Brillanz vergangener Generationen.

"The breathtaking city with its rich cultural heritage stands as testimony to the artistic brilliance of past generations."

After (human):

Die Stadt hat eine lange Geschichte. Ihre Denkmäler zeigen die Handwerkskunst des Mittelalters.

"The city has a long history. Its monuments show medieval craftsmanship."

Less decoration, more substance.

45 Patterns in 8 Categories

The Humanizer detects patterns across eight categories:

1. Language & Tone (12 patterns, mostly HIGH)

Symbolic overload, promotional language, editorial comments, mechanical conjunctions, section summaries, participle-I constructions, vague authorities, forced conclusions, negative parallelisms (now including clipped negation fragments like "kein Raten.", "keine Kompromisse."), tricolon overuse, false extensions, misplaced "Fazit" sections.

2. Style (4 patterns, MEDIUM/LOW)

Excessive bold text, false lists, emojis before headings, em-dash overuse (now with a replacement hierarchy: period > comma > colon > semicolon > parentheses > rephrase, plus detection of paired inserts and dash variants).

3. Communication (6 patterns, all HIGH)

Letter-style writing, collaborative chatbot phrases ("I hope this helps!"), knowledge cutoff references, prompt refusals, placeholder text, links to search queries.

4. Markup (6 patterns, all MEDIUM)

Markdown instead of wikitext, broken wikitext and AI tool artifacts (oaicite tags, contentReference spans, turn0search0 references), dead links, full citation fabrication (hallucinated publications, non-existent journals, utm_source parameters), incorrect reference formats, wrong categories.

5. Miscellaneous (3 patterns, LOW/MEDIUM)

Abrupt cutoffs, style shifts mid-text, first-person edit summaries.

6. Rhetoric & Structure (7 patterns)

Pattern Severity Example
Persuasive authority phrases MEDIUM "Im Kern" (at its core), "In Wirklichkeit" (in reality)
Signposting MEDIUM "Schauen wir uns an" (let's look at), "Here's what you need to know"
Fragmented headings LOW Generic one-liner immediately after a heading
Rhetorical questions as fake engagement MEDIUM "Aber was bedeutet das?" (But what does this mean?)
Universal human experience opener MEDIUM "Seit jeher" (since time immemorial), "Seit Anbeginn der Zivilisation"
"In today's X world" framing MEDIUM "In der heutigen digitalen Welt" (in today's digital world)
Aspirational corporate closing MEDIUM "bestens aufgestellt" (well-positioned), "die Möglichkeiten sind grenzenlos"

7. Argumentation & Evidence (3 patterns)

Pattern Severity Example
Passive constructions and subjectless fragments MEDIUM "wurde durchgeführt" (was carried out), "Keine Konfiguration nötig." (No configuration needed.) — hides the actor
Conditional stacking MEDIUM Piled-up "wenn/falls/sofern" (if/in case/provided that) clauses instead of stating what the analysis found
Miscalibrated epistemic confidence MEDIUM Swings between over-assertion ("grundlegend verändert", "zweifellos") and over-hedging ("scheint möglicherweise", "könnte eventuell")

LLMs hide the actor behind passive voice and subjectless sentences. They stack conditionals where a direct statement would do. Most telling: the swing between over-assertion ("revolutionary", "without doubt") and over-hedging ("seems possibly", "could perhaps") within the same paragraph.

Patterns 32–34 were adapted from upstream PR #39. Patterns 35–38 were adapted from upstream PR #67. Patterns 39–41 are from v3.1, adapted from upstream PRs #79, #80, #84, #85, #94, #96. All have German-specific phrasing and examples.

8. Additions (4 patterns, new in v3.2)

Four patterns drawn from the German Wikipedia's Erkennung KI-Einsatz guideline and its Schnelltest KI companion:

Pattern Severity Example
Source incongruence HIGH Source exists but doesn't support the claim
Hidden Unicode characters HIGH Zero-Width Space (U+200B), Soft Hyphen, BOM, bidi controls
Standard chapters without substance MEDIUM "Future perspectives" + unsourced filler; don't shorten — concretize/integrate
Anglicism structures MEDIUM Hard calques & false friends: "am Ende des Tages", "eventuell" = "eventually/finally" (not "possibly"), "aktuell" = "actually" (not "currently")

Source incongruence is particularly tricky: the source exists, the DOI validates, the author did publish — only the paper doesn't actually support the claim. A classic LLM hallucination pattern that simple fact-checking tools miss. False friends like "eventuell" (eventually = finally, not "maybe") are corrected regardless of mode because they are semantic errors.

Why I Created the German Humanizer

I discovered Siqi Chen's original Humanizer and immediately saw the gap: it worked brilliantly for English, but German AI had different patterns. Testing it on German text was like using an English spell-checker on German — not wrong, just missing the point.

The German Wikipedia maintains its own guide to AI-generated content indicators. The English Wikipedia has a comparable resource. Siqi's original pulls from the English one; the German version documents something different. I used both as the foundation.

The philosophy is the same as Siqi's tool — analysis, not auto-rewriting. But the patterns are German-specific.

Working with English content? Use Siqi Chen's original Humanizer. It's excellent for English text.

Working with German content? That's what the German adaptation is for.

If your goal is to disguise AI use, this is the wrong tool. The point is better writing, not camouflage.

Who Needs This

  • German content creators using AI who want their writing to sound authentic
  • Marketing teams reviewing copy for AI artifacts
  • Wikipedia editors evaluating German submissions
  • Bilingual teams where English editors need to catch German AI patterns
  • Anyone learning how to recognize German AI-generated text

Credits and Open Source

The tool is MIT licensed and open source. It builds on:

I built the German version. Siqi built the original. Both Wikipedias documented the patterns.


Changelog

v3.2.4-de.1 (April 2026)
  • 4 new patterns (42–45): Source incongruence, Hidden Unicode characters, Standard chapters without substance, Anglicism structures — new category "Additions"
  • Additional sources: Now also builds on the Wikipedia guidelines Erkennung KI-Einsatz and Schnelltest KI
  • Guardrails harmonized: "Never shorten substance" (instead of "Never shorten") with an explicit exception list for artifact cleanup
  • 3+-clustering rule limited to soft stylistic patterns; HIGH patterns, structural findings, source-based findings, and false friends are corrected on every occurrence
  • Mode system made consistent: "Add voice" full in Casual, moderate in Neutral, none in Formal
  • Operational precisions in patterns 21, 22, 25, 26: External research is out of scope for the skill; mark instead
  • 45 patterns across 8 categories
v3.1.0-de.1 (April 2026)
  • 3 new patterns (39–41): Passive constructions, Conditional stacking, Miscalibrated epistemic confidence — new category "Argumentation & Evidence"
  • 4 expanded patterns: Negative parallelisms (+clipped negation fragments), Dashes (replacement hierarchy), Broken wikitext (+AI artifacts), DOIs (→full citation fabrication)
  • "Never shorten" rule: Output must cover everything the original contains
  • Dash scan: Dedicated workflow step
  • Quick checklist: 7-point pre-output audit
  • 41 patterns in 7 categories
  • Integrates 6 PRs from the English original (blader/humanizer): #79, #80, #84, #85, #94, #96
v3.0.0-de.1 (March 2026)
  • Voice calibration: Match the user's personal writing style from samples
  • 4 new patterns (35–38): Rhetorical fake questions, Universal human experience openers, "In today's X world" framing, Aspirational corporate closings (adapted from upstream PR #67)
  • 38 patterns total
v2.3.0-de.1 (March 2026)
  • 3 new patterns (32–34): Persuasive authority phrases, Signposting, Fragmented headings (adapted from upstream PR #39)
  • Severity ranking (HIGH / MEDIUM / LOW) for all 34 patterns (inspired by upstream PR #51)
  • Mode system: Casual / Neutral / Formal
  • Quick reference table for fast scanning
  • "Don't touch" rules and guardrails
v2.2.0-de.2 (February 2026)
  • 2-pass workflow instead of one-shot cleanup: Draft -> Quick audit -> Final
  • More emphasis on voice: rhythm, perspective, natural variation
  • Cleaner review format: three separated output blocks

This post was written with AI assistance. But reviewed with language-specific awareness. Because the patterns that reveal AI aren't just in what you write — they're in which language you write it in.

🌐