Skip to content

removeDiacritics

Module Export

text |> text
text except text |> text

Removes diacritic marks (accents) from a text value, replacing accented characters with their base equivalents. Optionally preserves specific characters using the except keyword.

removeDiacritics strips combining diacritic marks from characters by decomposing them into their base letter and combining accent codepoints, then discarding the accents. This is useful when working with systems that only support a limited character set.

Characters that do not decompose into a base letter + combining mark (e.g. ø) are left unchanged.

Examples

Remove all diacritics

import { removeDiacritics } from 'text'
from 'résumé' removeDiacritics
// Returns: 'resume'

Preserve specific characters using except

Pass a text containing the characters that should pass through unchanged. This is useful when a target system supports some extended characters but not others — for example a system that handles Swedish å, ä, ö but not other accented letters.

import { removeDiacritics } from 'text'
from 'résumé åäö señor' removeDiacritics except 'åäöÅÄÖ'
// Returns: 'resume åäö senor'

Notes

  • Uses Unicode canonical decomposition (FormD) to separate base characters from their combining marks
  • Characters whose Unicode category is NonSpacingMark are removed
  • Characters that do not decompose (e.g. ø, ß) are left unchanged regardless of the except list
  • The except keyword accepts any text value — each character in the string is treated individually as a preserved character