Transforms: decomposeDiacriticalMarks

farrowtj · March 27, 2023, 4:10pm

In our current solution we have built a powershell script to clear out diacritic marks and are now moving to SailPoint IDNow. We noticed the transform decomposeDiacriticalMarks to perform the same function.

I am wondering if we can get the substitution list that is utilized in this transform so that if the substitutions are different we can prepare our users for that.

colin_mckibben · March 27, 2023, 6:21pm

Welcome to the developer community Todd.

We don’t have a complete list of substitution mappings, but I can tell you what libraries/method are being used by the transform.

The decomposeDiacriticalMarks transform uses the Normalizer library to decompose the diacritical marks. It specifically uses the Normalization Form KD (NFKD), as described in Sections 3.6, 3.10, and 3.11 of The Unicode Standard, also summarized under Annex 4: Decomposition.

Once decomposed, the transform then uses a Regex Replace to replace all diacritical marks using the InCombiningDiacriticalMarks property of Unicode (ex. replaceAll("[\\p{InCombiningDiacriticalMarks}]", "")).

That’s probably more technical than you wanted, but it will hopefully give you some idea of what’s going on under the hood.

If you want to run some tests in code, you can use this java code to compare the results of what the transform does with what your PowerShell does.

import java.text.Normalizer;
import java.util.regex.Pattern;

// Decomposes characters from their diacritical marks
input = Normalizer.normalize(input, Normalizer.Form.NFKD);

// Removes the marks
input = input.replaceAll("[\\p{InCombiningDiacriticalMarks}]", "");

Topic		Replies	Views
Can Transform convert special character (UTF8) to normal character ISC Discussion and Questions transforms , identity-security-cloud	19	409	December 16, 2024
Can the Name Normalizer transform be edited? ISC Discussion and Questions	9	2474	July 19, 2023
Getting error in the code ISC Discussion and Questions identity-security-cloud	3	77	March 14, 2025
Account Attribute Transform ISC Discussion and Questions transforms , identity-security-cloud	8	74	June 1, 2025
Non English characters ISC Discussion and Questions transforms , identity-security-cloud	5	76	June 7, 2025

Transforms: decomposeDiacriticalMarks

Related topics