Removing of accent characters and ligature characters from name attributes

shrutisangrulkar · October 10, 2025, 2:12pm

Please be sure you’ve read the docs and API specs before asking for help. Also, please be sure you’ve searched the forum for your answer before you create a new topic.

Please consider addressing the following when creating your topic:

What have you tried?
What errors did you face (share screenshots)?
Share the details of your efforts (code / search query, workflow json etc.)?
What is the result you are getting and what were you expecting?

We are having a cloud rule to generate unique username for the user. While generating the username we need to replace the accent characters and ligature characters from the first and last name with English characters and then proceed with the user name generation.

I have added the below code to replace the ligature characters. This code works fine when I try to run it on my local machine, but when I merge it with the cloud rule and run the SailPoint validator I get Failure message.

public String removeDiacritics(String inputString) {
HashMap<Integer, String> ligatureMap = new HashMap<>();
ligatureMap.put(0x00C6, “AE”);
ligatureMap.put(0x00D0, “Dh”);
ligatureMap.put(0x00F0, “dh”);
ligatureMap.put(0x0110, “Dj”);
ligatureMap.put(0x0111, “dj”);
ligatureMap.put(0x0126, “H”);
ligatureMap.put(0x0127, “h”);
ligatureMap.put(0x0130, “I”);
ligatureMap.put(0x0131, “i”);
ligatureMap.put(0x0141, “L”);
ligatureMap.put(0x013F, “L”);
ligatureMap.put(0x0142, “l”);
ligatureMap.put(0x0140, “l”);
ligatureMap.put(0x014A, “Ng”);
ligatureMap.put(0x014B, “ng”);
ligatureMap.put(0x00D8, “O”);
ligatureMap.put(0x0152, “OE”);
ligatureMap.put(0x00F8, “o”);
ligatureMap.put(0x0153, “o”);
ligatureMap.put(0x0138, “q”);
ligatureMap.put(0x00DE, “Th”);
ligatureMap.put(0x00FE, “th”);
ligatureMap.put(0x00DF, “ss”);
ligatureMap.put(0x00E6, “a”);
for (Map.Entry<Integer, String> entry : ligatureMap.entrySet()) {
int codePoint=entry.getKey();
char[] chars = Character.toChars(codePoint);
String character = new String(chars);
inputString = inputString.replace(character, entry.getValue());
}
String normalized = Normalizer.normalize(inputString, Normalizer.Form.NFD);
Pattern pattern = Pattern.compile(“\p{M}”);
return pattern.matcher(normalized).replaceAll(“”);
}

If I remove the above code the validator returns SUCCESS, but after adding the code it FAILS saying that

No beanshell method called ‘removeDiacritics’ was found with signature: (java.lang.String)

sukanta_biswas · October 10, 2025, 2:42pm

@shrutisangrulkar ,

You’re running into two classic “Beanshell-in-ISC” gotchas:

The validator is looking for a top-level Beanshell method with signature (String) but can’t see yours. In SailPoint rules, don’t mark helper methods public (or static)—just declare them as plain functions in the top scope and before you call them.
Hidden syntax landmines: smart quotes, missing imports, regex escaping, and (occasionally) generics. The web UI often pastes “smart quotes” (“ ”) which break the validator. Also make sure Normalizer/Pattern are imported, and escape the \p{M} class properly.

Below is a clean, validator-friendly version you can drop into your rule’s <Source><![CDATA[ ... ]]></Source> block. It:

Replaces common ligatures and special letters (Æ, Ð, ð, Đ, đ, Ħ, ħ, İ, ı, Ł/ł incl. Polish digraph forms, Ŋ/ŋ, Ø/ø, Œ/œ, Þ/þ, ß, æ) with ASCII.
Strips all remaining diacritics via Unicode normalization (NFD) + remove combining marks.
Avoids access modifiers, uses ASCII quotes only, imports everything explicitly, and keeps regex escaping correct.
Works in Beanshell (and Java) without relying on generics.

import sailpoint.object.*;
import sailpoint.api.*;
import sailpoint.tools.*;
import org.apache.commons.lang.StringUtils;

import java.util.Map;
import java.util.HashMap;
import java.text.Normalizer;
import java.util.regex.Pattern;

// Max length for your AD/SAM or whatever target (adjust as needed)
final int MAX_USERNAME_LENGTH = 12;

/**
 * Helper: replace ligatures/special letters with ASCII and strip diacritics.
 * Keep as a top-level Beanshell function (no 'public' / 'static').
 */
String removeDiacritics(String input) {
    if (input == null) return null;

    // 1) Replace specific ligatures / special letters first
    Map map = new HashMap(); // avoid generics to keep Beanshell happy
    map.put(Integer.valueOf(0x00C6), "AE");  // Æ
    map.put(Integer.valueOf(0x00D0), "Dh");  // Ð
    map.put(Integer.valueOf(0x00F0), "dh");  // ð
    map.put(Integer.valueOf(0x0110), "Dj");  // Đ
    map.put(Integer.valueOf(0x0111), "dj");  // đ
    map.put(Integer.valueOf(0x0126), "H");   // Ħ
    map.put(Integer.valueOf(0x0127), "h");   // ħ
    map.put(Integer.valueOf(0x0130), "I");   // İ (Latin capital I with dot)
    map.put(Integer.valueOf(0x0131), "i");   // ı (dotless i)
    map.put(Integer.valueOf(0x0141), "L");   // Ł
    map.put(Integer.valueOf(0x013F), "L");   // Ŀ
    map.put(Integer.valueOf(0x0142), "l");   // ł
    map.put(Integer.valueOf(0x0140), "l");   // ŀ
    map.put(Integer.valueOf(0x014A), "Ng");  // Ŋ
    map.put(Integer.valueOf(0x014B), "ng");  // ŋ
    map.put(Integer.valueOf(0x00D8), "O");   // Ø
    map.put(Integer.valueOf(0x0152), "OE");  // Œ
    map.put(Integer.valueOf(0x00F8), "o");   // ø
    map.put(Integer.valueOf(0x0153), "oe");  // œ
    map.put(Integer.valueOf(0x0138), "q");   // ĸ (kra) — approximate
    map.put(Integer.valueOf(0x00DE), "Th");  // Þ
    map.put(Integer.valueOf(0x00FE), "th");  // þ
    map.put(Integer.valueOf(0x00DF), "ss");  // ß
    map.put(Integer.valueOf(0x00E6), "ae");  // æ

    String s = input;
    for (Object k : map.keySet()) {
        int codePoint = ((Integer) k).intValue();
        String repl = (String) map.get(k);
        String ch = new String(Character.toChars(codePoint));
        s = s.replace(ch, repl);
    }

    // 2) Strip remaining accents by decomposing and removing combining marks
    String normalized = Normalizer.normalize(s, Normalizer.Form.NFD);
    Pattern combiningMarks = Pattern.compile("\\p{M}+");
    String noDiacritics = combiningMarks.matcher(normalized).replaceAll("");

    // 3) Optional: collapse any leftover non-ASCII letters to ASCII equivalents
    // and trim spaces
    return noDiacritics;
}

/**
 * Sanitizes to [A-Za-z0-9] only.
 */
String alnum(String s) {
    if (s == null) return null;
    return s.replaceAll("[^A-Za-z0-9]", "");
}

/**
 * Example username generator using the helpers above.
 * Adjust the uniqueness logic for your tenant (e.g., check identities, append counters, etc.).
 */
String generateUsername(String firstName, String lastName) throws GeneralException {
    String otherName = identity != null ? identity.getStringAttribute("otherName") : null;

    firstName = StringUtils.trimToNull(firstName);
    lastName  = StringUtils.trimToNull(lastName);
    otherName = StringUtils.trimToNull(otherName);

    if (firstName != null) firstName = removeDiacritics(firstName);
    if (lastName  != null) lastName  = removeDiacritics(lastName);
    if (otherName != null) otherName = removeDiacritics(otherName);

    if (otherName != null) {
        firstName = otherName; // business rule: prefer otherName
    }

    if (firstName == null || lastName == null) {
        log.debug("UsernameGen | Missing first/last name; cannot generate username.");
        return null;
    }

    // Keep alphanumeric only
    firstName = alnum(firstName);
    lastName  = alnum(lastName);

    if (firstName.length() == 0 || lastName.length() == 0) {
        log.debug("UsernameGen | First/last reduced to empty after cleanup.");
        return null;
    }

    // Simple pattern: first initial + last name
    String base = ("" + firstName.charAt(0) + lastName).toLowerCase();

    // Enforce length
    String candidate = base.length() > MAX_USERNAME_LENGTH ? base.substring(0, MAX_USERNAME_LENGTH) : base;

    // TODO: add uniqueness check in your environment (e.g., query existing usernames and append a counter).
    // For illustration we return candidate as-is.
    return candidate;
}

// ===== Entry point for AttributeGenerator (example) =====
// You’ll typically call generateUsername(...) from the rule’s logic below.
String firstName = (String) identity.getAttribute("givenName");
String lastName  = (String) identity.getAttribute("sn"); // or "surname" etc.

return generateUsername(firstName, lastName);

Why this fixes your validator error

No public on removeDiacritics → Beanshell exposes it as a top-level function the validator can find.
ASCII quotes only → avoids hidden character issues.
Proper imports for Normalizer and Pattern.
\\p{M}+ is correctly escaped for a Java/Beanshell string.
Avoids generics in the Map declaration to keep the Beanshell parser happy.
Declared before it’s called.

Cheers!!!

j_place · October 10, 2025, 3:01pm

Hi @shrutisangrulkar I appreciate this isn’t the question you’re asking, but why not the OOB Decompose Diacritical Marks | SailPoint Developer Community transform?

The only issue I found was that Swedes don’t like umlauted vowells being suffixed with an e, (ie Jörg = Jorg in Sweden, but Jörg = Joerg in Germany) but that can be caught prior to the transform

pradeep1602 · October 15, 2025, 7:21am

@j_place to handle that type of scenarios as well you can use replace all transform to handle it and then pass it over to Decompose Diacritical Marks transform.

nkandu01 · October 17, 2025, 5:03am

Hi Sukanta Biswas,

Thanks for checking. We(I am from Shruti team) tried as suggested, but it is still not working for us.

Thanks,
Naresh

pradeep1602 · October 17, 2025, 5:44am

Hi Naresh,

It is hard to tell why is it not working for you without sharing what you tried and what are the failures you are getting.

eg: you could try something like this:
{
“name”: “RemoveAccentsTransform”,
“type”: “string”,
“attributes”: {
“input”: {
“type”: “string”,
“source”: “accountAttribute”,
“name”: “displayName”
},
“replaceAll”: [
{
“match”: “[àáâãäå]”,
“replacement”: “a”
},
{
“match”: “[èéêë]”,
“replacement”: “e”
},
{
“match”: “[ìíîï]”,
“replacement”: “i”
},
{
“match”: “[òóôõö]”,
“replacement”: “o”
},
{
“match”: “[ùúûü]”,
“replacement”: “u”
},
{
“match”: “[ýÿ]”,
“replacement”: “y”
},
{
“match”: “[ñ]”,
“replacement”: “n”
},
{
“match”: “[ç]”,
“replacement”: “c”
},
{
“match”: “[ÀÁÂÃÄÅ]”,
“replacement”: “A”
},
{
“match”: “[ÈÉÊË]”,
“replacement”: “E”
},
{
“match”: “[ÌÍÎÏ]”,
“replacement”: “I”
},
{
“match”: “[ÒÓÔÕÖ]”,
“replacement”: “O”
},
{
“match”: “[ÙÚÛÜ]”,
“replacement”: “U”
},
{
“match”: “[ÝŸ]”,
“replacement”: “Y”
},
{
“match”: “[Ñ]”,
“replacement”: “N”
},
{
“match”: “[Ç]”,
“replacement”: “C”
}
]
}
}
Thanks
Pradeep

system · December 16, 2025, 5:44am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Non English characters ISC Discussion and Questions transforms , identity-security-cloud	5	123	June 7, 2025
Can the Name Normalizer transform be edited? ISC Discussion and Questions	9	2587	July 19, 2023
Nullvalue error on ReplaceAll transform ISC Discussion and Questions transforms , identity-security-cloud , provisioning	11	104	May 27, 2025
Account Attribute Transform ISC Discussion and Questions transforms , identity-security-cloud	8	132	June 1, 2025
Can Transform convert special character (UTF8) to normal character ISC Discussion and Questions transforms , identity-security-cloud	19	571	December 16, 2024

Removing of accent characters and ligature characters from name attributes

Please consider addressing the following when creating your topic:

Related topics