Removing of accent characters and ligature characters from name attributes

:bangbang: Please be sure you’ve read the docs and API specs before asking for help. Also, please be sure you’ve searched the forum for your answer before you create a new topic.

Please consider addressing the following when creating your topic:

  • What have you tried?
  • What errors did you face (share screenshots)?
  • Share the details of your efforts (code / search query, workflow json etc.)?
  • What is the result you are getting and what were you expecting?

We are having a cloud rule to generate unique username for the user. While generating the username we need to replace the accent characters and ligature characters from the first and last name with English characters and then proceed with the user name generation.

I have added the below code to replace the ligature characters. This code works fine when I try to run it on my local machine, but when I merge it with the cloud rule and run the SailPoint validator I get Failure message.

public String removeDiacritics(String inputString) {
HashMap<Integer, String> ligatureMap = new HashMap<>();
ligatureMap.put(0x00C6, “AE”);
ligatureMap.put(0x00D0, “Dh”);
ligatureMap.put(0x00F0, “dh”);
ligatureMap.put(0x0110, “Dj”);
ligatureMap.put(0x0111, “dj”);
ligatureMap.put(0x0126, “H”);
ligatureMap.put(0x0127, “h”);
ligatureMap.put(0x0130, “I”);
ligatureMap.put(0x0131, “i”);
ligatureMap.put(0x0141, “L”);
ligatureMap.put(0x013F, “L”);
ligatureMap.put(0x0142, “l”);
ligatureMap.put(0x0140, “l”);
ligatureMap.put(0x014A, “Ng”);
ligatureMap.put(0x014B, “ng”);
ligatureMap.put(0x00D8, “O”);
ligatureMap.put(0x0152, “OE”);
ligatureMap.put(0x00F8, “o”);
ligatureMap.put(0x0153, “o”);
ligatureMap.put(0x0138, “q”);
ligatureMap.put(0x00DE, “Th”);
ligatureMap.put(0x00FE, “th”);
ligatureMap.put(0x00DF, “ss”);
ligatureMap.put(0x00E6, “a”);
for (Map.Entry<Integer, String> entry : ligatureMap.entrySet()) {
int codePoint=entry.getKey();
char[] chars = Character.toChars(codePoint);
String character = new String(chars);
inputString = inputString.replace(character, entry.getValue());
}
String normalized = Normalizer.normalize(inputString, Normalizer.Form.NFD);
Pattern pattern = Pattern.compile(“\p{M}”);
return pattern.matcher(normalized).replaceAll(“”);
}

If I remove the above code the validator returns SUCCESS, but after adding the code it FAILS saying that

No beanshell method called ‘removeDiacritics’ was found with signature: (java.lang.String)

@shrutisangrulkar ,

You’re running into two classic “Beanshell-in-ISC” gotchas:

  1. The validator is looking for a top-level Beanshell method with signature (String) but can’t see yours. In SailPoint rules, don’t mark helper methods public (or static)—just declare them as plain functions in the top scope and before you call them.

  2. Hidden syntax landmines: smart quotes, missing imports, regex escaping, and (occasionally) generics. The web UI often pastes “smart quotes” (“ ”) which break the validator. Also make sure Normalizer/Pattern are imported, and escape the \p{M} class properly.

Below is a clean, validator-friendly version you can drop into your rule’s <Source><![CDATA[ ... ]]></Source> block. It:

  • Replaces common ligatures and special letters (Æ, Ð, ð, Đ, đ, Ħ, ħ, İ, ı, Ł/ł incl. Polish digraph forms, Ŋ/ŋ, Ø/ø, Œ/œ, Þ/þ, ß, æ) with ASCII.

  • Strips all remaining diacritics via Unicode normalization (NFD) + remove combining marks.

  • Avoids access modifiers, uses ASCII quotes only, imports everything explicitly, and keeps regex escaping correct.

  • Works in Beanshell (and Java) without relying on generics.

import sailpoint.object.*;
import sailpoint.api.*;
import sailpoint.tools.*;
import org.apache.commons.lang.StringUtils;

import java.util.Map;
import java.util.HashMap;
import java.text.Normalizer;
import java.util.regex.Pattern;

// Max length for your AD/SAM or whatever target (adjust as needed)
final int MAX_USERNAME_LENGTH = 12;

/**
 * Helper: replace ligatures/special letters with ASCII and strip diacritics.
 * Keep as a top-level Beanshell function (no 'public' / 'static').
 */
String removeDiacritics(String input) {
    if (input == null) return null;

    // 1) Replace specific ligatures / special letters first
    Map map = new HashMap(); // avoid generics to keep Beanshell happy
    map.put(Integer.valueOf(0x00C6), "AE");  // Æ
    map.put(Integer.valueOf(0x00D0), "Dh");  // Ð
    map.put(Integer.valueOf(0x00F0), "dh");  // ð
    map.put(Integer.valueOf(0x0110), "Dj");  // Đ
    map.put(Integer.valueOf(0x0111), "dj");  // đ
    map.put(Integer.valueOf(0x0126), "H");   // Ħ
    map.put(Integer.valueOf(0x0127), "h");   // ħ
    map.put(Integer.valueOf(0x0130), "I");   // İ (Latin capital I with dot)
    map.put(Integer.valueOf(0x0131), "i");   // ı (dotless i)
    map.put(Integer.valueOf(0x0141), "L");   // Ł
    map.put(Integer.valueOf(0x013F), "L");   // Ŀ
    map.put(Integer.valueOf(0x0142), "l");   // ł
    map.put(Integer.valueOf(0x0140), "l");   // ŀ
    map.put(Integer.valueOf(0x014A), "Ng");  // Ŋ
    map.put(Integer.valueOf(0x014B), "ng");  // ŋ
    map.put(Integer.valueOf(0x00D8), "O");   // Ø
    map.put(Integer.valueOf(0x0152), "OE");  // Œ
    map.put(Integer.valueOf(0x00F8), "o");   // ø
    map.put(Integer.valueOf(0x0153), "oe");  // œ
    map.put(Integer.valueOf(0x0138), "q");   // ĸ (kra) — approximate
    map.put(Integer.valueOf(0x00DE), "Th");  // Þ
    map.put(Integer.valueOf(0x00FE), "th");  // þ
    map.put(Integer.valueOf(0x00DF), "ss");  // ß
    map.put(Integer.valueOf(0x00E6), "ae");  // æ

    String s = input;
    for (Object k : map.keySet()) {
        int codePoint = ((Integer) k).intValue();
        String repl = (String) map.get(k);
        String ch = new String(Character.toChars(codePoint));
        s = s.replace(ch, repl);
    }

    // 2) Strip remaining accents by decomposing and removing combining marks
    String normalized = Normalizer.normalize(s, Normalizer.Form.NFD);
    Pattern combiningMarks = Pattern.compile("\\p{M}+");
    String noDiacritics = combiningMarks.matcher(normalized).replaceAll("");

    // 3) Optional: collapse any leftover non-ASCII letters to ASCII equivalents
    // and trim spaces
    return noDiacritics;
}

/**
 * Sanitizes to [A-Za-z0-9] only.
 */
String alnum(String s) {
    if (s == null) return null;
    return s.replaceAll("[^A-Za-z0-9]", "");
}

/**
 * Example username generator using the helpers above.
 * Adjust the uniqueness logic for your tenant (e.g., check identities, append counters, etc.).
 */
String generateUsername(String firstName, String lastName) throws GeneralException {
    String otherName = identity != null ? identity.getStringAttribute("otherName") : null;

    firstName = StringUtils.trimToNull(firstName);
    lastName  = StringUtils.trimToNull(lastName);
    otherName = StringUtils.trimToNull(otherName);

    if (firstName != null) firstName = removeDiacritics(firstName);
    if (lastName  != null) lastName  = removeDiacritics(lastName);
    if (otherName != null) otherName = removeDiacritics(otherName);

    if (otherName != null) {
        firstName = otherName; // business rule: prefer otherName
    }

    if (firstName == null || lastName == null) {
        log.debug("UsernameGen | Missing first/last name; cannot generate username.");
        return null;
    }

    // Keep alphanumeric only
    firstName = alnum(firstName);
    lastName  = alnum(lastName);

    if (firstName.length() == 0 || lastName.length() == 0) {
        log.debug("UsernameGen | First/last reduced to empty after cleanup.");
        return null;
    }

    // Simple pattern: first initial + last name
    String base = ("" + firstName.charAt(0) + lastName).toLowerCase();

    // Enforce length
    String candidate = base.length() > MAX_USERNAME_LENGTH ? base.substring(0, MAX_USERNAME_LENGTH) : base;

    // TODO: add uniqueness check in your environment (e.g., query existing usernames and append a counter).
    // For illustration we return candidate as-is.
    return candidate;
}

// ===== Entry point for AttributeGenerator (example) =====
// You’ll typically call generateUsername(...) from the rule’s logic below.
String firstName = (String) identity.getAttribute("givenName");
String lastName  = (String) identity.getAttribute("sn"); // or "surname" etc.

return generateUsername(firstName, lastName);

Why this fixes your validator error

  • No public on removeDiacritics → Beanshell exposes it as a top-level function the validator can find.

  • ASCII quotes only → avoids hidden character issues.

  • Proper imports for Normalizer and Pattern.

  • \\p{M}+ is correctly escaped for a Java/Beanshell string.

  • Avoids generics in the Map declaration to keep the Beanshell parser happy.

  • Declared before it’s called.

Cheers!!!

Hi @shrutisangrulkar I appreciate this isn’t the question you’re asking, but why not the OOB Decompose Diacritical Marks | SailPoint Developer Community transform?

The only issue I found was that Swedes don’t like umlauted vowells being suffixed with an e, (ie Jörg = Jorg in Sweden, but Jörg = Joerg in Germany) but that can be caught prior to the transform

@j_place to handle that type of scenarios as well you can use replace all transform to handle it and then pass it over to Decompose Diacritical Marks transform.

Hi Sukanta Biswas,

Thanks for checking. We(I am from Shruti team) tried as suggested, but it is still not working for us.

Thanks,
Naresh

Hi Naresh,

It is hard to tell why is it not working for you without sharing what you tried and what are the failures you are getting.

eg: you could try something like this:
{
“name”: “RemoveAccentsTransform”,
“type”: “string”,
“attributes”: {
“input”: {
“type”: “string”,
“source”: “accountAttribute”,
“name”: “displayName”
},
“replaceAll”: [
{
“match”: “[àáâãäå]”,
“replacement”: “a”
},
{
“match”: “[èéêë]”,
“replacement”: “e”
},
{
“match”: “[ìíîï]”,
“replacement”: “i”
},
{
“match”: “[òóôõö]”,
“replacement”: “o”
},
{
“match”: “[ùúûü]”,
“replacement”: “u”
},
{
“match”: “[ýÿ]”,
“replacement”: “y”
},
{
“match”: “[ñ]”,
“replacement”: “n”
},
{
“match”: “[ç]”,
“replacement”: “c”
},
{
“match”: “[ÀÁÂÃÄÅ]”,
“replacement”: “A”
},
{
“match”: “[ÈÉÊË]”,
“replacement”: “E”
},
{
“match”: “[ÌÍÎÏ]”,
“replacement”: “I”
},
{
“match”: “[ÒÓÔÕÖ]”,
“replacement”: “O”
},
{
“match”: “[ÙÚÛÜ]”,
“replacement”: “U”
},
{
“match”: “[ÝŸ]”,
“replacement”: “Y”
},
{
“match”: “[Ñ]”,
“replacement”: “N”
},
{
“match”: “[Ç]”,
“replacement”: “C”
}
]
}
}
Thanks
Pradeep

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.