We are seeing a lot of errors in Certifications at the moment. We have a few active certification with over a million certification item. This is a screenshot of few errors users are getting while attempting to save the certification. What could be the cause of this? Is this because of slow processing? We are also seeing severe issue with UI performance.
If you have a single Certification object with over a million CertificationItems associated with it, then I’m not surprised. Processing that many CertificationItems under a single Certification is going to be at best extremely time-consuming and at worst impossible given the memory requirements.
I believe the message you’re seeing appears when the user tries to save a Certification while it is locked.
I would:
Check if there are any tasks running that process Certifications. Especially, “Check Expired Work Items” and tasks created using the “System Maintenance” task template that have any of the Certification-related options checked.
If there are no Certification-related tasks running, collect thread dumps from all of your servers and look for Certification-related classes. I search for “Cert” in the stack traces.
If there is nothing Certification-related running in your environment, then whatever locked the Certifications has probably died from memory exhaustion. In this case, the lock should expire after at most a couple of hours.
All of that said, if you have a single Certification object with over a million CertificationItems, I don’t think there is any hope of IdentityIQ performing smoothly in that situation. I think you may need to contact SailPoint support to get assistance with purging this Certification from your system. After that, you have to address a more fundamental problem, which is that there is no way that a single certifier or small group of certifiers is going to be able to review that many line items at once. They will have to rubber-stamp things, which will not satisfy an auditor who is paying attention. You will have to figure out a way to break this up into smaller chunks and ensure that the workload is distributed accross a sufficient number of individuals.