Share all details related to your problem, including any error messages you may have received.
Hi Sailors,
We are noticing a problem in the production environment. Waiting to hear more updates from the community
Perform maintenance task with only below option is running for more than 7-8 hours for processing around 3000 workflow events. Any way we can speed these events to process quickly.
Impact: Requesters are getting access very lately (taking almost 10-12 hours for the approval process to kick in and then provisioning another few hours).
Perform maintenance task with only below option enabled: (Scheduled to run every 5 mins)
Process background workflow events: enabled
Number of background workflow threads: 4 (Increased the count to 8 - still the processing is same)
partitioning : enabled
Server specs - 6 task servers with good amount of cpu and ram
Other information
For almost 12 - 13 hours application account + group aggregations and identity refresh run every day.
Able to see Workflowcases and WorkItems are getting locked (less in number) via iiq console (unlocked them using unlock iiq console command)
Ran the db performance stats as suggested in identityiq kb article (stats below) - https://community.sailpoint.com/t5/Other-Documents/IdentityIQ-Database-Performance-Tests/ta-p/78060
Meter IIQDB-Test-DataSet-1k-Item: 1000 calls, 20887 milliseconds, 7 minimum, 118 maximum, 20 average, top five [118,68,66,62,59]
Meter IIQDB-Test-DataSet-4k-Item: 1000 calls, 17841 milliseconds, 10 minimum, 86 maximum, 17 average, top five [86,78,56,51,47]
Meter IIQDB-Test-DataSet-8k-Item: 1000 calls, 20312 milliseconds, 13 minimum, 56 maximum, 20 average, top five [56,55,50,37,37]
Access requests workload - 1500 per day with role/entitlement requests
cant enable logs and not able to reproduce the load in other environments
Provide your suggestions on what should the next steps.
How many servers do you have for processing? - 6 task servers with 8 cpus and 32 gb ram
How Many thread s do you have available to you? -
Aggregate Partition : max threads - 8
Identity Refresh Partition: max threads - 8
Role Propagation Partition: max thread - 1
WorkItem Maintenance Request: max thread - 1
Certification Builder - max threads - 4
Certification Maintenance Request - max threads - 1
Are you running things in Foreground or Background? - background
Perform maintenance is processing but it is very slow. We have no of background workflow threads as 4 right to process events.
I don’t think there is any thread dispute
How to find even if it’s there ?
We have increased the background workflow threads from 4 to 7
Number of background workflow threads: 4 to 7
finisher threads also from 4 to 7
Workflow thread timeout (seconds) : 600 seconds
Added the above parameter as we saw some workflow cases are taking long time to process as they are more than 5MB
Enabled partition: yes
With the above options the queue was cleared.
Need to use the above combinations of options to resolve the issue.
We added timeout value in perform maintenance task this way the bigger access request related workflows will get interrupted, resulting in perform maintenance completion at a faster pace.