We’ve been working with SailPoint IdentityIQ for the last couple of weeks trying to stabilize Azure AD delta aggregation for a tenant with 400K+ targeted user accounts, and I wanted to validate our approach with the broader community.
Removing PIM / privileged role aggregation - Though it was important for identity governance - To isolate the issue with PIM this was removed
Running delta on a single task server
Running delta on multiple task servers with user partitioning
Tuning thread counts and connector parameters in application XML
Validating that partitioning is successfully created and ROProcessors are running
Despite all of the above, delta aggregation does not consistently converge and remains in “DeltaSync in progress” state for extended periods, while full aggregation reliably completes within ~24 hours.
Based on this behavior, the recommendation we’ve received is to:
Create multiple logical Azure AD applications
Split the user population (e.g., core users vs privileged users, or segmented filters)
Use full aggregation for large populations and delta aggregation only for smaller, bounded scopes
My questions to the community:
For large Azure AD tenants (300K–500K+ users), is this a known and accepted architecture?
Have others found Azure AD delta aggregation to be unreliable at scale, even with partitioning?
Is relying on full aggregation (or a split full+delta model) considered best practice in these scenarios?
We also have similar issue as delta aggregation remains unreliable or takes quite long to complete. Please let me know if you happen to find a solution. Thank you!
For comparison, we have 68,000 Azure AD accounts.
A full aggregation, without partitioning and no changed accounts, takes just over an hour.
A delta aggregation when there are no changed accounts takes a few seconds.
Yes. Our Lab tenants has close 10% of production and the delta works well as the non-production environment changes are very minimal or sometime null. Cannot compare this with real production data changes. I donot see “Manage Exchange Online” schema anywhere in application configuration for EntraID. Do you have any specific way to find this ?
Would you be able to share below information to help you further optimize the performance of the EntraID aggregations?
Total number of task servers in your environment.
Max threads configured for aggregation partitions
Have you reviewed your partitions defined to ensure that accounts load is evenly distributed across the partitions. We tuned this one and saw significant performance improvement. We have ~700k active accounts in EntraID.
Total Tasks is 6 and max threads configured in aggregate partition are 8, and EntraID application “max-thread-account-membership” value="10 recommend by SailPoint.
Right now, our plan is to configure user partitioning with PIM module disabled and trimmed schema attributes (Specifically removing Lastlogontimestamp) and run full and delta with multi-server task configuration with partitioning enabled.
Yes Neelamadhav, we removed lastlogontimestamp as the first attempt and didn’t see much progress, What do you mean my risk schema attributes here ?
We had customization rule for lastlogontimestamp, but removed that as well at one point and tried delta. All these times, we were trying delta from single server due to PIM requirements where SailPoint recommendation is to use single node for PIM enablement, now that we got to know the PIM API calling will be stressfull aggregation we have decided to completely remove PIM from our configurations and enable just necessary attributes with user partitioning enabled and see if both full and delta performances are showing any improvements
We dont have any exchange related settings configured for Entra in any of our configurations, rather i dont see event those schema coming up in application.xml as i said earlier.
@RajeshPalani By risk attributes i meant riskLevel, riskState, riskDetails, etc. I think IIQ makes a chained API call internally to get these attributes adding more delays.
@RajeshPalani Is it possible for you to give it a try with Webservices connector? I have seen cases where OOTB connectors were not working as expected and clients configured it using Webservices, like for Coupa, Slack, etc. If the problem remains the same, please give it a try with Webservice connector.
Try opening a case with Sailpoint as well in case they provide any support.
Thanks Neel, absolutely we can try with alternate webservices connector, but I am also hearing from other customers or organizations where OOTB connector for EntraID works well with large accounts ± 400k user base. We have also submitted a case with SailPoint and there are some advice and recommendations provided; however, we couldn’t progress much with aggregation.
Thanks for your recommendations. Appreciate it. Will keep you posted if anything changes.