TaskResult Monitor UpdateProgress seems to save in memory objects?

TrailBear · January 16, 2022, 11:45am

We are seeing some unexpected behavior when a rule executes taskMonitor.updateProgress. The in-memory configuration of an altered Microsoft Active Directory application gets saved over the version in the database.

I think this method must be saving anything cached in memory when it executes. Is that expected behavior, a side effect?

Details: The application has 14 domains configured. The code knows which domain to search, and is processing several batches of accounts sorted by domain. So, we get the application, narrow down the options, do the work and repeat for each group. We do this for groups of accounts in different domains, referring back to the original application’s settings each time, and selecting the correct domainSettings to match the accounts being processed.

This works as expected until we insert a progress progress report via the monitor’s updateProgress method between each pass. At this point the altered configuration gets saved to the database.

Pragmatically, once we narrowed down the behavior, we proactively decache the application right before reporting progress, then get a new reference to it afterwards.

We’d like to understand what is going on, and whether this is a signal that we’re doing something wrong. Our worry is that perhaps a similar thing is happening to other objects in scenarios where the consequences might be more subtle. Which in turn may be causing some insidious bug that we have yet to uncover.

adam_creaney · January 18, 2022, 8:35pm

hi @TrailBear - welcome to the community!

Are you able to post the full code of the rule (with any sensitive information redacted) so I can take a look at what kind of rule is executing, as well as your logic?

TrailBear · January 18, 2022, 10:30pm

The actual full code is about 900 lines long, but here’s a similar issue where the use of monitor.updateProgress seems to destroy an iterator. I’ll try to prepare separate test code that replicates the original issue, but I’m up against a deadline just now.

The following rule needs to be run from a “run rule” task.

Near the end is a commented line that calls updateProgress
Run with that line commented until success
- When the task results should have a list of the managed attributes in your system.
Then uncomment the line and run again
- In our case, we see an error GenericJDBCException: could not get next iterator

  import java.util.ArrayList;
  import java.util.List;

  import java.util.Iterator;
  import sailpoint.object.Application;
  import sailpoint.object.Bundle;
  import sailpoint.object.ManagedAttribute;
  import sailpoint.object.QueryOptions;
  import sailpoint.object.Filter;
  import sailpoint.tools.Message;
  import sailpoint.tools.GeneralException;
  import sailpoint.tools.Util;

  import sailpoint.task.TaskMonitor;

  import org.apache.log4j.Logger;
  import org.apache.log4j.Level;

  Logger log = Logger.getLogger("your logger name");
  TaskMonitor monitor = new TaskMonitor(context, taskResult);
  
  String appName = "your app name";

  QueryOptions qo = new QueryOptions();
  Filter filter = Filter.and(Filter.eq("application.name", appName), Filter.eq("attribute", "memberOf"));
  qo.addFilter(filter);
  qo.setOrderBy("value");

  Iterator iterMAs = null;
  try 
  { // prepare an iterator for all MAs in scope
    iterMAs = context.search(ManagedAttribute.class, qo);  
  } 
  catch (GeneralException e) 
  { 
    String errMsg = "Exception occurred while reading the managed attributes from IIQ: " + e.getMessage() ;
    log.debug(errMsg);
    return null;  
  }

  List maPropertiesList = new ArrayList();
  Integer maCount = 0;
  while (iterMAs.hasNext()) 
  { 
    maCount++;
    ManagedAttribute thisMA = (ManagedAttribute) iterMAs.next();
    String msg = "Found: " + thisMA.getNativeIdentity();
    taskResult.addMessage(new Message(Message.Type.Info, msg, null));
    // when uncommented the following line destroys the iterator
    // monitor.updateProgress(msg); // run with this line commented, then without
    context.decache(thisMA);
  }
  //flush iterator
  sailpoint.tools.Util.flushIterator(iterMAs);

  return true;

adam_creaney · January 19, 2022, 6:07pm

@TrailBear Okay - I think this is a pretty complex issue, so i’ll try to explain what is going on here.

Long story short - there was an update made to hibernate which causes code that utilizes ‘commitTransaction’ inside of an iterative loop to actually close the iterator. While your code isn’t specifically calling ‘context.commitTransaction()’ method, the ‘monitor.updateProgress(String msg)’ method does call commitTransaction.

I have a few suggestions - first, if you are only interested in getting the string for the ManagedAttribute that represents the native ID of the object - we can adjust your query to bring just that field back from the managed attribute objects, instead of the whole object itself. This would be an efficiency improvement - you can get more details here. We could then create temporary, in memory storage for those strings ourselves and iterate over after that after the iterator is complete and run the updateProgress() there.

Or, I believe you can also adjust your code like this:

  QueryOptions qo = new QueryOptions();
  Filter filter = Filter.and(Filter.eq("application.name", appName), Filter.eq("attribute", "memberOf"));
  qo.addFilter(filter);
  qo.setOrderBy("value");
  qo.setCloneResults(true); //TRY THIS ARGUEMENT 

  ArrayList colsToRead = new ArrayList(); //SPECIFY JUST THE NATIVEID TO RETURN
  colsToRead.add("nativeIdentity");

  Iterator iterMAs = null;
  try 
  { // prepare an iterator for all MAs in scope
    iterMAs = context.search(ManagedAttribute.class, qo, colsToRead);  
  } 

...

  while (iterMAs.hasNext()) 
  { 
    maCount++;
    Object[] thisRecord = it.next();
    String thisMa = (String)thisRecord[1]; //GET THE STRING NATIVEID
    String msg = "Found: " + thisMa;
    taskResult.addMessage(new Message(Message.Type.Info, msg, null));
    monitor.updateProgress(msg); // run with this line commented, then without
    context.decache(thisMA);
  }

jeff_larson · January 19, 2022, 7:02pm

There are two things going on here. The original problem is most likely that Hibernate by default will flush to disk any object currently in the session cache that is detected as “dirty”, meaning a change of some kind was made to it. Once you modified the Application, and left it in the cache, any commit will flush it. There are a few ways around this, one is to decache the Application (or before) you modify it so Hibernate will no longer look at it. Depending on what you are doing with this Application object you may need to call app.load() on it to bring in all the pieces that Hibernate would normally load on demand. I’m not sure what you are trying to accomplish here, modifying an object temporarily is unusual, can you describe a bit about why you are doing this?

Stock tasks use a special option to prevent this from happening so I suspect you are using a custom task. If you want it to work more like how aggregation/refresh works do this when the task starts.

                PersistenceOptions ops = new PersistenceOptions();
                ops.setExplicitSaveMode(true);
                _context.setPersistenceOptions(ops);

What this does is disable automatic flushing by Hibernate. Any time you want an object saved
you must explicitly call _context.saveObject(obj) on it. Just modifying it will no longer be enough.

PersistenceOptions is in the sailpoint.object package.

TrailBear · January 19, 2022, 7:03pm

Right, this approach works for scenarios where iterators are involved. Depending on the data needed, I had modified my code in some places to do a projection query, and in other places to iterate over the objects. In both cases we build a disconnected structure of data to iterate over. Sometimes it is a set, sometimes a map, sometimes a set/array of maps.

In the case of working with an altered application, the only work around I came up with is batching the work. So, while iterating over a map of maps I have broken the data into batches of about 200 accounts. For each batch the code ensures the Application is decached, calls updateProgress, alters the application, Aggregates the batch, then decaches the modified Application.

I also learned that no matter what I do to the Application in terms of copying it, it seems I’m really only getting a new pointer to the same Application in memory. Anything I do to one copy affects any copies or clones. An update to one, updates all. Decaching one, decaches all.

jeff_larson · January 19, 2022, 7:05pm

The second problem is the result of a Hibernate upgrade changing behavior. I believe we have a workaround for that, but I’m waiting for some details from a colleague.

jeff_larson · January 19, 2022, 7:14pm

Sort of. When you call context.getObject() on an object for the first time, it will be loaded from Hibernate and placed in what it calls the “session cache”. It will remain there until you decache it or return the context/session to the pool. If you call getObject on that object again, you will get a pointer to the object that is already in the cache. That’s the point of the cache, to prevent you from hitting the database every time you want a handle to the object.

The other state an object can be in is “detached”. The object still exists in memory and you can use it but it will not be in the session cache. If you call context.getObject on that it will load a NEW copy of the object since Hibernate does not about detached objects. You have to be careful when you get into this state. The detached object must not be allowed to be put back into the session, such as calling context.saveObject on it. This will result in a Hibernate error something like “Two respresentations of the same object found in session” exception and the transaction cannot be committed.

To detach an object, call context.decache on it. But we’re getting into potentially dangerous territory here. context.decache(obj) will take the root object out of the session, but is has been known to leave some interior child objects and collections behind. When we do this in system code the best practice is before you start iterating, load all the objects you want to use for the duration, call obj.load() on them, then do a full context.decache(). This will ensure that the context/session starts out completely clean when you start iterating.

TrailBear · January 19, 2022, 7:28pm

Right, I agree that this is unusual. Yes, this is using the Run Rule parent task, and passing a domain specific list of resource objects to an Aggregator’s aggregate(Application, List) method. To get that list of resource objects we are customizing the Application to return only the accounts in scope, using this to get a connector ConnectorFactory.getConnector(accountOptimizedApp, null) and then using the connector’s iterateObjects(“account”, null, ops) method.

We need to aggregate only a small fraction of the accounts and groups in our Active Directory, but the population is determined by whether the person has come to our system asking for permission. We (appropriately) only have read permissions to Active Directory. This means there’s nothing natively in Active Directory to use as criteria save for the name of the account or group.

There are 14 domains in four forests to look through, but we know the domain before we do the aggregation. So, rather than search 14 domains, we alter the application to search just one domain, and set it’s iterateSearchFilter to criteria that will locate the account(s)/groups(s). Typically using a combination of distinguishedName, or sAMAccountName, and other properties like UserAccountControl, objectCategory, and/or objectClass.

Maybe there’s a better way to use the connector, or to get status back from the Aggregator, or put that status in the UI.

I think I hear you saying that if I decache an object, that I can reload it with .load()? That might be an interesting approach.

jeff_larson · January 19, 2022, 7:29pm

There are a few ways around the iteration with commit problem. I’m waiting for
details on one of them but I can describe two others.

The first is to use sailpoint.api.IdIterator instead of calling context.search() and iteratoring over that result set.

class sailpoint.api.IdIterator {

public <T extends SailPointObject> IdIterator(SailPointContext context, Class<T> cls, QueryOptions ops)
    throws GeneralException {

The intereface is similar to context.search() youy provide the class you want to iterate on
and a QueryOptions object that contains any filters you may want. In this example I won’t include
any filters.

 IdIterator it = new IdIterator(context, Identity.class, null);

What this does is read in ALL of the id columns for rows in the Identity table. You can
then iterate over this committing as you go and the result will be preserved because we
no longer have an open cursor.

while (it.hasNext() {

    String id = it.next();
    Identity obj = context.getObjectById(Identity.class, id);

    ... do something with the identity

}

For any type of iteration over a large number of objects it is important to do a periodic
full session decache in order to prevent cache bloat. IdIterator will do this automatically
after every 100 objects. This can be changed.

A slightly easier interace is sailpoint.api.IncrementalObjectIteratyor. This wraps IdIterator and saves the step of calling getObject().

 IncrementalObjectIterator it = new IncrementalObjectIterator(context, Identity.class, options);
 while (it.hasNext()) {

    Identity obj = it.next();

    ...

 }

With either of these, you will always have a stable result to iterate over and can commit or rollback within the loop.

jeff_larson · January 19, 2022, 7:33pm

Not exactly, obj.load() walks over the object hierarchy and makes sure that everything is fully brought in to the session cache. You typically do that before you decache it. That means you have a full copy of thee object that Hibernate will not mess with. You can modify the object if you want but you normally do not save it. If you need to make a persistent change to the object, fetch a new object using context.getObject() modify it and commit it. You can just let the detached object be garbage collected.

jeff_larson · January 19, 2022, 7:38pm

If I’m following this, you want to filter the results returned by the connector, but what determines that can’t be stored in the Application object. So you are wrapping Aggregator. First you do some magic to determine what the filter should be, then you load the Application, modify it with the necessary filter and aggregate that. I think this should work as long as you fullly load the Application then call context.decache(). Now modify it however many times you need to and call Aggregator.

Having said that, I’m pretty sure there are customization rules that allow you to get control over what will be returned that is not part of the static Application definition. I’ll check on that.

jeff_larson · January 19, 2022, 7:50pm

Here’s a more formal article on using IncrementalObjectIterator:

https://community.sailpoint.com/t5/IdentityIQ-Wiki/IdentityIQ-8-0-and-commitTransaction-While-Using-an-Iterator/ta-p/143225

TrailBear · January 25, 2022, 8:47pm

Iteration

I will have a look at the IncrementalObjectIterator to see where that could be applied in our code.

Decaching

The way we are doing things now (decache() before updateProgress()) doesn’t save the altered configuration, but I see the modified date change.
I think you may be saying that, by calling the .load() on this application object just before decaching, Hibernate would ignore it and perhaps the modified date would be unaffected. Is that correct?

Application Customization

At this point the only way we know of to limit the number of domains that get searched, or to specify an LDAP query is to alter the application.

Domains to Search

We’d like to be able to specify the Application’s “domains to search” in two places

When using the connector to get Resource Objects.
a) By DN - the simple case, because the DN contains the domain.
b) By SAM Account Name, where we know which domain to search from some other interaction.
When Aggregating the list of Resource Objects.

LDAP Filter

We’d like to be able to specify the LDAP query for this domain, for example: to search by Distinquished Name we do something like this today. And then update the Iterate Search Filter for the one remaining domain in the customized app.

  StringBuilder iterateSearchFilterStr = new StringBuilder();   
    // !(UserAccountControl:1.2.840.113556.1.4.803:=2) means "not disabled"
    iterateSearchFilterStr.append("(&amp;(objectCategory=user)(objectClass=user)(!(UserAccountControl:1.2.840.113556.1.4.803:=2))(|");
    for (String memberDN : memberDNs) 
    { // add the encoded distinguishedName of each account
      iterateSearchFilterStr.append("(distinguishedName="+memberDN.replace("\\", "\\5C").replace("*", "\\2A").replace("(", "\\28").replace(")", "\\29").replace("\000", "\\00") +")");
    }
    iterateSearchFilterStr.append("))");

`

mikechung · March 28, 2022, 1:45pm

Hello @jeff_larson,

Thanks a lot for your great explanation. May I ask while we are using Iditerator, do we also need to do Util.flushIterator on it? or just on the original iterator is sufficient?

Thanks and Regards,
Mike

Topic		Replies	Views
Account Aggregation org.hibernate.PropertyAccessException IIQ Discussion and Questions	16	6328	July 19, 2023
Seeing increased Hibernate session failures after upgrade to 8.3 IIQ Discussion and Questions apis	7	4935	July 19, 2023
SaveObject() and CommitTransaction() Not Updating the all rows in SailPoint Tables IdentityIQ (IIQ)	0	1667	April 27, 2022
Row was updated or deleted by another transaction IdentityIQ (IIQ)	2	4225	January 18, 2022
Does aborting a loop over a Hibernate list leave lingering resources? IIQ Discussion and Questions	3	1396	July 19, 2023