Why the server choked
| Step in your rule | What really happens | Cost |
|---|---|---|
context.search("sql … LIKE '%NativeChangeDetection%'") |
Hibernate executes the raw SQL and hydrates a full Identity object for every matching row, not just the id. |
≈ 100 k objects × dozens of columns ⇒ tens of GB on-heap |
IteratorUtils.toList(iterate) |
Pulls the whole result set into a Java ArrayList all at once. |
1 object wrapper + 1 array slot + 1 Hibernate proxy per row ⇒ heap blow-up |
| No decache / batching | All those objects remain pinned in the session cache; GC thrashes, CPU spikes, OOM kills the JVM. | Crash |
The log line never printed because the JVM spent its time allocating, paging and finally dying before it reached that statement.
A pattern that will survive 100 k+ cubes
import sailpoint.api.IdIterator;
import sailpoint.object.QueryOptions;
import sailpoint.object.Filter;
import sailpoint.tools.Util;
QueryOptions qo = new QueryOptions();
// identical semantics, but let Hibernate build the SQL for you
qo.addFilter(Filter.like("attributes", "NativeChangeDetection"));
IdIterator it = new IdIterator(context, Identity.class, qo); // streams only the ID column
int processed = 0;
try {
while (it.hasNext()) {
String id = (String) it.next(); // just a GUID, negligible memory
Identity cube = context.getObjectById(Identity.class, id);
// …your business logic here…
context.decache(cube); // keep session small
if (++processed % 500 == 0) { // batch-size tune as needed
context.commitTransaction(); // lets other threads run, releases locks
}
}
} finally {
Util.flushIterator(it); // closes DB cursor
}
Key changes
- Stream, don’t collect
IdIterator(or its convenience wrapperIncrementalObjectIterator) fetches only the primary-key column, automatically decaches every 100 objects, and lets you commit/rollback inside the loop without losing your place - TaskResult Monitor UpdateProgress seems to save in memory objects? - #10 by jeff_larson - Projection, not hydration
If you prefercontext.search, call the overload that lists the columns you need:
Iterator<Object[]> iter = context.search(Identity.class, qo, "id");
This avoids materialising full Identity objects until you explicitly call getObjectById - Context.search: Care and feeding of iterators - Compass.
3. Decache aggressively
context.decache(obj) (or a regular context.decache() every N records) prevents the session cache from ballooning.
4. Batch commits
A small commitTransaction every few hundred objects shortens database locks and gives the GC a chance to clean up.
5. Flush the iterator
Always Util.flushIterator (or use the try/finally above) so JDBC cursors are closed even on exceptions - Context.search: Care and feeding of iterators - Compass.
Optional refinements
| Technique | When it helps |
|---|---|
qo.setCloneResults(true) |
You must call commitTransaction() inside the loop but are stuck with a standard iterator. |
Partition the workload (multiple “Run Rule” tasks, each filtering on id range or shard key) |
Clustered deployments where you want predictable runtimes and lower per-node memory. |
Create a database index on the JSON attributes field (or split the flag into its own column) |
If the LIKE '%NativeChangeDetection%' scan itself is the bottleneck. |
TL;DR
Load only what you need, process in a streaming loop, decache, flush, and batch-commit.
Following the pattern above, 100 k, 1 M or even more cubes can be processed without exhausting heap or hanging the server.
Cheers!!!