Tricks and Tips for custom report performance?

Hi Developers!

I’ve been trying to troubleshoot a custom report, which is based off an out of the box report with a few customizations. The report contains about 14K results, and starts off running extremely fast, but as more and more results are “generated” the performance slows down, almost exponentially. Any ideas on things to look out for to make the performance steady for the entire duration of the run?

what are the customizations and which report is it based of?

It’s based upon the Revocation Live Report, however our team has highly customized it. I’m currently going through the report to make any small performance improvement I can, however my bigger question is why would the report start out by running fairly fast, and as it continues to run get slower and slower. I think there is also a memory leak in that I do occasionally get java errors that either there is no heap space left, or out of memory errors. Our heap is HUGE, it’s something like 16GB. I’m not a huge fan of nulling out objects in Java as I like to let the GC do it’s job, but it seems that the scope of some of these variables perhaps require me to null objects out perhaps.

In large reports like that, often the biggest memory hogs and time-consumption are the generation of the CSV and especially PDF artifacts. I’m not sure if there’s a way to bypass that using the OOB report executors.

The reason it’s slowing down as it generates is because it’s manipulating objects for the report results that are getting larger and larger in size, which requires increasing time and memory to process (and is not a linear increase).

@brian_weigel Do you know if concurrency is involved in generating report results? By that I mean if an initial mining of the data related to the data source occurs, including any specified filtering, and then an initial result set of objects for the data source is generated. Does the report executor then spin-up multiple threads to process the columns for each result, or is it single threaded? If multi-threaded my head is currently making me think it could be a database contention issue. I can see the progress bar on the results sometimes chug right along, processing a few hundred per minute, but then I can see it just hang for 10 to 15 min where it appears nothing is processed. My next step is going to be adding some detailed logging to the report columns to determine where the bottleneck may be.

@brian_weigel one more thing, you mentioned “especially PDF Artifacts”, we never use PDF, always CSV. Is there any way to turn-off generation of PDF artifacts?

As best I can tell, the OOB report generation is all single-threaded, but I’ll admit I haven’t dug that far into it in a long time.

And I believe I noted it above, but I’m pretty sure that the OOB report executors will always generate BOTH the CSV and PDF. The report option you see doesn’t affect that part as best I can tell - it only affects what file attachments are included in the report result notification email. If you view the report Task Result, you can still download both the CSV and PDF.

As a side note - It’s worth considering using the IIQ Data Export task to sync certification data to a separate database, which can then be used with a 3rd-party app to construct reports and dashboards. I’ve known several customers that have taken a similar approach (most notably syncing to a DB for ingestion and analytics via PowerBI).

After several hours on analysis, I was able to make significant improvement to the performance of this report by creating what I will call a “Hybrid Datasource”. The report itself is based upon the OOTB Revocation Report, however we had requirements to add several additional fields, which would rely upon larger data objects in the system such as the WorkitemArchive, IdentityRequest, and IdentityEntitlement objects. We were of course using the API, Projection Queries and Filters. However when the rubber hits the road we had no control over exactly how the query will be issued to the database.

My solution: for the report columns I suspected were hindering performance I hand crafted SQL queries, fetched a JDBC connection from the SailpointContext, and ran the query against the DB directly. Add proper connection handling, error handling, and a couple of new DB indexes, and voila, report performance improved by over 90 percent. I felt like writing the SQL myself gave me much more control, and is probably something I’ll add to my toolbox moving forward. Granted, if the IIQ database structure was ever to change, I have a problem on my hands, but the tables I am querying are rather stable in my opinion.

One thing that might impact that is the addition of PostgreSQL as a new DB type in 8.4.

I’d humbly suggest considering refactoring your SQL queries to instead be HQL (which performs the query against the Hibernate layer instead of the DB directly). This approach has 2 key benefits (which are related):

  1. Instead of being dependent on the back-end DB structure, which can change, you’re querying against the IIQ object model (which is far more stable)
  2. Since you’re querying against the IIQ object model at the Hibernate layer, it’s agnostic to the DB back-end (Hibernate handles any needed translation for you)

Hi so with 8.4 and postgre SQL the DB structure will change?

It’s hard to say, but I’d wager it’s pretty likely that the actual DB structure will be a bit different for PostgeSQL vs the other SQL DBs.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.