Introduction
One of the most common needs when working programmatically with the Identity Security Cloud (ISC, formerly IdentityNow) API is to get information about Identities. There are two different ways to get this data via API.
/identities API endpoint
While technically a pre-prod API, there is an /identities
endpoint. It was originally released as https://<orgName>.api.identitynow.com/beta/identities
and is now available at /v2025/identities
with the X-SailPoint-Experimental
header set to true
.
This works in a standard RESTful way:
List Identities
GET /identities
lists all the identities, with the standard pagination using offset (>=0) and limit (<=250)
Identity Details
GET /identities/:id
lets you get one specific identity using the object guid (32-character hexadecimal string) as the value for the path variable :id
NOTE: the first endpoint returns an array of JSON objects, while the second just returns a single object. If the filter
HTTP query parameter you apply to âList Identitiesâ causes the endpoint to return only one identity, the result will be an array with one element.
Response Schema
Letâs take a look at the response schema and point out some relevant detailsâsee the example JSON below:
- alias: the âtechnical nameâ of the Identity (For those coming from IIQ: the âcubeâ name)
- emailAddress: this appears as a top-level attribute on the Identity, but it may also appear as one or more âattributesâ based on how you configured the IdentityProfile [see âattributesâ below]
- managerRef: the manager relationship is represented as a pointer to another Identity object. Each Identity contains some minimum information to be useful to both machines and humans, namely
managerRef.id
andmanagerRef.name
- âstatusâ properties (lifecycleState/processingState/identityStatus): which properties are returned and what their possible values can be have changed over time, most notably when SailPoint introduced the concept of Identity State in addition to lifecycle state. As this is technically an experimental endpoint, I expect this to continue to change as concepts of status evolve. However, all of these are top-level properties in each result.
- attributes: this is a JSON object containing all the attributes defined in the Identity Profile that was used to create this Identity. Iâve truncated the attributes Map in the example, but this object can get very long.
{
"alias": "NCC1701",
"emailAddress": "[email protected]",
"isManager": true,
"managerRef": {
"type": "IDENTITY",
"id": "ad75c841dceb4a208507fdce9da03f5a",
"name": "Dee Bossa"
},
"lastRefresh": "2025-01-13T22:25:33.501Z",
"lifecycleState": {
"stateName": "active",
"manuallyUpdated": false
},
"processingState": null,
"identityStatus": "ACTIVE",
"attributes": {
"lastname": "Sandhu",
"firstname": "Vip",
"email": "[email protected]",
"uid": "[email protected]",
"title": "Tamyrlin Seat"
"department": "CISO",
//...
},
"id": "3d02660e4edc4c45a4dd54c37ccde6e2",
"name": "Vip Sandhu",
"created": "2020-02-06T0:10:48.869Z",
"modified": "2025-01-13T22:25:37.415Z"
}
Search
What if we wanted to get manager information on 10,000 identities? With a maximum limit
of 250 identities returned per page, that would take 40 pagesâ worth of queries against the /identities
endpoint. Now imagine we need to pull that information for 100,000 users. The wild truth is that you can do it in 10 queries, and probably do it faster than hitting the /identities
endpoint for 10,000 users. The secret is in specifying what information you want to know about each identity.
The Deep Identity-Retrieval Magic
What has been a constant way to retrieve Identity information programmatically is Search. It is an extremely powerful way to find information - not just about Identities, but thatâs the âindexâ that I will discuss in this article. The other indices are: Access Profiles, Account Activities, Entitlements, Events, and Roles. The /search
endpoint is present in these currently available versions of the API [ v3, v2024, v2025]. Search can seem daunting, but Iâll simplify its use by keeping the scope of this blog post to searching for Identity data in a way that unifies search in the UI and in the API.
Compose the POST body
Searching using an API query requires a POST operation. The request body is a JSON object with a variable number of properties, discussed in detail below. At a minimum, you should put two properties in the request body:
- While the search index (the
indices
property in the POST body) is optional, I recommend that you specify this to speed up your queries and guarantee that only Identities are returned. Specifically, this value is an array of strings, and the array should have just one element, the all-lowercase string"identities"
. - The
query
property is a JSON object. To keep it simple, put just one property in that object: thequery.query
string itself. Any valid search query that you could use in the UI athttps://<orgname>.identitynow.com/ui/search
will work.
Example of minimal search POST body, looking for Active Employee Identities created since the beginning of 2020:
{
"indices": ["identities"],
"query": {
"query": "status:ACTIVE AND attributes.workerType:Employee AND created:>=2020-01-01"
}
}
When copy/pasting your search query from the UI, the only thing to keep in mind is that if your query contains double quotes, you must deal with that using either of two approaches:
1. Escape the double-quotes with backlashes: If your search query is going into a double-quoted string, any double-quotes inside the search query must be escaped with a backslash
{
"indices": ["identities"],
"query": {
"query": "firstName:\"Jim Bob\""
}
}
2. Use outer single quotes: If you wrap the query.query
property value in single quotes, double-quotes inside the search query will work.
{
"indices": ["identities"],
"query": {
"query": 'firstName:"Jim Bob"'
}
}
Default properties in the Search result schema
A note on the example results you will see belowâsearch always includes 3 properties in each returned object, regardless of index: âtype,â and two properties that, according to the common Javascript naming convention of a leading underscore, should be treated as âprivateâ members for internal use only: â_typeâ and â_versionâ. They are only mentioned here for completeness and to reduce confusion.
{
//...
"_type": "identity",
"type": "identity",
"_version": "v2"
}
Two additional default properties show up when you include nested results (see two sections below for an explanation of nested results):
- âpodâ:
<environment><serialNumber>-<aws region of the tenant>
- âorgâ:
<orgname>
(identitynow.com subdomain)
How to Sort and Fetch more than 10,000 Results
If youâre potentially going to retrieve thousands of Identities and want to boost the performance of your code, first make sure you specify the page limit
in the HTTP query parameters. The default value is 250
, but on this /search
endpoint, the maximum value is 10000
. I.e.: you should query the endpoint using the following URL pattern:
https://<orgName>.api.identitynow.com/<version>/search?limit=10000
An example of a broad search query that can return a very large number of Identities:
{
"indices": ["identities"],
"query": {
"query": "status:ACTIVE"
},
"sort": [ "id" ]
}
Notice that I have included a new property in the POST body, sort
. It takes an array of property names and lets you perform tiered sorting. For example, if you wanted to find the longest-serving person in every department, you could use the following, which would sort first by department then within each department by start date:
"sort": [ "attributes.department", "attributes.startDate"]
By default, search returns the results in ascending order, but you can specify the direction by prefixing the property name with â+
â for ascending, or â-
â for descending order. For example, letâs say you wanted to see the newest person in each department first, you could use the following (note that the â+
â is redundant but valid syntax):
"sort": [ "+attributes.department", "-attributes.startDate"]
Technically, sorting the results can be done even if you expect a small number of them. However, itâs mandatory when paginating results beyond 10,000, because the offset
HTTP query parameter doesnât work to paginate results beyond the first 10K. (Iâm making an educated guess here, but Iâm inferring that on the backend, ElasticSearch is grabbing up to 10K results at a time, and the way for us to âtalk pastâ the API HTTP server to the ElasticSearch engine is via the POST body). Instead of using offset
, you sort the results, then specify that you want to âsearchAfter
â the last result of the first 10,000. For this reason, you want to ensure that youâre sorting on a unique value. It is recommended that you always include "id"
in the sort
array to guarantee a fixed sort order. Since the most likely approach youâll use to deal with 10,000+ results is to handle them programmatically, only sorting on "id"
is a valid approach. Letâs say we run the search in the full JSON POST body above, i.e.: searching for all users with an ACTIVE status. These are the summarized results (just the relevant properties of the first and last result of each 10,000):
First 10,000:
[
{
"displayName": "William Edward Kaiyode",
"id": "001ad3e7d10e418a834de5fa7e9d0902",
"_type": "identity",
"type": "identity",
"_version": "v2"
},
//...
{
"displayName": "Zod Kryptonovich",
"id": "2c91808471516cc6017154cebe745bca",
"_type": "identity",
"type": "identity",
"_version": "v2"
}
]
We should search after Zod Kryptonovich, whose id is 2c91808471516cc6017154cebe745bca
. The searchAfter
propertyâs value is an array that mirrors the structure of the sort
array. If we had sorted by displayName, then by id, and Zod Kryptonovich had still been the last result on a page, we would put in both values for the result after which we want to search:
"sort": [ "displayName", "id" ],
"searchAfter": [ "Zod Kryptonovich", "2c91808471516cc6017154cebe745bca" ]
Since these results were received sorting only on id (which is the most common use case), we only need to specify Zodâs id as below:
{
"indices": ["identities"],
"query": {
"query": "status:ACTIVE"
},
"sort": [ "id" ],
"searchAfter": ["2c91808471516cc6017154cebe745bca"]
}
This returns the next 10,000 results, similarly summarized. Notice that the first id is very close (in terms of guids) to the last of the previous page, only 0x670 0008 apart (or 108,003,336 in decimal)
[
{
"displayName": "Jack Sparrow",
"id": "2c91808471516cc6017154cec4e45bd2",
},
//...
{
"displayName": "J Jonah Jameson",
"id": "ffb6b887a4cc4475bf149c5717698143",
}
]
If we wanted to get the next 10,000, we would just search after "ffb6b887a4cc4475bf149c5717698143"
.
Include Nested Properties in Search Results
Search allows you to optionally include 3 nested object properties with the Identity Results: accounts, apps, and access. Each is an array of objects. This is the default behavior and can be turned off by setting "includeNested"
to false
in the POST body. For example:
{
"indices": ["identities"],
"query": {
"query": "status:ACTIVE"
},
"sort": [ "id" ],
"includeNested": false
}
Nested objects greatly âbulk upâ the search results and drop performance. So, while the maximum value for the limit
HTTP query parameter is 10,000, thatâs only really usable without returning the nested arrays. Empirically, an experiment I ran showed that searches that just specify an index, a query and a sort, which have"includeNested": true
by default, have their best performance when limit
is 125
. (See last section: Addendum â Performance Experiments).
Letâs discuss each of those nested object properties in the order they appear in the results schema:
Accounts [ ]
These are summaries of the accounts owned by the Identity. They do not provide account attributes like the /accounts
endpoint does. Rather, they give a rundown of the Account (Link) object and the entitlements found on it. Hereâs an example of what an Active Directory account would look like as part of this array property:
[
{
"id": "075bd3f0036d41af92d9eb3c8195f6d8",
"name": "wkaiyode",
"accountId": "CN=William Edward Kaiyode,OU=Explosive Testers,OU=Users,DC=acme,DC=com",
"source": {
"id": "c12900d1168e4b17b9966d3a795515e1",
"name": "AD - Acme.com",
"type": "Active Directory - Direct"
},
"disabled": false,
"locked": false,
"privileged": false,
"manuallyCorrelated": false,
"passwordLastSet": "2024-12-09T07:27:55.107Z",
"entitlementAttributes": {
"memberOf": [
"CN=Trauma Team,OU=Benefits,OU=Groups,DC=acme,DC=com",
"CN=Jet-Powered Pogo Stick,OU=R&D,OU=Groups,DC=acme,DC=com",
"CN=Giant Magnet,OU=Customer Beta Testers,OU=Groups,DC=acme,DC=com"
]
},
"created": "2025-01-13T18:31:08.212Z"
},
//...
]
Apps [ ]
These are summaries of the (Logical) Applications to which the Identity has access. Unlike the data returned by the /source-apps
endpoint, the information returned is less about how the Application is wired up, and more about why this Identity gets access to the App. The nested "account"
object inside each app in the apps
array represents the Source account which grants logical access to the Application:
- the
account.id
is the account (Link) objectâsid
, and can be used to get the Account Details at/accounts/:id
- the
account.accountId
is actually the nativeIdentity of the account object.
[
{
"id": "22193",
"name": "Acme R&D",
"source": {
"id": "c12900d1168e4b17b9966d3a795515e1",
"name": "AD - Acme.com"
},
"account": {
"id": "075bd3f0036d41af92d9eb3c8195f6d8",
"accountId": "CN=William Edward Kaiyode,OU=Explosive Testers,OU=Users,DC=acme,DC=com"
}
},
//...
]
Access [ ]
Access Items comes in 3 types: Roles, Access Profiles, and Entitlements. The nested array contains summaries of those granted to the Identity, with certain featuresâsome shared, some unique to a specific type:
- Common features:
id
,name
,displayname
, andtype
(this value is in SPINAL_UPPERCASE) - Roles and Access Profiles: have a
description
, anowner
(an Identity reference), and arevocable
attribute - Access Profiles and Entitlements: have a
source
reference. - Roles: indicate whether they are
disabled
- Entitlements: show the
attribute
/value
pair whose presence on an account in the relevant Source was used by ISC to determine that the Identity has that entitlement. Also indicate whether they arestandalone
or granted to this Identity as part of a Role or Access Profile.
[
{
"id": "243621ccb36c41d898f42ae38bca02dc",
"name": "Birthright - Super Genius",
"displayName": "Birthright - Super Genius",
"type": "ROLE",
"description": "Business Role for those who are Super Geniuses - those who embrace 'Have Brain, Will Travel'",
"owner": {
"id": "ac09f514d6524b2280556b1a2da5e140",
"name": "admin007",
"displayName": "Vip Sandhu"
},
"disabled": false,
"revocable": false
},
{
"id": "515d3266e02b4158a74a6e1d15f6d446",
"name": "Acme - Research & Development - Jet-Powered Pogo Stick",
"displayName": "Acme - Research & Development - Jet-Powered Pogo Stick",
"type": "ACCESS_PROFILE",
"description": "Application access Acme's R&D UI - allows user to edit Field notes on the Jet-Powered Pogo Stick",
"source": {
"id": "c12900d1168e4b17b9966d3a795515e1",
"name": "AD - Acme.com"
},
"owner": {
"id": "ac09f514d6524b2280556b1a2da5e140",
"name": "admin007",
"displayName": "Vip Sandhu"
},
"revocable": true
},
{
"id": "eda0a94523df431f86b6b1a749e8a6c8",
"name": "Jet-Powered Pogo Stick",
"displayName": "Acme.com - R&D - Jet-Powered Pogo Stick",
"type": "ENTITLEMENT",
"source": {
"id": "c12900d1168e4b17b9966d3a795515e1",
"name": "AD - Acme.com"
},
"privileged": false,
"attribute": "memberOf",
"value": "CN=Jet-Powered Pogo Stick,OU=R&D,OU=Groups,DC=acme,DC=com",
"standalone": false
},
//...
]
GET individual identities
All the same information is available using Get Document By Id using the index âidentities
â in the URL:
GET https://<orgname>.api.identitynow.com/<version>/search/identities/:id
- This searches for a single indexed object by id, so we donât need to POST a search query body.
- The GET operation always includes nested objects, which isnât too bad of a performance hit for 1 object. However, given the overhead of making each network call, GET search should be used very sparingly. If multiple Identities need retrieval, it is far more performant to build up a list of
id
s and get them all in a combined search query. - Because youâre Getting a single object, the response body will not be an array.
- Additionally, the default properties will not be present. (see section âDefault properties in the Search result Schemaâ above)
âCountâ properties
Search provides a lot of âcountâ properties. Each of them counts an array property on the Identity of the same name, where the count Property is <arrayProperty>Count
and the corresponding property is called <arrayProperty>
. Note that 3 of them count the nested object arrays that can optionally be included in the results: accounts
, access
, and apps
âthe count properties will be returned even if you donât includeNested
. Interestingly, setting includeNested
to false
changes the order of the properties, but not their presence or values (this mostly matters if youâre visually inspecting the results).
- accountCount: # of Accounts
- sourceCount: # of Sources on which those accounts are found
- appCount: # of Applications to which the Identity has access
- accessCount: # of Access items
- entitlementCount: # of Access items that have type: ENTITLEMENT
- roleCount: # of Access items that have type: ROLE
- accessProfileCount: # of Access items that have type: ACCESS_PROFILE
- ownsCount: # of types of objects found in the
owns
array property, where types are among {sources
,accessProfiles
,roles
,governanceGroups
, andapps
}. If there are any objects of a given type,owns.<type>
will be an array containing thename
andid
of each. (Authorâs note: this may not be the intended behavior, since it counts types instead of items) - tagsCount: # of tags applied to the Identity, which are found in the
tags
array property. - visibleSegmentCount: # of Access Request Segments of which the Identity is a part, allowing that person to see certain access items in the Request Center that are not visible to the general population of Identities.
Shaping the Response Schema
One of the great powers of Search is the ability to retrieve a lot of results with a wide variety of data returned for each Identity in the response. The downside is that by transferring so much data, your requests can get bogged down. Which brings us to the next great power of Search: Queryfilters give you the ability to declare exactly what data you want returned for each Identity that matches your search query. This is analogous to the benefit of GraphQL. In a real-world test, the queryResultFilter
below dropped the response time per Identity from 12.5 to 2.5 milliseconds when fetching 250 Identities, with even bigger gains for 10,000 Identities (dropping to 0.6 ms per Identity). The direct comparison only compared 250 Identities because including nested objects with a full response schema actually causes the response time to rise when asking for more than 125 Identities. (see Addendum â Performance Experiments)
{
"indices": ["identities"],
"query": {
"query": "id:(001ad3e7d10e418a834de5fa7e9d0902 || 2c91808471516cc6017154cebe745bca || 2c91808471516cc6017154cec4e45bd2 || ffb6b887a4cc4475bf149c5717698143)"
},
"includeNested": true,
"sort": [ "id" ],
"queryResultFilter": {
"includes": [ "displayName", "name", "email", "id", "manager", "apps" ],
"excludes": [ "manager.displayName", "apps.source" ]
}
}
If a term appears in both the lists, excludes
winsâthat property will not appear. The best use of this is to exclude sub-fields from nested objects in the results. For example, the âmanagerâ property on an Identity is an object reference with 3 properties:
"manager": {
"displayName": "Chuck Jones",
"name": "100N3Y",
"id": "733f61f8cff94281bbe928bdafab9546"
}
Letâs say your code didnât care about the id
of the manager
âs Identity object because it was generating a report that would only be consumed by humans. So, all you need is their displayName
and their employeeNumber, which you have wisely chosen to make their identity name
. You could leave the managerâs object id
out of the results.
{
//...
"queryResultFilter": {
"includes": [ "displayName", "name", "email", "id", "manager", "apps" ],
"excludes": [ "manager.id", "apps.source" ]
}
}
Note: the above doesnât work as intended unless you also have includeNested
set to true
, because otherwise the nested apps
property wonât be returned.
As a final note, sort order holds even if the properties on which you are sorting are not returned with the result. (see section âHow to Sort and Fetch more than 10,000 Resultsâ above for a review of sorting) In the queryResultFilter
above, you arenât returning any Identity attributes that arenât promoted to top-level properties in the search results (like firstName
and lastName
are), but if we were to include the following sort
property in the POST body, the results would still be returned in order by department, then by startDate within each department, even though the results donât include those properties.
"sort": [ "attributes.department", "attributes.startDate"]
Performance Takeaways
- The real advantage of the experimental
/identities
endpoint is the optimization of developer time, not execution time. - Grab results 10,000 at a time, not in pages of 250âonly possible through
/search
, and by editing the defaultlimit
. - If you need the unabridged Identity objects and also need to
includeNested
, drop thelimit
to125
for best server-side performance - Search allows you to simplify the response schema using a
queryResultFilter
, speeding up server-side processing and reducing the amount of data flying over the network to your client code. This is far more impactful on performance than trying to fine-tune the limit (see Addendum â Performance Experiments)
Disclaimer
The information here is current as of the time of publication. If you find any errors or things that have changed, let me know and Iâll update the post.
Addendum â Performance Experiments
Methods
To determine the optimal limit value (how big the page size should be), I ran two experiments: one where I shaped the results using a queryResultFilter
, and one where I didnât (full schema arm).
Other than that one difference, both experiments used the following POST body:
{
"indices": ["identities"],
"query": {
"query": "attributes.cloudLifecycleState:active AND identityProfile.name:\"<Main_HR_Profile>\""
},
"includeNested": true,
"queryResultFilter": {
"includes": [ "displayName", "name", "email", "id", "manager", "apps" ],
"excludes": [ "manager.displayName", "apps.source" ]
}
}
To ensure that I got good statistics, I ran 10 trials at each limit value. To ensure that random effects that varied over time didnât covary with the limit value, I ran through each limit value before coming back to a given value. The pseudocode for this approach is:
for trial in 1 to 10
for limit in MIN_LIMIT to MAX_LIMIT
POST search query and record timings
Because I very quickly found an optimal limit for the full schema arm, I only had to set MAX_LIMIT to 250, starting from 1. For the slim schema, I had to set MAX_LIMIT all the way up to 10000, but I ran the script in 3 runs: 1 to 250, 251 to 500, then 501 to 10000. I did this because I was looking for a local minimum, but found none up to a limit of 500, so I changed tactics and gathered data on all possible values.
Full Schema Results
I started by just grabbing the mean and standard deviation of the response time in milliseconds (ms) across the 10 trials at each limit value, and noticed a roughly linear shape, but with a noticeable downward bend. The relatively linear standard deviation curve provides confidence that the pattern wasnât affected by extreme outliers.
Looking at the response time per Identity showed a hyperbolic shape, indicating that some fixed cost was being amortized over the identities as the limit grew. It also showed a horizontal asymptote that was so low relative to the initial values that the downwards flexure of the first graph couldnât be appreciated with the initial graph window.
Zooming in on the vertical axis shows the detail of the downward flexure, showing a clear minimum at a limit of 125, followed by an increasing time per Identity.
This kind of shape implies that there are two competing processes: A constant overhead that has less effect on the per-Identity time cost as the limit grows, and some costs that increase as the limit grows. I have created model curves for the two (320ms/n and 2.8ms + 0.035ms*n, respectively, where n is the limit). Summing those two produces a curve whose shape is very similar to the observed results. I did not concern myself with creating a perfect fit because demonstrating that a fitting curve could be the sum of a hyperbolic and linear function was my goal.
Slim Schema Results
I once again began by just grabbing the mean and standard deviation across the 10 trials at each limit value. This time, however, I noticed a more logarithmic shape. Indeed, the logarithmic best-fit curve produced very good results.
Looking at the per-Identity response times, a clear hyperbolic shape was once again noted, but this time with a much lower asymptote.
Zooming in on the vertical axis and plotting the best-fit curve (ânormalizedâ by dividing by the limit x) showed exactly how good the logarithmic fit was for the asymptote. This curve is monotonically decreasing (always has a negative slope), meaning that the performance increases as the limit value increases. However, this per-Identity graph revealed some deviation for limits under 2000 that was worth investigating, specifically to rule out a local minimum.
This data for limits < 2000 was a lot noisier. Given that I ran this arm of the experiment in 3 runs (1-250, 251-500, 501-10000) on my laptop, that explains the discontinuity of the data points in the limit 251-500 range: some external factors like open applications, WiFi strength, etc. must have pushed those response times up, because the first and third chunks seem to continue the same curve. The other issue was the existence of sharp upward deflections around the limit values of 300 and 750. I co-plotted the Coefficient of Variance (Standard Deviation divided by Average), which showed that those deflections are explainable by long outlier response times, so the âlocal minimaâ on either side can be safely ignored. Finally, to determine local minima in noisy data, itâs best to analyze a smoothed curve. By smoothing over a running average of +/-50 limit values, we can see the only local minimum is near 750, which weâve already explained as an artifact of outliers. This means the whole graph is effectively a monotonically decreasing function.
Conclusions
So, I conclude that if we use a reasonable queryResultFilter to shape the response schema, then Search gets increasingly efficient with page limits up to the maximum limit of 10,000 Identities.
When comparing the two arms, we can see a dramatic improvement in per-Identity response times. When the limit is 250, the full schema costs 12 ms per Identity, and the slim schema only 2.5 ms. Even the âoptimumâ limit of 125 with the full schema only lowers that to 10 ms per Identity. Compare that to the optimal value of maxing out the limit to 10,000 where we see the per-Identity time drop to an amazing 0.6ms.