/v3/search results not accurate with limit and offset if dont specify sort criteria

What problem are you observing?

Is this expected behavior?
Using /v3/search with offset and limit, you need to specify sort, otherwise results are not accurate. There are duplicates and missed results. See example below

What is the correct behavior?

I would like search results to be accurate without having to specify sort criteria. It makes me lose trust in search results.

What product feature is this related to?

ISC Search API. I only tried v3, not v2024 or beta etc

What are the steps to reproduce the issue?

Using a script (say ruby) compare the search results of the following.
In my example, getting all identities that have an account on source X

Without sort, I had duplicates in the results. and after removing duplicates I had less rows than I had with sort option (which had no duplicates, as expected)

  objList = Array.new
  offset = 0
  count  = 250
  # This is payload without sort
  payload = "{\"indices\":[\"identities\"],,\"query\":{\"query\":\"@accounts(source.id:#{mySource})\"}}"
  while count == 250
    response = IDNAPI.post_json("#{$config['baseUrl']}/v3/search?limit=250&offset=#{offset}", $config['access_token'],payload)
    responseBody = JSON.parse( response.body )
    count = responseBody.length
    objList = objList.concat(responseBody)
    offset = offset + count
    puts "Count: #{count}, offset #{offset}, length #{objList.length()}"
  end
  puts "identitiesList size: #{objList.length}"
  objList.each do | identity |
    writeToFile(identity)
  end

With sort:

  objList = Array.new
  offset = 0
  count  = 250
  # Note: Without the sort, it was not giving accurate returns, there were duplicates and missing entries
  payload = "{\"indices\":[\"identities\"],\"sort\": [\"+name\"],\"query\":{\"query\":\"@accounts(source.id:#{mySource})\"}}"
  while count == 250
    response = IDNAPI.post_json("#{$config['baseUrl']}/v3/search?limit=250&offset=#{offset}", $config['access_token'],payload)
    responseBody = JSON.parse( response.body )
    count = responseBody.length
    objList = objList.concat(responseBody)
    offset = offset + count
    puts "Count: #{count}, offset #{offset}, length #{objList.length()}"
  end
  puts "identitiesList size: #{objList.length}"
  objList.each do | identity |
    writeToFile(identity)
  end

Do you have any other information about your environment that may help?

N/A

Hi Jason,

The Search API has an upper limit of 10,000 records instead of 250, although the default is set to 250. Using offset pagination is not really necessary and will lead to more API calls than necessary. I recommend setting the limit higher, up to 10,000, to avoid having to paginate. If you need to paginate past 10,000 records, you will have to use searchAfter, which is documented here:

All of this to say, this is not a bug and is working as intended. Search just behaves differently due to the technical differences of how it is implemented.

Hi Colin,
I tried 10,000 and also 1,000, but the response size / HTTP content size is too large

If you have access, please see case CS0354934

Response if I try 10,000:

{
    "detailCode": "500.1.503 Downstream service unavailable",
    "trackingId": "f5a8f1039cb64e65a81bff07ac155329",
    "messages": [
        {
            "locale": "und",
            "localeOrigin": "REQUEST",
            "text": "A downstream resource was unavailable."
        },
        {
            "locale": "en-US",
            "localeOrigin": "DEFAULT",
            "text": "A downstream resource was unavailable."
        }
    ],
    "causes": []
}

Try limiting the fields returned by your search query to reduce the response size. Please see this post for an example of how to do it.

1 Like

Thanks Colin. I will try that.