# Request Resiliency

Cornerstone's business objective for DEAPI and RAPI is to provide clients access to their growing reporting data programmatically. The goal is to deliver a solution that not only meets the functional business needs of the clients, but one that is robust, secure and scalable. However, robust and scalable does not mean that errors will never occur. You should always code defensively, expect transient errors to occur and handle them gracefully.

In general responses with a status of 2xx are successful, however there are circumstances where this is not always the case. We will cover the most common types of errors and the special case when 200 is not actually successful.

# Client Errors 4xx

The following errors require attention on your side, it usually involves fixing some aspects of your request before re-issuing it.

400 Bad Request / validation

Incorrect URL or parameters, such as the incorrect spelling of odata keywords $select, $filter, etc.

401 Unauthorized / Unauthenticated

Missing the right credentials or using an expired token in your request. Check the values you are sending and correct them accordingly.

403 Access Denied

The user account does not have the correct privileges to access the resource requested. If you encounter this error, you will need to login into your portal and assign the correct permissions to the user account being used to make the API calls.

404 Not Found

This error occurs when an incorrect resource is requested or the predicate for a resource does not match a record. For example

  • Incorrect resource: requesting vw_rpt_usr instead of vw_rpt_user

  • Incorrect predicate: requesting vw_rpt_user/1234, when a user record with that id does not exist

429 Limit Exceeded

This is an indication that you have exceeded your throttling limit for the API. Implement a back off logic accordingly to stagger your calls around the limits.

{
  "status": "429",
  "timeStamp":"2018-05-15T18:06:55+0000",
  "error": {
      "errorId": "77ed61e0-7052-4fad-b265-c52679ab2cac",
      "message": "CSOD Too many requests.", 
      "code":"429",
      "description": null, 
      "details": null
  }
}

Note: there are two API response headers, x-csod-throttling-rate-limit-urilevel and x-csod-requests-remaining-urilevel that can be inspected to see the throttling status. The former indicates the throttling limit (per hour), while the latter indicates how many requests are left in the current hour. However, throttling limits are actually maintained at the minute level (x-csod-throttling-rate-limit-urilevel/60). If you hit that limit, you will get throttled for the remainder of that minute. If you continue to make calls while throttled during the minute, those calls will count against your hourly limit and 24 hours limit.

# Server Errors 5xx

Server errors are generally the result of a malfunction that occurred on our servers. In most cases they are transient errors and the recommendation is to retry the request until it succeeds. You should cap the number of retries after which you would error out and report the error to Global Customer Support (GCS).

500 Server Error

An error occurred which may be transient. You should examine the details of the error :

{
  "status": "500",
  "timeStamp":"2018-05-15T18:06:55+0000",
  "error": {
      "errorId": "77ed61e0-7052-4fad-b265-c52679ab2cac", 
      "message": "error message",
      "code":"xxx", 
      "description": null, 
      "details": null
  }
}

In some cases, the error could be due to a timeout on the server. This could be due to a couple of reasons:

  • The load on the data source resulted in an excessive time before it responded to the request

  • The page size requested was too large, resulting in a timeout between the data source and the service

In both cases retrying the request could succeed, however to increase your rate of success, it is recommended you implement a back-off logic when retrying.

You might want to progressively reduce your page size. Using the count feature could be helpful in making that determination. If the count returned indicates you are dealing with a very large number of records, for example, over a million records and the records are very wide, i.e. have a large number of columns, then you should reduce the page size and exercise the optimizations suggested earlier in this document.

503 Service Unavailable

This likely to occur when the system is down for maintenance. Cornerstone sends out notifications to our clients in advance in which maintenance windows are specified. The recommendation is to pause your process until after the maintenance window has elapsed. If you receive this error outside an announced maintenance window, you should notify Global Customer Support (GCS) immediately so they can investigate.

# Special Case: 200 not a Success

As stated earlier, the API uses a streaming protocol to allow it to support large volumes of data. The results are sent back as they become available from the database. You can think of it as a fire hose -- the data flows directly from the source, rather than being accumulated entirely into a container, then shipped as a whole. Errors are handled the same way as well.

This can sometimes cause the API to initially send back a 200 success response, however as you continue to page through additional records, it may timeout. How would this error manifest itself? Let's say you issue a request with a page size of 10,000 records. The server processes the query successfully and begins to send back results. Because a number of records are successfully sent, the response status is automatically set to 200 OK.

You continue to receive valid records, however at some point an internal server error occurs due to a resource exhaustion, this typically manifests itself as a timeout. The HTTP protocol does not allow changing the status of a response once it has been sent back, and therefore the client side is unaware that an error occurred.

On the receiving end, you would continue to process the records, unfortunately the data in the stream no longer represents a valid record. Your JSON parser errors out when it encounters data which is not consistent with a record. The response may look as follows:

{
"@odata.context": "https://{environment}.csod.com/services/api/x/dataexporter/api/objects/$metadata#transcript_core",
"value": [
    {valid records}
    error message, could be html
}

If you are collecting the entire JSON payload before you begin parsing it, then it is likely it is not valid JSON anymore. If you are parsing the data stream on the fly then you will run into the error in real-time.

In both scenarios, the error is most likely transient, and the recommendation is to follow the same approach as a 500 service error of retry with back-off logic.