# Paging

Both APIs implement the OData server driven paging protocol, which segments large results of data into pages. You can control the page size, i.e. the number of records per page, through the prefer header in your request. For example,

prefer = odata.maxpagesize=10,000

The maximum accepted page size is 10,000. If the prefer header is not included in your request, the API uses the default page size of 1,000 records. When the total record count is greater than the page size, the service will inject into the page response a link to retrieve the next page of results.

 {
 "@odata.context": "https://{environment}.csod.com/services/api/x/api/x/dataexporter/api/objects/$metadata#transcript_core",
 "value": \[...\],
 "@odata.nextLink": "https://{environment}.csod.com/services/api/x/dataexporter/api/objects/transcript_core?... "
 }

WARNING

"@odata.nextlink" should be treated as opaque, parsing it or manipulating could result in incorrect results and even errors.

# Streaming

Both APIs use a streaming protocol, which means that they send data out to the client as soon as it becomes available rather than waiting until it receives all the records from the database before sending them out. This allows the APIs to provide access to very large data sets very efficiently, using a relatively small memory footprint on our servers, allowing us to serve many more simultaneous requests.

# Record Duplication

To provide you the most up to date data, RTDW data is synced from the transactional database on a frequency of at most fifteen minutes. This is important to consider because it means the data the API operates on can change at any time, even while you are executing your queries.

As you retrieve records, page after page, newly synced data can cause the records positions to shift. A previously retrieved record can potentially re-appear in subsequent pages and potentially cause errors as you process it on your end.

It is highly recommended that you first stage the data you retrieve in a temporary storage, run any validation logic you deem necessary, including checking for duplicate records, before sending it to its final destination in your pipeline.

# Record Count

Now that you are aware that the data may shift across pages while you are retrieving it, it is good practice to get a count of records at the beginning of your process to verify against at the end. The record count can also be useful to determine what to set your page size to. If the total count is relatively low but higher than the default page size, you may elect to increase the page size by setting it higher to reduce the number of pages you go through. Because the record count is a snapshot at the time of the request -- it reflects the total possible records that meet the filtering criteria you specified, it is important to specify the same filtering criteria you will use when retrieving your records.

For example, using the following use case: Retrieve all completed transcript records in the last 24 hours.

Get the total record count.

Get https://{environment}.csod.com/services/api/x/dataexporter/api/objects/transcript_core?$filter=user_lo_comp_dt ge cast('currentDate - 24 hours', Edm.DataTimeOffset)&$count=true&$top=0

It's important to specify $top=0 when retrieving the total count, so that the service only returns a count. Calculating the total count can be a very expensive operation on the system, especially when the object has a very large record set.
Parse the response.

 {
 "@odata.context":"https://.../services/api/x/dataexporter/api/objects/$metadata#transcript_core",
 "@odata.count": 31379, 
 "value": []
 }

Proceed to retrieve the records

Get https://{environment}.csod.com/services/api/x/dataexporter/api/objects/transcript_core?$filter=user_lo_comp_dt ge cast('currentDate - 24 hours', Edm.DataTimeOffset)&$select=transc_user_id, tansc_object_id, reg_ num, user_lo_score, user_lo_comp_dt

Response:

{

"@odata.context":
"https://.../services/api/x/api/x/dataexporter/api/objects/$metadata#transcript_core",
"value": [...],
"@odata.nextlink":"https://{environment}.csod.com/services/api/x/dataexporter/api/objects/transcript_core?...&$skiptoken=..."
}

Repeat call with @odata.nextLink to retrieve additional records.
While you can compare your accumulated record count to the initial count you retrieved, they may not be identical due to additional or changed data that appears during the streaming/paging process (i.e., you final received record count should be greater than or equal to the initial record count).

# Throttling

To ensure the best possible performance of APIs, CSOD has implemented throttle limits. The current DEAPI throttle limit is 120 calls per min in total. The default RAPI throttle limit is 400 requests per minute.

For DEAPI, this means you may call 2 objects 60 (2x60 =120) times in one minute or 12 objects 10 times in one minute (10x12=120).
If you exceed the call limit, you'll receive a '429' Error. Throttle limits apply to all authenticated APIs calls. If a request is made after the throttle limit is reached, an HTTP error is thrown, which is to be handled elegantly. We recommend that interface applications queue such requests until the throttle limit duration have passed and the request can be re-sent.

TIP

This is another area where DEAPI change tracking techniques can work to your advantage. By it's nature, change tracking results in smaller data sets, fewer pages of data, and hence fewer requests for that data.