Rotating/circular caching

Development at Stats groundwork improvements by jtc42 · Pull Request #533 · renalreg/ukrdc-fastapi

Problem: Cached just-in-time (JIT) calculation

Currently, stats are calculates on-request (JIT), and then cached for some time. During that cache time, all requests will retrieve the data quickly from a Redis database. Once the cache expires, it is deleted from the Redis database, and the next API request will trigger a calculation.

Crucially, this first request after the cache expires will not return until the calculation has finished. This essentially means that the first person to log on from a renal unit on any given day will have to wait the full calculation time to see any stats, but all subsequent requests that day will not.

For renal units with smaller cohorts (fewer patients included in the statistics data sets), this is less of an issue since both querying the database and calculating statistics are relatively quick. For larger units however (>5000 records) these calculations can take 5-10 seconds or more in some cases. This kind of delay in receiving a response from the API can cause both technical and UX issues,

Solution: Cached ahead-of-time (AOT) calculation for large cohorts

To alleviate this issue, we will move instead to cached ahead-of-time calculations for renal units with a large number of patients. A large number of total patients generally corresponds to larger cohorts in statistics calculations, which in turn corresponds to slow calculations.

The API application (ukrdc-fastapi) already includes functionality for tasks to run in the background on a schedule. This is currently used to query and cache metadata from our Mirth Connect instances. This will be extended to include a subset of statistics calculations for a subset of renal units.

These tasks should repeat such that the next calculation completes approximately as the previous cached values expire.

Some calculations, such as basic demographics statistics, probably need not be calculated AOT since they are acceptably fast to calculate on-request.

Selective pre-calculation

As mentioned, smaller cohorts also likely need not be calculated AOT. For this reason, we will decide a cutoff of UKRDC patient count (patient records to be included in stats calculations, so those coming from an RDA feed) below which AOT calculation will be skipped.

This cutoff should be configurable at runtime as we will likely need to update it in response to the number of sites sending RDA files changing over the coming months.

Multi-processing

Currently, background tasks are running in Python threads. This may degrade API performance due to the limitations of the GIL. However, both database querying and stats calculations could run in separate processes since they need not share any resources with the main API process.

In the near future we should explore options to have the AOT calculation background tasks run in a separate process, so make full use of additional CPU cores and reduce it’s impact on the main API thread responsible for handling requests.

See for example using Celery:

Alternatively, since our traffic is fairly low, we can continue using FastAPI BackgroundTask for request-time tasks, and just create a separate container instance that only runs the scheduled tasks, and not the actual API listener. Effectively the codebase has two entry points, one starts the API, and the other just runs scheduled tasks.

Implementation Notes

Rotating/circular caching

Problem: Cached just-in-time (JIT) calculation

Solution: Cached ahead-of-time (AOT) calculation for large cohorts

Selective pre-calculation

Multi-processing