Currently, the process has issues with the size of the data that is trying to be passed through it. The proposed solution is to break the process up and have the tracing and the extraction handled separately.

Extraction

The extraction will take the form of a simple query that will grab the AKI alerts for a set period. A simple UI could be built that would allow the user to specify the time range and would output a CSV fit for use by the stats team. It might be the case that the query is run directly on the database by a member of the stats team if access to the data is granted. The extraction itself requires further planning but in essence, it should be very simple to get the data out with a query and build a tool later if it is required/desired.

Pre Extraction

Before the extraction can take place, the data supplied with each alert needs to be compared with the data returned as a result of running NHS tracing. The proposed solution to get around the volume of alerts is to run the tracing daily(or weekly depending on numbers). A new table would be created that would store data in cases where the data from an alert and the data from a trace do not match. This would address the temporal state of traced data at the time of the alert and allow the data supplied by the lab to remain unaltered.

For example, if an alert contains the first name Andy but the result of a trace on the record returns first name Andrew then a row would be generated in the new table that would hold the column name of the incorrect value, plus the correct traced value and a reference to the alert that had the issue. At the time of extraction, a row in the CSV would hold both the raw data supplied by the lab and the corrected trace data.

Database Changes

To accommodate the above proposed changes the aki_reg database will require some alterations. Firstly the ALERT_RAW database will require some extra data points to assist with keeping track of tracing results and audit. A new table will be created to hold any data points that didn't match the raw data. This table will hold the traced data and a link back to the corresponding alert.

Use Case

An alert is raised with the following (uncomplete for simplicity) data.

ALERT_ID	NHSNO	FORENAME	SURNAME	POSTCODE	CREATININE	TRACE_STATUS	TRACE_ID	TRACE_DATE
1005	1111111111	JOHN	SMITH	BS1 1AA	300	NULL	NULL	NULL

A scheduled query will search for any alerts that have a traced status of NULL and create a tracing file with the results. A TRACE_ID will be set that will link all the alerts returned with the trace file. Once the tracing file has been traced, the results will be compared with the alerts with matching TRACE_ID's. For the majority of alerts a traced status and date will be added to the alert and no further action will be required because the tracing data will match the data of the alert. The extract will reflect that the data was identical.

For the alert above the tracing file contains the following data

NHSNO	FORENAME	SURNAME	POSTCODE
1111111111	JON	SMITH	BS1 1AB

Here there are two data points picked up by tracing that don't match, FORENAME and POSTCODE. This would generate two new rows in the TRACED_DATA table as follows

ID	DATA_POINT	VALUE	ALERT_ID
27	FORENAME	JON	1005
28	POSTCODE	BS1 1AB	1005

When the extract is run it will grab all of the raw data plus check for any differing traced data and produce a CSV that will look something like this

RAW_NHSNO	TRACED_NHSNO	RAW_FORENAME	TRACED_FORENAME	RAW_SURNAME	TRACED_SURNAME	RAW_POSTCODE	TRACED_POSTCODE	CREATININE	DATE_OFALERT
1111111111	1111111111	JOHN	JON	SMITH	SMITH	BS1 1AA	BS1 1AB	300	24/08/2020

Opt-Out

All patient records will need to be put through the opt-out tracing. All data pertaining to opted-out patients will be filtered from the extract to ensure that IG is not comprimised but will persist in the database in complience with IG guidelines. The stats and IG team have both expressed a desire to track the percentage of patients that are excluded as a result of opt-out.

Anonymised Data Points

NHS numbers will continue to be encrypted and hashed as they are now. There is a need to better engineer the hashing process to reduce the time it takes to complete. The intent is to complete the hashing of NHS numbers at the same time as the regular tracing and to simplify the process while keeping in mind any IG concerns. Dates of birth will continue to be anonymised as they are now using the 15th of the month for all records. Postcodes will no longer be anonymised and LOSA are not required.

Dates of Death

One of the primary reasons to run the NHS tracing is to find any dates of death that have occured in a set time frame. All patinets that have had an alert within the past two years will need to be traced to check for occurances of death. This date of death needs to be added to the DEAD_PATIENT table as it is now. Once a patient is decesed there is no longer any point in tracing them so this should be considered when creating any queries for the extract.

Related Jira Tickets

Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Browser not supported