How to Sync Account Transactions From Open Banking APIs Without Unique Transaction IDs
TL;DR
A Robust Synchronization Algorithm: Fedor Tyurin, our CTO, outlines a step-by-step algorithm to improve synchronization, focusing on a "day-by-day" approach and using "double-check" verification to reduce errors and avoid duplicate or missed transactions. The algorithm matches transactions based on consistent properties (e.g., booking date, amount) and uses fuzzy matching to resolve ambiguities.
The first step for handling account transactions is to store them in an internal database - a must for many use cases, from invoice reconciliation to cash flow forecasting. While storing transactions might seem straightforward, the limitations of PSD2 APIs for fetching transaction data introduce challenges that can make reliable synchronization more complex than expected.
The most significant challenge arises when unique identifiers for transactions are either missing entirely or only provided for some entries. Unlike data formats such as camt.053 or MT940, which mandate unique identifiers, PSD2 and the associated regulatory technical standards do not require ASPSPs (banks and similar financial institutions) to provide them. Without unique identifiers, matching transactions accurately becomes difficult. The complexity grows further if the order of transactions is not guaranteed, or if some transaction properties, such as booking date or amount, might change after they’re initially fetched.
Accordingly, synchronising transactions fetched from PSD2 APIs requires careful consideration of these challenges. Without unique transaction IDs, the risk of errors like duplicates or missed updates increases. Moreover, if transaction properties change, simple tracking methods may fail. For instance, you can't just take a hash of the transaction data and use it as an identifier.
To address these challenges, a more robust synchronisation strategy is necessary – one that considers both the potential absence of unique identifiers and the possibility that some property values may change. Below, we outline one possible algorithm designed to improve synchronisation robustness by using a "day-by-day" approach and introducing a "double-check" verification process. This algorithm helps ensure that your internal database remains accurate, even when dealing with the limitations of PSD2 APIs.
The algorithm described below uses the "terminology" (i.e. field names) of Enable Banking’s API:
entry_reference: A unique transaction identifier (which, as mentioned earlier, might be "null").
booking_date: The date when the transaction is officially recorded by the ASPSP (bank).
credit_debit_indicator: Indicates the direction of the transaction, either credit (incoming) or debit (outgoing).
transaction_amount: The monetary value of the transaction, represented as an absolute value without a sign, along with its currency (as recorded by the ASPSP).
This algorithm synchronises only "accounted" transactions i.e. those with a status of BOOK in the Enable Banking API. These transactions are considered stable because their "fundamental values" – transaction_amount, credit_debit_indicator, and booking_date – remain constant. Although some transaction properties may vary due to the multiple layers involved in processing, the "fundamental values" remain constant. Ideally, the remaining fields should not change, but some form of fuzzy matching can be used to ensure accurate synchronisation, even if some properties have changed.
Algorithmic Steps to Synchronise “Accounted” Transactions from Open Banking Data
1. Fetch transaction data: retrieve transactions for a requested period, ensuring all pages are fetched. If there are multiple pages, all are received. Select a transaction from the list you fetched.
2. When transaction IDs are present: If the transaction has an entry_reference, search in your database for a matching transaction with the same entry_reference.
2.1. A match is found: If a matching transaction is found, update the transaction details in your database. Only one transaction should be found, otherwise, something is wrong and this needs to be logged for a manual review. It is also good at this early stage to assert that the values of the field booking_date, credit_debit_indicator and transaction_amount have not changed (these values should never change).
2.2. No match is found: If no transactions matching the entry_reference are found, search the database for transactions matching booking_date, credit_debit_indicator and transaction_amount.
2.2.1. No matches are found: Insert the transaction into your database.
2.2.2. One match found: update the transaction with the newly received values (including entry_reference).
2.2.3. Multiple matches found: If two or more matches are found, add an additional field to the fuzzy matching criteria and repeat the search until only one matching transaction is identified. This is the trickiest part of the matching process and may require using a combination of different fields to find a single match.
3. When Transaction IDs are missing: If the transaction does not have an entry_reference, try to find transactions matching booking_date, credit_debit_indicator and transaction_amount instead.
3.1. Matched transactions should not have an entry_refrence value, if one of them has, there is a problem.
3.2. If a unique match is not found, follow repeat steps 2.2.1 to 2.2.3 to refine the match criteria.
Python Pseudo-Code for Transaction Synchronization
Below is a "pseudo-code" in Python implementing the algorithm, which might be easier to read than the explanation above:
def process_transactions(transactions):
for transaction in transactions:
entry_reference = transaction.get('entry_reference')
booking_date = transaction['booking_date']
credit_debit_indicator = transaction['credit_debit_indicator']
transaction_amount = transaction['transaction_amount']
if entry_reference:
matching_transactions = find_transactions_by_entry_reference(entry_reference)
if len(matching_transactions) == 1:
# Check if values of booking_date, credit_debit_indicator, and transaction_amount have changed
db_transaction = matching_transactions[0]
if db_transaction['booking_date'] == booking_date and \
db_transaction['credit_debit_indicator'] == credit_debit_indicator and \
db_transaction['transaction_amount'] == transaction_amount:
update_transaction_details(db_transaction, transaction)
else:
# Handle unexpected changes in the transaction details
handle_error("Transaction details have changed")
elif len(matching_transactions) == 0:
insert_transaction(transaction)
else:
# Handle multiple matching transactions
handle_error("Multiple transactions with the same entry_reference")
else:
matching_transactions = find_transactions_by_criteria(booking_date, credit_debit_indicator, transaction_amount)
if len(matching_transactions) == 0:
insert_transaction(transaction)
elif len(matching_transactions) == 1:
if matching_transactions[0].get('entry_reference'):
# Handle unexpected entry_reference in the matching transaction
handle_error("Unexpected entry_reference in a matching transaction")
else:
update_transaction_details(matching_transactions[0], transaction)
else:
# Attempt to find exactly one matching transaction using additional fields
matching_transaction = find_single_matching_transaction(matching_transactions)
if matching_transaction:
update_transaction_details(matching_transaction, transaction)
else:
handle_error("Multiple matching transactions, unable to determine the correct one")
def find_transactions_by_entry_reference(entry_reference):
# Implement logic to fetch transactions from the database based on entry_reference
pass
def find_transactions_by_criteria(booking_date, credit_debit_indicator, transaction_amount):
# Implement logic to fetch transactions from the database based on booking_date, credit_debit_indicator, and transaction_amount
pass
def insert_transaction(transaction):
# Implement logic to insert a new transaction into the database
pass
def update_transaction_details(db_transaction, new_transaction):
# Implement logic to update the transaction details in the database
pass
def find_single_matching_transaction(transactions):
# Implement logic to find exactly one matching transaction using additional fields
pass
def handle_error(message):
# Implement error handling logic, such as logging or raising an exception
Pass
Why Don’t We Handle Synchronization on Our End?
At Enable Banking, we’re here to empower our partners, not to control or profit from their data. We don’t store, share, or monetize the data we provide. All data processing occurs in real time—there’s no caching or storage involved on our side. This means we can't review a period of transactions and assign IDs based on the logic outlined above. Instead, we focus solely on secure, transparent access to real-time data directly from banks through official PSD2 APIs. By avoiding any storage or interference with your data, we ensure that our partners have complete ownership and control, free from concerns about data misuse or hidden costs.
Our vision is to enable our partners to shape the future of finance. We are here to support the creation of the next wave of innovators. This is why we focus exclusively on bank connectivity and data harmonization, giving you access to quality, compliant data while you build unique, transformative solutions. We’ll stay focused on enabling, so you can keep innovating.
We’d love to hear your feedback – whether you've tried implementing this algorithm or have developed your own solutions to tackle these challenges. Feel free to share your experiences and insights in the comments!