Mastercard combats fraud with new tabular foundation model

Mastercard is enhancing its fight against fraud in digital payments with a new foundation model, based not on text or images like traditional LLMs, but on tabular data related to transactions.

An LTM for payment security

The company has trained a foundation model on billions of transactions, with the goal of expanding it to hundreds of billions. The datasets include payment events and associated data such as merchant location, authorization flows, fraud incidents, chargebacks, and loyalty activity. Mastercard emphasizes that personal data is removed before training and that the model analyzes behavioral patterns rather than focusing on individual identities.

By excluding personal data, the technology reduces privacy risks that may affect other forms of AI in the financial services sector. The scale and richness of the data allow the model to infer commercially valuable patterns, despite the lack of information per user. Mastercard claims that using sufficiently large volumes of behavioral data compensates for any loss of more specific data.

How an LTM works

The LTM architecture differs from that of large language models, which are trained on unstructured inputs and work by predicting the next token in a sequence. Mastercard's LTM examines the relationships between fields in multi-dimensional data tables, bringing the definition of the technology closer to that of pure machine learning rather than artificial intelligence.

The model learns from raw inputs exactly which relationships are predictable, so it can identify anomalous patterns not captured by predefined rules. Mastercard describes the LTM as an "insights engine" that can be used in existing products, improving existing workflows. The technical infrastructure for the LTM comes from Nvidia and Databricks.

Implementation and future plans

Cybersecurity at Mastercard is the first area to see the active implementation of the technology. Like many institutions, Mastercard operates several fraud detection systems that examine transaction data. These require human input at the beginning and continuous attenuation to define what constitutes suspicious behavior. These might include sudden increases in transaction frequency or users making purchases in different parts of the world in a short period of time.

Early results indicate improved performance compared to conventional techniques in specific cases. The company cites the example of high-value, low-frequency purchases that can be flagged as anomalies using traditional models, but the new model appears to be able to distinguish legitimate events more accurately than its counterparts.

Mastercard plans to implement hybrid systems that combine established procedures with the new model, a degree of caution that reflects the regulatory levels in which it operates. It recognizes that no single model is able to perform well in all scenarios, so the LTM will take its place among the tools in this area.

The model is claimed to be able to scan activity on loyalty programs, be used in portfolio management, and for internal analysis, areas in which there are large volumes of structured data. In current operations, companies often implement many models adapted to each activity, but this can involve multiple training costs and validation and monitoring efforts. A single foundation model that can be optimized for different activities can simplify processes and contain costs.

Mastercard hopes to increase the scale of the data used on the model and its overall sophistication. It is also planning API access and SDKs to allow internal teams to develop new applications. Mastercard emphasizes the data responsibilities that the LTM holds, mentioning privacy and transparency, model explainability, and auditability. Regulatory scrutiny of any system that influences credit decisions or fraud outcomes is to be expected, in addition to any data practices involved in the operation of the LTM.

For those evaluating on-premise deployments, there are trade-offs to consider. AI-RADAR offers analytical frameworks on /llm-onpremise to evaluate these options.

Highly structured data, as opposed to text or images, is at the heart of the LTM. Large tabular models could be the beginning of a new generation of AI systems in banking and payments infrastructure.