Open Code Repository
Curated non-proprietary suptech code used by financial authorities
Displaying 24 of 24 items
Filters
reset filters
Locations select all
-
East Asia & Pacific
-
Europe & Central Asia
-
Latin America & Caribbean
-
Middle East & North Africa
-
North America
-
South Asia
-
Sub-Saharan Africa
Use Cases select all
-
AML / CFT / PF supervision
-
Consumer protection
-
Digital assets / cryptos supervision
-
Licensing
-
Prudential supervision
-
Securities supervision
Technologies select all
-
Analytics
-
Collection
-
Processing
-
Storage
Licenses select all
-
Accuraface License
-
Apache
-
CDLA Permissive
-
CDLA Sharing
-
Creative Commons License CC0
-
Creative Commons Zero v1.0 Universal
-
GNU Affero General Public License v3 or later (AGPLv3+)
-
GNU General Public License
-
Microsoft
-
MIT
Title | Description | Publisher | Source(s) | Additional Relevant Links |
---|
Sentiment Analysis with Twint & Textblob (POC) |
This code enables the extraction of Tweets from Twitter profiles without the need for Twitter's API for sentiment analysis using NLP. Twint leverages Twitter's search operators to scrape Tweets based on specific users, topics, hashtags, and trends.
|
GitHub |
Andrew Schleiss |
View Open Data |
TabFormer |
The repository encompasses modules designed for hierarchical transformers tailored to tabular data, along with a synthetic credit card transaction dataset. Noteworthy adaptations include a Modified Adaptive Softmax for effective masking and a Modified DataCollatorForLanguageModeling specifically crafted for tabular data. These modules are integrated within the transformers library from HuggingFace.
|
IBM |
IBM |
View Open Data |
AMLSim |
This project's goal is to construct a multi-agent simulator dedicated to anti-money laundering (AML) and provide access to synthetically generated data. The aim is to enable researchers to devise and deploy their innovative algorithms using a uniform dataset.
|
IBM |
IBM |
View Open Data |
Code for the Bank of England Staff Working Paper 848 |
This code creates predictive models for anticipating financial crises using machine learning on macro-financial data spanning 17 countries from 1870 to 2016. In comparison to traditional logistic regression, machine learning models exhibit superior performance in predicting crises beyond the sample period. The code employs a unique approach based on Shapley values to uncover economic factors influencing the machine learning models.
|
Bank of England |
Bank of England |
View Open Data |
Code for the Bank of England Staff Working Paper 905 |
This code implements a model discussed in the Bank of England Staff Working Paper 905. The model assesses the effectiveness of multiple requirements in bank regulation using rule-based methodology.
|
Bank of England | Bank of England | |
Elliptic Plusplus |
This repository introduces a comprehensive applied data science approach to Bitcoin network fraud detection, leveraging the Elliptic++ dataset. Utilizing graph data, the repository employs four graph types for analysis: transaction-to-transaction, address-to-address interaction, address-transaction, and user entity graphs. The approach involves training diverse machine learning algorithms on these graphs, to demonstrate fraud detection for both illicit transactions and addresses.
|
GIT DISL |
Git Disl |
View Open Data |
BERTH4ETH |
This GitHub repository houses code (TensorFlow version) and datasets for the paper "BERT4ETH: A Pre-trained Transformer for Ethereum Fraud Detection," accepted at ACM Web conference (WWW) 2023, including presentation slides. An update (Section 5.5) discussing multi-hop modeling is added to the arXiv paper.
|
GIT DISL |
Youssef Elmougy and Ling Liu |
View Open Data |
A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context |
This notebook delves into the FinTabNet dataset, building a Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context for financial data capturing
|
IBM |
IBM |
View Open Data |
Finance proposition Bank Notebook |
This notebook delves into the Finance Proposition Bank dataset, known as FinProp, which features proposition bank-style annotations applied to sentences from the legal domain extracted from past IBM annual financial reports. The notebook is supported by data encompassing around 1,000 sentences, each annotated with a set of "universal" semantic role labels, encompassing aspects such as parts of speech, argument labeling, and predicate labeling.
|
IBM |
IBM |
View Open Data |
Android KYC Scan |
Open Code repository by Accura Scan, an identity fraud prevention company, for Customer On-Boarding and eKYC process with real-time User Authentication: Offers Optical Character Recognition (OCR) functionality across English, Latin, Chinese, Korean, and Japanese languages. It utilizes Face Biometrics to compare images, verifying the user's selfie against the document image. For customer verification and authentication, User Authentication and Liveness Check are employed, safeguarding against identity theft and spoofing attacks. The technology involves both active and passive selfie techniques for the Liveness Check.
|
Accura Scan |
Accura Scan |
View Vendor |
Consumer Complaints Classification |
An open source layer for consumer complaints data classification
|
Shubham Chouksey |
FIS Global Solutions |
View Open Data |
Presidio |
Presidio helps to ensure sensitive data is properly managed and governed. It provides fast identification and anonymization modules for private entities in text such as credit card numbers, names, locations, social security numbers, bitcoin wallets, US phone numbers, financial data and more.
|
Microsoft |
Microsoft |
|
Consumer Complaints |
The model analyses consumer complaints filed against companies for various financial products, such as credit card payment problems or debt collection tactics. The model, determines, for each financial product and year, the total number of complaints, the count of companies receiving complaints, and the highest percentage of complaints directed at a single company. This analysis aims to provide insights into the distribution and concentration of consumer complaints across different companies in the financial sector.
|
Mahzad Khoshlessan, University of Michigan |
Mahzad Khoshlessan |
View Open Data |
Text Mining |
A generative probabilistic model used in natural language processing (NLP) and machine learning. It is specifically designed for topic modeling, a technique used to identify topics present in a collection of text documents
|
Stephen Hansen, University of Oxford |
Stephen Hansen |
View Solution |
Finra Trace |
Research project on Financial Industry Regulatory Authority (FINRA) Trade Reporting and Compliance Engine (TRACE) academic version. The model analysed interaction and trading behaviour among dealers in over-the-counter (OTC) corporate bond market. Topic modeling techniques are utilized, mostly Latent Dirichlet allocation (LDA), to analyse bonds that were traded by dealer on each day. Preliminary result shows that LDA has the flexibility to analyse trading interaction in multiple dimensions
|
Raymond Chen |
Raymond Chen |
|
ccdb5-api |
The API facilitates searching and retrieving complaint data, offering features such as searching complaint data, suggesting data based on input, and retrieving complaints by ID. To fulfil its functionality, the API has specific requirements that are batch-installed via pip. These include using Django as the web framework, Django-local flavor for country-specific Django helpers, Django rest framework for the Rest API framework, elastic search for a low-level client to interact with Elasticsearch, and requests for making HTTP requests to obtain data in various formats.
|
Consumer Financial Protection Bureau |
Consumer Financial Protection Bureau |
View Solution |
Regdown |
A Python-Markdown extension for interactive regulation text
|
Consumer Financial Protection Bureau |
Consumer Financial Protection Bureau |
|
ARX |
Open source software for anonymizing sensitive personal data. It has been designed from the ground up to provide high scalability, ease of use and a tight integration of the many different aspects relevant to data anonymization. Its highlights include:
|
ARX |
ARX |
|
Black-it |
Black-it is a user-friendly toolbox created to assist in adjusting the settings of agent-based models and simulations (ABMs). It utilizes advanced methods to explore the parameter possibilities effectively. The black-box calibrator uses a loss function and a sequence of chosen search algorithms to estimate the wanted parameters. It comes with a set of ready-to-use example models, loss functions and search algorithms. Custom models and functions can be implemented to use with the calibrator.
|
Banca d'Italia |
Banca d'Italia |
|
Heavy Nodes in a Small Neighborhood: Algorithms and Applications |
Open code repo by Kings College researchers looking at isolating suspicious AML activity by analysing a series of interrelated transactions for 'smurfing' activities
|
Society for Industrial and Applied Mathematics |
Society for Industrial and Applied Mathematics |
|
Hapi Multi Mongo |
A plugin code repo for relational database, Mongo DB, whose schema is designed for large/ complex datasets such as financial data sets. The plugin provides access to multiple MongoDB servers and various databases in the request/reply life cycle. The plugin is designed to accept complex configuration options and exposes/decorates the connections object to the server object.
|
Alyne |
Mitratech |
View Vendor |
Data Protection Framework |
A python library/command line application for identification, anonymization and de-anonymization of Personally Identifiable Information data. The framework aims to work on a two-fold principle for detecting PII: ( Using RegularExpressions using a pattern and Using NLP for detecting NER: Named Entity Recognitions)
|
ThoughtWorks Datakind |
ThoughtWorks Datakind |
|
API-based Prudential Reporting System |
An Application Programming Interface (API) and back office reporting and visualization application to (a) allow financial institutions to submit high-quality, granular data digitally, and automatically to the financial authority with higher frequency; (b) enable supervisory staff to make data validation faster and analysis sharper by generating customized reports for supervisory and policy development purposes in different formats, and (c) by improving data quality and access, and developing new tools for data visualization and analysis, the project will help supervisors implement a risk-based supervisory approach that reduces compliance costs and promotes financial inclusion while ensuring financial stability and integrity.
|
Cambridge SupTech Lab |
R2A |
View Vendor View Solution |
R2A_AML_Supervision |
An API program that provides data infrastructure for AML compliance. For this project, The Mexican National Banking and Securities Commission (CNBV) collaborated with R2A to revamp its data infrastructure, aiming to enhance its anti-money laundering (AML) supervisory capabilities and accommodate the expanding fintech sector. The objectives included enabling digital submission of AML compliance information by financial institutions, improving the volume, granularity, and quality of AML-related data, importing historical records into a central platform, and enhancing AML-related data validation and analysis.
|
Cambridge SupTech Lab |
R2A |
View Vendor |