Data Management

back to the product groups

AZ Frame
/
Offer
/
Data Management
/
Graph databases

Graph databases

Graph databases are one of the “hottest” phenomena in the world of data management. They are created according to a different concept than relational databases, collecting information in nodes (equivalent to a record in RDBMS) and using relationships (equivalent to links between tables in RDBMS). Each node and relationship can have associated attributes (the RDBMS equivalent of columns), but the list of attributes can vary even between nodes and relationships of the same type. AZ Frame offers graph databases of the leader among providers of such solutions – Neo4J, as well as its own tool, GraphIQ.

Anti-Fraud and Anti-Money Laundry: A key application of graph base

The construction of Neo4J and Graph IQ databases enables the adaptation of data structures to changing requirements (e.g. anti-fraud tactics).

Financial crimes are getting harder to detect. Fraudsters create:

artificial identities – new identities are created based on real data from various sources,
networks of connections, both small and large. The greater the number of related persons or companies, the greater the scam.

Standard fraud detection mechanisms (including those based on Machine Learning) do not detect the above situations to a large extent. This is because data is being analyzed, not relationships. If we connect people using the same ID number or PESEL number, people who have the same telephone number or e-mail address or people who have the same addresses of residence, it turns out that these people will obtain the status of related (although in conventional databases are alien to each other). Algorithms embedded in the graph database allow for the detection of links and groups of information, and this in turn allows for the detection of crime chains.

Features of graph database:

Efficiency – with queries involving many tables – the construction of graph databases allows for performance several orders of magnitude better than in the case of relational databases. This is despite the lack of dedicated indexes.

Fast development – no predefined data model – creating a model as data flows in. Intuitive “reading” of data in the model – both by IT specialists and by business.

Business responsiveness – the lack of a predefined scheme causes quick reactions to the needs of changes in applications (CR). The data in the graph database can be shaped as required.

From laptop to clusters – graph databases are characterized by high availability, transactionality and scalability. Billions of nodes and relationships can be stored in them. They can be used locally on a laptop/workstation and in corporations as mass data stores.

Can be used by IT and business – thanks to the tools built into the database – data extraction and analysis can be carried out by people who do not have IT knowledge.

Relations between the entities

Based on information about the actual beneficiaries of companies and relations with these companies, we can create a network of connections in the form of a graph. We can search for connections between different people and/or companies by defining a condition (the condition may be a specific name and surname or company name, or, for example, the number of relationships coming out of a node) and the number of distance relationships between objects (e.g. we are looking for the shortest path between the user Jan Kowalski and the company Alfa with a maximum distance of 5 relations). We can create a graph containing millions of nodes and relationships (Neo4j supports many billions of nodes in one database) and analyze it.

We can also introduce people or relationships to the graph that do not result directly from court registers, e.g. kinship, neighborhood, close cooperation of formally unrelated companies, supply chain transactions. Additional relationships can be created by defining rules for data in the database and implementing them. This data may come from the institution’s database or from external sources.

Support in risk management

By defining appropriate conditions, e.g.:

Carrying out the same activity,
Trade in the same goods,
Same collateral issuer,
Relationships within capital groups
- identifying the control relationship (e.g. preparation of consolidated statements, voting rights, decision-making powers),
- identifying economic dependence (e.g. mutual guarantee, significant relationship with one recipient, client, joint owners),

we can create – with the Neo4J or GraphIQ tools – a transparent graph containing nodes and relationships within the indicated transactions and conditions and subject it to further analysis. It is also possible to use data from external sources.

Use of libraries and functions of Neo4J bases

The Neo4j base has over 500 functions implemented. In addition, there are over 600 additional functions available in the APOC library. Thanks to the construction of graph databases and mechanisms (Neo4j Graph Data Science Library) implemented in them, one can:

identify unrelated groups that use the same identifiers (eg phone numbers) (Louvain Modularity Algorithm);
identify groups that often come into contact with each other (eg ZUS compensation payments) (Components Algorithm (Union Find);
examine the similarity of accounts or the similarity of chains of connections (Jaccard Algorithm),
investigate impact on others and deal size (PageRank Algorithm);
find additional relationships and add them to your data (eg neighborhoods, using the same IP addresses) (Common Neighbors Algorithm);
find transactions or relationships with very short paths (Shortest Path Algorithm).