Anti-money laundering

Dedect relationship between A and E by tracing through payments with similar amounts and identifying payment chains.

For example:

lag means lag(daystamp,-1) over (partitin by accname, Cntpty_Acct_Name order by daystamp )

accname Event_Dt Tx_Amt Cntpty_Acct_Name daystamp id lag
a 2020-01-01 20.0 b 7305 5068 7306.0
a 2020-01-02 300.0 b 7306 5069 7307.0
a 2020-01-03 180.0 b 7307 5070 Infinity
b 2020-01-03 40.0 c 7307 5071 7307.0
b 2020-01-03 500.0 c 7307 5072 7308.0
b 2020-01-04 10.0 c 7308 5073 Infinity
b 2020-01-03 150.0 d 7307 5074 Infinity
c 2020-01-04 50.0 e 7308 5075 Infinity
d 2020-01-04 150.0 e 7308 5076 Infinity

You can run spark-submit aml.py to get the payment chains seem like money laundering. The same ‘batch_id’ indicates these transactions belong to a complete chain. The field ‘depth’ indicates the length of the chain.

id batch_id src dst amount_sum depth accname Event_Dt Tx_Amt Cntpty_Acct_Name
5068 0 a e 200.0 4 a 2020-01-01 20.0 b
5070 0 a e 200.0 4 a 2020-01-03 180.0 b
5071 0 a e 200.0 4 b 2020-01-03 40.0 c
5073 0 a e 200.0 4 b 2020-01-04 10.0 c
5074 0 a e 200.0 4 b 2020-01-03 150.0 d
5075 0 a e 200.0 4 c 2020-01-04 50.0 e
5076 0 a e 200.0 4 d 2020-01-04 150.0 e
5068 1 a c 500.0 3 a 2020-01-01 20.0 b
5069 1 a c 500.0 3 a 2020-01-02 300.0 b
5070 1 a c 500.0 3 a 2020-01-03 180.0 b
5072 1 a c 500.0 3 b 2020-01-03 500.0 c

Enjoy it !
?

GitHub

View Github