EmailDomainFinder

EmailDomainFinder is an OSINT program that allows you to uncover a censored domain in an email adress.
For example, you can find out an account's censored email by using the password recovery mode on sites such as Instagram or Twitter.
The email adress will appear like this : h**********6@y***o.com.
Enter this email in EmailDomainFinder, and you'll get this result :
[Match] y***o.com matches with yahoo.com

For common used email, it's not the ideal, but if your target uses rarely used domain, it will never be that fast to uncover what is it.
This program uses a list of 6100 + domain list (https://gist.github.com/ammarshah/f5c2624d767f91a7cbdc4e54db8dd0bf) oftenly updated.

*The 6100+ domain list can be replaced by a list of 15.926.668 domains. See more at the end of the README.

Installation

git clone https://github.com/novitae/EmailDomainFinder
cd EmailDomainFinder
pip install -r requirements.txt
python3 refgen.py

Running refgen.py will get the domain list and generate the technical reference list for the program to work.
To update the domain list, just python3 refgen.py again, it will delete the list and install the latest version.

If you want install the first version where the program asks you directly for the email/domain and if you want to export,
simply replace the first installation line by git clone https://github.com/novitae/EmailDomainFinder/tree/b8b4f708be771a66a32e38e6d37bc35b17fa54e6,
and do the same steps than if it was the actual repo.

Running

To tun the program, do python3 emaildomainfinder.py -h. To enter a mail, do python3 emaildomainfinder.py -c <your_mail>.
To export the match results, do python3 emaildomainfinder.py -c <your_mail> -e.
If you export the result, it will be exported in the folder where emaildomainfinder.py is in a file named DomainsCorrelation.txt.
You can also enter a censored domain name too, it will work.
All the censored characters in the entry needs to be *, no x, ?, or i don't know what else.

How it works ?

First, i'm a beginner in Python, please don't trashtalk me for what you will read if you find that terrible. Thanks.

So, the refgen.py downloads the list of all public domain provider. Then it takes each line of it, and create, what i called, a "reference list".
For example, 0-mail.com will be turned in (6, 3);0-mail.com;0#1 -#2 m#3 a#4 i#5 l#6 .#7 c#8 o#9 m#10 .

So, (6, 3) is the "reference". It is the list of the lenght of each string of each side of the . in the domain.
For example, len(0-mail) equals 6, and len(com) equals 3.
Then, 0-mail.comis simply the mail so we can get it back further to write it.
Finally, 0#1 -#2 m#3 a#4 i#5 l#6 .#7 c#8 o#9 m#10 are the "characters references".
It spells the domain but with the location of each characters after the #.

When you launch emaildomainfinder.py, the email you enter only keeps the part after the @, to only keep the domain. It makes a reference for the email (('i', 'i'), or ('i', 'i', 'i') if the domain have 3 times a .), then it spells the same manner than before the domain to make "character references". It write these in a sort of "cache" txt file.

Then the lines of the "cache" file are turned into a list. Then "character references" part of the domain list is turned into a list too. Then we compare the lists. If all the elements of the list of the entered domain are in the "character references" part of the domain list, we print and/or export the name of the domain.

Replacing the 6.000 domains by 15.926.668 domains

To make a way larger search, you can replace the original 6.100 domain list by a 15.926.668 scrapped on https://www.email-format.com/

Procedure

Follow the instructions of the README of the page where the domain list is coming from. https://github.com/novitae/TitanEmailsDomainList
If you want to go faster :

pip install termcolor
git clone https://github.com/novitae/TitanEmailsDomainList
cd TitanEmailsDomainList
python3 fullrefgen.py

To paste the path of the file it will ask you, take the full_domain_list.txt and drop it in your terminal window.
It will normally write the path for this file. Then click enter.

You will normally see a lot of lines writing themselves fast in your terminal window like this :

Generating reference for email.com
Generating reference for email.com
Generating reference for email.com

It is normal. You can do something else while generating the reference list, it can take a long time.
It took me about 50 minutes with a 2,3 GHz Intel Core i5 two core.

Once the full_domain_list.txt generated, copy it to EmailDomainFinder. Replace the old one.
If there's no old, don't run the refgen.py program in EmailDomainFinder, or it will erase the full domain list to replace it by the 6.100 domains one.