Extract email addresses from a website

extract email from website
extract email from website

Last updated: January 1, 2023

For one reason or another you want extract all emails from a website ? The script The Harvester va automate the task and search for emails using a dozen resources and search engines.

The Harvester allows quickly and precisely retrieve email addresses and also the subdomains linked to a website.

It is a kind of web crawler or web spider, a program that automatically explores the Internet in search of targeted content.

Application areas of The Harvester tool

The Harvester is often used by spammers to collect email addresses whose send spam, but can also be used for more noble tasks:

  • For example, you can retrieve the addresses of a company for sending CVs
  • Find the email of an old acquaintance.
  • Coupled with Maltego, it is possible to audit and test a company's information system
  • Test your own website to prevent spam or mailbombing.

How does The Harvester work?

The Harvester script will search the web for internet addresses by looking for the @ character then analyzing the characters before and after. If the email is validated by the program, it will be added to the database. This is why it is not recommended to write your email “in plain text” on forums, blogs, etc. You may have seen that on certain sites contact emails are written on Jpeg or omit the @ character by writing for example contact_gmail.com.

How to use The Harvester?

The Harvester is integrated with Kali Linux. The easiest way to access it is to open a terminal window and run the following command: theharvester.

If you are using an operating system other than Kali Linux, you can download the tool directly from the site: http://www.edge-security.com.

To install it, open a Linux terminal and clone the Git repo:

git clone https://github.com/laramies/theHarvester

Then go to the created folder:

cd theHarvester

We will then have to add the necessary libraries for the version of Python that we must use

pip install requests

To make the script executable, do a…

chmod + x theHarvester.py

Finish by running this command for example:

./theHarvester.py -d www.funinformatique.com -b all
on Kali Linux: theharvester -d www.funinformatique.com -b all

The latter will have the effect ofextract emails from website www.funinformatique.com on all search engines and social networks known to TheHarvester.

Let's take a closer look at this command line:

  • The -d option allows you to specify the target website.
  • The -b option specifies the search engine used to find email addresses.

We have several choices, including Google, Bing, Baidu, LinkedIn, Twitter and others; in my case, I chose the all option which allows you to search in all the servers mentioned above.

After running this command, this is what I get:

I was able to recover 4 email addresses displayed in plain text on the Web and 5 subdomains.

The Harvester stands up very useful for extract email addresses from a website. To put in your pencil case pentesters tools.