Extract email addresses from a website

extract email from website
extract email from website

Last updated: January 1, 2023

For one reason or another you want extract all emails from a website ? The script The Harvester va automate the task and search for emails using a dozen resources and search engines.

The Harvester allows quickly and precisely retrieve email addresses and also the subdomains linked to a website.

It is a kind of web crawler or web spider, a program that automatically crawls the Internet for targeted content.

Areas of application of The Harvester

The Harvester is often used by spammers to collect email addresses whose send spam, but can also be used for more noble tasks:

  • For example, you can retrieve the addresses of a company for sending CVs
  • Find the email address of an old acquaintance.
  • Coupled with Maltego, it is possible to audit and test a company's information system
  • Test your own website to prevent spam or mailbombing.

How does The Harvester work?

The Harvester script will search the web for internet addresses by looking for the @ character then analyzing the before and after characters. If the email is validated by the program, it will be added to the database. This is why it is not recommended to write your e-mail "in clear" on forums, blogs, etc. You may have seen that on some sites contact emails are written on Jpegs or omit the @ character by writing for example contact_gmail.com.

How to use The Harvester?

The Harvester is integrated with Kali Linux. The easiest way to access it is to open a terminal window and run the following command: theharvester.

If you are using an operating system other than Kali Linux, you can download the tool directly from the site: http://www.edge-security.com.

To install it, open a linux terminal and clone the Git repo:

git clone https://github.com/laramies/theHarvester

Then go to the created folder:

cd theHarvester

We will then have to add the necessary libraries for the version of Python that we must use

pip install requests

To make the script executable, do a ...

chmod + x theHarvester.py

Finish by running this command for example:

./theHarvester.py -d www.funinformatique.com -b all
on Kali Linux: theharvester -d www.funinformatique.com -b all

The latter will have the effect ofextract emails from website www.funinformatique.com on all search engines and social networks known to TheHarvester.

Let's take a closer look at this command line:

  • The -d option specifies the target website.
  • The -b option specifies the search engine used to find email addresses.

We have several choices including Google, Bing, Baidu, LinkedIn, Twitter, and others; in my case, I chose the all option which allows to perform the search in all the servers mentioned above.

After running this command, this is what I get:

I was able to retrieve 4 email addresses displayed in clear on the Web and 5 under domain.

The Harvester stands up very useful for extract email addresses from a website. To put in your pencil case pentesters tools.