The information gathering steps of footprinting and scanning are the most importance before hacking. Good information gathering can make the difference between a successful penetration test and one that has failed to provide maximum benefit to the client.
We can say that Information is a weapon, a successful penetration testing and a hacking process need a lots of relevant information that is why, information gathering so called foot printing is the first step of hacking. So, gathering valid login names and emails are one of the most important parts for penetration testing.
TheHarvester has been developed in Python by Christian Martorella. It is a tool which provides us information of about e-mail accounts, user names and hostnames/subdomains from different public sources like search engines and PGP key server.
This tool is designed to help the penetration tester on an earlier stage; it is an effective, simple and easy to use. The sources supported are:
Google – emails, subdomains/hostnames
Google profiles – Employee names
Bing search – emails, subdomains/hostnames, virtual hosts
Pgp servers – emails, subdomains/hostnames
LinkedIn – Employee names
Exalead – emails, subdomain/hostnames
New features:
Time delays between requests
XML results export
Search a domain in all sources
Virtual host verifier
Getting Started:
If you are using kali linux, go the terminal and use the command theharvester.
In case, if it is not available in your distribution, than you can easily download it from http://code.google.com/p/theharvester/downlaod, simply download it and extract it.
Provide execute permission to the theHarvester.py by [chmod 755 theHavester.py]
After getting in to that, simply run. /theharvester, it will display version and other option that can be used with this tool with detailed description.
#theHarvester -d [url] -l 300 -b [search engine name]
#theHarvester -d sixthstartech.com -l 300 -b google
-d [url] will be the remote site from which you wants to fetch the juicy information.
-l will limit the search for specified number.
-b is used to specify search engine name.
From above information of email address we can identify pattern of the email addresses assigned to the employees of the organization.
#theHarvester -d sixthstartech.com -l 300 -b all
This command will grab the information from multiple search engines supported by the specific version of theHarvester.
Save the result in HTML file. Command:
#theHarvester.py -d sixthstartech.com -l 300 -b all -f pentest
To save results in html file -f parameter is used as shown in this example.