Preliminary steps to get information about target website

Do you want to hack a wesite?

Follow these steps first, to gather few information about it.

Try to get the following:

  • IP address
  • Domain info
  • Technology used in n the website (programming language, db, …)
  • Other website on the same server
  • DNS records
  • Unlisted files, subdomain, etc

IP address

So we can start to use whois lookup (https://who.is/, https://whois.domaintools.com/), to find information about the domains and the owner

Technologies

To know about technologies used on the website we can check with netcraft website (https://www.netcraft.com/tools/)

DNS record

To get DNS information user the website https://www.robtex.com/

Other website on the same server

In some cases a website is hosted inside a server in which are hosted many other website. So if you can access to your target website, you can try to access to some other website on that server. Basically the all the website on the same server have the same IP address.
robtext.com can show them. Or also Bing can show them, just look for the IP address of the target website and the search result will show all the other websites hosted on the server.

Subdomain

To know subdomains could help to find extra info about the target website.
To know the different subdomain we can use a Linux app called knock (you need python installed).

git clone https://github.com/guelfoweb/knock.git
cd knock
pip3 install -r requirements.txt

python3 knockpy.py <targetwebsite>

And the result will be the list of all subdomains

Unlisted files and folder

To find folder and files could be very helpful because they can contain user, password or other important and sensitive info.
To discover files and folder exposed on the website we can use a tool named “dirb”. It’s a Linux app which uses the brute force to discover them. It has a list of names that will be used to find hidden folder and files.
This list from dirb contains many default file name like robot.txt and config.ini which can contains files that the target website owner doesn’t want to index to search emgine or the db configuration.