What happens when you type
https://www.mywebsite.com in your browser and press
If we want to know what happen, we should talk about Domain Name.
What is a Domain Name?
Domain name is the address of your website that people type in the browser URL bar to visit your website.
In simple terms, if your website was a house, then your domain name will be its address.
A more detailed explanation:
The Internet is a giant network of computers connected to each other through a global network of cables. Each computer on this network can communicate with other computers.
To identify them, each computer is assigned an IP address. It is a series of numbers that identify a particular computer on the internet. A typical IP address looks like this:
Now an IP address like this is quite difficult to remember. Imagine if you had to use such numbers to visit your favorite websites.
Domain names were invented to solve this problem.
This is how a Domain name is build:
Hypertext Transfer Protocol Secure HTTPS:
Is an extension of the Hypertext Transfer Protocol (HTTP). It is used for secure communication over a computer network, and is widely used on the Internet. In HTTPS, the communication protocol is encrypted using Transport Layer Security (TLS) or, formerly, Secure Sockets Layer (SSL).
Its importan to know that when whe are using HTTPS with the last S this means that our trafic is encrypted its means that your information is secure and the hackers cant get it this is very important if you are buying using your credit card.
World Wide Web www:
Technically, it’s a subdomain traditionally used to indicate that a site is part of the web, as opposed to some other part of the Internet like Gopher or FTP.
There is nothing special about the “www” subdomain other than tradition. Any string of letters can be used in a subdomain. Take a look in the address bar of your browser as you read this article. It starts with “blog” because our blog lives on a subdomain of our primary nexcess.net domain. The main site lives on a “www” subdomain.
The domain name com is a top-level domain (TLD) in the Domain Name System of the Internet. Added in 1985, its name is derived from the word commercial, indicating its original intended purpose for domains registered by commercial organizations. Later, the domain opened for general purposes.
We also need to know about some terms to understand more about this “simple” request, check this diagram and you will know how its works:
Domain Name System DNS:
is a hierarchical and decentralized naming system for computers, services, or other resources connected to the Internet or a private network.
The DNS directory that matches name to numbers isn’t located all in one place in some dark corner of the internet.
When your computer wants to find the IP address associated with a domain name, it first makes its request to a recursive DNS server, also known as recursive resolver. A recursive resolver is a server that is usually operated by an ISP or other third-party provider, and it knows which other DNS servers it needs to ask to resolve the name of a site with its IP address. The servers that actually have the needed information are called authoritative DNS servers.
Transmission Control Protocol/Internet Protocol TCP/IP:
is a suite of communication protocols used to interconnect network devices on the internet. TCP/IP can also be used as a communications protocol in a private computer network (an intranet or extranet).
A Firewall is a network security device that monitors and filters incoming and outgoing network traffic based on an organization’s previously established security policies. At its most basic, a firewall is essentially the barrier that sits between a private internal network and the public Internet. A firewall’s main purpose is to allow non-threatening traffic in and to keep dangerous traffic out.
Requests are received by load balancers and they are distributed to a particular server based on a configured algorithm. Some industry standard algorithms are:
- Round robin
- Weighted round robin
- Least connections
- Least response time
A web server is hardware or software through which a computer can host a website. A server can be hosted in kernel mode or user mode; kernel mode has the web server run on top of the operating system, while user mode (downloaded just like another app or program) is slower and less effective. Just one example of a well-known web server is Apache, which runs well on a variety of popular operating systems.
Web servers communicate with clients (those who are accessing their hosted websites) through the Transmission Control Protocol and Internet Protocol. Typically web servers are programmed to allow a certain amount of traffic, or a certain number of requests, for a period of time. This is set to protect the server from being overloaded, which in some cases can temporarily make it inoperable.
A load balancer is a device that acts as a reverse proxy and distributes network or application traffic across a number of servers. Load balancers are used to increase capacity (concurrent users) and reliability of applications. They improve the overall performance of applications by decreasing the burden on servers associated with managing and maintaining application and network sessions, as well as by performing application-specific tasks.
An application server is a program that resides on the server-side, and it’s a server programmer providing business logic behind any application. This server can be a part of the network or the distributed network.
Now, if we would like to know the purpose of a server program, it goes this way:
Ideally, server programs are used to provide their services to the client program that either resides on the same machine or lies on a network.
Application Server is a type of server designed to install, operate, and host applications. In the early days of application servers, there was a huge growth in the number of applications brought to the Internet. Those applications became bigger and bigger with the demand for adding more and more functionalities to the application and become more complex to run and maintain. There was a need for some kind of program on the network while it will share application capabilities in an efficient and organized manner.
Also called electronic database, any collection of data, or information, that is specially organized for rapid search and retrieval by a computer. Databases are structured to facilitate the storage, retrieval, modification, and deletion of data in conjunction with various data-processing operations. A database management system (DBMS) extracts information from the database in response to queries.
Now what happens when you type
1. You enter the URL in the browser.
Suppose you want to visit the website of www.mywebsite.com So you type www.mywebsite.com in the address bar of your browser. When you type any URL you basically want to reach the server where the website is hosted.
2. The browser looks for the IP address of the domain name in the DNS(Domain Name Server).
DNS is a list of URLs and their corresponding IP address just like the telephone book has phone numbers corresponding to the names of the people. We can access the website directly by typing the IP address but imagine remembering a group of numbers to visit any website. So, we only remember the name of the website and the mapping of the name with the IP address is done by the DNS.
The DNS checks at the following places for the IP address.
- Check Browser Cache: The browser maintains a cache of the DNS records for some fixed amount of time. It is the first place to run a DNS query.
- Check OS Cache: If the browser doesn’t contain the cache then it requests to the underlying Operating System as the OS also maintains a cache of the DNS records.
- Router Cache: If your computer doesn’t have the cache, then it searches the routers as routers also have the cache of the DNS records.
- ISP(Internet Service Provider) Cache: If the IP address is not found at the above three places then it is searched at the cache that ISP maintains of the DNS records. If not found here also, then ISP’s DNS recursive search is done. In “DNS recursive search”, a DNS server initiates a DNS query that communicates with several other DNS servers to find the IP address.
So, the domain name which you entered got converted into a DNS number. Suppose the above-entered domain name www.mywebsite.com has an IP address 100.95.224.1. So, if we type https://100.95.224.1 in the browser we can reach the website.
3. The Browser initiates a TCP connection with the server.
When the browser receives the IP address, it will build a connection between the browser and the server using the internet protocol. The most common protocol used is TCP protocol. The connection is established using a three-way handshake. It is a three-step process.
- Step 1 (SYN): As the client wants to establish a connection so it sends an SYN(Synchronize Sequence Number) to the server which informs the server that the client wants to start a communication.
- Step 2 (SYN + ACK): If the server is ready to accept connections and has open ports then it acknowledges the packet sent by the server with the SYN-ACK packet.
- Step 3 (ACK): In the last step, the client acknowledges the response of the server by sending an ACK packet. Hence, a reliable connection is established and data transmission can start now.
4. The browser sends an HTTP request to the server.
The browser sends a GET request to the server asking for www.mywebsite.com webpage. It will also send the cookies that the browser has for this domain. Cookies are designed for websites to remember stateful information (items in the shopping cart or wishlist for a website like Amazon) or to record the user’s browsing history etc. It also has additional information like request header fields(User-Agent) for that allows the client to pass information about the request, and about the client itself, to the server. Other header fields like the Accept-Language header tells the server which language the client is able to understand. All these header fields are added together to form an HTTP request.
Sample Example of HTTP Request: Now let’s put it all together to form an HTTP request. The HTTP request below will fetch abc.html page from the web server running on afteracademy.com
GET /abc.htm HTTP/1.1
User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)
Accept-Encoding: gzip, deflate
5. The server handles the incoming request and sends an HTTP response.
The server handles the HTTP request and sends a response. The first line is called the status line. A Status-Line consists of the protocol version(e.g HTTP/1.1) followed by numeric status code(e.g 200)and its associated textual phrase(e.g OK). The status code is important as it contains the status of the response.
- 1xx: Informational: It means the request was received and the process is continuing.
- 2xx: Success: It means the action was successful.
- 3xx: Redirection: It means further action must be taken in order to complete the request. It may redirect the client to some other URL.
- 4xx: Client Error: It means some sort of error in the client’s part.
- 5xx: Server Error: It means there is some error on the server-side.
It also contains response header fields like Server, Location, etc. These header fields give information about the server. A Content-Length header is a number denoting the exact byte length of the HTTP body. All these headers along with some additional information are added to form an HTTP response.
Sample Example of HTTP Response: Now let’s put it all together to form an HTTP response for a request to fetch the abc.htm page from the web server running on afteracademy.com.
HTTP/1.1 200 OK
Date: Tue, 28 Jan 2020 12:28:53 GMT
Server: Apache/2.2.14 (Win32)
Last-Modified: Wed, 22 Jul 2019 19:15:56 GMT
6. The browser displays the HTML content.
All these steps happen each time we enter any URL. All these processes happen in the background and within milliseconds. That’s it for this blog. Hope you enjoyed reading this blog.