How Google Search Engine Works


Google Search Engine

The search engine is set of programs which crawls through the web pages, stores them on server and do indexing of stored web pages in order to show the relevant pages for user query on search engine result page (SERP). There are three major search engines, Google, Bing and Yahoo, but only Google has changed our life completely.

Google Search engine changed the way we find things over internet. Initially, when the internet boom started it was hard to find websites having good content, but Google made it easy, It’s like book with proper indexing, which tells us where content is located.

Google has evolved a lot from an academic project of Larry Page and Sergey Brin to multi-billion dollar company. Trillions of web pages are available over internet, but finding out most relevant pages against your search query within seconds is fascinating. Have you ever wondered how Google does that?

It’s a three stage process consisting of

1. Crawling

2. Indexing

3. Presenting Results

Google Search Engine

Crawling: – Crawling is the process of browsing web pages with the help of web crawlers like “GoogleBot”.

  1. Crawler’s scan web pages, collects data and identify links associated with those pages.
  2. Crawler’s moves to other web pages by following links contain in web pages that is being scanned.
  3. Collected data send back to Google’s server.

Google Searh Engine Crawl’s process starts with a list of web pages or addresses discovered in previous crawls, or the XML Sitemap submitted by webmasters. Whenever crawlers visit websites, they look for the links to visit other web pages. These crawlers monitors for change in existing websites, dead links and new websites to update or shuffle indexing of web pages.  Even they also determine which sites to crawl, how often and how many pages to retrieve from each web site. Google does not charge you to crawl your websites and rank it in their search engines.

Indexing: -After crawling through trillions of web pages and stored them, it’s time to create an index so that we can serve results quickly to users. Google index consists of information about words and locations.While indexing, it fetches the information from title tags,heading tags, permalink and ALT attribute value of images for better understanding of web page. Google-boats does not fetch information behind dynamic web pages or rich media files. Google search not only give the results by just query matching, it involves complex calculation done by algorithms running behind search engine to show you expected results. Such as you search for Airplane, so you don’t want to see the page which fills up with word airplane, you might want information about the airplane, images or videos. Indexing systems also consider dates on which page published, what information, images or videos, it contains and by the help of Knowledge Graph, Google is trying to give you much better search experience in terms of people, places,things etc.

Presenting Results:

Billions of searches handle by Google each day and retrieving relevant result in a fraction of seconds is amazing. When you type a search query, Google looks into index for matching pages and shows you the relevant results, but relevancy is calculated by more than 200 factor’s one of them is PageRank. The PageRank algorithm measures importance of a page depends upon the incoming links from other websites. Basically, each link on a page to your site from another site improves your website PageRank. It is not the only algorithm used by Google, but it is the first one. Google is improving to determine spam sites which can negatively impact on search results. The links are good enough if those are given based on content quality.

In order to rank high in Google search engine, Google should be able to crawl your website properly, to improve crawling experience you can submit XML sitemap to Google webmaster tools. It helps to improve ranking of your website since webmaster tools  provides data about your site, such as From where traffic is coming , Which pages visited most by users, how much time they are spending on your website etc.

If you want to know indexed pages of your website just type  Site:yourwebsitename.com  in your address bar.

Photo credit: MoneyBlogNewz via photopin cc

The following two tabs change content below.
Neeraj Kaple is a software engineer based in Mumbai. He writes about softwares, blogging tools, internet marketing and how-to guides. He is the founder of Techmazic.com.