How Search Engines Work - Crawling, Indexing, & Ranking

The Process Behind Search Engines

Whether we are searching for a local business on a map application, a video to stream, or for an answer to a general question, most of us rely on search engines countless times throughout each day. Despite this fact, few people know how search engines serve the web pages we view and why they do what they do.

The process has three crucial steps: crawling, indexing, & ranking. Optimizing your website for search engines can ultimately lead to an increase in users and grow your organic source of revenue.

Crawling

Crawling is the process by which a crawler discovers new and updated pages to be added to a search engine index. Crawling can also be called web crawling or spidering.

By definition: A crawler is a bot (a program or automated script) that systematically browses the web.

In layman’s terms: A crawler is a search engines’ right-hand man.

Types of Search Engine Crawlers:

Googlebot is a web crawler used by Google.
Bingbot is a web crawler deployed by Microsoft in 2010 to supply information to its Bing search engine. This is the replacement of what used to be the MSN bot.
Slurp Bot is a web crawl by Yahoo using Bing’s web crawler, as a lot of Yahoo is now powered by Bing.

Crawlers may also be called Spiders, Webspiders, Spiderbot & Robot.

The Crawling Process

The crawling process starts with a list of web addresses from past crawls and sitemaps provided by website owners.
Crawlers use links on those sites to discover other pages.
Crawlers bring the information they found back to servers which is kept track of in a search index.

Indexing

A web index is a database of information on the Internet. Having your page indexed is the next step after it gets crawled.

Search engines use these databases to store billions of pages of information. So, when you use a search engine, you aren’t actually searching everything that exists on the Internet. You are searching that search engine’s index of stored pages and information.

What is the Purpose of Indexing?

The purpose of storing an index is to optimize speed and performance in finding relevant documents for a search query.

Without an index, the search engine would scan every document in the corpus, which would require considerable time and computing power.

For example, while an index of 10,000 documents can be queried within milliseconds, a sequential scan of every word in 10,000 large documents could take hours.

How to Get Your Website Indexed by Search Engines

Add Google Search Console
Create a sitemap
Create a robots.txt
Create internal links
Earn inbound links
Encourage social sharing
Create a blog

Ranking

Search engine ranking is the position at which a website appears in the results of a search engine query.

Each page of the search results typically lists 10 websites, although they are also heavily augmented with local listings, videos, and images.

Three Mechanisms for Ranking Websites

Scoring: Major search engines use hundreds of factors nestled into many algorithms. Think about it like an onion and its layers. All too often, people say things like “the Google algorithm” when in fact, there are many algorithms that Google uses. The combined score of all these algorithms provides the initial rankings.
Boosting: This is another element or signal that might raise a page’s position in the rankings. One example is a statement Google made that fast-mobile sites are given a boost in mobile search. Various forms of personalization also use a boosting element to re-rank results.

Dampening: Not to be confused with penalties, a dampening factor is an element that would lower the rankings of a web page after the initial scoring process. One example is the now Infamous Google Penguin or Panda algorithms. While it may seem like a penalty, it is in fact a dampening element.

Factors That Contribute to Search Engine Positioning

Google takes over 200 ranking factors into consideration.

Some high-level signals are:

On-Page Signals such as Content & User Engagement
Off-Page Signals such as Links & Social Reputation

Algorithm Updates

An algorithm is a process or set of rules to be followed in calculations or other problem-solving operations, especially by a computer. Search engines use algorithms to provide you with a ranked list from their index of what pages you should be most interested in based on the search terms you used.

On Google, other elements like personalized and universal results may also change your search results. In personalized results, the search engine utilizes additional information it knows about the user to return results that are directly catered to their interests.

Universal search results combine:

Videos
Images
Maps
Shopping/Product Listing Ads (PLA)
News

Algorithm Update Factors

Each year, Google changes its search algorithm around 500–600 times.

As one webmaster says of Google: “They move the toilet mid-stream.”

While most of these changes are minor, Google occasionally rolls out a “major” algorithmic update (such as Google Panda and Google Penguin) that affects search results or rankings in a significant way.

Current Google Algorithm Updates

May 2020 Core Update – May 4^th, 2020
January 2020 Core Update — January 13^th, 2020
BERT Update — October 25^th, 2019
Broad Core Update — September 24^th, 2019
June 2019 Core Update — June 2^nd, 2019
Florida 2 Update — March 12^th, 2019

Monitoring the trends in algorithm updates can provide insight into how to optimize your website for the eyes of search engine crawlers. Your business can benefit from search engine optimization.