How crawl works in SharePoint || How indexing work || Basic Concept

Hey Friends,

Today I am here with some concept task . How a search work in SharePoint, but for a search to work SharePoint would first index the content sources.

So, How a crawl work to index the content from MOSS?

Main thing is that it can index, it can crawl anything stored in a server, in any format PDF, Zip, word, excel , txt, HTML, RTF, MS-Office etc. But to index the content other than office or Microsoft the index process is little more complicated but interesting. But the initial or the process is same for all. So let’s have a look on it.

1) When the scheduler for the crawl or index run, it will search for every place you have defined or what you called it a content source.

2) When it find a file in that, it will look on it extension. It will check in SharePoint SSP whether the type is defined to be indexed or not.

3) Now when the SharePoint confirms of file type it will look for a software/Ifilter to read this file. Ifilter is a software which will read a file. Every file need its own ifilter.

4) If SharePoint finds a ifilter for it. It will start opening this file and start scanning the file. It will remove certain words that are not required in search or not need to be indexed ex: 1 ,2 numerals etc.

5) After scanning the whole file it will index the content in index file with the pointer of name and location of the file.

6) Once a file is completed with full process. It will start for next file and with the same process as above.

Now to search or index any file of our use like PDF etc. we need to install the ifilter of every such types, which do not come by default. We can also put the images of such file type in SharePoint images file(12 hive) so that in search document come with their images. 🙂

Hope I was able to describe the content based on my knowledge and learning.

Feel free to Rate and provide feedback if you find post useful

Hope this help
Ashi

19 Responses to How crawl works in SharePoint || How indexing work || Basic Concept

Riti says:

January 9, 2013 at 3:12 pm

Very Nice content..

Rajni says:

January 9, 2013 at 6:20 pm

Its really good article Ashish….Now I can tell whats happening in backend..

ashishbanga says:

January 9, 2013 at 6:58 pm

Hi Rajni,

Thanks for the compliment. You are surely welcome to Share facts, which can increase my knowledge.

ashishbanga says:

January 9, 2013 at 7:04 pm

Thanks Riti

Pingback: How Search works in SharePoint || Basic Concept « AshishBanga
Abhishek says:

January 21, 2013 at 1:05 pm

Awesome way of Explaining the content. Keep it up

Pingback: How to find and install common IFilters||For Search||Crawl « AshishBanga
Nick says:

February 1, 2013 at 2:11 am

great ideas. have just found you here, and will bookmark to come back

juvesiio says:

December 14, 2013 at 4:21 am

You rеally make it seem so easy with уour presentation but I find this topic tо be rеally something that І think І woulԁ never
understand. It ѕeems too complex and extremely broad foг mе.
I’m lоoking forward foг your next post, І will try to get thе hang
of it!

- ashishbanga says:
  
  December 25, 2013 at 12:42 am
  
  Hello Juvesiio,
  
  Thanks.
  I would surely love to help. In case you feel any issue. Please let me know with ur mail id.
  Will try my best to answer
  
  Ashi
  
marcelachitiva says:

November 12, 2014 at 8:04 pm

Reblogged this on El blog de Chiti.

Nafi mohammad says:

January 9, 2015 at 7:34 am

really good it is very useful

- ashishbanga says:
  
  January 29, 2015 at 2:09 pm
  
  Thanks Nafi
  
rajeev patidar says:

January 21, 2015 at 12:07 pm

very nice article……i will give 10/10. thank u somuch

- ashishbanga says:
  
  January 29, 2015 at 2:09 pm
  
  Thanks Rajeev
  
Sir Poon says:

August 3, 2015 at 10:28 pm

This is quite an over-simplification. It especially does not deal with the second (and so on) passes of the crawler as well as incremental and continuous crawls. You have just covered one tiny aspect which is essentially what an IFilter does. It would make your article a lot better if you talked about how the individual items are stored in the crawl database and what happens when the items are changed, unchanged, and moved. Also, the continuous crawling in 2013 is a big improvement because of how it staggers search threads.

I hope this feedback is helpful!

- ashishbanga says:
  
  August 21, 2018 at 7:42 pm
  
  Sure I will do the same..& come up with the post
  
Gopi says:

June 5, 2016 at 11:13 am

Good explanation bro

- ashishbanga says:
  
  August 21, 2018 at 7:40 pm
  
  Thank You Gopi