Wednesday, April 2, 2008

Lucene.net - Index Format





























Directory, Document, Fields

Index, Segment, Files
----------------------------------------------------------------------------
Index: Is a container for all indexing files
Segment: a sub-index, and a fully independent index/search module
File: Contains physical data

Inverted index: focus on Term,
  • Field names (.fnm)
  • Term dictionary (.tis)
  • Term frequencies (.frq)
  • Term positions (.prx)

Files:
Some definition:
  • A document is a sequence of Fields;
  • A Field is a named sequence of Terms;
  • A Term is a string

No comments: