Apache Lucene
is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.
Apache Lucene is an open source project available for free download. Please use the links on the right to access Lucene.
Main features:
Scalable, High-Performance Indexing
- Over 800GB/hour on modern hardware
- Small RAM requirements -- only 1MB heap
- Incremental indexing as fast as batch indexing
- Index size roughly 20-30% the size of text indexed
Powerful, Accurate and Efficient Search Algorithms
- Ranked searching -- best results returned first
- Many powerful query types: phrase queries, wildcard queries, proximity queries, range queries and more
- Fielded searching (e.g. title, author, contents)
- Nearest-neighbor search for high-dimensionality vectors
- Sorting by any field
- Multiple-index searching with merged results
- Allows simultaneous update and searching
- Flexible faceting, highlighting, joins and result grouping
- Fast, memory-efficient and typo-tolerant suggesters
- Pluggable ranking models, including the Vector Space Model and Okapi BM25
- Configurable storage engine (codecs)
Cross-Platform Solution
- Available as Open Source software under the Apache License which lets you use Lucene in both commercial and Open Source programs
- 100%-pure Java
- Implementations in other programming languages available that are index-compatible
>