3 March 2007
Luke 1.7 is out
Luke 1.7

After a year or so of shut-eye, Luke, the Lucene query tool, is back. New features listed below. Time to dust off some patches I’ve got for an earlier version and send them along for 1.8.

Luke 1.7 features:

  • Added pagination of results, especially useful for very large result sets.
  • Added support for new Field flags. They are now displayed in the Document details, and also can be set on edited documents.
  • Added a function to add new documents to the index.
  • Low-level index operations (such as detecting unused files, index directory cleanup) use the newly exposed Lucene classes instead of duplicating their internals in Luke.
  • A side-effect of the above is the ability to properly cleanup all supported index formats, including the new lockless and single-norm indexes.
  • Added a function to copy the list of top terms to clipboard.
  • Added a function to copy the term vector to clipboard.
  • Added a function to close and/or re-open the current index.
  • In the Documents tab, pressing “First Term” now positions the term enumeration at the first term for the selected field.
  • Added a field vocabulary analysis plugin by Mark Harwood, with some modifications.
  • Overall UI cleanup - improved layout in some places, added graphics instead of ASCII art, etc.

Nicely done Andrzej.

In other Lucene related news, it looks like some interesting things are brewing in the Hadoop project.

Hadoop is a framework for running applications on large clusters built of commodity hardware. The Hadoop framework transparently provides applications both reliability and data motion. Hadoop implements a computational paradigm named Map/Reduce, where the application is divided into many small fragments of work, each of which may be executed or reexecuted on any node in the cluster. In addition, it provides a distributed file system (HDFS) that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both Map/Reduce and the distributed file system are designed so that node failures are automatically handled by the framework.

← Read More