JSSINDEX: The JavaScript Search Engine
Home Page |
SourceForge Summary |
Download |
Forums |
News |
JSS is a simple search engine designed for CDROM or Web-based document
collections. The documents to be indexed can be in HTML, PostScript
(.ps and .ps.gz), PDF, and DjVu. The main feature of JSS is that the
query engine and the index are entirely in JavaScript, and therefore
require no other software than a JavaScript-enabled Web browser.
What is the advantage? If you are distributing a collection of
document on CD-ROM, you can provide platform-independent full-text
search without asking your users to install any software on their
machine. If you publish a collection of documents on the web, you
don't need to install any server-side scripts: search queries run
entirely in the user's web browser.
INSTALLATION
jssindex was tested on GNU/Linux. The indexer should run wherever Lush
and DjVuLibre run (this includes Solaris, Irix, and Windows under
Cygwin). The query engine produced by jssindex runs on any
Javascript-enabled Web browser.
To install:
- 0 - get the latest version of jssindex from here.
- 1 - download and install Lush (jssindex is written in Lush)
Lush is included with many Linux distros. It can also be
downloaded from http://lush.sf.net.
(if you don't know about Lush, read below).
- 2 - if you want JSS to be able to index DjVu documents, make sure
you have djvused in your path. djvused is part of DjVuLibre,
which is included with many Linux distros. It can also be
obtained at http://djvu.sf.net.
(if you don't know about DjVu, read below).
- 3 - copy jssindex somewhere in your path (e.g. /usr/local/bin)
and make it executable by typing "chmod a+x jssindex".
- 4 - Enjoy.
jssindex uses two other programs: ps2ascii and zcat. make sure you
have those in your shell path if you want jssindex to index documents
in postscript (.ps), PDF (.pdf), and gzipped postscript (.ps.gz).
ps2ascii is part of the GhostScript package (also known as gs), and
zcat is part of gzip. Both packages are installed by default in most
Linux distros.
USAGE
Call jssindex with no argument to get the full documentation. Here is
a simple example of usage: let's assume that the content of your Web
site is in the directory web-root, and that the collection of
documents you want to index are under web-root/mydocs.
Type the following at the shell prompt:
% cd web-root
% jssindex mydocs
Now point your Web browser to web-root/jss-index.html.
If you want the JSS files neatly put in their own directory, do:
% mkdir jss
% cd jss
% jssindex ../mydocs
Now point your Web browser to web-root/jss/jss-index.html.
That's it.
CREDITS
jssindex was written by Yann LeCun and Florin Nicsa, with
contributions from Leon Bottou. Lush was created by Leon Bottou and
Yann LeCun. DjVu was created by Leon Bottou, Yann LeCun, Patrick
Haffner, and a large cast of characters.
CONTACTS
For Yann's contact info, visit http://yann.lecun.com
WHAT ARE LUSH AND DJVU?
Lush is an interpreter/compiler for a dialect of Lisp.
More info can be obtained at http://lush.sf.net.
jssindex is written in Lush. Lush is required for running
jssindex, but not required by end-users to perform search queries
(that part merely requires a Web browser).
DjVu is a file format and compression technology for documents
(particularly scanned documents) and images (particularly
high-resolution ones). DjVuLibre is an open source implementation
of DjVu that includes viewers, decoders, utilities, and simple encoders.
The simplest way to produce DjVu documents from originals (scanned or
digitally produced) in TIFF, PS, PDF, or other formats is to use
one of the free on-line conversion servers, for examples:
http://any2djvu.djvuzone.org
(single document conversion with OCR while-U-wait).
LICENSE AND LEGAL STUFF
jssindex is Copyright (C) 2003 Yann LeCun and is distributed under the
GNU General Public License.