| |
|
The search engine is the application which searches the data and returns the results to the client. This usually means creating an HTML page in the specified format.
Most search engines search within an index, created by an SpiderBot application. A few just search the files in real-time, but that can get very slow.
To send a search to the search engine, most systems include forms. The site visitor enters their search terms in a text field, and may select appropriate settings in the form. When they click the "Search" button, the server passes that data to the search engine application.
Types of Site Search Engines
CGI Programs
The Common Gateway Interface (CGI) standard allows a web server to communicate with external programs. Most site search CGIs are invoked by a site visitor filling in data and clicking a Search or Submit button on an HTML form. They take the data from a form as parameters, search for the terms, limit the results according to any other settings, and return the result list as an HTML page. |
 |
|
CGI programs can be written in everything from C to Perl to AppleScript, depending on the web server and the platform. Many CGIs are portable from Unix to Windows and Macs, depending on the language and libraries they use. CGIs are compatible with many different web servers, but there is some overhead in sending the data back and forth, and some cases where the CGI programs can become overwhelmed. See also Plug-Ins.
Perl Scripts
Perl is a scripting language, and is not compiled to object binary like C or Pascal. It has its own syntax and libraries, and communicates with web servers using the CGI standard. You can use Perl scripts on most platforms and with most web servers. Several web site search tools are written in Perl: see the Perl listing for details.
Server Plug-Ins
For better data interchange, less overhead and more flexibility, web server companies have defined APIs (Application Programmer Interfaces) to their servers. This allows third-party developers to create modules for the servers which run inside the server process. Several web site search tools are written to various server APIs. They are rarely portable and generally compiled to binary object code.
Java Applications
Applications, written in the Java language, which runs in the Java Virtual Machine. Applets are small Java applications which run inside the browser program. Java Servlets Applications written in Java using the Java Servlet API. Many web servers now exchange data with Java applications using this interface, much like the CGI system. Because Java is designed to be cross-platform, many of the Java Servlets can run almost anywhere.
Search Servers
Some search engines run as separate servers. The form data is passed as part of the URL, just like a URL, but the search engine application runs as a separate HTTP server on a different machine. This reduces the load on the main web server substantially.
Compatibility
Search Options
- Natural Language Processing
- Boolean Operators
- Vectors
- Fuzzy Matching
- Phrase Searching
- Proximity Matching
- Concept Browsing & Automatic Matching
- Thesaurus
- Query By Example
- Stemming & Substitutions
- Non-English character matching
- Special features (price-range searching, for example)
- Spelling error tolerance.
Site Search Tools
Learn more about Site Search Tools »
|