DekGenius.com
Previous Section  < Day Day Up >  Next Section

9.4 The Design

Our application is beginning to take shape. Figure 9-2 shows the entire design of the Simple Spider. It has layers that present a model, the service API, and two public interfaces. There is not yet a controller layer to separate the interfaces and logic. We'll integrate a controller in the next chapter.

Figure 9-2. The Simple Spider design
figs/bflJ_0902.gif


We need to provide a configuration service to our application. I prefer to encapsulate the configuration into its own service to decouple the rest of the application from its details. This way, the application can switch configuration systems easily later without much editing of the code. For this version of the application, the Configuration service will consist of two class, ConfigBean and IndexPathBean, which will encapsulate returning configuration settings for the application as a whole (ConfigBean) and for getting the current path to the index files (IndexPathBean). The two are separate classes, as finding the path to the index is a more complex task than simply reading a configuration file (see the implementation details below). The configuration settings we will use are property files, accessed through java.util.Properties.

The crawler/indexer service is based on two classes: IndexLinks, which controls the configuration of the service in addition to managing the individual pages in the document domain, and IndexLink, a class modeling a single page in the search domain and allowing us to parse it looking for more links to other pages. We will use Lucene (http://jakarta.apache.org/lucene) as our indexer (and searcher) because it is fast, open source, and widely adopted in the industry today. The search service is provided through two more classes, QueryBean and HitBean. The former models the search input/output mechanisms, while the latter represents a single result from a larger result set. Sitting over top of the collection of services are the two specified user interfaces, the console version (ConsoleSearch) and a web service (SearchImpl and its WSDL file).

    Previous Section  < Day Day Up >  Next Section