DekGenius.com
Previous Section  < Day Day Up >  Next Section

10.4 Setting Up the Indexer

Now that the search service is integrated into the application, we'll configure the indexer to automatically update against the current version of the web site on a regular basis. If you recall from the previous chapter, both the console application and the web service have mechanisms that let you launch the indexer service instead of the search service. The question is, how should the indexer be integrated with jPetStore?

10.4.1 Embed in jPetStore or Launch Externally?

The first approach is to make the indexer part of the jPetStore application itself; in other words, to add code to jPetStore that invokes the indexer. jPetStore could invoke the indexer at the request of a user or on a schedule. Both methods have problems: if we expose a user interface for launching the indexer, we have to wrap it in some kind of secured section of the site for administrative users only. Currently, jPetStore has no such security built in. Building it just to wrap around the indexer seems like a major stretch—too much complexity, not enough payoff. Which means a manual access point is out.

The other option is to build a scheduler into the jPetStore application. Regardless of how the architecture, a scheduler would require the jPetStore application to be running for indexing to occur. Since jPetStore is a web- and container-based application, its lifecycle is entirely dependent on the external hosts. If the web server software is turned off for any reason, jPetStore shuts down as well. If the interval for the indexer falls in that window, the indexer doesn't run. In addition, writing scheduling code is completely outside of the problem domain for jPetStore, just as it was for the Simple Spider. The jPetStore application should do one thing: display animals in a web catalog.

We have no option but to invoke the indexer from some other location. A good strategy is to leverage an existing scheduler system: on Windows it's schtasks and on Linux it's cron. Let's implement the scheduled indexer on Windows.

10.4.2 Using the System Scheduler

For ease of use, we create a batch file for actually launching the service. We want to invoke the Java runtime to run our ConsoleSearch class's main method, passing in the starting point for jPetStore. The command (and, therefore, the contents of our batch file) looks like this:

java c:\the\path\to\ConsoleSearch /i:http://localhost/jpetstore

We store that in a file called jpetstoreIndexer.bat. For simplicity's sake, we'll store it in c:\commands.

In order to schedule the indexer to run every night at 2:00 a.m., issue the following command (whiled logged in as a local administrator):

c>schtasks /create /tn "jpetstore Indexer" /tr:c:\commands\jpetstoreIndexer.bat 
        /sc daily /st 02:00:00

The /tn flag creates a unique name for the text; /tr points to the actual command to invoke; /sc is the time interval; and /st is the specific time to launch the indexer on that interval.

Similarly, on Linux, edit the crontab file and launch the cron daemon to accomplish the same thing.

10.4.3 Smell the Roses

The beauty of this solution is that our application, the Simple Spider, has been repurposed to run in both a container-based environment (Spring) and a direct runtime environment (via the scheduler calling the Java runtime directly) without any extra code whatsoever. Because of its simple architecture and loosely coupled services, the Spider itself can operate just fine in both environments simultaneously. We didn't have to write a new access point or code a new UI or even make any configuration changes. Even better, we were able to take a single application from our first chapter and repurpose its internal services to two different endpoints without much work. It's good to step back every now and again and smell the roses, just to realize what a little forethought and adherence to simple principles gets you.

10.4.4 Principles in Action

  • Keep it simple: use system-provided scheduler and existing console-based access point to application

  • Choose the right tools: schtasks, cron, ConsoleSearch

  • Do one thing, and do it well: neither Spider nor jPetStore worry about the scheduling of the indexer; the scheduler only worries about the index, not the rest of the functionality

  • Strive for transparency: the scheduler knows nothing about the implementation details of the indexer or even where the results of the indexing will end up: it's all handled in configuration files

  • Allow for extension: none

    Previous Section  < Day Day Up >  Next Section