LPA - Case Based Reasoning Toolkit
The LPA Case Based Reasoning (CBR) Toolkit is a collection of routines, supplied in the form of an API, which support the retrieval of similar cases within relational databases such as Access, Oracle, SQL Server etc.
Four Phases of CBR
The key stages of CBR as supported by the LPA CBR toolkit are:
- Selection of the Data Source: The first stage is to select the appropriate data for analysis. The toolkit assumes that all joins and exclusions and views are actioned outside of the toolkit. The toolkit works off a single table. You need to designate which columns are to be included in the later stages. You can also specify how the notion of 'closeness' is treated within columns containing non-numeric information. For instance, you may wish to say that 'Turkish Coffee' is similar to 'Greek Coffee'.
- Constructing an Input Query: The input query determines what to look for in the database, and establishes how many similar records to retrieve. As opposed to conventional database retrieval, a CBR search will return similar records based on its understanding of 'closeness'. So, whilst we might say we want to buy, ideally, a red car, with 4 wheel drive, less than 100K miles on the clock, for less than 10K USD. We may well end up having to chose between an orange car with 2 wheel drive and 90K miles for 12.5K, a yellow car with 4 wheel drive, 120K miles, costing 20K and a black car with 2 wheel drive, 150K miles for 9K.
- Retrieving the Records: The database is searched, initially for an exact match, and then the search is widened iteratively until enough records have been identified which satisfy the initial query requirements. The records are then downloaded for local evaluation.
- Re-ordering the Retrieved Records: Here the user can cause re-sort in a variety of different manners by applying weights to columns to indicate the relative importance of columns, and/or specify and apply rules.
What's in the LPA CBR toolkit?
The LPA CBR toolkit contains:
- API Routines: A collection of routines for dealing the four phases of case based reasoning as described above. These routines are presented as Prolog predicates and can be combined with most all other LPA products and features. The routines often return lists and structures which can be manipulated easily by the Prolog developer. This makes the toolkit an ideal basis for Prolog application developers to build their own CBR applications or to include a CBR component within their existing applications.
- Source Code Example: A fully documented source code example is supplied which shows you how to build an interactive CBR oriented application using the API routines and a set of dialogs designed using the Dialog Editor utility which comes with LPA Prolog.
- Sample CBR Application: A stand-alone, point-and-click desktop application, based on the source code example described above is supplied which you can use 'out-of-the-box' to demonstrate and explore the CBR concepts described here.
Strategy and Philosophy
The aim of the LPA CBR toolkit is to use the power and performance of database technology to prune potentially a very large number of records in a corporate data store to a more manageable number. These are then cached locally where they can be further examined. The toolkit makes use of the underlying Prolog and Flex to provide a rules-based interface for specifying conditions and situations where the standard measure of 'closeness' between records should be overruled.
How does the LPA CBR toolkit work?
For any given input query, the LPA CBR toolkit will measure how far away other records are by considering each dimension of a multi-dimensional query in isolation. For each individual column there is a formula available which will compute how near or far apart any two values are. This can be overriden through programmatic control. The total distance between any two records is based upon the sum of all the individual column distances.
Normal scoring and weighting algorithms tend to give a linear behaviour. In the real world, however, we quite often wish to not to consider certain situations, say where the cost is just too high, or maybe both the cost and the mileage are both more than a certain distance from our ideal choices. Rules can be supplied using a Prolog syntax and applied to the current set of retrieved records. This will result in different sub-set of the records being suggested by the system.
The LPA CBR toolkit generates large volumes of SQL queries to analyse the database. By utilising the performance of the database engine, the LPA CBR toolkit offers a truly scaleable and robust architecture.
What is Required?
The LPA CBR toolkit uses ODBC and SQL to query databases. You need to ensure that you have the correct ODBC drivers installed and that you have set up your data files as data sources.
Run-time Deployment and Application Deployment
The LPA CBR toolkit can be integrated with most all other LPA products and technology. By combining the LPA CBR toolkit and the Intelligence Server, it is possible to present the case based reasoning algorithms as a COM object for embedding within, say, a VB-oriented application. By combining with ProWeb, it is possible to develop a web-based CBR application.
Integration with WIN-PROLOG and its Toolkits
WIN-PROLOG is the central product in a series that consists of programming tools that works cross-platform on Windows XP, 2000, NT, ME, 98 and 95; the series also includes flex, Flint, the Data Mining toolkit and the ProData Database Interface toolkit. The Windows series uses incremental compilation of user programs to provide the execution speed of a compiler but with the interactive behaviour of an interpreter. This allows for the in-line debugging and editing of programs.