Friday, June 11, 2010

Goal Systems

Continues this.

It seems fairly clear that backwards-chaining should be seen as a result of goal-oriented behaviour, whereas forward-chaining should be more a result of the reactive reasoning layer. This thought arises from several considerations (including efficiency), but mainly from the idea that backtracking is simply an attempt to achieve a goal by trying different possible (sets of) subgoals in turn.

A simple production-rules-vs-logic-programming view would see the issue as two different ways of interpreting the conditional from propositional logic; A->B either means "if you have A, conclude B" (if you're a production rules person) or "if you want B, try A" (if you're a logic programming person). I'm denying that that is what's going on. A->B in the knowledge base means just what it means in the declarative logic, and doesn't by itself trigger any inferences. For inferences to be triggered, procedural knowledge must be present. Procedural knowledge can always be put in a series of if-then type statements which are executed by forward-chaining; both "if you have A, conclude B" and "if you want B, try A" are just examples of this.

Now, for probabilistic-relational systems, goal-oriented reasoning is set within the framework of bayesian utility theory. The mathematical theory is simple and elegant (although there have been refinements over time, such as causal decision theory, timeless decision theory, updateless decision theory); but like pure probabilistic reasoning, it isn't immediately obviously how to efficiently implement/approximate. Much of the relevant research for this comes from the domain of reinforcement learning; also, an algorithm that's generating a lot of research right now is monte carlo tree search. It seems reasonable to follow these threads of research for a goal-system implementation.

For me, it seems natural to replace the on/off subgoals & backtracking which appear in crisp systems with a more continuous goal structure. This is based on the analogy to the way the truth values become continuous in these systems, which my be a bit hasty. Still, here is my rough outline:

I've been considering for some time a system which allocates time to different reasoning processes based on a currency of "effort". Effort could take two forms: a longer-term effort would represent percentage of processing time, while a shorter-term effort would represent amounts of time. There would be long-ish cycles in which every goal that has long-term effort attached to it gets a corresponding amount of short-term effort assigned to it. A goal can then cause subgoals to be spawned which it assigns some of its effort to.


This may or may not be a good idea. :) The shorter-term effort seems like a natural concept; it's been invented independently by two different people I've talked with (Ben Goertzel's group, and Russel Wallace). It seems useful for complex combinations of approximation algorithms that can be given more or less time to give better or worse answers, especially if they may employ each other or themselves in a loopy manner. Programs based on this sort of effort would be guaranteed to terminate in whatever amount of time they are given (though, perhaps not with a very good answer.)

The longer-term type of effort is not so obviously useful; hard to say.

No comments:

Post a Comment