HOME

Date: <2024-12-27 Fri>

Different Conceptions of Learning
Function Approximation vs. Self-Organization

Table of Contents

A paper by Pei Wang. [pdf:] [Paper: cis.temple.edu] [Presentation Slides: csi.temple.edu]

Learning can be of two types (there might be other types):

Inferential Learning appeared in the early days of ML but now lost favor to Deep Learning. However AGI research still has many challenges and the authors think that compared to deep neural networks, inferential learning may provide better alternative as learning paradigm of AGI [Page 9 / Conclusions].

Differences:

Aspect Algorithm Learning (NN) Inferential Learning (NARS)
Representation Vectors Sentences of formal language
  Distributed In-between local and distributed (Page 7)
Network Layered network Graph Network
  Fixed topology Dynamic topology
Task Input/Output Mapping Any question can be asked
Learning Training phase Life long learning
Learning Algorithm Backprop, Gradient Descent Inference Rules of Logic

Algorithmic learning is "using an algorithm to learn an algorithm". While in inferential learning there is a dynamic interaction between multiple algorithms which is more general but less predictible than the input-output mapping using a single algorithm paradigm of Algorithmic learning.

1. NARS - Non Axiomatic Reasoning System

NARS is an example of a system that does inferential learning. It is based on the following definition of intelligence:

“Intelligence” is the ability for a system to adapt given insufficient knowledge and resources. That is, the system must depend on finite resources to make real-time response while being open to unanticipated problems and events.

Consequently, the system’s solutions are usually not absolutely optimal, but the best the system can find at the time, and the system could always do better if it had more knowledge and resources. [Different Conceptions of Learning.pdf: Page 3]

NARS consists of a knowledge base, a collection of inference rules and a control mechanism that applies the rules and updates, queries the knowledge base.

  • Its knowledge base is represented as a graph of
    • Nodes = Representing terms
    • Link = Statement about those terms (with weight = truth value)
    • Along with priority values of the nodes and links that affect how the terms and statements are choosen for inference.
  • As input comes
    • new nodes, links are formed,
    • the weights of old links are updated,
    • and priority of statements are updated
  • The statements of NARS have a truth value of belief assigned to them. Which is a pair of two numbers: Frequency, Confidence
    • Frequency = ratio of positive evidence among total evidence
    • Confidence = ratio of current evidence to future evidence that can arrive

Thus NARS is not just do purely deductive inference and but also other type of logical inference. It can also compose terms (using operation similar to set operations: union, intersection, difference) to create new terms, and do inteference on statments about them [Page 5].

An example of how generalization works in NARS is: Say the system gets an observatoin "Tweety flies". (Tweety is a cartoon character, a bird) Then the system can do following generalization:

  1. "Canaries fly" using the informmation that "Tweety is a canary".
  2. "Birds fly" using the information that "Tweety is a bird"
  3. "Animals fly" using the information that "Tweety is an animal".

The last is an over-generalization which will loose priority due to low frequency of evidence (or through negative evidence), the first is an under-generalization which will loose priority through low confidence (from less total evidence). The second "Birds fly" which is a proper generalization will get high priority due to higher frequency of positive evidence compared to other two.

Other properties of NARS are:

  • Statements are of Subject-copula-predicate format (copula 1 means connecting word). Statement can be of different form, denoting inheritance, equivalence, implication.

    E.g. Some statements are of the form: \(S \rightarrow P \langle t \rangle\) which means `S` is a generalization of `P`. Here \(t\) is truth-value of belief, and \(\rightarrow\) is inheritance copula 1.

  • Statements are themselves terms too. So there can be higher order statements and inferences.
  • Since resources are not infinite, any real time system that is open to new information needs to forget. [Page 4]
    • Absolute Forgetting: Some concepts are totally deleted
    • Relative Forgetting: As some concepts are use infrequently, their priority value start decreasing and thus they are chosen less in inference process.

Footnotes:

1

Copula means a connecting word, in particular a form of the verb be connecting a subject and complement.


Backlinks


You can send your feedback, queries here