OutOfBrain cognitive architecture
Shown in the diagram below is a schematic overview of the OutOfBrain architecture.
There is no need to understand the above diagram at this point.
We invite you to read along as we describe each component in the following paragraphs.
The network represents the structure of the acquired knowledge.
The network is a Directed Acyclic Graph (DAG), consisting of nodes and connections.
Each node has a number of connections that connect to other nodes.
The connections define how the nodes relate to each other.
As will be shown later, each node represents a specific meaning, condition or piece of knowledge.
The network is managed by a group of agents.
Each node in the network is owned and managed by one of the agents.
Agents are simple autonomous computer programs, each with their own specialized task or goal.
As we will see later, agents can create new nodes and connect them to other nodes.
Each new node represents a new piece of acquired knowledge.
In this way, agents work together by building upon each others' knowledge, collectively stored in the network.
The network then grows as the system learns.
We will talk more about the agents later.
Let's first take a look at the nodes.
Each node has an activity function.
There are different types of activity functions, depending on the agent that created or owns the node.
The activity function is like an algorithm.
It is the algorithm and the connections that gives the node its specific meaning and purpose.
The activity function continuously "listens" to the connected nodes.
Under certain conditions the activity function "fires" and the node generates an output value.
We refer to such events as "activity".
The output is a numeric (real) value.
Most nodes output values between 0.0 and 1.0.
In general, the value says how "true" the current state of the node is, where 1.0 means "completely true".
Each node in the network is connected to the memory.
Whenever a node outputs a value, the value is moved to the memory and marked with the current time.
The memory contains a stream of previous node values and when they happened.
Each node in the network outputs its activity to the memory.
The memory contains a record of the past of the entire network.
The memory is like an EEG of the network, where each signal corresponds with the activity of a single node.
Patterns & sequences
The combined memory contains a long stream of past activity of each node.
Within this stream of events there are a lot of interesting patterns.
Patterns are forms of regularity in the memory.
Patterns are interesting because they contain information about how events relate to each other.
Understanding the relationships between events helps organize the memory and helps understand the outside world.
As we will see later, learning to detect patterns is useful for predicting the future.
Many of the agents are specialized in the discovery of new patterns.
These "pattern agents" continuously look at activity entering the memory and try to detect new patterns.
There are many different types of patterns and for each type of pattern there is an agent that is specialized in only that pattern.
The idea is as follows: whenever an agent discovers a new pattern in the activity of certain nodes,
the agent will then store this pattern in the network as a new node.
This new node and its connections is a description of the pattern.
The node itself then listens to the connected nodes and will react if this particular pattern is detected.
To see how this works, we will look at an example.
One of the most simple types of patterns is a "sequence".
A sequence is a particular order of events where each event corresponds with a particular node activity.
When a new sequence is discovered, the "sequence agent" will create a new "sequence node".
The sequence node is described in detail in the following paragraphs.
Patterns & sequences
A sequence node describes a simple sequence of activity in memory.
The sequence of activity is described by the connections of the sequence node.
The connections are ordered to describe the time-based order of the sequence.
The idea is that the sequence node will fire if the sequence is detected.
The sequence node listens to the activity of the connected nodes entering the memory.
The activity of other nodes in the network are ignored.
When a complete sequence is detected, the sequence node will fire and output a value of 1.0 to memory.
The value 1.0 means "sequence fully detected".
Note that the partial values are also written to memory. In the example shown in the diagram, the intermediate
values of 0.3 and 0.6 correspond with the partial sequences being completed, i.e. one-thirds and two-thirds completion, respectively.
The sequence node only outputs a value of 1.0 if the sequence is fully completed.
The diagram shows an example of an incomplete sequence.
If the sequence is not completed then sequence node is reset such that it listens for the activity of the first connected node.
The sequence node therefore acts like a state machine, where a value of 1.0 corresponds with the final state.
The connections may have optional properties called filters.
Filters tell the sequence node to ignore activity that that does not match the filter condition.
An example of a filter is shown in the diagram, where activity must equal the filter value.
Some filters are more complex and may describe a range of values.
Some filters are more complex and may contain optional time-based constraints, to instruct the sequence node that the
connected nodes must fire within a certain time range.
It is possible for multiple connections to point to the same node.
This is used when a sequence involves the same node multiple times.
In the example shown in the diagram, the sequence is complete when
the connected node fires three times.
The memory of a sequence node tells when a particular sequence happened in the past.
The memory of sequence nodes make the original sequences of events in memory redundant, because
this information is also present in more "simpler" form in the sequence node's memory.
As we will see later, the memory agent may decide to remove these redundant pieces of memory.
The memory of sequence nodes can be used for prediction.
Suppose at some point in time a particular sequence is partially complete and we want
to determine the probability of the sequence advancing to the next state.
In other words, we want to know if the next node in the sequence will fire.
In the diagram shown, this corresponds with the probability of the sequence node firing
a value of 1.0 after firing a value of 0.6.
So we only need to look at the sequence node's memory, and not at the memory of the connected nodes.
Let S(n) be the value of the node's current state.
Let S(n+1) be the value of the node's next state.
A simple approximation of the probability P that it will advance to the next state is simply the number of occurences in memory of the next value S(n+1)
divided by the number of occurences in memory of the current value S(n).
PS(n+1) = #S(n+1) / #S(n)
Sequence nodes are created by the "sequence agent".
The sequence agent is specialized in discovering frequent sequences in memory.
It does so by selecting a group of nodes, look at their memory and determine whether they produce frequently occuring sequences.
If they do, then for each found sequence a sequence node is added which represents the sequence.
When a sequence node is added, its complete memory is filled retrospectively by the sequence agent.
It is as if the sequence node always existed.
This is possible because the sequence node's memory can be calculated, because all the necessary information is present in the connected nodes' memory.
Loosely speeking, this ensures that newly learned knowledge is automatically applied to the past.
There are different types of pattern agents and consequently different pattern nodes, each specialized in a particular type of pattern.
Different types of pattern nodes (such as sequence nodes) can be stacked such that they describe patterns of patterns.
In this way, complex patterns can be constructed with high abstraction.
During learning, complex patterns are broken down into smaller parts, i.e. nodes.
This mechanism is similar to the concept of "chunking" in psychology.
The types of nodes determine the "grammar" of the network, where the nodes and connections determine its syntax.
The grammar determines how expressive the system is.
The grammar in OutOfBrain is Turing complete such that the network can represent an algorithm or computer program.
The network is like a computer program which grows as the system learns.
In this way the system has self-programming properties.
The agents are like plugins.
The framework allows a programmer to define and add new agents, to extend the grammar of the network.
Extending the grammar is useful for optimizing the system to operate well in specialized environments.
When creating a new agent, the programmer creates algorithms to interface with the system such that it typically performs these tasks:
- Search memory for patterns.
- Create new nodes for each pattern.
- Manage nodes and memory.
Agents are given full access to the entire network and memory, other agents and external resources.
When designing a new agent, the programmer must take care of the computational cost of each algorithm,
because the system runs these algorithms very frequently.
A special type of node is the input node.
Input nodes are usually found at the bottom of the network.
Input nodes at the bottom of the network do not have connections, and therefore do not connect to other nodes by themselves.
The activity function of an input node fires when it is fed a numeric value from an external input, from outside the network (served by the input agent).
The output of the activity function is simply the value that was fed into it, which ends up in the input node's memory.
The purpose of the input nodes is to feed activity from the system's environment into the network and memory.
The input activity function accepts any value that can be represented as a number.
This can be any signal, such as temperature, location, stock market prices and so on.
As an example, consider the input values coming from a chat console, where human input
text is fed as ASCII values to an input node.
The letters 'h' and 'i' will get converted to values 104 and 105, respectively,
and fed to the input node.
The value nodes are typically found just above the input nodes, i.e. the second layer from the bottom of the network.
They can be found in other layers of the network, but for now we will assume the value nodes to be in the second layer only.
Their purpose is to fire when the connected input node generates a specific value or range.
Value nodes have just one connection that link to an input node.
Typically there are numerous value nodes that link to a single input node.
The value node's connection has a filter value. This is a fixed value, telling the
value node's activity function to fire only when this particular value is generated by the connected input node.
The value node therefore acts as a filter.
Note that there are several types filters.
For example, some value nodes hold two filter values, to signify a range.
For simplicity we will assume simple single-valued filter values.
A value node "listens" to a connected input node for a particular value.
The value node's activity function fires when the connected input node generates the same value as the connection's value.
When the value node's activity function fires, it ouputs a value of 1.0.
The value 1.0 in this case means "fully detected".
If the input node is connected to a chat console, then typically there will be connected
value nodes for each letter of the (ASCII) alphabet.
Value nodes are created by the "value agent".
The value agent is specialized in recognizing frequent values in memory that are generated by certain nodes.
An output node is a node which is linked with the outside world.
For example, it could be connected to a chat console or to a motor driver.
Output nodes have a single connecion, simply copying the value of the connected node.
An output node is usually connected to a so-called voluntary node.
Voluntary nodes will be discussed in a later section.
Evaluation nodes represent evaluative and emotional properties such as goals, rewards, happiness, penalties or punishment.
The activity of evaluation nodes tell the system whether a situation is desirable or not.
Evaluation nodes can attach to any type of node, for example input nodes, value nodes or sequence nodes.
The activity of evaluation nodes is monitored by the evaluation agent.
The evaluation agent uses the combined activity to determine the overall state of success.
There are two main types of evaluation nodes: primary evaluation nodes and inferred evaluation nodes.
The primary evaluation nodes are fixed nodes that are assigned by the user or programmer.
The system itself is not authorized to change or remove these fixed primary evaluation nodes.
The user or programmer can place evaluation nodes (at any time) to instruct the system to obey rules or strive towards achieving goals.
This is done by simply telling the system whether certain activities are desirable, undesirable or not allowed.
The inferred evaluation nodes are added and maintained by the evaluation agent.
These nodes serve as heuristics, to help the future agent and voluntary agent detect the desirability of possibile futures as early as possible.
The future agent will be discussed in the next section.
The voluntary agent will be discussed in a later section.
The memory is a limited resource.
Obviously we cannot store more memories than the maximum capacity of the memory allows.
To prevent memory overload, the memory is managed by the "memory agent".
The task of the memory agent is to remove parts of the memory that are either redundant or the least interesting.
For example, the memory agent may decide to remove redundant lower-level node activity from memory that is already covered by higher-level "chunks".
There are many other types of examples of activity that may be removed from memory.
An example is so-called "noise activity", which is activity that does not produce any patterns of interest.
The future memory is a virtual memory space that represents the future, structured as a tree of possible futures.
The future memory is managed by the future agent.
The future agent continuously builds possible futures and adjusts them as time goes by.
The future agent attempts to predict possible futures by looking for nodes which may possibly react next.
For example, a sequence may be partially complete and may indicate what activity is likely to happen next.
Based on prior experience there is a probability that the sequence will fully complete.
If the probability is anywhere near likely, then a new possible future is created to represent this.
The probability of this future happening is attached to the new future.
The future agent may also use voluntary nodes to generate new futures.
Voluntary nodes will be discussed in the next section.
For each new future, the future agent communicates with the evaluation agent and requests it to calculate its evaluation score.
The evaluation score is then attached to the future.
For each new future, the future agent attempts to further determine possible futures from there.
A new future presents a new network that may trigger particular sequence nodes to react, spawning yet more new futures, and so on.
The result is a tree of possible futures, with probabilities and evaluation scores.
A voluntary node does not have an activation function.
Instead, the voluntary node is directly activated by the voluntary agent.
The voluntary agent may activate any voluntary node at will, if it sees fit.
To the best of its ability, the voluntary agent tries to select the best sequence of voluntary activity to achieve the most desirable situation.
The voluntary agent does this by communicating with the future agent, to select the most prefered possible future.
The voluntary agent is like a planning agent, because it tries to plan a set of sequences that will result in a desirable situation.
If the voluntary agent has no idea what to do or cannot predict what will happen then it will simply "try stuff out",
providing that it does not violate any rules imposed by the evaluation agent.
The usual setup is such that output nodes are connected to voluntary nodes, either directly or indirectly.
So the output of voluntary nodes is fed to the external environment through the outputs.
As the voluntary agent interacts with the environment through the outputs, the system will
analyze the impact by trying to detect patterns through the inputs.
The system therefore learns automatically as the system interacts with the environment.
Putting it all together
Every node in the network has memory.
The total memory contains patterns.
By understanding these patterns, the agents work together to learn, think, predict, plan and achieve.
So how far will this scale? We do not know ourselves yet as we are still exploring this architecture.
In the meantime, if you have any questions or opinions, please feel free to contact us by e-mail: