Skip to content

Smarter SP parameters #536

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 43 additions & 30 deletions src/htm/algorithms/SpatialPooler.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -64,15 +64,23 @@ class SpatialPooler : public Serializable
{
public:
SpatialPooler();
SpatialPooler(const vector<UInt> inputDimensions, const vector<UInt> columnDimensions,
UInt potentialRadius = 16u, Real potentialPct = 0.5f,
bool globalInhibition = true, Real localAreaDensity = DISABLED,
Int numActiveColumnsPerInhArea = 10u,
UInt stimulusThreshold = 0u, Real synPermInactiveDec = 0.008f,
Real synPermActiveInc = 0.05f, Real synPermConnected = 0.1f,
Real minPctOverlapDutyCycles = 0.001f,
UInt dutyCyclePeriod = 1000u, Real boostStrength = 0.0f,
Int seed = 1, UInt spVerbosity = 0u, bool wrapAround = true);
SpatialPooler(const vector<UInt> inputDimensions,
const vector<UInt> columnDimensions,
const UInt potentialRadius = 16u,
const Real potentialPct = 0.5f,
const bool globalInhibition = true,
const Real localAreaDensity = 0.02f,
const Int numActiveColumnsPerInhArea = -1u,
const UInt stimulusThreshold = 3u,
const Real synPermInactiveDec = 0.008f,
const Real synPermActiveInc = 0.05f,
const Real synPermConnected = 0.1f,
const Real minPctOverlapDutyCycles = 0.001f,
const UInt dutyCyclePeriod = 1000u,
const Real boostStrength = 0.0f,
const Int seed = 1,
const UInt spVerbosity = 0u,
const bool wrapAround = false);

virtual ~SpatialPooler() {}

Expand All @@ -99,7 +107,7 @@ class SpatialPooler : public Serializable
columns use 2000, or [2000]. For a three dimensional
topology of 32x64x16 use [32, 64, 16].

@param potentialRadius This parameter deteremines the extent of the
@param potentialRadius This parameter deteremines the extent of the //TODO change this to potentialRadiusPct 0.0..1.0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace with similar meaning but relative [0.0, 1.0] percentage of the input dimensions.

Current "receptive field radius is 16 [input bits]" is not transferable, while "recept field is 10% [of the input field]" will work well with any sizes.

Overall, everywhere move from absolute units to relative percentages.

input that each column can potentially be connected to. This
can be thought of as the input bits that are visible to each
column, or a 'receptive field' of the field of vision. A large
Expand All @@ -109,13 +117,14 @@ class SpatialPooler : public Serializable
column will have a max square potential pool with sides of
length (2 * potentialRadius + 1).

@param potentialPct The percent of the inputs, within a column's
potential radius, that a column can be connected to. If set to
1, the column will be connected to every input within its
@param potentialPct The percent of the inputs, within a column's //TODO make this "automated" depending on #potentialRadius & numColumns.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • rename to columnInputOverlapPct

  • remove and make it a function of Fn(#columns, input area, potential radius,local area pct, *prefer-local-vs-global)

    • #columns + -> Fn -
    • area + -> Fn -
    • pot radius + -> Fn +
    • local area Pct + -> Fn +
    • prefer local + -> Fn +
  • the Fn represents "prefer local, details" (over global, holistic)

  • new smart param "prefer local" 0..1.0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this works well, see #533 for successful demonstration

@ref `potentialRadius`, that a column can be connected to. //TODO change to similar `columnOverlapPct` 0..1.0
If set to 1.0, the column will be connected to every input within its
potential radius. This parameter is used to give each column a
unique potential pool when a large potentialRadius causes
overlap between the columns. At initialization time we choose
((2*potentialRadius + 1)^(# inputDimensions) * potentialPct)
overlap between the columns.
At initialization time we choose
`((2*potentialRadius + 1)^(# inputDimensions) * potentialPct)`
input bits to comprise the column's potential pool.

@param globalInhibition If true, then during inhibition phase the
Expand All @@ -130,13 +139,12 @@ class SpatialPooler : public Serializable
internally calculated inhibitionRadius, which is in turn
determined from the average size of the connected potential
pools of all columns). The inhibition logic will insure that at
most N columns remain ON within a local inhibition area, where
N = localAreaDensity * (total number of columns in inhibition
area).
most `N` columns remain ON within a local inhibition area, where
`N = localAreaDensity * (total number of columns in inh area)`.
If localAreaDensity is set to any value less than 0,
output sparsity will be determined by the numActivePerInhArea.
output sparsity will be determined by the @ref numActivePerInhArea.

@param numActiveColumnsPerInhArea An alternate way to control the sparsity of
@param numActiveColumnsPerInhArea An alternate way to control the sparsity of //TODO remove this method of operation?!
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose to completely remove this param, and switch to using localAreaDensity only.

All optimized models (mnist, hotgym) use the localAreaDensity.

When using this method, as columns
learn and grow their effective receptive fields, the
inhibitionRadius will grow, and hence the net density of the
active columns will decrease. This is in contrast to the

I esp. dislike this part, density of SP should remain constant.
This would get us rid off of a mutex, making param optimization easier.

Are there any usecases where this mode of operation would be favorable?

active columns. When numActivePerInhArea > 0, the inhibition logic will insure that
at most 'numActivePerInhArea' columns remain ON within a local
inhibition area (the size of which is set by the internally
Expand All @@ -148,20 +156,20 @@ class SpatialPooler : public Serializable
columns the same regardless of the size of their receptive
fields.
If numActivePerInhArea is specified then
localAreaDensity must be < 0, and vice versa.
@ref localAreaDensity must be < 0, and vice versa.

@param stimulusThreshold This is a number specifying the minimum
@param stimulusThreshold This is a number specifying the minimum //TODO replace with `robustness` 0..1.0, which will affect this & synPermInc/Dec
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stimulusThreshold well represents "robustness" (to noise) of the segment.

  • bump default not to be too small (to 2,3,4,..?)
  • must not be too high, or no segment will be able to satisfy and no learning will occur -> auto check that num potential synapses on segment is x-times (2 times ?) bigger than the threshold
  • in "smart" replace with "robustness" [0.0..1.0]

number of synapses that must be active in order for a column to
turn ON. The purpose of this is to prevent noisy input from
activating columns.

@param synPermInactiveDec The amount by which the permanence of an
@param synPermInactiveDec The amount by which the permanence of an //TODO make fixed and only depend on robustness?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make fixed? and depend only on robustness modifier (robustness + -> both changes - )

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

synPermActiveInc and synPermInactiveDec can be reformulated as learningRate and coincidenceThreshold where:

  • coincidenceThreshold = inc / dec
  • learningRate = 1 / inc which is the maximum number of cycles it takes for a synapses permanence to saturate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that might be better, for sure learningRate.
what is the meaning of coinc threshold? I take it it's whether it's easier to learn new (and "fill" mem faster), or forget when not repeated (so stable vs "one-shot" patterns?), vs balanced when equal (is this a golden middle?)

So for long timeseries, I'd choose to have more of forgetting, and for short, new, relatively rare events more learning?
The SP does not really unlearn (it could, but the capacity is just huge)?

inactive synapse is decremented in each learning step.

@param synPermActiveInc The amount by which the permanence of an
@param synPermActiveInc The amount by which the permanence of an //TODO ditto
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO how related ActiveInc, InactiveDec? Something as "prefer forgetting, or learning new?" Which is about relative ratio of the two

active synapse is incremented in each round.

@param synPermConnected The default connected threshold. Any synapse
@param synPermConnected The default connected threshold. Any synapse //TODO remove, hard-coded in Connections, raise to 0.5 from 0.2?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely make hard-coded.

Propose changing to "0.5" (or middle of minPermanence, maxPerm). Is there a reason why everywhere this would be set unevenly closer to min? (0.2, 0.1 being common defaults. Performs well with 0.5 in MNIST)

whose permanence value is above the connected threshold is
a "connected synapse", meaning it can contribute to
the cell's firing.
Expand All @@ -171,10 +179,11 @@ class SpatialPooler : public Serializable
stimulusThreshold active inputs. Periodically, each column looks
at the overlap duty cycle of all other column within its
inhibition radius and sets its own internal minimal acceptable
duty cycle to: minPctDutyCycleBeforeInh * max(other columns'
duty cycles). On each iteration, any column whose overlap duty
duty cycle to:
`minPctDutyCycleBeforeInh * max(other columns' duty cycles)`.
On each iteration, any column whose overlap duty
cycle falls below this computed value will get all of its
permanence values boosted up by synPermActiveInc. Raising all
permanence values boosted up by @ref synPermActiveInc. Raising all
permanences in response to a sub-par duty cycle before
inhibition allows a cell to search for new inputs when either
its previously learned inputs are no longer ever active, or when
Expand All @@ -183,9 +192,12 @@ class SpatialPooler : public Serializable
@param dutyCyclePeriod The period used to calculate duty cycles.
Higher values make it take longer to respond to changes in
boost. Shorter values make it potentially more unstable and
likely to oscillate.
likely to oscillate. //TODO do not allow too small
//TODO make this to dutyCyclePeriodPct 0..1.0, which uses
//TODO new `samplesPerEpoch`, if known. For MNIST (image dataset) this would be #image samples,
//for stream with a weekly period this would be #samples per week.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • check for too small
  • make relative % of new "epochSize" (samplesPerEpoch, period)
  • epochSize is an estimate of periodicity of the data, eg:
    • weekly reccuring timeseries: = number of samples per week
    • mnist dataset = #samples on training set
    • unknown (timeseries -> 0 = infinity)


@param boostStrength A number greater or equal than 0, used to
@param boostStrength A number greater or equal than 0, used to //TODO no biological background(?), remove
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • verify biological background for boosting, and remove altogether if none (boosting does help somewhat in MNIST, see if this can be mitigated with new param config?)

  • if not removed, make fixed (2.0), or automated on robustness (boost = 2.0 * <inverse ration of robustness>)

control boosting strength. No boosting is applied if it is set to 0.
The strength of boosting increases as a function of boostStrength.
Boosting encourages columns to have similar activeDutyCycles as their
Expand All @@ -202,6 +214,7 @@ class SpatialPooler : public Serializable
@param wrapAround boolean value that determines whether or not inputs
at the beginning and end of an input dimension are considered
neighbors for the purpose of mapping inputs to columns.
//TODO does it hurt to set this to always true? We could rm NonWrappingNeighbourhood
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whether or not inputs
at the beginning and end of an input dimension are considered
neighbors for the purpose of mapping inputs to columns

biologically, if we assume hierarchy, a Region we model with SP is a portion ("rectangle") on a 2D sheet. Its input field is another 2D sheet (or retina, ...) -> so inputs on one side are not close to the others. So we should leave this OFF?


*/
virtual void
Expand Down