NM500 User’s Manual

Data Sheet

Revision History
Version 0.1 Revised December 12, 2016
Version 0.91 Revised June 13, 2017
Version 0.91 Revised July 4, 2017

NM500 is a product manufactured by nepes (www.nepes.kr).

NM500 is subject to a license from General Vision for the NeuroMem® technology. For information about the NeuroMem® digital neuromorphic technology, refer to the General Vision web site at www.general-vision,com and in particular the NeuroMem Technology Reference Guide.

Table of Contents
1. Overview 4
1.1. Architecture 4
1.2. A chain of identical neurons 5
2. Programming sequence 8
2.1. Vector broadcasting 8
2.2. Recognize a vector 8
2.3. Learn a vector 10
2.4. Reading the number of committed neurons 10
2.5. Saving and restoring the committed neurons 11
2.6. Reading the contents of a single specific neuron 12
2.7. Typical opertion laatency 12
3. The control registers 13
3.1. Neuron registers in detail 14
3.2. Remark: NCOUNT for the chain of more than 65535 neurons 18
3.3. Remark: Usage of the TEST registers 19
4. The NeuroMem buses and control lines 20
4.1. Clocks, power-up and reset 20
4.2. The NeuroMem bus 21
5. Timing considerations 25
5.1. Register Access Latency 25
5.2. Typical Timings Constraints 26
6. Interconnecting chips 28
7. Physical specifications 29
7.1. Pinout 29
7.2. Mechanical specifications 31
7.3. Electrical Specifications 31
8. FAQ 33
8.1. Hardware issues 33
8.2. Functional issues 33

Overview
The NM500 chip is a fully parallel silicon neural network: it is a chain of identical elements (i.e. neurons) addressed in parallel and which have their own “genetic” material to learn and recall patterns without running a single line of code and without reporting to any supervising unit. In addition, the neurons fully collaborate with each other through a bi-directional and parallel neuron bus which is the key to accuracy, adaptivity and speed performance. Indeed, each neuron incorporates information from all the other neurons into its own learning logic and into its response logic.

The neurons can learn and recognize input vectors autonomously and in parallel. If several neurons recognize a pattern (i.e. “fire”), their responses can be retrieved automatically in increasing order of distance (equivalent to a decreasing order of confidence). The information which can be read from a firing neuron includes its distance, category and neuron identifier. If the response of several or all firing neurons is polled, this data can be consolidated to make a more sophisticated decision weighing the cost of uncertainty or else.
1.1. Architecture
The NM500 chip is composed of the following modules:
 Neuron Interconnect module
 Clusters of neurons which is instantiated several times in the chip

1.1.1. Neuron Interconnection
 Synchronize communication between the clusters of neurons
 Inter-module communication is made though a bi-directional parallel bus of 26 wires: data strobe (DS), read/write (RWn), 4-bit register (REG), 16-bit data (DATA), ready (RDY)
 Inter-neuron communication also uses two additional lines indicating the global status of the neural network: identified recognition (IDn), uncertain recognition (UNCn).

1.1.2. Cluster of Neurons
 Group of (24) identical neurons operating in parallel.
 All neurons have the same behavior and execute the instructions in parallel independent from the cluster or even chip they belong to.
 No controller or supervisor
 Selection of one out of two classifiers: K-Nearest Neighbor (KNN) or Radial Basis Function (RBF) and more precisely a Restricted Coulomb Energy (RCE) neural network
 Recognition time is independent of the number of neurons in use
 Automatic model generator built into the neurons
 Save and Restore of the contents of the neurons in 258 clock cycle per neuron
 Simple Register Transfer Level instruction set through of 15 registers
1.1.3. Inter-chip cascadability (串联)
The NM bus establishes intra-chip and inter-chip connectivity.

1.2. A chain of identical neurons
1.2.1. The basic neuron entity
The neuron cell is composed of a memory and a set of 6 registers as described in the diagram to the below.

Three of the registers are read only.

1.2.2. Neuron daisy-chaing
A neuron can have three states in the chain: IDLE, Ready-To-Learn (RTL) or COMMITTED. It becomes committed as soon as it learns a pattern and its category register is written with a value different from 0. Its Daisy-Chain-Out (DCO) control line automatically rises, changing its status from Ready-To-Learn to Committed. The next neuron in the chain becomes the Ready-To-Learn. It has its Daisy-Chain-In (DCI) high and Daisy-Chain-Out (DCO) low.

The transfer of the DCI-DCO from one neuron to the next is activated the same way whether the two consecutive neurons belong to a same cluster or not, and even belong to a same chip or not.

1.2.3. Parallel access to the neurons
All the neurons decode and execute the commands received through the neuron bus in parallel. This bus also allows all the firing neurons to interact with one another such that the “winner takes all” in the case of a recognition and such that only novelty commits a new neuron in the case of learning. This is the key enabler to deliver a recognition time independent of the number of committed neurons in the chain.
1.2.4. Sequential access to the neurons
The contents of the committed neurons is a representation of the knowledge they have built autonomously by learning examples. This knowledge can be read and written in a mode called Save and Restore which access the neurons sequentially.
2. Programming sequence
This paragraph describes the typical programming sequences to use the neurons in standard mode and save and restore mode.
 Broadcast a vector to all the neurons (whether to learn or recognize it)
 Recognize the last broadcasted vector
 Learn the last broadcasted vector
 Save the content of all the neurons
 Read the content of a specific neuron
 Load the content of the neurons
2.1. Vector broadcasting
The memory of the neurons is 256 bytes long so the vectors to learn or recognize can be composed of up to 256 components of 8-bit value.

2.2. Recognize a vector
A vector broadcasted to the neuron bus is evaluated by all the committed neurons in parallel. The neural network can exercise two types of classifiers: K-Nearest Neighbor (KNN) or Radial Basis Function (RBF) and more precisely a Restricted Coulomb Energy (RCE) neural network

The KNN classifier always returns a response, when the RBF classifier discriminates case of positive identification, uncertainty and unknown.

The response of the firing neurons can be accessed by a succession of (Read DIST, followed by Read CAT and optionally Read NID registers). The first distance quantifies the difference between the input vector and the neuron with the closest pattern. The category of this neuron is the category with the highest confidence level. The second distance quantifies the difference between the input vector and the neuron with the second closest pattern. The category of this neuron is the category with the second highest confidence level, and so on. In the case of the RBF classifier, all the firing neurons have been read when Read DIST returns the value 0xFFFF.

The following diagram illustrates the three levels of response which can be delivered by the neurons through the readout of the registers NSR, DIST, CAT and NID. They are listed per increasing number of system cycles:
 Conformity, or status of the recognition (identified, uncertain or unknown)
 Best match in distance and its associated category
 All possible matches listed per increasing distance values.

Conformity detection Best match Detailed match
Detect if a pattern is recognized, or not, by reading the network status register (NSR) or the ID and UNC lines of the chip.

This response can be sufficient in the case of presence/absence detection, pass/fail, etc. If an application has a low cost of mistake, reading the distance and category of the neuron with the best match can be sufficient.

The ID line or NSR bit 3 indicates if other neurons are firing and are, or not, in disagreement. If an application has a high cost of mistake, the response of all the firing neurons might be of interest to obtain a better accuracy. A global response can then be established using probability functions, dispersion of the distances, minimum number of aggregates, etc.
Read NSR (1 cycle) Read Dist (18 cycles)
Read Cat (19 cycles)
Read NID (1 cycle, optional) Do
Read Dist (18 cycles)
Read Cat (19 cycles)
Read NID (1 cycle, optional)
k++
While dist[k]!=0xFFFF

Remarks:

If two neurons fire with the same distance but different category, their individual response are read as follows: Read Dist, Read Cat, Read Dist, Read Cat. The second Read Dist returns the same value as the first Read Dist but is necessary to access the category register of the second neuron.

If two neurons fire with the same distance and same category, only the response of the first one is read. The first Read Dist will notify both neurons to stay in query, but both will output their category at the following Read Cat and therefore exclude themselves from the next query. A second Read Dist will return the next higher distance value if applicable.

A Write Category command can be executed immediately after a Read Distance + Read Category sequence without having to re-enter the vector. This can be useful for applications such as in predictive maintenance or target tracking where you want to know what is recognized prior to learning a novelty.

If the category value is greater than 0x8000 or 32768 (bit 15=1) you have a warning that the neuron is “degenerated”. The real category value can be obtained by masking bit 15 with 0 (AND with 0x7FFF). The degenerated flag simply indicates that the neuron was prevented from shrinking its AIF to a smaller value during training and that its response should be weighted with care, or simply differently than the response of a neuron which is not degenerated.
2.3. Learn a vector
All the neurons have their internal learning logic and teaching a vector is as simple as broadcasting its components and then writing its category value. Optionally, the PowerSave register can be written to set the data lines in tri-state mode so they do not draw current.

Remarks
If the pair (vector and category) represents novelty, a new neuron is committed

If some firing neurons recognize the vector to learn as belonging to a different category, they automatically reduce their influence field to prevent such erroneous recognition in the future.

If the network is full, a learning operation will have no effect. You can detect that all the neurons of the network are already committed by executing the Read NCOUNT command which will then return the value 0xFFFF.

If the AIF of a neuron reaches the Minimum Influence Field, the bit 15 of its category register is set to 1. The neuron is said “degenerated”. It still reacts to input patterns as any other committed neuron but the bit 15 of its category indicates that the neuron was prevented from shrinking its AIF to a smaller value during training and its response should be weighted differently than the response of another firing neuron which is not degenerated.
2.4. Reading the number of committed neurons
The NCOUNT register returns the number of committed neurons in the chain, EXCEPT when the chain is full meaning that all the neurons are committed, in which case NCOUNT=0xFFFF.

If the number of chips daisy-chained in the system, N, is known, the readout of NCOUNT=0xFFFF becomes a simple indication that the number of committed neurons is actually N*576.

If N is unknown, due to a reconfigurable and/or stackable hardware architecture, the readout of NCOUNT=0xFFFF can trigger the following sequence of operations in order to obtain the number of committed neurons: Switch the network to Save and Restore mode, point at the first neuron of the chain and start iterations reading the neurons’ category sequentially until a category 0 or 0xFFFF is reached. The number of iterations is equal to the number of committed neurons. Calling this function might take a few second if your platform includes thousands of neurons.

2.5. Saving and restoring the committed neurons
The content of the committed neurons describes a knowledge which can be saved and restored. This functionality is useful for backup purposes, but also to transfer and duplicate knowledge between NeuroMem networks.

The two functions require to set the neurons in Save_and_Restore mode and point to the first neuron of the chain. For each neuron, you can read or write its components, context, minimum influence field and active influence field in any order, except for the category register which must be read or written last to point to the next neuron in the chain. Finally, when the neurons have been saved or restored the last operation consists of setting the neurons back to their normal operation mode.

< Save sequence > < Restore sequence>

Remarks
Note that in Save_and_Restore mode the last component is written to the COMP register and not to the LCOMP register.

If it is known that all neurons hold a pattern with only M significant components with M<256, the number of Read COMP can be limited to M, thus speeding the Save operation.

If an application does not use the notion of context, saving the context register might not necessary, saving one clock cycle per saved neuron.

Saving the MINIF is necessary if it is known that additional training will be done at a later time to complete or expand the knowledge.

You can proceed two ways to detect that all committed neurons have been read and stop the iterations: (1) read the NCOUNT register prior to turning the Save_and_Restore mode and set the number of iterations to this value; (2) Iterate until you read a category 0 which indicates that you are pointing at the ready-to-learn neuron of the chain and that the last committed neuron was the previous one.
2.6. Reading the contents of a single specific neuron
Reading the contents of a specific neuron is made in the following order:
 The first operation consists of setting the chip in Save_and_Restore mode and pointing to the first neuron of the chain
 In order to point to the ith neuron in the chain, (i-1) consecutives Read CAT are necessary
 You can then read the ith neuron’s components, context, minimum influence field and active influence field in any order. The category register must be read last because the instruction automatically points to the next neuron in the chain.
 Finally, the last operation consists of setting the chip back to the normal mode.

2.7. Typical opertion laatency
Operation Clock cycle @27MHz
L=256, N=576, K=3
Broadcast a vector of Length L L+3 9.56 in microseconds
Learn a vector of length L L+3+18 10.26
Status of a vector of length L L+3+1 9.59
Best match of a vector of length L L+3+37 10.93
Get the K top match of a vector of length L L+3+N*37 13.67
Save N neurons 4+(260)*N 5,546.67
Restore N neurons 4+(260)*N 5,546.67
3. The control registers
Under Normal mode, the NM500, or a chain of NM500, can learn and recognize patterns. In recognition, the neurons can behave as a K-Nearest Neighbor (KNN) or Radial Basis Function (RBF) and more precisely a Restricted Coulomb Energy (RCE) neural network.

Under the SR mode, the automatic model generator and search-and-sort logic are disabled. The neurons become dummy memories but can be read or written in the least amount of time. This SR mode is essential to transfer knowledge bases between hardware platforms, or make backup prior to learning additional examples.

The following table describes the 15 registers controlling the entire behavior of the neurons under the Normal and Save-and-Restore mode.

Description Addr 8-bit  Normal mode SR mode 16-bit default

NSR Network Status Register 0x0D RW W 0x0000
GCR Global Control Register 0x0B RW 0x0001
MINIF Minimum Influence Field 0x06 RW RW 0x0002
MAXIF Maximum Influence Field 0x07 RW 0x4000
NCR Neuron Context Register 0x00 RW 0x0001
COMP Component 0x01 W RW 0x0000
LCOMP Last Component 0x02 W 0x0000
INDEXCOMP Component index 0x03 W W 0x0000
DIST Distance register 0x03 R R 0xFFFF
CAT Category register 0x04 RW RW 0xFFFF
AIF Active Influence Field 0x05 RW 0x4000
NID Neuron Identifier 0x0A R R 0x0000
POWERSAVE PowerSave 0x0E W n/a
FORGET Forget 0x0F W n/a
NCOUNT Count of committed neurons 0x0F R R 0x0000
RESETCHAIN Points to the first neuron 0x0C W n/a
TESTCOMP Test Component 0x08 W 0x0000
TESTCAT Test Category 0x09 W 0x0000

3.1. Neuron registers in detail
Abbreviation Register Normal mode SR mode
NSR

Network Status Register

Bit[1:0], reserved
Bit[2], UNC (Uncertain)
Bit[3], ID (Identified)
Bit[4], SR mode
Bit[5], KNN classifier
The ID and UNC bits are updated internally after each Write Last Comp command.

ID is high if all firing neurons report the same category.

UNC is high if several neurons fire but disagree with the category. Note that this is always the case if the mode is KNN and 2 committed neurons have different categories.

KNN is a recognition mode and should not be active while learning. Indeed, any pattern would be recognized whatever its distance from a neuron and the learning will only create a single neuron per new category.
Writing Bit 4 to 1 switches the chain of neuron to SR mode and points directly to the RTL neuron.
GCR Global Context and also partial identifier of the RTL neuron

Bit [6:0]= Context
Bit[7]= Lsup Norm Bit[23:16]= Identifier[23:16] Context in use for any new learning or recognition

If the Norm is not set to LSUP, the default is the L1 Norn or Manhattan distance.

Accessing the 3rd upper byte of the RTL neuron is needed if the chain of neurons is larger than 65535 neurons.
N/A

MINIF Minimum Influence Field
Value in use for any new neuron commitment
Value of the pointed neuron at the time it was committed

MAXIF Maximum Influence Field Value in use for any new neuron commitment
N/A

NCR Neuron Context Register

Bit[15:8]=0x00

Bit[7:0]= Identifier [23:16] of the RTL neuron
Value of the pointed neuron
Bit[15:8] = Identifier [23:16]
Bit[7]= LSUP Norm
Bit[6:0]= Context [0, 127]
COMP Component

Bit[15:8] = unused
Bit[7:0]= byte component
Each Write COMP stores the component at the current INDEXCOMP value and updates the DIST register of the committed neurons with NCR=GCR and also of the RTL neuron. INDEXCOMP is automatically incremented. After each Read or Write, moves to the next INDEXCOMP of the pointed neuron
LCOMP Last Component

Bit[15:8] = unused
Bit[7:0]= byte component
Write LCOMP stores the component at the current INDEXCOMP value and updates the DIST register of the committed neurons with NCR=GCR and also of the RTL neuron. INDEXCOMP is set to 0.

The IDn and UNCn lines are updated to report the recognition status. If IDn line is low, the “identified category” is available on the DATA bus. N/A
INDEXCOMP Component index

Common index pointing to the neurons’ memory between [0, 255]. Write INDEXCOMP moves to a specific index value, but does not reset the DIST register.

This value is incremented automatically after each Read COMP or Write COMP. It is reset after a Write LCOMP.
DIST Distance register

between [0, 65535]

DIST=0 means that the vector matches exactly the model of the firing neuron. The higher the distance, the farther the vector from the model.
This register is updated by the neuron during the broadcast of components (Write COMP and Write LCOMP)

Read DIST returns the distance of the top firing neuron. This “winner” neuron pulls out of the race, so the next Read Dist will be answered by the next top firing neuron, etc.
DIST=0xFFFF means that there are no more firing neurons.

Must be read after Write LCOMP and before Read CAT N/A

CAT Category register

Bit 15= Degenerated flag (read-only)
Bits [14:0]= Category value between 0 and 32766 (0x7FFE)

CAT greater than 32768 means that the responding neuron is degenerated. The value must be masked with 0x7FFF to report the original category of the neuron.
Write CAT of 0 does not commit a new neuron, but may force existing committed neurons to reduce their influence fields.

Read CAT returns the category of the top firing neuron
CAT=0xFFFF means that there are no more firing neurons

Must be read after the DIST register except if the IDn line is low and the NID register does not need to be read after the CAT register. Category of the pointed neuron

Read or Write CAT automatically moves to the next neuron index in the chain.
AIF Active Influence Field
This register is updated automatically by all the firing neurons during learning operations (i.e. Write CAT) Influence field of the pointed neuron

NID Neuron Identifier or index of the neuron in the chain.

Bit [15:0] = 2 lower bytes of a 3-bytes neuron identifier.

The upper byte is stored in the NCR register. Its access is only necessary when the chain of neurons is larger than 65535. This register is assigned automatically when the RTL neuron gets committed after a Write CAT.

Read NID returns the identifier of the firing neuron with the least distance and least category. It must be read after a Read CAT. (1)
This register is assigned automatically when the pointed neuron gets assigned a category different from 0 with a Write CAT.

POWERSAVE PowerSave mode

Writing this register resets the DATA lines to a tristate mode and ensures that they do not draw current from the pull-up resistors.
FORGET Uncommit all neurons by clearing their category register.
Note that the neuron’s memory is not cleared, but its index is reset to point at the first component.

Also reset the MINIF, MAXIF and GCR to their default values. N/A
NCOUNT
Count of committed neurons

Bit[15:0]= 2 lower bytes of the count

NCOUNT=0xFFFF means that all neurons of the chain are committed. If the chain of neurons is greater than 65535 neurons this can also means that 65535 neurons are indeed committed.

Reading the upper byte of the NCR register can extend the count to a 3 bytes value. Index of the neuron pointed in the chain.

Write RESETCHAIN points to the first neuron. If it is committed, NCOUNT will be equal 1, otherwise 0.
RESET
CHAIN N/A Points to the first neuron of the chain.
TESTCOMP Write the pointed component of all neurons with the input value.
TESTCAT Write the same category to all the neurons. Useful for test routines to commit all neurons in one clock cycle.

(1) If the content of the neurons has been built using their model generator, there should be no occurrences of firing neurons with the same distance and same category. As a result, reading the NID returns the identifier of the sole firing neuron. If, on the contrary, the content of the neurons has been loaded in Save-and-Restore and is such that multiple neurons can fire with the same distance and same category, reading the NID will return an “irrelevant” value which is the AND of all their identifier.
3.2. Remark: NCOUNT for the chain of more than 65535 neurons
The DATA bus being 16-bit wide, the single readouts of the NCOUNT and NID registers are insufficient for a chain of more than 65,535 (0xFFFF) neurons. In such case, both the neuron count and identifier values must be reported on 24-bit as follows:

Operation If chain <= 65535 neurons If chain > 65535 neurons
Report the number of committed neurons Ncount= Read NCOUNT N1=Read GCR
N2=Read NCOUNT
Ncount = N1[15:8]*0xFFFF+N2[15:0]
Report the identifier of the next firing neuron Nid= Read NID N1=Read NCR[7:0] (the bit reporting the identifier are shifted to the lower byte at the time of the readout. The context information is dropped)
N2=Read NID[15:0]
Nid= N1[15:8]*0xFFFF+N2[15:0]

3.3. Remark: Usage of the TEST registers
Usage1: Clear the neurons’ memory
- Write NSR 0x10 Set the SR mode
- For i=0 to 256
o Write TESTCOMP=0
- Write TESTCAT 0
- Write NSR 0x00 Cancel the SR mode

Usage1: Clear the neurons’ memory
- Write NSR 0x10 Set the SR mode
- Write TESTCAT Value Commit all the neurons with a same category value
- Write RESETCHAIN Point to the 1st neuron in chain
- Ncount=0
- Do Loop
o Read CAT, cat
o Ncount++
- Until cat=0xFFFF (Ncount-1) is the number of neurons in the chain
- Write NSR 0x00 Cancel the SR mode

The NeuroMem buses and control lines
This chapter describes the buses, control lines and interrupt lines.

Symbol Type Description
Configuration lines VCC Core power supply (1.2v)
VCCIO IO power supply line (3.3 v)
GND Ground line
DCI Input Daisy Chain In
DCO Output Daisy Chain Out
Clock and Reset GCLK
Input System clock
GRESETn
Hardware reset
CSn
Input Enable chip activity
NeuroMem bus DS Bidir Data strobe line
RWn
Bidir Read/Write
REG[3:0]
Bidir Register
DATA[15:0]
Bidir Data
UNCn
Bidir Uncertain_low line
Neuron output lines IDn
Bidir Identified_low line
RDY Bidir Ready line
4.1. Clocks, power-up and reset
4.1.1. GRESETn, global reset
The chip is reset at power-up by pulling down the GRESETn pin for a minimum of 5 clock cycles. An internal reset signal is then sustained for 255 clock cycles to filter any bouncing of the GRESETn external pulse. It is propagated internally to the neurons so all registers are set to their default values. In a multi-chip configuration, the same GRESETn must be connected to all chips.
4.1.2. GCLK, system clock
The chip operates at a typical system clock of 37 MHz. If multiple chips are connected in parallel the typical system clock is 16 MHz.
4.1.3. CSn, power saving control line
The CSn line controls the propagation of the system clock GCLK to the neurons. It is pulled low by default letting the clock run continuously.

Pulling up the CSn line when the chip is unused reduces considerably its power consumption (from 500 mW to 25 mW). On the other hand the timings to pull CSn back down and let the system clock pass through must be accurate: (1) it must be pulled down on a negative edge of GCLK when the external data strobe (DS) is pulled up at the latest. (2) It must be released on the negative edge of the system clock following the rise of the RDY signal at the earliest or the fall of the B_BSY signal.

4.1.4. DCI
Until the DCI line of a chip is high, its neurons are idle. As soon as the DCI line rises, the neurons of the chip become active, meaning ready to learn and recognize.

In a configuration with multiple chips, the Daisy-Chain-In (DCI) line of the first chip must be high. For the subsequent chips, the connection between their DCO and DCI lines allows to physically arrange them in a chain. The DCI line of a chip must be connected to the DCO of the previous chip in the chain. Its status is then controlled by the neurons of the previous chip.
4.1.5. DCO
The Daisy-Chain-Out (DCO) line of a chip must be connected to the DCI of the next chip in the chain, if applicable. It is low by default and will rise when the last neuron of the chip gets committed. If this line is connected to the DCI of another chip, the later will awake its neurons to become Ready-To-Learn.
4.2. The NeuroMem bus
The neurons receive and execute instructions simultaneously through a bi-directional parallel bus composed of 26 lines:

Symbol Description
DS Data strobe line
RWn
Read/Write line (default is Read=1)
REG 4 bit register address
DATA 16-bit register data
RDY Ready control line mixing the RDY output signal of all the neurons in the chain and indicating that the neurons are all ready to execute a new command
IDn
Control line mixing the IDn output signal of all the neurons in the chain and indicating that neurons have identified the last vector and that these neurons are all in agreement for its classification
UNCn
Control line mixing the UNCn output signal of all the neurons in the chain and indicating that neurons have identified the last vector but that these neurons are in disagreement with its classification. This line is an in/out line because used as an input during the execution of certain Write register.

4.2.1. Timings
The neurons sample these signals on the positive edge of the system clock GCLK. Their setup time must be at least 5 nanoseconds before the positive edge of GCLK. The hold time must be at least 5 nanoseconds after the positive edge of the clock. The signals have to be released before the next positive edge of the clock to ensure that the data bus becomes bi-directional for proper execution of the commands requiring snooping of the bus.

Depending on the register to access and the status of the neurons, the Read and Write commands can take between 1 and 19 clock cycles.

The neurons sample a new command on the positive edge of the system clock and pull down their RDY line for the duration of its execution. Upon completion, the RDY line is pulled back up on the positive edge of the system clock.

A Write command (DS, RWn=0, REG, DATA) must be stable on the positive edge of the system clock and released before the next positive edge of the system clock.

A Read command (DS, RWn=1, REG) must be stable on the positive edge of the system clock and released before the next positive edge of the system clock. DATA is stable when the RDY control line is pulled high.

Write in one cycle
(REG 0x06 is the MINIF register)

Read in one clock cycle
(REG 0x04 is the CAT register, read in this case in SR mode)

Write in two cycles
(REG 0x02 is the LCOMP register)

Remark:
When the DS signal is asserted the DATA bus must be the input value (i.e. 0x000b). It then is switched to a tri-tate mode (i.e. 0xFFFF). During the second and last cycle of the Write LCOMP the firing neurons output their category value and DATA represents their resulting bit-per-bit AND combination (i.e. 0x0001). If this value is different from the category of one of the firing neurons, the UNCn line is pulled down (not the case illustrated in the above timing diagram)

Read in sixteen cycles
(REG 0x03 is the DIST register)

4.2.2. DS
The data strobe line, DS, must be asserted and de-asserted at the negative edge of GCLK. It must be asserted only when the RDY line is high.
4.2.3. RWn
The Read/Write line, RWn, must be low to write and high to read. It is low by default. This signal is sampled on the positive edge of GCLK when DS is high. In the case of a Write command, it must be pulled low only for the duration of the DS high and then immediately released to allow the interconnectivity of the neurons during a Write Last Component or a Write Category.
4.2.4. REG[3:0]
The five Register lines, REG, represent the 4-bit address of the register to read or write. They are sampled on the positive edge of GCLK when DS is high and must be not be released before the rise of the RDY line.
4.2.5. DATA[15:0]
The 16 DATA lines are connected to open collectors and can have three different states:
 During a write operation (RWn is low and DS is high), DATA is the 16-bit value to write to the selected register. It is sampled by the neurons at the positive edge of GCLK when DS is high and RW is low.
 At the end of a read operation (RWn is high and RDY is rising), DATA is the 16-bit value of the selected register. It can be read on or after the rising edge of RDY after the fall of DS. The default output value is 0xFFFF.
 During the execution of the commands which last more than one clock cycles, the DATA lines must be released to allow the mixing and snooping of the responses of all the neurons connected in parallel in a same chain. These operations are the Write LCOMP, Write CAT, Read DIST and Read CAT.
4.2.6. RDY
The Ready line, RDY, is pulled down by the neurons during the execution of a command and released upon its termination. It is updated at the positive edge of the system clock GCLK whether or not the command is recognized by the neurons.
4.2.7. IDn
The Identified line, IDn, is pulled down when all the neurons recognizing the last input vector are all in agreement and return the same category. This line is updated each time the last component of a vector is broadcasted to the neurons either through a Write LCOMP command. The actual update occurs at the 3rd negative edge of the clock during the execution of the Write LCOMP. The IDn line is released at the next Write COMP.

The IDn line is also continuously latched in bit [3] of the NSR register of the chip at the positive edge of the clock.
4.2.8. UNCn
The Uncertain line, UNCn, line is bidirectional and shall not be driven. It is an output during a recognition operation and an input during a learning operation.

UNCn is pulled down when the neurons recognizing the last input vector have different categories. This update occurs each time a Write LCOMP is executed. The actual update occurs at the 3rd negative edge of the clock during the execution of the Write LCOMP. The UNCn line is released at the next Write COMP.

Note that UNCn is always pulled down if the mode is KNN and 2 committed neurons have different categories.

The UNCn line is also continuously latched in bit [2] of the NSR register of the chip. At the positive edge of the clock.

During a Write CAT, this line is asserted by the neurons if the last input vector is recognized as a novelty and must be stored into a new neuron.
5. Timing considerations
5.1. Register Access Latency
The following table reports the number of clock cycles (cc) necessary to read and write the registers of the chip. The cycles are counted from the first rising edge of the system clock upon the receipt of the DS signal, to the rising edge of the READY signal upon execution of the command.

Addr Register Description Learn and Recognition mode Save and Restore mode
Write
cycles Read
cycles Write
cycles Read
cycles
0x00 NCR Neuron Context Register 1 1
0x01 COMP Component 1 1 1
0x02 LCOMP Last Component 1 if no neurons 3 otherwise
0x03
0x03 INDEXCOMP
DIST Component Index
Distance 1
18 1
1
0x04 CAT Category 1 if ID, 19 otherwise 3 if ID, 19 otherwise 1 1
0x05 AIF Active Influence Field 1 1
0x06 MINIF Minimum Influence Field 1 1 1
0x07 MAXIF Maximum Influence Field 1 1
0x08 TESTCOMP Test Component 1
0x09 TESTCAT Test Category 1
0x0A NID Neuron Identifier 1 1
0x0B GCR Global Context Register 1 1
0x0C RESETCHAIN 1
0x0D NSR Network Status Register 1 1
0x0F FORGET Clear the neurons 1
0x0F NCOUNT Committed neurons 1 1

5.1.1. Commands executing in multiple cycles (LCOMP, CAT and DIST)
Accessing most registers takes a single clock cycle. In Learn and Recognition mode, reading and writing the LCOMP, DIST and CAT registers can take between 3 and 19 clock cycles depending on the content of the neuron at the time of the execution. This means that two neurons can execute a same instruction in different number of clock cycles depending on its status and internal registers’ values. For example a neuron which does not recognize an input pattern will execute the RDIST instruction in 1 cycle, when a neuron which recognizes the pattern (i.e. fires) will participate to the Search and Sort race for up to 16 clock cycles. The Ready line of the chip indicates when all the neurons have finished the execution of an instruction and can receive a new one.

Write LCOMP (0x02), Read DIST (0x03), Read and Write CAT (0x04) are “snooping” commands meaning they are making open collector bus mixing. The release of the DATA lines as well as the IDn and UNCn lines after the fall of the DS signal is critical so they can snoop properly.

5.1.2. Multiple read/write to the COMP register
Broadcasting a vector to the neurons is a succession of Write COMP (up to 255 times) ended with a Write LCOMP. The series of Write COMP can be executed with a sustained DS signal provided that the data is updated and stable at each new positive edge of the system clock. For reference, the waveforms shown under the paragraph “Recognizing a vector received through the digital video bus” illustrate the use of a sustained DS signal during the feed of all but the last component value.
5.2. Typical Timings Constraints
In the example below, a vector of 8 components is learned and then recognized. The resolution of the diagrams does not allow reading the DATA values of the components and the category, but this is not important for understanding the timing constraints of the chip.

The DS, RWn, REG and DATA signals are updated at the negative edge of the system clock (GCLK) so that they are stable when the neurons read them at the next positive edge of GCLK. The RDY signal is then immediately pulled down by the neurons and released at the first positive edge of GCLK following the completion of the command. The duration during which the RDY signal is low represents the execution time of the command.

In the case of a Read command, the output DATA is ready to be read when RDY rises.
5.2.1. Learn a vector
The sequence of instructions consists of 3 Write COMP, 1 Write LCOMP, and 1 Write CAT.

When REG is equal to 01, each DS pulse triggers a Write COMP lasting one cycle of GCLK. The RDY signal has the same duration as the DS only shifted by one half clock cycle.

When REG is equal to 02, the DS pulse triggers a Write LCOMP. The RDY signal is pulled down for 3 cycles. The fact that both lines IDn and UNCn are pulled up indicates that the input vector is not recognized by any existing neuron. The subsequent Write CAT command will necessarily commit a new neuron.

When REG is equal to 04, the DS pulse triggers a Write CAT. The RDY signal is pulled down for 3 cycles.

5.2.2. Recognize a vector
The sequence of instructions consists of 3 Write COMP, 1 Write LCOMP, 1 Read DIST and 1 Read CAT.

When REG is equal to 01, each DS pulse triggers a Write COMP. The RDY signal is pulled down for one cycle.

When REG is equal to 02, the DS pulse triggers a Write LCOMP. The RDY signal is pulled down for 3 cycles. The UNCn is pulled down at the last negative edge of GCLK before RDY is pulled back up. This indicates that the input vector is recognized by more than one neuron and that different categories are identified.

When REG is equal to 03 and RWn remains high, the DS pulse triggers a Read DIST. The RDY signal is pulled down for 18 cycles which is the duration of the Search and Sort looking for the firing neuron with the smallest distance value. This distance is equal to 08.

When REG is equal to 04 and RWn remains high, the DS pulse triggers a Read CAT. The RDY signal is pulled down for 19 cycles which is the duration of the Search and Sort looking for the firing neuron with a distance register equal to 08 and the smallest category value. This category is equal to 01.

Remark: Since it is known that the recognition status is uncertain (UNCn is low), executing another series of Read DIST followed by Read CAT would report the distance and category of the next neuron with the smallest distance.
6. Interconnecting chips
One of the benefits of the NM500 architecture is that you can cascade multiple chips in parallel to expand the size of the neural network by increment of 576 neurons. The behavior of the neurons in a single-chip or multiple-chips configuration remains the same.

A chain of multiple chips is defined by connecting their NeuroMem bus together and also to external pull-up resistors when applicable (refer to the pinout table for details).

The external controller sending Read/Write commands to a chain of chips must be careful to release the bidirectional lines as soon as the Ready signal falls. Failure to do so will prevent the proper execution of commands interconnecting all the neurons together and using the bi-directional lines of DATA, IDn, UNCn, RDY as output on the negative edge of the clock and input at the next positive edge of the clock.

Also in the case of commands taking more than one clock cycle, the RWn line must be asserted only for the duration of the DS pulse.

The merger of the IDn lines across multiple chips also requires that it is pulled high if the merger of the global UNCn line is low. Indeed, each chip implies internally that a UNCn prevails over an IDn among all its neurons, but this must be enforced via firmware across multiple chips.
7. Physical specifications
7.1. Pinout
The following table describes the pins of the NM500, their type as I/O or bi-directional lines:

**Bottom view

Pin # Symbol Description Type Internal If daisy chained
D8 CSn
Chip Select Input
F2 DATA[0] Data line 0 Bidir PU 45KΩ PU external
E1 DATA[1] Data line 1 Bidir PU 45KΩ PU external
E2 DATA[2] Data line 2 Bidir PU 45KΩ PU external
D1 DATA[3] Data line 3 Bidir PU 45KΩ PU external
C2 DATA[4] Data line 4 Bidir PU 45KΩ PU external
C1 DATA[5] Data line 5 Bidir PU 45KΩ PU external
A2 DATA[6] Data line 6 Bidir PU 45KΩ PU external
B2 DATA[7] Data line 7 Bidir PU 45KΩ PU external
A3 DATA[8] Data line 8 Bidir PU 45KΩ PU external
B3 DATA[9] Data line 9 Bidir PU 45KΩ PU external
A4 DATA[10] Data line 10 Bidir PU 45KΩ PU external
B4 DATA[11] Data line 11 Bidir PU 45KΩ PU external
A5 DATA[12] Data line 12 Bidir PU 45KΩ PU external
B5 DATA[13] Data line 13 Bidir PU 45KΩ PU external
A6 DATA[14] Data line 14 Bidir PU 45KΩ PU external
B7 DATA[15] Data line 15 Bidir PU 45KΩ PU external
E7 DCI Daisy Chain In Input
F1 DCO Daisy Chain Out Output
G2 DS Data strobe line Input
E8 GCLK
Master clock Input
D7 GRESETn
Global reset_low
H3 IDn
Identified_low Bidir PU 45KΩ PU external
H4 RWn
Read/Write_low Input
G1 RDY
Ready line Output
PU 45KΩ PU external
H7 REG[0] Register line 0 Input
H6 REG[1] Register line 1 Input
G6 REG[2] Register line 2 Input
H5 REG[3] Register line 3 Input
G4 UNCn
Uncertain_low line Bidir PU 45KΩ PU external
A1, A8,
C3, C6,
D4, D5,
E4, E5,
F3, F6,
H1, H8 GND System ground
B6
C5, C7,
D3, D6,
E3, E6,
F4, F5, F7,
G5 VCC Core power supply (1.2v)
A7,
B1, B8,
C4, C8,
D2,
F8,
G3, G7, G8,
H2 VCCIO IO power supply line (3.3 v)
7.2. Mechanical specifications

Die size …………………………………………………………………… 4.6 x 4.4 x 0.5 mm2
Packaging: 64 pin CSP 4.6 x4.4 mm
Ball size …………………………………………………………………… 0.3 mm
Ball pitch …………………………………………………………………. 0.5 mm
7.3. Electrical Specifications
All signals are LVTTL (3.3 volts)

Vcc IO, Power supply for IO 3.3V for IO
Vcc CorePower supply for core 1.2 V for core
Max operating clock frequency 36 Mhz in single chip configuration
16 Mhz in multiple chips configuration
Operating temperature range (Tj) -40 ~ 125 C
Leakage Power …………………………………….. 6.5 mW
Total Power …………………………………………. 153 mW
Open Drain Max Sink Current (IOL): …………. 12 mA
Output Capacitance ………………………………. 1.7 pF
Interface levels …………………………………….. LVTTL
Fan out ……………………………………………….. 8 chips
7.3.1. Power saving tips
Since the DATA bus is composed of 16 internal pull-up lines, the broadcast of a value other than 0xFFFF on this bus will draw current until the execution of another command releasing its lines in whole or in part. The register POWERSAVE allows the release of the DATA bus (back to 0xFFFF) when no other Write command is expected.
8. FAQ
8.1. Hardware issues
The neurons do not learn
 The neurons will not learn if the UNCn line is driven. Verify that it is in tri-state during a learning operation.

Standalone mode On/Off
 How low can you run VCCIO?
 volts would work providing the core will have to stay above 1.2 volts.

 Do the nerurons retain data when STDBY is asserted?
 Yes, STDY cuts the internal clock and puts the neuron ram in very low power. As long as the core remains at 1.2 volts, the neurons’ content is kept.

 How long does it take for the neurons to be ready after STDBY is deasserted?
 Next clock cycle

 How much power does the chip consume in standby?
 Should decrease by at least factor 10 according to specifications

8.2. Functional issues
The neurons do not learn, nor recognize my vectors when I know it should
 Verify that the neurons are not in Save-and-Restore mode by reading the Network Status Register (NSR). If it is equal to 16 (0x10) then the neurons behave as dummy memories and cannot learn nor recognize.
 Verify that the Global Context Register (GCR) is set to the proper value. If you have learned your vectors while the GCR was equal to A, they will not be recognize if the GCR at the time of the recognition is different from A or 0.

提供RBF算法的神经元芯片NM500

NM500 User’s Manual

猜你喜欢