Opportunity
The most efficient Deep Neural Network (DNN) such as CNN and MLP IC designs benefit from In-memory Computing (IMC) approaches and have energy efficiency ~1000 TOPS/W. In contrast, all recent LSTM have yet to surpass the energy efficiency of 10 TOPS/W –two orders of magnitude less than the state-of-the-art CNN ICs. The major reason for this is that the published LSTM ICs use digital processing element (PE) based architecture. Our proposed method can break this efficiency wall by enabling implementation of nonlinear operators within memory and gain energy efficiency above 100TOPS/W for LSTM IC.
Technology
The main function of this IP is to enable implementation of scalar or vector nonlinear functions within SRAM memory. In-memory computing enables a high energy efficiency over conventional digital implementation where data read from memory consumes a lot of energy. Scalar nonlinearities such as sigmoid, tanh and vector nonlinearities such as softmax are very commonly used in neural networks achieving state-of-the-art performance in various applications.
Advantages
- Low latency
- High energy efficiency
Applications
- Key-word spotting
- Speech separation and speech enhancement
