Adiabatic Capacitive Artificial Neuron (ACAN)
Artificial intelligence is a cornerstone of modern technology, underpinning everything from predictive texting on mobile phones to complex logistics spanning the globe. However, bigger and better AI systems require increasingly much power to operate at full capacity, straining global energy supplies and limiting the democratisation of AI. Project ACAN is using a circuit design technique called "charge recovery" to implement AI systems with far higher power efficiency; in fact aiming to lower the power consumption of such systems by a factor of 10x.
You can also watch this video on YouTube
On-The-Fly In-Memory Stochastic Computing AI Architecture (OISCA)
We are presently witnessing the end of Moore’s law due to the physical limits of technology scaling. At the same time, the AI revolution has introduced new challenges due to the ever-increasing performance demands of its data-intensive applications such as computer vision, speech recognition, and natural language processing. The classic Von Neumann architectures are not originally designed to deal with these massive workloads. Thus, new beyond Von Neumann architectures became an urgent need to mitigate the emerging AI compute-gap. In another direction, the Matrix-Matrix Multiplication is the major computational core of AI models while its computational complexity forms a performance bottleneck for any AI architecture. The current emerging situation has spotlighted the urgent need to explore not only novel architectures that are data-centric but also unconventional computing domains that reduce the computational complexity of the AI models.
Transformers’ AI Architectures
Natural Language Processing (NLP) models are gaining increasing attention due to their remarkable ability to mimic human reasoning. At the core of NLP, Transformers represent cutting edge AI models that have led to a new AI revolution across various application domains. These AI models have achieved significant success in natural language understanding, translation and generation, leading to greater humanization of AI skills.However, these data-intensive models add more critical performance demands to the classic computing architectures. As an example, Generative Pre-trained Transformer (GPT) model, the core of the well-known ChatGPT, has billions of parameters to perform massive matrix-matrix multiplication workloads. This scales up the AI performance gap to a new critical level where Von Neumann architectures have not only a memory gap but also a compute gap.
Systolic Arrays are emerging spatial architectures that have been adopted by commercial AI computing platforms (like Google TPUs). Thanks to their capability of maximizing data utilization, systolic arrays mitigate the memory bottleneck by increasing the number of operations per memory access. Moreover, systolic arrays are scalable architectures that may form a computing fabric of tens of thousands of processing elements (PEs). These PEs are specialized light-weight cores that are efficiently tailored to handle the matrix-matrix multiplication workloads. However, these spatial architectures suffer from the penalty of input and output synchronization hardware. This interfacing hardware has a negative impact on not only the energy efficiency but also the throughput.
CEF develops novel systolic array architectures that eliminate the input and output synchronization penalty, resulting in a significant improvement of both throughput and energy efficiency. Our ambition is to build massive-scale Transformers’ AI Architectures with high-performance and energy-efficient systolic array cores, using advanced technology nodes (like 22nm).Emerging technologies, like RRAMs memristors, are also utilised for designing high-density and energy-efficient global buffers for the systolic array cores, whilst leveraging the non-volatility of RRAMs for cost-effective on-chip parameter management.