We have discussed the concept of integration of storage and computing and in-memory computing before. Especially in the context of AI development, the storage wall problem has increasingly become a bottleneck for the continuous improvement of computing power, so the industry has proposed a non-von Neumann architecture, which combines the traditional and computing-centric von Neumann architecture. Architecture, changing the computing paradigm, and pushing part of the computing power down to storage, there is a saying of in-memory computing.
There are different implementations of this concept. Usually, this integrated storage and computing structure can be understood as embedding algorithms in the memory. The storage unit itself has computing power, which theoretically eliminates the delay and power consumption of data access. This chip is particularly suitable for neural networks.
Yesterday’s “core” product roadshow at the 2nd China (Shanghai) Free Trade Zone Lingang New Area semiconductor Industry Development Summit Forum, Beijing Zhicun Technology Co., Ltd. CEO Wang Shaodi talked about his own WTM2101: storage and computing integrated chip; and A strategic plan for the future this year.
Solve storage wall problems
The following picture given by Wang Shaodi mentions some typical data of the storage wall problem; it mainly reflects that, with the advancement of process technology, the computing power of the processor is getting stronger, the operation speed is getting faster and the storage capacity is getting stronger and stronger. Increasingly larger, but memory bandwidth struggles to achieve year-over-year growth.
As the computing power increases, the number of processor cores increases, and the available bandwidth per core decreases, which limits the overall speed. “Transferring data has become a considerable bottleneck.” “At the same time, energy consumption is also a problem.” The energy consumption of moving data from external memory and on-chip storage is huge; , thousands of times.”
“That’s why a storage-computing integrated solution is needed. The most fundamental solution to the storage wall is to integrate storage and computing, and use storage units for computing.” Wang Shaodi said.
“The name of in-memory computing may be better. It means that memory is used for computing, and the whole is still a computing type chip. Its computing medium is memory, not logic operation unit.” The above picture compares the structural differences between the two. Among them, the The circles represent memory cells.
Different from the traditional computing architecture, the storage subsystem activates one row at a time and completes data reading in turn; the storage-computing integrated architecture activates multiple rows and multiple columns at the same time. “The horizontal axis is no longer the selection signal, it is actually the processed data.” The process needs to be converted to an analog circuit – we also mentioned earlier that a multiplication is performed using Ohm’s law of a single device, and then Kirchhoff’s law is used to complete the column. accumulate. In this way, the multiplication and addition calculation is completed using the storage device unit.
“One memory operation cycle can complete 1 million parameter multiplication and addition operations, and the efficiency is improved by 50-100 times.” Obviously, this is quite valuable for AI.
In terms of more specific usage scenarios, for AI computing from the device side to the cloud, elastic expansion of computing power seems to have natural advantages in chip products that integrate storage and computing. “2MB, 4MB, and 8MB memory have relatively lower computing power, which can be used for end-side devices; storage and computing integrated array to 128MB can be used for edge side; storage capacity reaches 1GB, 2GB, and 4GB, which can provide more than 1000TOPS of computing. force, using cloud scenarios.”
“It will take 5-8 years for the integration of storage and computing to cover the AI computing scene, starting from the end side and the edge side.” Wang Shaodi said. The above picture also clearly shows the AI chips in different scenarios, the future market size and its development potential – these should also be well known.
From Zhicun 1.0 to Zhicun 3.0
Judging from the development track of Zhicun, the two founders have early experience in participating in the research and development of the integrated project of deposit and computing in the United States; Zhicun Technology was established in 2017 and received angel round financing the following year. This is still a fairly young company. In 2019, Zhicun “carried out IP research and development cooperation with internationally renowned companies, completed the tape-out of IP and SoC test chips, and released the world’s first integrated storage and computing chip”.
In 2020, Zhicun completed “the world’s first integrated storage and computing chip mass production film” and “the world’s first integrated storage and computing SoC chip verification”. The WTM2101 memory-computing integrated chip introduced by Wang Shaodi is expected to be mass-produced in the fourth quarter of this year. It seems that the concept of “integration of storage and computing” is developing much faster than we expected.
The left side of the above picture is the WTM2101 chip architecture diagram. The main part of the memory with integrated storage and computing “most operations are completed through the integration of storage and computing”; in addition, it is also equipped with a RISC-V CPU to provide non-matrix operations.
“Comparing the computing power and power consumption of the existing market solutions, WTM2101 has an advantage of more than 10 times.” Although it is not clear what the “existing market solutions” in this picture is, the algorithm complexity in the column of WTM2101 And power consumption, indeed both are quite amazing.
In addition to introducing products, Wang Shaodi also talked about Zhicun’s strategic planning. From the company’s establishment to 2020 is the “knowledge storage 1.0 era”, at this stage “research and development of storage and computing integrated technology, and application of the voice scene”, Wang Shaodi also emphasized again that “we are the first company in the world to implement technology.”
From this year to 2024, it belongs to the planning period of Zhicun 2.0. “Further advance, to 128MB, to achieve 64-100TOPS computing power level, covering end-side and edge-side scenarios. We will choose advantageous scenarios to implement applications.”
After 2024, the chip will be “pushed to the cloud”, with a capacity of 1GB to achieve a computing power range of 500-2000TOPS, and the product will achieve vehicle-grade reliability. “After 2025, it is planned to launch standardized products. After that, the products are no longer oriented to application scenarios, but provide different capacities like memories, integrate with existing computing systems, and complete such integration with advanced packaging technology.” At the same time, we have also launched a corresponding tool chain, which is fully adapted to the integrated storage and computing technology to adapt to mainstream artificial intelligence algorithms.”
Note that the revenue on the horizontal axis and the market value forecast on the vertical axis in this graph both indicate that the company has sufficient confidence in its development in the short term in the future. Perhaps this picture itself can also represent the development trend of storage and computing integration technology in the next few years.