A startup company announces the development of a PPU that can increase CPU perfo

FlowComputing claims that its PPU has ushered in a new era of the "super CPU."

What is the bottleneck of modern computers? Finnish startup FlowComputing points to the CPU. The company claims that the chip technology "Parallel Processing Unit (PPU)" it has developed can increase the processing power of the CPU by up to 100 times and will meet many of the computational needs required for the current development of artificial intelligence.

In computer science, parallel processing technology is a method of improving computational efficiency and performance by executing multiple tasks simultaneously.

Increase CPU performance by up to 100 times without rewriting existing application code

FlowComputing, a spin-off from the Finnish national research organization VTT, claims that its PPU has ushered in a new era of the "super CPU."Although the CPU has made great progress compared to a few years ago, its basic processing method has not changed at all, and it can only do one thing at a time. We switch billions of times per second through multiple cores and paths, but due to the basic limitations of the instruction execution method, a task must be completed before the next task can start, but a large number of facts have not changed. Waste occurs and becomes a bottleneck.

Advertisement

The PPU of flow computing eliminates this limitation and allows the CPU to switch from a single channel to multiple channels. Although the CPU can still only handle one task at a time, the PPU of Flow can perform on-chip traffic management at the nanosecond level, enabling tasks to move in and out of the processor faster than ever before.

The PPU of Flow Computing can be applied to any CPU architecture and maintain complete backward compatibility. In addition, it is expected that performance will be further improved by rebuilding and recompiling software to work with the PPU-CPU combination. Flow has confirmed that if you change the code (without completely rewriting it) to take advantage of this technology, you can increase performance by up to 100 times. The company is providing recompilation tools for developers who want to optimize the software of chips that support Flow.

The core idea of parallel processing technology is to divide a task into multiple subtasks and let these subtasks execute simultaneously. In this way, the computing resources of multi-core processors or distributed systems can be fully utilized to improve computational efficiency. Parallel processing technology can be divided into two types: process-based parallel processing and thread-based parallel processing. Process-based parallel processing uses the operating system as an intermediary to allocate multiple independent processes to different processors for execution; thread-based parallel processing, on the other hand, divides a process into multiple threads and allows these threads to execute simultaneously on the same processor.

Therefore, the benefits are not limited to the improvement of CPU performance; other connected units (NPU, GPU, etc.) can also indirectly benefit from the improvement of PPU performance.

Although this kind of parallelization has been possible at the research level, it is not practical because it requires rewriting the application code from scratch. If what Flow Computing says is true, then they can achieve this goal without changing any code.

However, to put it into practice, the PPU of Flow Computing must be integrated during the chip design phase. Flow seems to indicate that this technology is suitable for FPGA-based test setups.

Flow has now obtained 4 million euros (about 4.3 million US dollars) in seed financing, which is led by Butterfly Ventures, with FOV Ventures, Sarsia, Stephen Industries, Superhero Capital, and Business Finland participating.The company also plans to collaborate with leading CPU companies such as AMD, Apple, Arm, Intel, NVIDIA, and Qualcomm to jointly develop next-generation advanced CPU computing. The company is also seeking to cooperate with smaller CPU startups, such as Tenstorrent, led by the legendary chip designer Jim Keller.

However, it is worth noting that to achieve efficient parallel computing in practical applications, the following factors should be considered:

Task Division: Reasonably dividing the task into multiple subtasks is key to achieving efficient parallel computing. It is necessary to develop appropriate division strategies based on the specific characteristics of the task and the computing resources.

Communication and Synchronization: In parallel computing, communication and synchronization between subtasks are required to ensure the correctness of the computation. It is necessary to choose appropriate communication protocols and synchronization mechanisms to reduce communication overhead and improve computational efficiency.

Load Balancing: In distributed systems, the computing capabilities of various nodes may vary. To fully utilize system resources, effective load balancing strategies are needed to distribute computing tasks.

Performance Optimization: For specific computing tasks and application scenarios, corresponding performance optimization measures need to be taken, such as algorithm optimization, data compression, prefetching techniques, etc.

Parallel Programming Models and Frameworks: Choosing the right parallel programming models and frameworks can improve development efficiency and application performance. Common parallel programming models and frameworks include OpenMP, MPI, MapReduce, etc.