SiFive 博客

来自 RISC-V 专家的最新洞察与深度技术解析

January 13, 2020

Part 1: Fast Access to Accelerators: Enabling Optimized Data Transfer with RISC-V

This is the first in a series of blogs about Domain-specific accelerators (DSAs), which are becoming increasingly common in systems-on-chip (SoCs). A DSA provides higher performance per watt than a general-purpose processor by optimizing the specialized function it implements. Examples of DSAs include compression/decompression units, random number generators and network packet processors. A DSA is typically connected to the core complex using a standard IO interconnect, such as an AXI bus.

IO Caching Slide

Unfortunately, data transfers between DSAs and the core complex, which are often critical, can be inefficient in traditional SoCs. Figure 1 shows that the datapath between a DSA and core complex in a traditional SoC traverses multiple interconnects and bridges. This increases the latency between the cores and a DSA to 100s of cycles. Consequently, this makes it difficult to have fine-grain interaction between cores and a DSA.

RISC-V offers a unique opportunity to optimize such fine-grain communication between cores and DSAs. For example, as Figure 2 shows, a DSA can export a per-core DSA cache sitting next to a RISC-V core. The RISC-V core can poll status changes out of the DSA cache, thereby reducing the latency of interaction between the core and DSA to tens of cycles. The DSA can update status changes in the DSA cache through a sideband network. Others [1, 2] have argued for similar mechanisms.

The DSA cache can further improve core-DSA interaction performance by prefetching data from the DSA and merging small IO writes into bigger chunks. A RISC-V core could even integrate some of these mechanisms within the core pipeline, if one were to design a custom RISC-V core.

[1] J. Mcalpin, "Notes on Cached Access to Memory-Mapped IO Regions", https://sites.utexas.edu/jdm4372/2013/05/29/notes-on-cached-access-to-memory-mapped-io-regions

[2] Mukherjee, et al., "Coherent Network Interfaces for Fine-Grain Communication," Proceedings of the 23rd Annual International Symposium on Computer Architecture, May 1996.

See more details about SiFive’s standard cores, or to customize and build domain-specific RISC-V cores, please visit sifive.com/risc-v-core-ip

Shubu Mukherjee

Chief SoC Architect, SiFive

Read more Insights from the RISC-V Experts

全力投入：开启增长新篇章

我们自信地宣布公司发展历程中最重要的里程碑之一：完成 **4 亿美元** 的融资。本轮融资由 Atreides Management 领投，其他顶级投资机构\*包括 Apollo Global Management、NVIDIA（英伟达）、Point72 Turion 和 T. Rowe Price Investment Management, Inc.，以及现有投资者 Prosperity7 Ventures 和 Sutter Hill Ventures 参投。此次融资使公司估值达到 **36.5 亿美元**，并将加速 SiFive 的 RISC-V CPU 及 AI IP 解决方案推向数据中心和 AI 基础设施市场的核心地带。

RISC-V 代码模型（2026 版）

RISC-V 指令集架构 (ISA) 在设计上兼顾简洁与模块化。为了实现上述设计目标，RISC-V 有意识地减少了寻址方式的种类，从而降低了实现复杂 ISA 时的一项重大成本。寻址方式成本高昂：在小型设计中，会增加解码开销；在大型设计中，则会引入隐式依赖成本。

模块化是 AI 的未来：为何 SiFive-NVIDIA 的里程碑意义重大

AI 的巨大潜力目前正受限于一个主要瓶颈：数据传输。在当今系统中，GPU 的处理速度往往受到互联技术以及 CPU、加速器与系统其余部分间数据流动效率的限制。

SiFive Data Center Innovation

Leadership in Embedded Markets

Essential 系列

Intelligence 系列

Performance 系列

Automotive 系列

现已上市

旧版开发板

All Boards

Featured Blog Post

HiFive P550 Review

Software Expansion

Premier P550 Boards

RISC-V Software Expansion

Premier P550 Boards

Cycuity Partnership

SiFive’s New Look

SiFive 博客

Part 1: Fast Access to Accelerators: Enabling Optimized Data Transfer with RISC-V

Shubu Mukherjee

Read more Insights from the RISC-V Experts

全力投入：开启增长新篇章

RISC-V 代码模型（2026 版）

模块化是 AI 的未来：为何 SiFive-NVIDIA 的里程碑意义重大