PicoRio User Manual¶
General Documentation¶
Introduction¶
What is PicoRio¶
PicoRio is an open-source project stewarded by the RISC-V International Open Source (RIOS) laboratory, a nonprofit research lab at Tsinghua-Berkeley Shenzhen Institute (TBSI). The RIOS Lab focuses on elevating the RISC-V software and hardware ecosystem collaboratively with both academia and industry. In PicoRio, we create an open, affordable, Linux-capable RISC-V hardware platform to help software developers port modern applications that require Javascript or GPUs. PicoRio will build upon high-quality IPs and software components contributed by experts from industry and academia. PicoRio is not proprietary to any specific vendor or platform, and will have complete documentation that can help users build high-quality products in a short amount of time.
Motivation¶
- A system is more than processors
Large cost to license other IPs in SoC: cache, interconnects, graphics, camera ISP, etc
An attractive open-source platform to experiment new hardware ideas
Full-system support is indispensable to security and trusted executions.
RISC-V hardware extensions: JIT runtime, vectorization, etc
- The community lacks affordable RISC-V hardware platforms that is capable of executing diverse softwares
Few low-cost, software-capable boards for the long tail of developers
Developers won’t spend $1000 for a new hardware just for software development
Highlights¶
Independently Maintained: The RIOS Lab is an independent nonprofit organization that governs the architecture development, ensures compliance, and will publish the design. The RIOS Lab will be the gatekeeper for both hardware and software, from SoC and firmware/drivers to high-level software and documentation. PicoRio will be vendor agnostic and non-proprietary. The RIOS Lab will work with academic and commercial organizations that will commit to its expansion and volume manufacturing.
Open Source: PicoRio will open source as many components as possible, including the CPU and main SoC design, chip package, board design files, device drivers, and firmware. The exceptions are foundry related IPs (e.g., TSMC SRAM configurations), commercial high-speed interfaces, and complex commercial IP blocks like GPU. Nevertheless, our goal is to reduce the commercial closed source IPs for each successive release of PicoRio, with the long term goal of having a version that is as open as possible.
High-Quality IPs: A major goal of the RIOS lab is developing open source, hardware IPs with industrial quality to boost the growth of RISC-V ecosystem and compete with those of existing, proprietary ISAs. Thus, PicoRio aims at a high-quality silicon release using open-source IPs. Such IPs will have gone through rigorous tapeout verifications that meet industry quality. The openness of PicoRio will not come at the cost of lower quality IP blocks. In addition, we will open source our verification process, which can further enhance transparency and trustworthiness.
Modern Software Stack Support: PicoRio utilizes a heterogeneous multicore architecture and it is Linux-capable (RV64GC). We also designed PicoRio hardware to run modern managed languages such as JavaScript/WebAssembly as well as graphical applications like the Chrome web browser. In the RIOS Lab, PicoRio is also the hardware platform for several other open-source software projects, such as the RISC-V ports for the V8 Javascript engine and the Chromium OS.
Low-Power and Low-Cost: The target metrics of PicoRio are low power dissipation and cost, which is a perfect match to the target of RISC-V system design.
Project Roadmap¶
Three Phases of the PicoRio Development¶
We aim to incrementally improve PicoRio with each new release. We divide the development of PicoRio into three phases:
First Phase (PicoRio 1.0): We include a basic 64-bit quad-core cache-coherent design (RV64GC) that runs full Linux. We have already booted a Chromium OS kernel in command line mode. A standalone version of Chrome V8 Javascript engine will run directly on the kernel. We expect an early beta release late this year. This “headless” version of PicoRio should be fine for software development.
Second Phase (PicoRio 2.0): In addition to the hardware improvement of the PicoRio v1.0, we are working with Imagination™ to include a complete display pipeline (including a GPU) with video encode/decode capabilities to run graphics intensive applications like web browsers.
Third Phase (PicoRio 3.0): Building upon the v2.0 hardware, we plan to further improve the CPU performance to bring PicoRio to the level of a pad computer or laptop.
FAQ¶
How is PicoRio compared to Raspberry Pi?¶
Inspired by the Raspberry Pi, we propose the PicoRio project, whose goal is to produce RISC-V based small-board computers at an affordable price point. PicoRio has differences in the following aspects:
Open Source: Unlike Raspberry Pi, which uses proprietary Broadcom SoCs, PicoRio will open source as many components as possible, including the CPU and main SoC design, chip package and board design files, device drivers, and firmware. Nevertheless, our goal is to reduce the commercial closed source IPs for each successive release of PicoRio, with the long term goal of having a version that is as open as practical.
Low-Power and Low-Cost: The target metrics of PicoRio are long battery life and low cost, which is a better match to RISC-V today, instead of high performance and large memory. In contrast, Raspberry Pi uses more power hungry ARM processors. For example, the idle power consumption has risen from 0.4 Watts to 2.7 Watts in the latest version of Raspberry Pi.
Hardware Projects¶
This section describes the specification of PicoRio hardware components. We have grouped all the components into 4 general classes according to their respective functionalities. The development status is also listed.
RRV64¶
Overview¶
RRV64 is a 64-bit RISC-V Core designed for embedded applications. It has a 5 stage in-order pipeline and multi-level cache system including L1 and L2 I/D caches. RRV64 supports RV64IMAC instruction sets, Sv39 Virtual Address format, legal combinations of privilege modes in conjunction with Physical Memory Protection (PMP). It is capable of running a full-featured operating system like Linux. The core is compatible with all applicable RISC‑V standards.
RRV64 is designed to be feature a very flexible memory system that includes L1 caches, L2 caches, bus interfaces, and memory maps that provide a lot of flexibility for SoC integration.
![]()
Fig. 1 Core Overview¶
Fig. 1 illustrates a simplified RRV64 pipeline
Repository Organization¶
The following shows the main folders in RRV64 repository and their usage:
.
|─rtl --RRV64 RTL description using SystemVerilog code
│ ├─common --Macro and parameter definition files
│ ├─lib --Components used in RRV64, such as FIFO, RAM, etc
│ └─rrv64 --RRV64 Core
└─tb --Benchmarks, testbenchs and Makefile for simulation
├─rrv64 --The testbench of the top-level module for simulation
├─perfect_mem_model --The testbench of an ideal L2Cache
└─test_program --Benchmarks for testing the CPU
└─benchmarks
Getting Started¶
Get the Source Code¶
You can clone the source code of RRV64 along with its simulator using git
:
$ git clone https://gitlab.com/picorio/rrv64.git
Prerequisites¶
Several tools are needed to build the project.
1. Verilator : SystemVerilog Translator and simulator¶
On Ubuntu, executing the following command should suffice:
$ sudo apt-get install verilator
For other OS, you can install Verilator with Git. See here for more information.
2. Gtkwave : Wave viewer¶
To make use of Verilator waveform tracing, you will need to have GTKwave installed.
3. RISC-V GNU Compiler Toolchain¶
Choose Newlib for installation.
For RRV64, the configuration should be:
./configure --prefix=/opt/riscv --with-arch=rv64gc --with-abi=lp64d
To add
$PATH
into PATH, If you choose, say,/opt/riscv
as prefix:$ vim ~/.bashrcappend
export PATH=$PATH:/opt/riscv/bin
into .bashrc file, then save & exit, then$ source ~/.bashrc
Compile & Run simulation¶
With VCS¶
To compile RRV64 with VCS
$ cd rrv64/tb
$ make vcs
And then it will be compiled by VCS, to run the simulation
$ make vcs_run
The default program to be executed is Dhrystone
With Verilator¶
Verilator is an open-source simulator, it provides verilog/systemverilog compilation function similar to VCS.
Build RRV64 and run program with RRV64 in Verilator
$ cd rrv64/tb $ make verAnd then it will be compiled by Verilator, to run the simulation
$ make ver_runThe Dhrystone program is executed by default. You will see the execution result of Dhrystone in about one minute.
To change the program running in the RRV64 processor, edit the file
rrv64/tb/rrv64/top.sv
, input the path to the binary file you want to execute.
With the argument +trace
after ./Vtestbench
,
the program will produce a waveform file with suffix .vcd
in the folder logs
under its corresponding folder prefixed with sim_
.
To check the waveform file, we use Gtkwave,
say the .vcd
file named vlt_dump.vcd
:
$ gtkwave vlt_dump.vcd
Core design¶
Fetch¶
Instruction Fetch (rrv64_fetch) is the first pipeline stage in RRV64. This block is responsible for initiating requests for instruction data by sending requests to the instruction buffer and loop buffer. If one of the two buffers hits, the instruction data will be available in the next cycle. Otherwise, the instruction buffer will send a request to I-Cache to obtain the instruction data. Such process will take several cycles of delay. The IF module is also responsible for generating the address of the next instruction. It receives PC requests from other pipeline stages and arbitrates using a fixed priority scheme. The modules that act as PC sources are listed below, from the highest priority to the lowest.
rrv64_csr: Sends PC on exceptions, interrupts and trap return instructions.
rrv64_execute: Sends PC when a branch instruction taken.
rrv_mem_access: Sends PC when completing a fence.i instruction and when some of CSR registers have been modified. For fences, the PC request is delayed until all fetches before the fence instruction are completed and I-Cache is flushed. This is in case of any self-modifying code. For CSR modifications, delaying the PC request ensures that the CSR operation will use the correct values.
rrv_fetch: Sends PC for the normal case (next PC=PC+4 or PC+2 for compressed instructions), immediate jumps and register jumps.
Interfaces¶
if2ic/ic2if: These interfaces are used for sending PC fetch requests from IF to instruction buffer and loop buffer. This interface uses an enable signal to send requests. This enable signal is held high until a response is received. There are 2 signals in if2ic interface:
pc: The address of the requested instruction.
valid: If this request is valid.
On the response side (ic2if), the main signals are:
inst/rvc_inst: The instruction data.
valid: Whether this response is valid.
is_rvc: Whether the instruction is RVC or not.
excp_cause: Contains the exception cause of the instruction, if any.
excp_valid: Whether this instruction was found to have an exception.
if2id: This interface contains all the data that is passed from IF to ID. It works using a valid/ready handshake. There are 2 signals in this interface.
inst: The instruction data.
pc: The PC of the instruction.
cs2if_npc/ma2if_npc/ex2if_npc/id2if_npc: These interfaces are used for sending PC redirection request to IF. They work using a valid/ready handshake. There are 2 signals in these interfaces.
pc: The new value of the PC register.
valid: Whether the request is valid.
Decode¶
Decode (ID) is the second stage in RRV64’s pipeline. It receives instruction data from the IF stage and hold it if necessary, expands C-extension instructions, decodes instruction data to set the control signals, and sends read requests to the regfile. When encountering an illegal instruction, the decoder will generate an exception signal, which will be handled when the current instruction reaches the MA stage.
The RRV64 implements the standard compressed extension to the RISC-V architecture, which allows for 16-bit, in addition to the normal 32-bit instruction size. To handle this new size of instructions, ID contains a submodule that takes the 16-bit instructions and expand it to its 32-bit equivalent. This module acts as the first layer of decoding.
After ID has the final instruction data, either the expanded compressed instruction, or the initial instruction data, it will begin to decode the instruction to determine how to set the control signals that will be used throughout the pipeline. In the RTL, you can find a case statement that will call different functions depending on the instruction’s opcode, funct7 field, funct5 field, etc. These functions will output the appropriate control signals. If the instruction needs to read the register, ID will asynchronously read the registers in rrv64_regfile (IRF). Since IRF doesn’t contain a real entry for x0, ID will instead substitute this read with a hardwired 0 signal.
If ID decodes its current instruction as a JAL instruction, it will calculate the destination address and send a redirect request to the IF stage. If it is a fence_i, mret, or a csr operation on the PMP related registers, the ID will stall the IF stage until the instruction is retired.
There is a Regfile Scoreboard in this stage. Its purpose is to track which registers still have pending writes. This is used to resolve data hazards. When ID decodes that its instruction will eventually write to the regfile, it indexes into the scoreboard using rd (the index of the destination register) and marks that entry, to signal that there is a pending write, and thus a possible data hazard. When that instruction eventually writes to the regfile, that scoreboard entry is cleared. If ID has an instruction and with one, or both, of its source registers indicating pending writes, it will use the data pushed forward from EX stage or wait for the data retrieved from the memory.
Interfaces¶
id2irf: This interface is for requesting the data in the IRF. There are 4 signals.
rs1_addr: The address of source register 1.
rs2_addr: The address of source register 2.
rs1_re: Control signal. High when read to rs1_addr is valid.
rs2_re: Control signal. High when read to rs2_addr is valid.
id2ex: This interface contains all the data passed from ID to EX. It works on a valid/ready handshake. There are 6 signals in this interface.
pc: The PC of the instruction.
inst: The instruction data.
rs1_addr: The address of source register 1.
rs2_addr: The address of source register 2.
is_rvc: Signals whether this instruction is RVC, used to calculate npc in EX and MA stage, if needed.
ex2id_bps/ma2id_bps: These interfaces are used for data forwarding: send the execution result of the EX/MA stage back to the EX stage to solve data hazard. There are 4 signals in this interface.
valid_addr: Indicating whether the address of register accessing or memory accessing is valid.
valid_data: Indicating whether the data of register accessing or memory accessing is valid.
addr: The address of register accessing or memory accessing. Used to compare with the address to be accessed by the instruction in the ID stage.
data: The data in register accessing or memory accessing.
Execute¶
The execute stage is responsible for calculations and sending memory requests to the LSU. This stage consists of an arithmetic and logic unit (ALU), a pair of multi-cycle multiplier and divider, a branch address calculation unit and a load/store address calculation unit.
ALU: The ALU is responsible for additions, subtractions, shifts, data comparisons (for branches and slt instructions), and bit-wise logical operations (AND, OR, XOR). The ALU is fed with the operands as well as the operation type. The logic in ALU is purely combinational.
Multiplier: The multiplier is used for multiplications. It is fed the operands as well as the multiplication type. The start_pulse input of the multiplier is set to 1 for 1 cycle to trigger the multiplication operation. The complete output is set to 1 when the multiplication is done. For multiplications where only the lower 64 bits of the result are needed, the calculation completes in the same cycle the start_pulse is set to 1. For multiplications where the upper 64 bits of the result are needed, the calculation completes in 3 cycles.
Divider: The divider is used for division operations. The divider is fed with the operands as well and the division type. The divider triggers the calculation when start_pulse input is set to 1. The complete output is set to 1 when DIV is done. DIV takes 17 cycles to accomplish a division operation.
The target address of the branch and the address of load/store instructions are calculated by the branch address calculation unit. For a branch instruction, if the branch is taken, a flush signal will be sent to IF and ID to “flush” the instructions in those stage, and a redirection signal will be sent to IF and the value of PC will change accordingly. For load/store instruction, the memory access request will be sent to D-Cache, so if D-Cache hits, we can the get the memory access result at MA stage in the next cycle.
Interfaces¶
ex2ma: This interface contains all the data passed from EX to MA. It works on a valid/ready handshake. There are 6 signals in this interface.
pc: The PC of the instruction.
inst: The instruction data.
ex_out: The result of EX’s calculation.
rd_addr: The address of destination register 1, if any.
csr_addr: The address of csr register, if any.
is_rvc: Whether this instruction is RVC.
ex2dc: This is the interface between EX and D-Cache, used for sending memory requests. It uses a valid/ready handshake. There are 5 signals in this interface.
rw: 1 if the request is a write, 0 if it is a read.
mask: The byte mask for Store operation.
addr: The memory request address.
wdata: The write data of the memory request.
width: The width of the operand of Load/Store operation.
Memory Access¶
This stage is responsible for receiving memory responses from D-Cache, interfacing with rrv_csr (CSR), sending redirection requests to IF in certain cases, and committing instructions and writing data to Register Files.
For load and store instructions, MA will receive memory responses from D-Cache. Only 1 memory response is accepted per instruction. Loads will respond with the data read from memory, while stores will respond with 0 data. The data will be pushed forward to the ID stage through the bypass network to solve possible data hazard.
For CSR instructions, the MA stage will read and write the CSR Registers.
For fence or those csr operations on the PMP related registers, MA will send a npc signal to the IF stage to release the stall state of the IF, ID and EX stages.
For instructions with destination register and without any exceptions, it is at MA stage that the result will write to the regfile. Regfile writes are synchronous.
Interfaces¶
dc2ma: This interface is the memory response interface between D-Cache and MA. There are 4 signals in this interface.
rdata: The read data requested by load instructions.
excp_valid: Signals whether the memory access operation cause an exception (e.g. violated a PMP check).
excp_cause: Contains the exception cause of the instruction, if any.
valid: Whether the response is valid.
ma2cs/ma2cs_ctrl: These interfaces are used by MA for sending read/write requests to CSR. The ma2cs_ctrl is for controlling transactions with CSR. In ma2cs_ctrl, there are 3 signals in this interface:
csr_op: CSR operation type. It can be set to RRV64_CSR_OP_RW (read and write), RRV64_CSR_OP_RS (read and set), RRV64_CSR_OP_RC (read and clear) and CSR_OP_NONE if MA does not have a request to CSR.
ret_type: Return instruction type (mret or uret). It will be set to RET_TYPE_NONE if the instruction is not either of the ret type instructions mentioned.
is_wfi: Set to 1 if the instruction is a WFI instruction.
For ma2cs, there are 5 signals in this interface:
pc: PC of the current instruction. Used mainly for exception handling.
csr_addr: Request CSR address.
csr_wdata: Data used for do some calculation with data in CSR, the calculation result will be written back to the CSR.
rs1_addr: rs1 address of the instruction. Used for checking if the CSR operation should be considered a write.
mem_addr: Memory address of the load or store instruction. Used for updating the MTVAL CSR on load/store PMP exceptions.
ma2irf: This interface is used by MA to send regfile writes to IRF. Writes will be validated using an active high write enable signal. Including the enable signal, there are 3 signals in this interface:
rd: Write data.
rd_addr: Regfile write address.
rd_we: Write enable.
Instruction Buffer¶
The instruction buffer is mainly used to prefetch instructions from L1 Cache. In addition to the instruction requested by the IF, the instruction buffer also fetches the instructions of the next two cache lines. If the execution flow is sequential, or there is a forward jump whose span is less than two cache lines, the instruction buffer will hit and return the instruction data within one cycle since we have already fetch it before. When a branch or jump instruction is taken and the instruction corresponding to the destination address is not currently in instruction buffer, the instruction buffer will be flushed and send a request to ICache.
Loop Buffer¶
Loop buffer is a high speed D-Cache type memory that is used for holding up to 64 of the most recently fetched instructions. It is maintained by the IF stage of the pipeline. If a branch instruction is taken, we can first check the loop buffer to see if the instruction exists. If the loop buffer hits, the instruction data will be returned to IF within a cycle. If not, the loop buffer will wait for the instruction data be fetched from instruction buffer or L1 Cache and use this instruction to replace the oldest instruction in loop buffer.
Address Translation¶
To support an operating system, RRV64 features full hardware support for address translation via a Memory Management Unit (MMU). It has separate configurable data and instruction TLBs. The TLBs are fully set-associative memories. On each instruction and data access, they are checked for a valid address translation. If none exists, RRV64’s hardware PTW queries the main memory for a valid address translation. The replacement strategy of TLB entries is Pseudo Least Recently Used (LRU).
Both instruction cache and data cache are virtually indexed and physically tagged and fully parametrizable. The address is split into page offset (lower 12 bit) and virtual page number (bit 12 up to 39). The page offset is used to index into the cache while the virtual page number is simultaneously used for address translation through the TLB. In case of a TLB miss the pipeline is stalled until the translation is valid.
Exception Handling¶
Exceptions can occur throughout the pipeline and are hence linked to a particular instruction. The first exception can occur during instruction fetch when the PTW detects an illegal TLB entry or the address is not aligned. During decoding, exceptions can occur when the decoder detects an illegal instruction. As soon as an exception has occurred, the corresponding instruction is marked and auxiliary information is saved. Such excepting instruction will be handled by the exception handler at the MA stage.
Interrupts are asynchronous exceptions, in RRV64, they are synchronized to a particular instruction. Like exception, the interrupt signal will be processed in the MA stage.
Privileged Extensions¶
The privileged specification defines more CSRs governing the execution mode of the hart. The base supervisor ISA defines an additional interrupt stack for supervisor mode interrupts as well as a restricted view of machine mode CSRs. Accesses to these registers are restricted to the same or a higher privilege level.
CSR accesses are executed in the MA stage. Furthermore, a CSR access can have side-effects on subsequent instructions which are already in the pipeline e.g. altering the address translation infrastructure. This makes it necessary to completely flush the pipeline on such accesses.
Cache¶
Cache overview¶
So far, the RRV64 core is equipped with private L1 instruction & data cache and unified L2 cache, the coherent L1 data cache is in progress.
The overall design of our internal memory hierarchy is illustrated in following blockdiagram.
L1 Cache¶
L1 Data Cache¶
As part of the memory hierarchy, the L1 data cache helps cut down memory access time of cpu. In that the L1 D-Cache is private, the cache coherence among multicores is a major problem to settle. The design and implementation of cache coherent scheme and other design details are work in progress.
Parameter¶
The parameter of L1 data cache is as follows:
Cache capacity
Cache line numbers
Cache line capacity
Mapping method
32 KBytes
512
32 Bytes
2-way set associative
L1 Instruction Cache¶
As part of memory hierarchy, the L1 instruction cache helps cut down the latency of cpu instruction fetching.
The parameter of L1 instruction cache is as follows:
Cache capacity
Cache line numbers
Cache line capacity
Mapping method
8 KBytes
128
32 Bytes
2-way set associative
L2 Cache¶
Overview¶
The L2 cache is a 256KB, 4-bank, 4-way set associative shared L2 cache. The latency of L2 cache is 4 cycles at hit. The L2 cache RAM reading and writing processes are pipelined into 4 stages for less RAM access and higher frequency. The L2 cache is designed as a non-blocking cache which can handle hit-under-miss and miss-under-miss using the Missing Status Holding Registers (MSHRs). With non-blocking L2 cache design, memory system can execute out-of-order and more latency can be hidden.
![]()
Fig.1 L2 cache bank connection¶
Parameter¶
The parameter of L1 data cache is as follows:
Cache capacity
Cache line numbers
Cache line capacity
Mapping method
256 KBytes
512
32 Bytes
4-way set associative
L2 cache pipeline¶
The L2 is designed as 4-stage-pipeline for low power and high frequency. In the first 3 stages, valid, tag, lru, dirty and data RAMs are serially checked, which means some of the RAMs are not needed to be accessed if the information got from previous stages tells the control logic not to.
The Missing Status Holding Registers lie in the stage 4, which has the ability to hold multiple cache missed request to the next level memory, without blocking the whole pipeline. This is a key feature for Out-of-Order memory system.
![]()
Fig.2 L2 cache pipeline overview¶
Contributing¶
We highly appreciate community contributions. If you want to do contribution to the project, please:
Create your own branch to commit your changes and then open a Pull Request.
Split large contributions into smaller commits addressing individual changes or bug fixes. Only include one change in per commit.
Write meaningful commit messages. For more information, please check out the commit guide.
If asked to modify your changes, do fixup your commits and rebase your branch to maintain a clean history.
Commit guide¶
Create your branch to commit your changes and then create a Pull Request.
Separate subject from body with a blank line.
Capitalize the subject line.
Use the present tense (“Add feature” not “Added feature”).
Use the body to explain what and why and how.
Component |
Description |
---|---|
Pygmy_ES1Y EVB User Guide |
|
RRV64 core used in PicoRio: a 64-bit, single in-order issue, 5-stage-pipeline 64-bit RISC-V core. |
|
Graphics |
Collection of display pipeline in PicoRio™. This includes the GPU, display core, and video encoder and decoder. |
Private L1 instruction & data cache and unified L2 cache. |
|
System Control |
System control related features and units |
IO |
Collection of input and output interfaces in PicoRio hardware. |
The overall PicoRio™ hardware blockdiagram (future work included):
Software Projects¶
This section describes the software projects which PicoRio supports. We put all projects in a dashboard, and list out the current developing status of them.
Firmware¶
Debug socket introduction¶
Debug-socket¶
Debug-socket is proxy running on host to interact with target, the functionality of debug-socket in software development, as shown in the following picture.

Fig.1 Socket debug in SW development¶
According to the riscv-debug specification, if any kernel contains standard debug modules, simply follow the: “RISC-V external debugging support version xxx”. For standard debug module:

Fig.2 RISC-V debug overview¶
We choose to use a software-based debug socket instead of a standard debug module to implement the debug function, both of which have the same effect and can be used for debugging of the soc. For our debug-socket, see debug-socket connections overview.

Fig.3 Debug socket connection overview¶
Basically, the debug-socket implements basic functions required by gdb, with the help of hardware-provided breakpoint, watchpoint, trace buffer, and many other features.
Debug-socket supported command list¶
The full-stack debug tool development is under way, you can use the raw debug-socket interface to debug for now. Debug socket offers a big list of commands, however the following commands are the ones used most frequently:
Command |
Usage |
Function |
---|---|---|
b0 |
b0 addr |
set a breakpoint at hw breakpoint 0 with addr |
b1 |
b1 addr |
set a breakpoint at hw breakpoint 1 with addr |
b2 |
b2 addr |
set a breakpoint at hw breakpoint 2 with addr |
b3 |
b3 addr |
set a breakpoint at hw breakpoint 3 with addr |
d0 |
d0 |
disable breakpoint at hw breakpoint 0 |
d1 |
d1 |
disable breakpoint at hw breakpoint 1 |
d2 |
d2 |
disable breakpoint at hw breakpoint 2 |
d3 |
d3 |
disable breakpoint at hw breakpoint 3 |
wp (not supported for now) |
wp |
show watchpoint configure |
bp |
bp |
show breakpoint configure |
c |
c |
continue to run |
stall |
stall |
make cpu stall |
step N |
step N |
run next N instructions |
gpr(not supported for now) |
gpr |
print all general purpose register |
q |
q |
quit debug-socket |
wb_pc |
wb_pc |
show current excute instruction pc |
if_pc |
if_pc |
show current fetch instruction pc |
minstret |
minstret |
show m-mode excuted instruction count |
mstatus |
mstatus |
show mstatus value |
mcause |
mcause |
show mcause value |
mepc |
mepc |
show mepc value |
mip |
mip |
show mip value |
mie |
mie |
show mie value |
hpmcounter_3~hpmcounter_10 |
hpmcounter_3 hpmcounter_4 hpmcounter_5 hpmcounter_6 hpmcounter_7 hpmcounter_8 hpmcounter_9 hpmcounter_10 |
show PMU counter values |
dump |
dump 0x00f00000 0x00f00080 rb/dma |
dump content from start address to end address |
read |
read 0x00f00000 rb/dma |
read content from specified address, rb for device register & dma for memory |
write |
write 0x00f00008 1 rb/dma |
write value to specified address, rb for device register & dma for memory |
uart1 |
uart1 |
show uart1 cfg |
gpio |
gpio |
show gpio cfg |
rtc |
rtc |
show rtc cfg |
wdt |
wdt |
show wdt cfg |
i2c0 |
i2c0 |
show i2c controller’s cfg |
Classical debug process¶
When encounter some error in program, you can use debug-socket to debug the program:
1. type ‘minstret’ twice to analysis if the CPU is stall or not, if the two values of minstret is the same value, the CPU is stalled
: minstret
Do Read to Addr 0x1002b0 (minstret), Got Data 0x2409734f
Please enter command: (All Data in HEX no matter 0x is added or not)
: minstret
Do Read to Addr 0x1002b0 (minstret), Got Data 0x240aa177
Please enter command: (All Data in HEX no matter 0x is added or not)
:
if the CPU is not stalled, type ‘wb_pc’
: wb_pc
Do Read to Addr 0x100258 (wb_pc), Got Data 0x80009430
Please enter command: (All Data in HEX no matter 0x is added or not)
:
use ‘b0 addr’ to set a breakpoint, the program will stop when run into addr
: b0 80008e48
add breakpoint0, pc_addr = 0x80008e48
Please enter command: (All Data in HEX no matter 0x is added or not)
:
then, you can use ‘read addr dma’ to check some var value
: read 800102c4 dma
Do Read to Addr 0x800102c4, Got Data 0x6ffffffff
Please enter command: (All Data in HEX no matter 0x is added or not)
:
type ‘step N’ to run N instructions
: step 10
pc = 0x80000300
pc = 0x80000304
pc = 0x80000308
pc = 0x8000030c
pc = 0x80000310
pc = 0x80000314
pc = 0x80000318
pc = 0x8000031c
pc = 0x80000320
pc = 0x80000324
Please enter command: (All Data in HEX no matter 0x is added or not)
:
re-check some var
: read 800102c4 dma
Do Read to Addr 0x800102c4, Got Data 0x6ffffffff
Please enter command: (All Data in HEX no matter 0x is added or not)
:
continue to run until run into the breakpoint again
: c
Continue
Please enter command: (All Data in HEX no matter 0x is added or not)
:
disable breakpoint
: d0
del hw breakpoint1
Please enter command: (All Data in HEX no matter 0x is added or not)
:
continue
: c
Continue
Please enter command: (All Data in HEX no matter 0x is added or not)
:
ES1Y SDK v1.0 Introduction¶
ES1Y Software Development Kit is used in linux platform at present and it will support much more host OS, e.g. windows later. The SDK provides freertos APIs for customers’ application development, what’s more, there are some system test demos included in the SDK so as to help the new customers get on hand quickly.
1. Getting start¶
In this chapter, we need prepare development environment and know how to make the binary running on ES1Y SoC, and then rvSDK provide straightforward tools for debugging.
Init SDK
Follow the README.md file at SDK v1.0 root dir firstly.
Compile & Run
- Build the gcc toolchain
$ cd build $ make gcc
- Build the fesvr & debug proxy
$ make fesvr $ make driver
- Build freertos and application code
# clean if needed $ make freertos-clean $ make freertos
- Run vivado to use FPGA as a debug tool, at the same time, run debug proxy
# this command only need execute once time in the whole debug process $ make run-vivado # or shortly 'make rv'
- Download and run FreeRTOS firmware through debug proxy
$ make run-rtos # or shortly 'make rvt' # when you finish your debug and want to exit # use Ctrl+C Ctrl+C(that is: input Ctrl+C twice).
- You can use this one command below instead of steps above to simplify build process
$ make freertos-all
- One additional command is provided to speed up debuging after edit source code
# this command equal to make freertos && make run-rtos $ make re-comp-run-rtos # or shortly 'make rvrt'
- The default code in rvSDK v1.0 will startup two tasks, which one print ‘TEST’ per second and the other one print ‘DEMO’ every 2 seconds, after do some IO test:
************************************************* Welcome enter FreeRTOS on pygmy_e platform ************************************************* TEST DEMO for IO functions ... TEST IO functions done ... ------- TEST ------- Demo task ... ------- Demo ------- ------- TEST ------- ------- TEST ------- ------- Demo ------- ------- TEST ------- ------- TEST ------- ------- Demo -------
- Debug
- Console by UART
Please read the other document that introduces usb-uart dongle connection between host & target.
The proper UART baudrate & other configuration is 500000, 8n1 for print debugging.
- Command Line Interface(CLI)
- There are some limitations for debugging CLI in rvSDK v1.0 with debug-spi-base.o, anyway, we will provide more abundant debugging tool in future.
1. Debug tool startup interface
$ cd software/host/driver/pygmy_e $ ./debug-socket.o serverPort = 8800 Please enter command: (All Data in HEX no matter 0x is added or not) :
2. read the current pc value
: if_pc Do Read to Addr 0x100238 (if_pc), Got Data 0x8000943c Please enter command: (All Data in HEX no matter 0x is added or not) : wb_pc Do Read to Addr 0x100258 (wb_pc), Got Data 0x80009430 Please enter command: (All Data in HEX no matter 0x is added or not) :
3. read device register
: read 80bff8 rb Do Read to Addr 0x80bff8, Got Data 0xbccb5ade85 Please enter command: (All Data in HEX no matter 0x is added or not) :
4. read memory
: read 8000f798 dma Do Read to Addr 0x8000f798, Got Data 0x20656e6f6420736e Please enter command: (All Data in HEX no matter 0x is added or not) :
Reference debug-socket introduction for more debug commands introduction.
2. How to code¶
In rvSDKv1/target/src/Demo/pymgy_e/, app_entry.c is main application entry c file, and you should implement rvHalCB_app_entry() within it depend on your requirement.
By default, run demo program if enable #define APP_SYSTEM_TEST in app_entry.c, otherwise run your program.
The demo code in this file which supports i2c, gpio and spi flash. (controlled by #define TEST_DEMO_GPIO #define TEST_DEMO_I2C #define TEST_DEMO_SPI_FLASH in target/src/Demo/pymgy_e/hal/config/pygmy_e/system_config.h)
Anyway, you can implement rvHalCB_app_entry() by your own requirement
3. Programming API¶
- Freertos API
we can easily get the help from https://www.freertos.org/FreeRTOS-quick-start-guide.html
ES1Y API¶
OS API¶
The official FreeRTOS API references can be found here: https://www.freertos.org/a00106.html
UART API¶
Only support module init and printf functions for now, more functions is under developing.
/********************************
uart
*********************************/
/*!
* @discussion initialize uart module.
*/
void __rvHal_uart_init(void);
/*!
* @discussion print log through uart.
* @param fmt fmt string.
* @param ... params corresponding to % in fmt string.
* this is a simplified version printf of standard printf in libc,
* only support below format params:
* %d, %u, %ld, %lu, %lld, %llu, %o, %x, %lo, %lx, %llo, %llx, %s, %c, %%
* and also support width and padding in params above
*/
int printf(const char* fmt, ...);
GPIO API¶
GPIO API is in the following code snippet
/********************************
gpio
*********************************/
struct irq_gpio_handler_t
{
void *context;
void (*hook)(void *context);
};
struct gpio_desc
{
unsigned int pin;
struct irq_gpio_handler_t handler;
};
enum RVHAL_gpio_type
{
GPIO_PIN_INPUT = 0,
GPIO_PIN_OUTPUT,
};
enum RVHAL_gpio_int_type
{
GPIO_INT_TYPE_LEVEL = 0,
GPIO_INT_TYPE_EDGE,
};
enum RVHAL_gpio_int_polarity
{
GPIO_INT_POLARITY_LOW = 0,
GPIO_INT_POLARITY_HIGH,
};
/*!
* @discussion initialize gpio module.
*/
void __rvHal_gpio_init(void);
/*!
* @discussion initialize gpio pin descriptior.
* @param dgpio gpio descriptor.
* @param pin pin number[0, 31].
* @param type see enum RVHAL_gpio_type.
* @param value if type is GPIO_PIN_OUTPUT, it is [0, 1] by default.
*/
void rvHal_gpio_init( struct gpio_desc *dgpio, unsigned int pin, unsigned int type, unsigned int value );
/*!
* @discussion set gpio pin interrupt attribution.
* @param dgpio gpio descriptor.
* @param level see enum RVHAL_gpio_int_type.
* @param polarity see enum RVHAL_gpio_int_polarity.
* @param irqHandler gpio pin callback handler.
* @param context context param for this gpio pin.
*/
void rvHal_gpio_set_interrupt( struct gpio_desc *dgpio, unsigned int level, unsigned int polarity, void (*irqHandler)(void*), void *context);
/*!
* @discussion remove gpio pin interrupt attribution.
* @param dgpio gpio descriptor.
*/
void rvHal_gpio_remove_interrupt( struct gpio_desc *dgpio );
/*!
* @discussion gpio pin output level.
* @param dgpio gpio descriptor.
* @param value [0, 1].
*/
void rvHal_gpio_write( struct gpio_desc *dgpio, unsigned int value );
/*!
* @discussion gpio pin input level.
* @param dgpio gpio descriptor.
* @return value [0, 1].
*/
unsigned int rvHal_gpio_read( struct gpio_desc *dgpio );
/*!
* @discussion toggle gpio pin output level.
* @param dgpio gpio descriptor.
*/
void rvHal_gpio_toggle( struct gpio_desc *dgpio );
Projects |
Project Description |
---|---|
Debug-socket is proxy running on host to interact with target, the functionality of debug-socket in software development. |
|
ES1Y SDK v1.0 provides freertos for customers’ application development, what’s more, there are some system test demos included in rvSDK so as to help the new customers get on hand quickly. |
|
Includes OS API, UART API, GPIO API |
V8-RISCV¶
Welcome to the v8-riscv wiki
This is an on-going project to enhance the RISC-V backend for the V8 JavaScript Engine. The initial port has been upstreamed (https://chromium.googlesource.com/v8/v8.git/). The RISC-V backend is fully functional and is able to run the full test suites as well has common benchmarks, but it still needs improvements for improving performance and adding features. We have established a sustainable porting methodology and development best practices, such that we feel confident to invite broader community participation. We welcome you to join our development effort. Plenty of support is still needed for a complete and high-performing V8 on RISC-V.
This repo will be the community home even though it is now available upstream. This provides us a shared space for developing larger changes here before pushing them upstream, as well as a stable branch that will always work for RISC-V, as upstream may still break the RISC-V port from time to time. For general V8 information, see V8 Dev. The rest of the wiki is specific to the RISC-V V8 backend.
RISC-V ISA specification is found here, and RISC-V standard ABI can be found here.
Getting Started¶
Project Management¶
[[Upstream Workflow]]
For Developers¶
RISC-V Backend Design Doc¶
How to develop a new backend
Community operation¶
Attend our bi-weekly developer Zoom Meeting
| Meeting Info | Description | |-|-| | Next meeting | 17/03/2021 (US) | | Time | every other Wednesdays 5pm PT (Thursdays 9am Beijing Time) | | Meeting ID| 876 4151 0603 | | Passcode | 714793 | | Meeting agenda | Meeting agenda (03/03) | | Last meeting minutes | Meeting minutes (03/03)|
Projects |
Development State |
Project Description |
Link |
---|---|---|---|
ES1Y Firmware includes Debug socket, ES1Y SDK and ES1Y API. |
|||
V8 is a commonly used JavaScript engine in popular web browsers. PicoRio provides support for RISC-V V8. |
|||
Chromium OS |
Chromium OS is a open-source web browser with strong web application support and rich software ecosystem. This project is RISC-V port of Chromium OS, and is in development. |