A tiny bootloader for LXP32

Introduction

If a program is stored in a slow external memory device (e.g. a serial Flash chip), it must be copied to the RAM before executing. While it is possible to design a RAM loading machine in Verilog or VHDL, this task can be also accomplished by the CPU itself, provided that the program RAM is accessible from the data bus.

To do this we will need a small ROM consisting of eight 32-bit words which will be accessible to the CPU via the instruction bus (it doesn't have to be connected to the data bus). Such a tiny ROM can be implemented on LUTs without using RAM blocks.

To make the bootloader as small as possible, we place a few limitations on the system address space layout:

Let's also assume that the source address is 0x04000000 (any address consisting mostly of zeros will work), and the program size is 2048 words (i.e. 8192 bytes, any size less than 1048576 will work).

// The most significant byte of the source address
// (others are assumed to be zero)
#define SOURCE_ADDRESS_MSB 0x04

// Program size in bytes
#define PROGRAM_SIZE 8192

// Register variables
#define src r1
#define dst r2
#define size r3
#define loop_ptr r4

	sl src, SOURCE_ADDRESS_MSB, 24 // load source pointer to r1
// dst (r2) is already 0 after reset
	lcs size, PROGRAM_SIZE
	lcs loop_ptr, Loop
Loop:
	lw r0, src
	sw dst, r0
	add src, src, 4
	add dst, dst, 4
	cjmpul loop_ptr, dst, size

The source address is loaded using the sl instruction instead of lc since the former occupies only one word. After copying the required number of words, the last cjmpul instruction will cause the instruction pointer to overflow and transfer execution to the 0x00000000 address, which is the start of the program RAM.

To compile the code, we must specify the base address explicitly, for example:

lxp32asm -b 0xFFFFFFE0 -f hex bootrom.asm

As a result, the following machine code will be generated:

70010418
A0032000
BF04FFEC
22000100
33000200
42010104
42020204
CB040302

ROM code in Verilog

module bootrom(
	input clk_i,
	input [2:0] addr_i,
	output reg [31:0] data_o
);

always@(posedge clk_i) begin
	case(addr_i)
		3'b000: data_o  = 32'h70010418;
		3'b001: data_o  = 32'hA0032000;
		3'b010: data_o  = 32'hBF04FFEC;
		3'b011: data_o  = 32'h22000100;
		3'b100: data_o  = 32'h33000200;
		3'b101: data_o  = 32'h42010104;
		3'b110: data_o  = 32'h42020204;
		default: data_o = 32'hCB040302;
	endcase
end

endmodule

ROM code in VHDL

library ieee;
use ieee.std_logic_1164.all;

entity bootrom is
	port(
		clk_i: in std_logic;
		addr_i: in std_logic_vector(2 downto 0);
		data_o: out std_logic_vector(31 downto 0)
	);
end entity;

architecture rtl of bootrom is

begin

process(clk_i)
begin
	if rising_edge(clk_i) then
		case addr_i is
		when "000" =>
			data_o<=X"70010418";
		when "001" =>
			data_o<=X"A0032000";
		when "010" =>
			data_o<=X"BF04FFEC";
		when "011" =>
			data_o<=X"22000100";
		when "100" =>
			data_o<=X"33000200";
		when "101" =>
			data_o<=X"42010104";
		when "110" =>
			data_o<=X"42020204";
		when others =>
			data_o<=X"CB040302";
		end case;
	end if;
end process;

end architecture;