Discussion:
[N8VEM: 17910] My Z80 FPGA computer
William R Sowerbutts
2014-04-26 13:50:12 UTC
Permalink
Hello everyone

This is not strictly N8VEM related, but I've seen your thread about FPGA
systems on chip ... I built a similar system last year and I thought some of
you may be interested. I should start by saying this was only my second FPGA
project and was also my first attempt at writing code for a Z80, so the
quality of my code is probably not very good! The machine works well though
and I've had a great deal of fun with it.

I built my system using a Papilio Pro FPGA board:
http://papilio.cc/index.php?n=Papilio.PapilioPro

The PPro is a great board and I thoroughly recommend it. It has a Xilinx
Spartan 6 LX9 FPGA, 8MB of SDRAM, 8MB of SPI flash memory, and an FTDI USB
interface that is used to connect JTAG and UART to a host PC. My main
criticism of the board is that the serial link via the UART has no flow
control lines hooked up to the FPGA -- the FTDI has a deep FIFO (several KB)
and you can build a receive FIFO inside the FPGA, but at high data rates you
will inevitably overflow these eventually.

I also have a Pipistrello FPGA board which is based on the same Papilio form
factor. It has the UART flow-control hooked up, has a larger and faster DDR
SDRAM chip as well as a much larger LX45 FPGA. You can use the Xilinx on-chip
memory controller block to drive the DDR SDRAM chip. I've not got had time to
get this board working yet.

The Papilio form factor is very hardware-hacker friendly; all the IO pins are
broken out on 0.1" headers so you can easily pop a bit of veroboard on top
and solder up a MAX3232 or SD card or LEDs or whatever.

I started my Z80 system with the open-source T80 CPU core, a UART that I'd
written for an earlier project, and some of the on-chip block SRAM for
memory. I then wrote a simple monitor program for it.

Xilinx have a "data2mem" tool that you can use to quickly replace the data
loaded into a block RAM without resynthesising the FPGA design (which is
rather slow), so you can assemble your monitor program, use data2mem to have
the code loaded into block RAM, then reprogram the FPGA which will run the
code when it comes out of reset. This affords a very quick edit/compile/test
cycle (about 3 seconds from hitting enter to running code).

Once I had a monitor program running I imported Mike Field's brilliant SDRAM
controller to drive the 8MB SDRAM chip on the board:
http://hamsterworks.co.nz/mediawiki/index.php/Simple_SDRAM_Controller

This gave me access to far more memory than the Z80 could address, so I added
a 4K paged MMU to translate the 16-bit (64K) logical address space into a
26-bit (64MB) physical address space. Each 4KB logical page can be mapped
independently to any 4KB physical page. There's also what I call the "17th
page" which allows you to access physical memory addresses without mapping
them into the CPU address space -- it has a 26-bit pointer in the MMU and an
I/O port that translates I/O cycles into memory cycles, automatically
incrementing the pointer after each cycle so you can use the INIR instruction
with it to do block copies of unmapped physical memory to/from mapped memory.

The SDRAM takes on the order of 10 cycles to supply data after a read request
so I implemented a 16KB direct mapped cache using the on-chip block SRAM in
order to conceal this latency. This works very well. The FPGA block SRAM is
36-bits wide which allows for a 4-byte wide cache line plus 4 bits to
indicate the validity of each byte. You can also use it in a 9-bit wide mode
which turned out to be perfect for storing the cache address tags.

Debugging the cache was a pain. I ended up writing several programs to
exercise and test the memory in various ways; when I found a fault it often
took some head-scratching to determine if it was a bug in the hardware or the
software! This is doubly hard when the software is itself executing from
unreliable memory, so I added a 4K block of SRAM to the system use the MMU to
map that wherever I want and store the memory test program in there.

The Xilinx synthesis tools tell me my design is good for about 64MHz. I've
always run it at 128MHz without problems -- I think the critical timing paths
are probably on the address bus and so have an extra cycle to propagate.
I've not figured out how to tell the Xilinx tools about this nor how to
interpret their output to understand which are the critical paths. The Z80 is
rather fast at 128MHz and the cache all but eliminates the need for
wait states.

Once I had the hardware working I had a lot of fun writing software for it,
extending the hardware capabilities as the software grew more sophisticated.

I wrote a CP/M-2.2 BIOS and got CP/M running. There's so much RAM in the
system that I just used the top 6MB as three 2MB RAM disks, which hugely
simplified writing storage drivers (you just map the relevant page of RAM
disk into the address space, copy the data, and then map back the original
page -- this was before I implemented the "17th page" trick). I wrote SPI
master hardware and some routines in the monitor ROM to copy the RAM disk
contents to and from the on-board 8MB SPI flash for persistent storage.

Once I had CP/M working I wrote an MP/M-II XIOS and got MP/M-II running. I
added a second UART and a simple interval timer and got interrupt driven
serial and pre-emptive multitasking working. I was really very impressed with
MP/M-II, I had not realised that these Z80 systems could multitask and
support multiple concurrent users (and all before I was even born!)

Once I had that working I got a bit ambitious and decided to port UZI, Doug
Braun's 8-bit UNIX like operating system. There's little or no documentation
so this was harder than writing the BIOS/XIOS where there is a clear
specification of what you need to do. I started with the P112 UZI-180 port
which uses the Hi-Tech C/PM C compiler. I ported the kernel to ANSI C and
made it build with the modern SDCC compiler, added drivers for my MMU, UART,
RAM disk, an SD card interface, and removed the Z180 instructions. I modified
the context switching mechanism to make it much more efficient by eliminating
all the memory copying. I also increased the amount of memory available to
processes -- a native UZI process can use up to 0xF900 (62.25KB) and a CP/M
process running under emulation has a 60KB TPA (larger than under real CP/M!)

I've not yet come up with a good solution for building UZI userspace
applications. I'd really like to write a Z-machine interpreter that runs
under UZI and can play the newer Z-machine formats (eg versions 5 and 8).
These require much more memory than fits in the 64K logical address space and
I can't come up with a clean way to expose the MMU's abilities through
standard UNIX system calls (neither mmap nor shared memory segments are a
good match).

Anyway, if anyone else has a Papilio Pro and is interested, please let me
know and I'll send you an FPGA bit stream so you can have a play yourself. If
there's interest I'd consider tidying up the source code enough to share it.

I've recently ordered one of John's Mark IV SBC boards and I'm hoping to
assemble that and then port my UZI kernel over -- that software was the most
work and I'd love to see it used. The Z180 MMU is less capable than my
synthetic hardware but I believe I've come up with a workable scheme.

Has anyone considered building a ECB board to carry an FPGA? You'd need some
tri-state buffers that handle level shifting down to 3.3V on the FPGA side,
but the right design would allow you to implement either peripherals or a CPU
on the FPGA (or both). FPGAs are generally in TQFP or BGA packages but a few
rows of 0.1" header pins would allow you to plug on a Papilio or Pipistrello
daughterboard. I do realise this is somewhat contrary to the N8VEM philosophy
of using original parts, but it might be fun to have one board that can be
reconfigured quickly to work as a Z80, 68K, MIPS, ARM or even a bespoke CPU
board that talks to your N8VEM peripherals.

Hope someone found something interesting buried in there.

Will

_________________________________________________________________________
William R Sowerbutts will-***@public.gmane.org
"Carpe post meridiem" http://sowerbutts.com
main(){char*s=">#=0> ^#X@#@^7=",c=0,m;for(;c<15;c++)for
(m=-1;m<7;putchar(m++/6&c%3/2?10:s[c]-31&1<<m?42:32));}
James Moxham
2014-04-26 14:21:46 UTC
Permalink
Hi Will,

All I can say is... Wow!

This is all the things we are trying to do - and you have done them all!

Ok, technical question regarding the MMU. Say you map 16 4k blocks in
order and that is the first 64k, and now you want to shuffle them all up
by one 4k block? Would you need to write out 16 new block addresses, or
could you do it with one instruction?

I'm thinking of how you would use such an advanced MMU system to emulate
simpler systems with larger block sizes (including the various N8VEM
boards).

I'm bursting with more questions about your work.

Are you able to write this up and/or share code by any chance?

Cheers, James Moxham


On Sat, 26 Apr 2014 23:20:12 +0930, William R Sowerbutts
Post by William R Sowerbutts
Hello everyone
This is not strictly N8VEM related, but I've seen your thread about FPGA
systems on chip ... I built a similar system last year and I thought some of
you may be interested. I should start by saying this was only my second FPGA
project and was also my first attempt at writing code for a Z80, so the
quality of my code is probably not very good! The machine works well though
and I've had a great deal of fun with it.
http://papilio.cc/index.php?n=Papilio.PapilioPro
The PPro is a great board and I thoroughly recommend it. It has a Xilinx
Spartan 6 LX9 FPGA, 8MB of SDRAM, 8MB of SPI flash memory, and an FTDI USB
interface that is used to connect JTAG and UART to a host PC. My main
criticism of the board is that the serial link via the UART has no flow
control lines hooked up to the FPGA -- the FTDI has a deep FIFO (several KB)
and you can build a receive FIFO inside the FPGA, but at high data rates you
will inevitably overflow these eventually.
I also have a Pipistrello FPGA board which is based on the same Papilio form
factor. It has the UART flow-control hooked up, has a larger and faster DDR
SDRAM chip as well as a much larger LX45 FPGA. You can use the Xilinx on-chip
memory controller block to drive the DDR SDRAM chip. I've not got had time to
get this board working yet.
The Papilio form factor is very hardware-hacker friendly; all the IO pins are
broken out on 0.1" headers so you can easily pop a bit of veroboard on top
and solder up a MAX3232 or SD card or LEDs or whatever.
I started my Z80 system with the open-source T80 CPU core, a UART that I'd
written for an earlier project, and some of the on-chip block SRAM for
memory. I then wrote a simple monitor program for it.
Xilinx have a "data2mem" tool that you can use to quickly replace the data
loaded into a block RAM without resynthesising the FPGA design (which is
rather slow), so you can assemble your monitor program, use data2mem to have
the code loaded into block RAM, then reprogram the FPGA which will run the
code when it comes out of reset. This affords a very quick
edit/compile/test
cycle (about 3 seconds from hitting enter to running code).
Once I had a monitor program running I imported Mike Field's brilliant SDRAM
http://hamsterworks.co.nz/mediawiki/index.php/Simple_SDRAM_Controller
This gave me access to far more memory than the Z80 could address, so I added
a 4K paged MMU to translate the 16-bit (64K) logical address space into a
26-bit (64MB) physical address space. Each 4KB logical page can be mapped
independently to any 4KB physical page. There's also what I call the "17th
page" which allows you to access physical memory addresses without mapping
them into the CPU address space -- it has a 26-bit pointer in the MMU and an
I/O port that translates I/O cycles into memory cycles, automatically
incrementing the pointer after each cycle so you can use the INIR instruction
with it to do block copies of unmapped physical memory to/from mapped memory.
The SDRAM takes on the order of 10 cycles to supply data after a read request
so I implemented a 16KB direct mapped cache using the on-chip block SRAM in
order to conceal this latency. This works very well. The FPGA block SRAM is
36-bits wide which allows for a 4-byte wide cache line plus 4 bits to
indicate the validity of each byte. You can also use it in a 9-bit wide mode
which turned out to be perfect for storing the cache address tags.
Debugging the cache was a pain. I ended up writing several programs to
exercise and test the memory in various ways; when I found a fault it often
took some head-scratching to determine if it was a bug in the hardware or the
software! This is doubly hard when the software is itself executing from
unreliable memory, so I added a 4K block of SRAM to the system use the MMU to
map that wherever I want and store the memory test program in there.
The Xilinx synthesis tools tell me my design is good for about 64MHz. I've
always run it at 128MHz without problems -- I think the critical timing paths
are probably on the address bus and so have an extra cycle to propagate.
I've not figured out how to tell the Xilinx tools about this nor how to
interpret their output to understand which are the critical paths. The Z80 is
rather fast at 128MHz and the cache all but eliminates the need for
wait states.
Once I had the hardware working I had a lot of fun writing software for it,
extending the hardware capabilities as the software grew more
sophisticated.
I wrote a CP/M-2.2 BIOS and got CP/M running. There's so much RAM in the
system that I just used the top 6MB as three 2MB RAM disks, which hugely
simplified writing storage drivers (you just map the relevant page of RAM
disk into the address space, copy the data, and then map back the original
page -- this was before I implemented the "17th page" trick). I wrote SPI
master hardware and some routines in the monitor ROM to copy the RAM disk
contents to and from the on-board 8MB SPI flash for persistent storage.
Once I had CP/M working I wrote an MP/M-II XIOS and got MP/M-II running. I
added a second UART and a simple interval timer and got interrupt driven
serial and pre-emptive multitasking working. I was really very impressed with
MP/M-II, I had not realised that these Z80 systems could multitask and
support multiple concurrent users (and all before I was even born!)
Once I had that working I got a bit ambitious and decided to port UZI, Doug
Braun's 8-bit UNIX like operating system. There's little or no
documentation
so this was harder than writing the BIOS/XIOS where there is a clear
specification of what you need to do. I started with the P112 UZI-180 port
which uses the Hi-Tech C/PM C compiler. I ported the kernel to ANSI C and
made it build with the modern SDCC compiler, added drivers for my MMU, UART,
RAM disk, an SD card interface, and removed the Z180 instructions. I modified
the context switching mechanism to make it much more efficient by eliminating
all the memory copying. I also increased the amount of memory available to
processes -- a native UZI process can use up to 0xF900 (62.25KB) and a CP/M
process running under emulation has a 60KB TPA (larger than under real CP/M!)
I've not yet come up with a good solution for building UZI userspace
applications. I'd really like to write a Z-machine interpreter that runs
under UZI and can play the newer Z-machine formats (eg versions 5 and 8).
These require much more memory than fits in the 64K logical address space and
I can't come up with a clean way to expose the MMU's abilities through
standard UNIX system calls (neither mmap nor shared memory segments are a
good match).
Anyway, if anyone else has a Papilio Pro and is interested, please let me
know and I'll send you an FPGA bit stream so you can have a play yourself. If
there's interest I'd consider tidying up the source code enough to share it.
I've recently ordered one of John's Mark IV SBC boards and I'm hoping to
assemble that and then port my UZI kernel over -- that software was the most
work and I'd love to see it used. The Z180 MMU is less capable than my
synthetic hardware but I believe I've come up with a workable scheme.
Has anyone considered building a ECB board to carry an FPGA? You'd need some
tri-state buffers that handle level shifting down to 3.3V on the FPGA side,
but the right design would allow you to implement either peripherals or a CPU
on the FPGA (or both). FPGAs are generally in TQFP or BGA packages but a few
rows of 0.1" header pins would allow you to plug on a Papilio or Pipistrello
daughterboard. I do realise this is somewhat contrary to the N8VEM philosophy
of using original parts, but it might be fun to have one board that can be
reconfigured quickly to work as a Z80, 68K, MIPS, ARM or even a bespoke CPU
board that talks to your N8VEM peripherals.
Hope someone found something interesting buried in there.
Will
_________________________________________________________________________
"Carpe post meridiem" http://sowerbutts.com
(m=-1;m<7;putchar(m++/6&c%3/2?10:s[c]-31&1<<m?42:32));}
William R Sowerbutts
2014-04-26 16:05:43 UTC
Permalink
Post by James Moxham
This is all the things we are trying to do - and you have done them all!
Ah, but I assure you, there is PLENTY of scope to do them better! :)
Post by James Moxham
Ok, technical question regarding the MMU. Say you map 16 4k blocks in
order and that is the first 64k, and now you want to shuffle them all
up by one 4k block? Would you need to write out 16 new block
addresses, or could you do it with one instruction?
It's unlikely you'd want to shuffle the whole map upwards by 4K. Apart from
anything else you'd be remapping the page containing currently executing
code. You'd need to rewrite all 16 page mappings. You're more likely to
entirely replace a subset rather than shuffle around stuff you can already
access.

In principle you could update all 16 mappings at once but that would consume
more hardware resources -- you'd need 16 adders to compute the 16 addresses
from the base address supplied, or a little state machine to count up as you
updated them one by one. I think you'd find updating each page separately is
simpler, more flexible, and doesn't really take very long.

Assuming your code was executing in the frame starting at 0xF000 and you
wanted to remap the lower 60KB of memory, your code would look like the code
below which is a slightly modified version of my code for remapping process
memory in UZI;

-------------------------------------------------------
MMU_SELECT .equ 0xF8
MMU_PAGE17 .equ 0xFA
MMU_PERM .equ 0xFB
MMU_FRAMEHI .equ 0xFC
MMU_FRAMELO .equ 0xFD

; map lower 60KB of virtual address space
; first physical page number is in HL
mmu_map_process:
; examine 0 page, see if we're already mapped in
xor a
out (MMU_SELECT), a
; if HL is the mapping already loaded, abort early
in a, (MMU_FRAMEHI)
cp h
jr nz, mmu_map_process_go
in a, (MMU_FRAMELO)
cp l
ret z
mmu_map_process_go:
; now setup the MMU. frame 0 is already selected.
ld b, #15 ; remap lower 15 pages (0x0000...0xEFFF)
ld c, #1 ; this is the next frame to be remapped
mmu_map_process_next:
; write the new translation
ld a, h
out (MMU_FRAMEHI), a
ld a, l
out (MMU_FRAMELO), a
; select the next frame
ld a, c
out (MMU_SELECT), a
inc c
inc hl
djnz mmu_map_process_next
ret
-------------------------------------------------------

I should probably explain the MMU operation in a little more detail.

The MMU takes a 16-bit virtual address from the CPU and divides it into a
4-bit frame number and a 12-bit offset. The frame number is used as an index
into an array of sixteen registers which contain the hardware page numbers
(the translation table). The physical address is then formed from the
hardware page number concatenated with the 12-bit offset. My hardware page
numbers are 14-bits long because I wanted a 64MB physical address space but
you could use any length you wished.

So if the virtual address 0xABCD is accessed, we'd divide that into frame
number 0xA (decimal 10), offset 0xBCD. If the 10th MMU translation register
contains 0x1234 then the translated physical address would be 0x1234BCD.

My MMU is programmed using 8 I/O ports in the range 0xF8 through 0xFF (all my
peripherals get a block of 8 ports).

The port at 0xF8 is effectively a mux which selects the function of ports
0xFB through 0xFF.

Writing 0x00 through 0x0F to the function register at 0xF8 allows you to
read/write one of the 16 MMU translation registers. With these selected;
- 0xFC contains the high byte of the physical address
- 0xFD contains the low byte of the physical address
- 0xFB contains permission bits (read/write/execute, currently ignored)

So updating a mapping generally requires just three I/O writes: One to 0xF8
to select which frame to modify, and one each to 0xFC and 0xFD to write out
the new translation. The permission bits are programmable but currently
ignored, I had planned to add some level of memory protection to UZI one day.

The "17th page" was a bit of a hack I added. The lazy programmer in me finds
it much more convenient to sometimes access memory without remapping a frame;
in particular you don't have to select which frame to remap such as to avoid
remapping the memory pointed to by PC, SP or your source/target pointer.

Writing 0xFF to port 0xF8 selects the "17th page pointer". This is a 26-bit
register but again could be wider/narrower as required. With this selected
ports 0xFC, 0xFD, 0xFE, 0xFF are the register contents (with the high byte in
0xFC, low byte in 0xFF).

When the CPU reads or writes the I/O port 0xFA the MMU translates the I/O
operation into a memory operation. The physical memory address accessed is
the address contained in the 17th page pointer register. After the memory
operation completes the 17th page pointer register is incremented so that
repeated accesses to 0xFA walk forward through memory.
Post by James Moxham
I'm thinking of how you would use such an advanced MMU system to
emulate simpler systems with larger block sizes (including the
various N8VEM boards).
It's easy to emulate systems with larger page sizes as you can see. Pages
smaller than 4K are not useful IMHO.
Post by James Moxham
I'm bursting with more questions about your work.
Ask away. I'm pleased someone else is interested!
Post by James Moxham
Are you able to write this up and/or share code by any chance?
Yes, it's largely a case of finding the time and tidying up the more
egregious embarrassing mistakes! There's quite a lot of messy logic in there
to deal with a mechanism I implemented to slow the CPU down by disabling the
CPU clock some proportion of the time while the peripherals continue to run
at the same speed (slowing them down would break the UART baud rate generator
or the SDRAM timing and refresh interval, for example).

Anyway I've attached a copy of the MMU code from before I added the 17th page
pointer, as well as the more complex version. The former is much easier to
understand, I hope.

Will

_________________________________________________________________________
William R Sowerbutts will-***@public.gmane.org
"Carpe post meridiem" http://sowerbutts.com
main(){char*s=">#=0> ^#X@#@^7=",c=0,m;for(;c<15;c++)for
(m=-1;m<7;putchar(m++/6&c%3/2?10:s[c]-31&1<<m?42:32));}
--
You received this message because you are subscribed to the Google Groups "N8VEM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to n8vem+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To post to this group, send email to n8vem-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
Visit this group at http://groups.google.com/group/n8vem.
For more options, visit https://groups.google.com/d/optout.
Nikolay Dimitrov
2014-04-26 16:54:12 UTC
Permalink
Hi Will,
Post by William R Sowerbutts
The PPro is a great board and I thoroughly recommend it.
Yep, it's cool. The FPGA has 64 KiB BRAM, so you can easily put there
any standard 8-bit
system.
Post by William R Sowerbutts
I also have a Pipistrello FPGA board which is based on the same Papilio form
factor.
This is nice board. I was looking at it, but couldn't find a good way to
buy.
Post by William R Sowerbutts
Xilinx have a "data2mem" tool that you can use to quickly replace the data
loaded into a block RAM without resynthesising the FPGA design (which is
rather slow), so you can assemble your monitor program, use data2mem to have
the code loaded into block RAM, then reprogram the FPGA which will run the
code when it comes out of reset.
Do you know whether this tool works with inferred BRAMs?
Post by William R Sowerbutts
There's also what I call the "17th
page" which allows you to access physical memory addresses without mapping
them into the CPU address space -- it has a 26-bit pointer in the MMU and an
I/O port that translates I/O cycles into memory cycles, automatically
incrementing the pointer after each cycle so you can use the INIR instruction
with it to do block copies of unmapped physical memory to/from mapped memory.
Nice, I really like this idea!
Post by William R Sowerbutts
The Xilinx synthesis tools tell me my design is good for about 64MHz. I've
always run it at 128MHz without problems -- I think the critical timing paths
are probably on the address bus and so have an extra cycle to propagate.
I've not figured out how to tell the Xilinx tools about this nor how to
interpret their output to understand which are the critical paths.
I've seen a critical path in the Timing Report, but for the older
designs on Spartan-3E. I checked
for Spartan-6, but there's no good info for the critical path there.
Maybe I'm looking at the wrong
place.
Post by William R Sowerbutts
Anyway, if anyone else has a Papilio Pro and is interested, please let me
know and I'll send you an FPGA bit stream so you can have a play yourself. If
there's interest I'd consider tidying up the source code enough to share it.
That would be great. I have both the PPro and interest in trying your
FPGA build. Also, the code
would be definitely interesting to see. Also, if you have pictures of
your system running, don't be
shy and share!
Post by William R Sowerbutts
Has anyone considered building a ECB board to carry an FPGA?
We had some discussions about this here, but not solid answer yet. Feel
free to share any ideas
and experiments you've done!

Regards,
Nikolay
William R Sowerbutts
2014-04-26 17:15:59 UTC
Permalink
Post by Nikolay Dimitrov
Post by William R Sowerbutts
The PPro is a great board and I thoroughly recommend it.
Yep, it's cool. The FPGA has 64 KiB BRAM, so you can easily put there
any standard 8-bit system.
Indeed. My aim was to use as little BRAM as possible -- I think I used about
a third of the LX9 BRAMs. I was saving the rest for video memory but I never
got around to implementing this.
Post by Nikolay Dimitrov
Post by William R Sowerbutts
I also have a Pipistrello FPGA board which is based on the same Papilio form
factor.
This is nice board. I was looking at it, but couldn't find a good way
to buy.
Just write to magnus-0SoenXc1oMtWk0Htik3J/***@public.gmane.org and he'll sort you out. I found him to be
very helpful.
Post by Nikolay Dimitrov
Post by William R Sowerbutts
Xilinx have a "data2mem" tool that you can use to quickly replace the data
loaded into a block RAM without resynthesising the FPGA design (which is
rather slow), so you can assemble your monitor program, use data2mem to have
the code loaded into block RAM, then reprogram the FPGA which will run the
code when it comes out of reset.
Do you know whether this tool works with inferred BRAMs?
Yes it does, providing it infers a synchronous RAM (you want this anyway).
You need to write a "BMM" file which includes the name of the BRAM and the
address space which you've mapped to it. You can find the name by digging
around in the synthesised project in ISE. You include this BMM in your
project and when it places/routes the project it generates a second BMM file
which describes the specific BRAM resources in which your memory is stored. I
found it painful to make work, probably my unfamiliarity with the tools, but
an absolute blessing once I had it.
Post by Nikolay Dimitrov
Post by William R Sowerbutts
Anyway, if anyone else has a Papilio Pro and is interested, please let me
know and I'll send you an FPGA bit stream so you can have a play yourself. If
there's interest I'd consider tidying up the source code enough to share it.
That would be great. I have both the PPro and interest in trying your
FPGA build. Also, the code
would be definitely interesting to see. Also, if you have pictures of
your system running, don't be
shy and share!
I'll put something together for you when I have a little free time.

W

_________________________________________________________________________
William R Sowerbutts will-***@public.gmane.org
"Carpe post meridiem" http://sowerbutts.com
main(){char*s=">#=0> ^#X@#@^7=",c=0,m;for(;c<15;c++)for
(m=-1;m<7;putchar(m++/6&c%3/2?10:s[c]-31&1<<m?42:32));}
James Moxham (Dr_Acula)
2014-04-27 01:06:05 UTC
Permalink
I think I am getting the hang of what you are doing. Very clever. I hope it
is ok if if I copy your MMU - maybe slightly simplified?





*MMU_SELECT .equ 0xF8 MMU_PAGE17 .equ 0xFA MMU_PERM .equ 0xFB MMU_FRAMEHI
.equ 0xFC MMU_FRAMELO .equ 0xFD*

So just to check, if you had an MP/M machine and the top 16k didn't change,
and you wanted to map in a new user in the lower 48K, that would be 12
translation registers to change, so 24 numbers to send out? Plus the
selects?

Ok, Grant's board only has 512k, so a translation register needs 128 bits,
so FRAMEHI and the 7th bit of FRAMELO won't change. But for code
compatibility may as well leave the same as your code, as fpga computers
run so fast.



*So updating a mapping generally requires just three I/O writes: One to
0xF8 to select which frame to modify, and one each to 0xFC and 0xFD to
write out the new translation*

So when writing to this, it is like a little ram chip with a 16 bit data
bus and a 4 bit address bus.

I think I can start to see how to turn this into vhdl.

I like this memory model - it is more flexible but yet fully backwards
compatible. It probably would take quite a few TTL chips to do in hardware
- this is where fpga chips really are useful :)
--
You received this message because you are subscribed to the Google Groups "N8VEM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to n8vem+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To post to this group, send email to n8vem-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
Visit this group at http://groups.google.com/group/n8vem.
For more options, visit https://groups.google.com/d/optout.
James Moxham (Dr_Acula)
2014-04-27 12:07:24 UTC
Permalink
Hi Will,

I think it is working! I simplified your MMU a bit - I have kept the low
address, but the high address port isn't needed as the ram chip is not big
enough.

This is my little MBASIC program - pick a block, write to it, then change
the block, read it back, then restore it and check the original number is
still there. Do it several times.

Ok
LIST
10 FOR A=0 TO 10
20 POKE &H9000,A
30 PRINT PEEK (&H9000);
40 OUT &HF8,9
50 OUT &HFD,127
60 POKE &H9000,A+30
70 PRINT PEEK (&H9000);
80 OUT &HF8,9
90 OUT &HFD,0
100 PRINT PEEK (&H9000)
110 NEXT A
Ok
RUN
0 30 0
1 31 1
2 32 2
3 33 3
4 34 4
5 35 5
6 36 6
7 37 7
8 38 8
9 39 9
10 40 10
Ok

The mmu entity is below, and then my working vhdl program to test it.

-- Memory management unit

-- based on code by Will Sowerbutts
-- designed for Grant Searle's minicomputer
-- The Z80 has 64k of ram which is divided up into 16 x 4k blocks
-- each block has an address which maps this block to a physical location
in external ram

--MMU_SELECT .equ 0xF8 -- use 4 bits to select which one of 16 4k blocks to
change
--MMU_PAGE17 .equ 0xFA -- not used
--MMU_PERM .equ 0xFB -- not used
--MMU_FRAMEHI .equ 0xFC -- not used as ram not big enough!
--MMU_FRAMELO .equ 0xFD -- use 7 bits to remap to one of 127 locations in
sram


library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use ieee.std_logic_unsigned.all;

entity MMU is
port (
n_wr : in std_logic;
n_rd : in std_logic;
n_reset: in std_logic;
dataIn : in std_logic_vector(7 downto 0); -- data from the cpu
cpuAddress: in std_logic_vector(15 downto 0); -- address from the cpu
mmuAddressOut : out std_logic_vector(18 downto 0) -- new address sent back
to the chip
);

end MMU;

architecture rtl of MMU is
type mmu_entry_type is
record
frame: std_logic_vector(6 downto 0); -- 128 blocks of 4k in a 512k ram
end record;
type mmu_entry_array is array(natural range <>) of mmu_entry_type;
signal mmu_entry : mmu_entry_array(0 to 15);
signal cpu_entry_select : std_logic_vector(7 downto 0); -- MMU_SELECT .equ
0xF8
alias page_number : std_logic_vector( 3 downto 0) is cpuAddress(15 downto
12); -- break up the incoming virtual address
alias page_offset : std_logic_vector(11 downto 0) is cpuAddress(11 downto
0);

begin
mmuAddressOut <= mmu_entry(to_integer(unsigned(page_number))).frame &
page_offset; -- decode address
process (n_wr, n_reset) begin -- reset
if (n_reset = '0') then
mmu_entry( 0).frame <= "0000000"; -- set first 64k to startup values
mmu_entry( 1).frame <= "0000001";
mmu_entry( 2).frame <= "0000010";
mmu_entry( 3).frame <= "0000011";
mmu_entry( 4).frame <= "0000100";
mmu_entry( 5).frame <= "0000101";
mmu_entry( 6).frame <= "0000110";
mmu_entry( 7).frame <= "0000111";
mmu_entry( 8).frame <= "0001000";
mmu_entry( 9).frame <= "0001001";
mmu_entry(10).frame <= "0001010";
mmu_entry(11).frame <= "0001011";
mmu_entry(12).frame <= "0001100";
mmu_entry(13).frame <= "0001101";
mmu_entry(14).frame <= "0001110";
mmu_entry(15).frame <= "0001111";
elsif (rising_edge(n_wr)) then -- write to ports
case cpuAddress(2 downto 0) is
when "000" =>
cpu_entry_select <= dataIn; -- MMU_SELECT .equ 0xF8
--when "011" =>
-- mmu_entry(to_integer(unsigned(cpu_entry_select(3 downto 0)))).can_read
<= cpu_data_in(0);
-- mmu_entry(to_integer(unsigned(cpu_entry_select(3 downto 0)))).can_write
<= cpu_data_in(1);
--when "100" =>
-- mmu_entry(to_integer(unsigned(cpu_entry_select(3 downto 0)))).frame(13
downto 0) <= cpu_data_in(5 downto 0) &
mmu_entry(to_integer(unsigned(cpu_entry_select(3 downto 0)))).frame(7
downto 0);
when "101" => --MMU_FRAMELO .equ 0xFD
mmu_entry(to_integer(unsigned(cpu_entry_select(3 downto 0)))).frame(6
downto 0) <= dataIn(6 downto 0);
when others =>
end case;
end if;
end process;


end rtl;

and the microcomputer code

-- This file is copyright by Grant Searle 2014
-- You are free to use this file in your own projects but must never charge
for it nor use it without
-- acknowledgement.
-- Please ask permission from Grant Searle before republishing elsewhere.
-- If you use this file or any part of it, please add an acknowledgement to
myself and
-- a link back to my main web site http://searle.hostei.com/grant/
-- and to the "multicomp" page at
http://searle.hostei.com/grant/Multicomp/index.html
--
-- Please check on the above web pages to see if there are any updates
before using this file.
-- If for some reason the page is no longer available, please search for
"Grant Searle"
-- on the internet to see if I have moved to another web hosting service.
--
-- Grant Searle
-- eMail address available on my main web page link above.

library ieee;
use ieee.std_logic_1164.all;
use IEEE.STD_LOGIC_ARITH.all;
use IEEE.STD_LOGIC_UNSIGNED.all;

entity Microcomputer is
port(
n_reset : in std_logic;
clk : in std_logic;

sramData : inout std_logic_vector(7 downto 0);
sramAddress : out std_logic_vector(18 downto 0);
n_sRamWE : out std_logic;
n_sRamCS : out std_logic;
n_sRamOE : out std_logic;

rxd1 : in std_logic;
txd1 : out std_logic;
rts1 : out std_logic;

rxd2 : in std_logic;
txd2 : out std_logic;
rts2 : out std_logic;

videoSync : out std_logic;
video : out std_logic;

videoR0 : out std_logic;
videoG0 : out std_logic;
videoB0 : out std_logic;
videoR1 : out std_logic;
videoG1 : out std_logic;
videoB1 : out std_logic;
hSync : out std_logic;
vSync : out std_logic;

ps2Clk : inout std_logic;
ps2Data : inout std_logic;

sdCS : out std_logic;
sdMOSI : out std_logic;
sdMISO : in std_logic;
sdSCLK : out std_logic;

realworldPinsA : inout std_logic_vector(7 downto 0);
realworldPinsB : inout std_logic_vector(7 downto 0);
realworldPinsC : inout std_logic_vector(7 downto 0);

driveLED : out std_logic :='1'

);
end Microcomputer;

architecture struct of Microcomputer is

signal n_WR : std_logic;
signal n_RD : std_logic;
signal cpuAddress : std_logic_vector(15 downto 0);
signal cpuDataOut : std_logic_vector(7 downto 0);
signal cpuDataIn : std_logic_vector(7 downto 0);

signal basRomData : std_logic_vector(7 downto 0);
signal internalRam1DataOut : std_logic_vector(7 downto 0);
signal internalRam2DataOut : std_logic_vector(7 downto 0);
signal interface1DataOut : std_logic_vector(7 downto 0);
signal interface2DataOut : std_logic_vector(7 downto 0);
signal sdCardDataOut : std_logic_vector(7 downto 0);
signal realworldDataOut : std_logic_vector(7 downto 0);

signal n_memWR : std_logic :='1';
signal n_memRD : std_logic :='1';

signal n_ioWR : std_logic :='1';
signal n_ioRD : std_logic :='1';

signal n_MREQ : std_logic :='1';
signal n_IORQ : std_logic :='1';

signal n_int1 : std_logic :='1';
signal n_int2 : std_logic :='1';

signal n_externalRamCS : std_logic :='1';
signal n_internalRam1CS : std_logic :='1';
signal n_internalRam2CS : std_logic :='1';
signal n_basRomCS : std_logic :='1';
signal n_interface1CS : std_logic :='1';
signal n_interface2CS : std_logic :='1';
signal n_sdCardCS : std_logic :='1';
signal n_realworldCS :std_logic := '1';
signal n_mmuCS : std_logic := '1';

signal serialClkCount : std_logic_vector(15 downto 0);
signal cpuClkCount : std_logic_vector(5 downto 0);
signal sdClkCount : std_logic_vector(5 downto 0);
signal cpuClock : std_logic;
signal serialClock : std_logic;
signal sdClock : std_logic;
signal n_RomActive : std_logic := '0';

begin
-- Disable ROM if out 38. Re-enable when (asynchronous) reset pressed
process (n_ioWR, n_reset) begin
if (n_reset = '0') then
n_RomActive <= '0';
elsif (rising_edge(n_ioWR)) then
if cpuAddress(7 downto 0) = "00111000" then -- $38
n_RomActive <= '1';
end if;
end if;
end process;
--
____________________________________________________________________________________
-- CPU CHOICE GOES HERE
cpu1 : entity work.t80s
generic map(mode => 1, t2write => 1, iowait => 0)
port map(
reset_n => n_reset,
clk_n => cpuClock,
wait_n => '1',
int_n => '1',
nmi_n => '1',
busrq_n => '1',
mreq_n => n_MREQ,
iorq_n => n_IORQ,
rd_n => n_RD,
wr_n => n_WR,
a => cpuAddress,
di => cpuDataIn,
do => cpuDataOut);
--
____________________________________________________________________________________
-- ROM GOES HERE
rom1 : entity work.Z80_CPM_BASIC_ROM -- 8KB BASIC
port map(
address => cpuAddress(12 downto 0),
clock => clk,
q => basRomData
);
--
____________________________________________________________________________________
-- RAM GOES HERE

MemoryManagement : entity work.MMU
port map(
n_wr => n_mmuCS or n_ioWR, -- group of 8 ports to write data to the MMU
n_rd => n_mmuCS or n_ioRD, -- future use, for reading data back from the MMU
n_reset => n_reset, -- reset line
dataIn => cpuDataOut, -- data lines to send commands to the MMU
cpuAddress => cpuAddress, -- cpu Adress lines to the MMU
mmuAddressOut => sramAddress -- modified address lines from the MMU
);

-- original 5 lines of ram cde below - with a MMU comment out the first
line and instead
-- send the cpu address to the mmu code, which returns 18 address lines

-- sramAddress(15 downto 0) <= cpuAddress(15 downto 0);
sramData <= cpuDataOut when n_memWR='0' else (others => 'Z');
n_sRamWE <= n_memWR or n_externalRamCS;
n_sRamOE <= n_memRD or n_externalRamCS;
n_sRamCS <= n_externalRamCS;
--
____________________________________________________________________________________
-- INPUT/OUTPUT DEVICES GO HERE
io1 : entity work.bufferedUART
port map(
clk => clk,
n_wr => n_interface1CS or n_ioWR,
n_rd => n_interface1CS or n_ioRD,
n_int => n_int1,
regSel => cpuAddress(0),
dataIn => cpuDataOut,
dataOut => interface1DataOut,
rxClock => serialClock,
txClock => serialClock,
rxd => rxd1,
txd => txd1,
n_cts => '0',
n_dcd => '0',
n_rts => rts1
);

io2 : entity work.SBCTextDisplayRGB
port map (
n_reset => n_reset,
clk => clk,

-- RGB video signals
hSync => hSync,
vSync => vSync,
videoR0 => videoR0,
videoR1 => videoR1,
videoG0 => videoG0,
videoG1 => videoG1,
videoB0 => videoB0,
videoB1 => videoB1,

-- Monochrome video signals (when using TV timings only)
sync => videoSync,
video => video,

n_wr => n_interface2CS or n_ioWR,
n_rd => n_interface2CS or n_ioRD,
n_int => n_int2,
regSel => cpuAddress(0),
dataIn => cpuDataOut,
dataOut => interface2DataOut,
ps2Clk => ps2Clk,
ps2Data => ps2Data
);


sd1 : entity work.sd_controller
port map(
sdCS => sdCS,
sdMOSI => sdMOSI,
sdMISO => sdMISO,
sdSCLK => sdSCLK,
n_wr => n_sdCardCS or n_ioWR,
n_rd => n_sdCardCS or n_ioRD,
n_reset => n_reset,
dataIn => cpuDataOut,
dataOut => sdCardDataOut,
regAddr => cpuAddress(2 downto 0),
driveLED => driveLED,
clk => sdClock -- twice the spi clk
);

portio : entity work.realworld
port map(
n_wr => n_realworldCS or n_ioWR,
n_rd => n_realworldCS or n_ioRD,
n_reset => n_reset,
dataIn => cpuDataOut,
dataOut => realworldDataOut,
realworldPinsA => realworldPinsA,
realworldPinsB => realworldPinsB,
realworldPinsC => realworldPinsC,
regAddr => cpuAddress(2 downto 0) -- pass group of 8 ports
);
--
____________________________________________________________________________________
-- MEMORY READ/WRITE LOGIC GOES HERE
n_ioWR <= n_WR or n_IORQ;
n_memWR <= n_WR or n_MREQ;
n_ioRD <= n_RD or n_IORQ;
n_memRD <= n_RD or n_MREQ;
--
____________________________________________________________________________________
-- CHIP SELECTS GO HERE
n_basRomCS <= '0' when cpuAddress(15 downto 13) = "000" and n_RomActive =
'0' else '1'; --8K at bottom of memory
n_interface1CS <= '0' when cpuAddress(7 downto 1) = "1000000" and
(n_ioWR='0' or n_ioRD = '0') else '1'; -- 2 Bytes $80-$81
n_interface2CS <= '0' when cpuAddress(7 downto 1) = "1000001" and
(n_ioWR='0' or n_ioRD = '0') else '1'; -- 2 Bytes $82-$83
n_sdCardCS <= '0' when cpuAddress(7 downto 3) = "10001" and (n_ioWR='0' or
n_ioRD = '0') else '1'; -- 8 Bytes $88-$8F
n_realworldCS <= '0' when cpuAddress(7 downto 3) = "01110" and (n_ioWR='0'
or n_ioRD = '0') else '1'; -- 8 bytes $70 to $77
n_mmuCS <= '0' when cpuAddress(7 downto 3) = "11111" and (n_ioWR='0' or
n_ioRD = '0') else '1'; -- 8 bytes $F8 to FF
n_externalRamCS<= not n_basRomCS;
--
____________________________________________________________________________________
-- BUS ISOLATION GOES HERE
cpuDataIn <=
interface1DataOut when n_interface1CS = '0' else
interface2DataOut when n_interface2CS = '0' else
sdCardDataOut when n_sdCardCS = '0' else
realworldDataOut when n_realworldCS = '0' else
basRomData when n_basRomCS = '0' else
internalRam1DataOut when n_internalRam1CS= '0' else
sramData when n_externalRamCS= '0' else
x"FF";
--
____________________________________________________________________________________
-- SUB-CIRCUIT CLOCK SIGNALS
serialClock <= serialClkCount(15);
process (clk)
begin
if rising_edge(clk) then

if cpuClkCount < 1 then -- 4 = 10MHz, 3 = 12.5MHz, 2=16.6MHz, 1=25MHz
cpuClkCount <= cpuClkCount + 1;
else
cpuClkCount <= (others=>'0');
end if;
if cpuClkCount < 1 then -- 2 when 10MHz, 2 when 12.5MHz, 2 when 16.6MHz, 1
when 25MHz
cpuClock <= '0';
else
cpuClock <= '1';
end if;

if sdClkCount < 49 then -- 1MHz = 49
sdClkCount <= sdClkCount + 1;
else
sdClkCount <= (others=>'0');
end if;
if sdClkCount < 25 then -- 25 = 1Mhz
sdClock <= '0';
else
sdClock <= '1';
end if;

-- Serial clock DDS
-- 50MHz master input clock:
-- Baud Increment
-- 115200 2416
-- 38400 805
-- 19200 403
-- 9600 201
-- 4800 101
-- 2400 50
serialClkCount <= serialClkCount + 2416;
end if;
end process;


end;
--
You received this message because you are subscribed to the Google Groups "N8VEM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to n8vem+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To post to this group, send email to n8vem-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
Visit this group at http://groups.google.com/group/n8vem.
For more options, visit https://groups.google.com/d/optout.
William R Sowerbutts
2014-04-27 19:09:19 UTC
Permalink
Hi James

Really pleased you have it working.

I'm far from an HDL or FPGA expert but I'd like to suggest that you should
make the parts of your circuit that update registers synchronous to a clock.
The clock should be the same clock that the CPU uses.

As I understand it the problem with using rising_edge(n_wr) is that the n_wr
signal in the FPGA is only really guaranteed to be stable on the rising edges
of the CPU clock. Using "rising_edge()" tells the synthesis tool that n_wr is a
clock signal and it should be used to clock the registers. However it's not a
nice square wave with 50% duty cycle. Far from it -- In between edges of the
CPU clock it may have switch freely and frequently between states as the
various logic and LUTs that feeds the signal receive new inputs and settle into
their new state. Some of these glitches may be long enough to trigger the
register to store state, even though the CPU isn't trying to do so.
Additionally I understand it makes it very hard for the synthesis tools to do
static timing analysis.

If you make everything synchronous to the CPU clock these problems just go
away; the register is clocked by the CPU clock directly, and some logic which
inspects n_wr and whatever else is required then generates an enable signal
to the register. Static timing analysis is done to ensure that all the
signals will propagate to the relevant inputs on the registers in good time
to meet their setup and hold requirements.

Essentially the synthesis tools and indeed the FPGA hardware itself are
optimised for the entire system to be synchronously clocked. I know
combinatorially generated clocks are common practice with discrete logic, in
part because it's difficult to distribute a clock across large boards and
backplanes without introducing significant skew, but this style of design
doesn't fit well with FPGAs.

So;
- you can read from registers in a combinatorial statement
- you should only ever write to a register from inside a synchronous process
- use as few clocks as possible (ideally exactly 1)

You want something like:

begin
-- decode address
mmuAddressOut <= mmu_entry(to_integer(unsigned(page_number))).frame & page_offset;

process (clk) begin
if rising_edge(clk) then
if n_reset = '0' then
-- ... reset state ...
else
if n_wr = '0' and my_chipselect = '1' then
-- ... write data from CPU to MMU registers
end if;
end if;
end if;
end process;
end;

Hope this helps.

Will
Post by James Moxham
Hi Will,
I think it is working! I simplified your MMU a bit - I have kept the low
address, but the high address port isn't needed as the ram chip is not big
enough.
This is my little MBASIC program - pick a block, write to it, then change
the block, read it back, then restore it and check the original number is
still there. Do it several times.
Ok
LIST
10 FOR A=0 TO 10
20 POKE &H9000,A
30 PRINT PEEK (&H9000);
40 OUT &HF8,9
50 OUT &HFD,127
60 POKE &H9000,A+30
70 PRINT PEEK (&H9000);
80 OUT &HF8,9
90 OUT &HFD,0
100 PRINT PEEK (&H9000)
110 NEXT A
Ok
RUN
0 30 0
1 31 1
2 32 2
3 33 3
4 34 4
5 35 5
6 36 6
7 37 7
8 38 8
9 39 9
10 40 10
Ok
The mmu entity is below, and then my working vhdl program to test it.
-- Memory management unit
-- based on code by Will Sowerbutts
-- designed for Grant Searle's minicomputer
-- The Z80 has 64k of ram which is divided up into 16 x 4k blocks
-- each block has an address which maps this block to a physical location
in external ram
--MMU_SELECT .equ 0xF8 -- use 4 bits to select which one of 16 4k blocks to
change
--MMU_PAGE17 .equ 0xFA -- not used
--MMU_PERM .equ 0xFB -- not used
--MMU_FRAMEHI .equ 0xFC -- not used as ram not big enough!
--MMU_FRAMELO .equ 0xFD -- use 7 bits to remap to one of 127 locations in
sram
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use ieee.std_logic_unsigned.all;
entity MMU is
port (
n_wr : in std_logic;
n_rd : in std_logic;
n_reset: in std_logic;
dataIn : in std_logic_vector(7 downto 0); -- data from the cpu
cpuAddress: in std_logic_vector(15 downto 0); -- address from the cpu
mmuAddressOut : out std_logic_vector(18 downto 0) -- new address sent back
to the chip
);
end MMU;
architecture rtl of MMU is
type mmu_entry_type is
record
frame: std_logic_vector(6 downto 0); -- 128 blocks of 4k in a 512k ram
end record;
type mmu_entry_array is array(natural range <>) of mmu_entry_type;
signal mmu_entry : mmu_entry_array(0 to 15);
signal cpu_entry_select : std_logic_vector(7 downto 0); -- MMU_SELECT .equ
0xF8
alias page_number : std_logic_vector( 3 downto 0) is cpuAddress(15 downto
12); -- break up the incoming virtual address
alias page_offset : std_logic_vector(11 downto 0) is cpuAddress(11 downto
0);
begin
mmuAddressOut <= mmu_entry(to_integer(unsigned(page_number))).frame &
page_offset; -- decode address
process (n_wr, n_reset) begin -- reset
if (n_reset = '0') then
mmu_entry( 0).frame <= "0000000"; -- set first 64k to startup values
mmu_entry( 1).frame <= "0000001";
mmu_entry( 2).frame <= "0000010";
mmu_entry( 3).frame <= "0000011";
mmu_entry( 4).frame <= "0000100";
mmu_entry( 5).frame <= "0000101";
mmu_entry( 6).frame <= "0000110";
mmu_entry( 7).frame <= "0000111";
mmu_entry( 8).frame <= "0001000";
mmu_entry( 9).frame <= "0001001";
mmu_entry(10).frame <= "0001010";
mmu_entry(11).frame <= "0001011";
mmu_entry(12).frame <= "0001100";
mmu_entry(13).frame <= "0001101";
mmu_entry(14).frame <= "0001110";
mmu_entry(15).frame <= "0001111";
elsif (rising_edge(n_wr)) then -- write to ports
case cpuAddress(2 downto 0) is
when "000" =>
cpu_entry_select <= dataIn; -- MMU_SELECT .equ 0xF8
--when "011" =>
-- mmu_entry(to_integer(unsigned(cpu_entry_select(3 downto 0)))).can_read
<= cpu_data_in(0);
-- mmu_entry(to_integer(unsigned(cpu_entry_select(3 downto 0)))).can_write
<= cpu_data_in(1);
--when "100" =>
-- mmu_entry(to_integer(unsigned(cpu_entry_select(3 downto 0)))).frame(13
downto 0) <= cpu_data_in(5 downto 0) &
mmu_entry(to_integer(unsigned(cpu_entry_select(3 downto 0)))).frame(7
downto 0);
when "101" => --MMU_FRAMELO .equ 0xFD
mmu_entry(to_integer(unsigned(cpu_entry_select(3 downto 0)))).frame(6
downto 0) <= dataIn(6 downto 0);
when others =>
end case;
end if;
end process;
end rtl;
and the microcomputer code
-- This file is copyright by Grant Searle 2014
-- You are free to use this file in your own projects but must never charge
for it nor use it without
-- acknowledgement.
-- Please ask permission from Grant Searle before republishing elsewhere.
-- If you use this file or any part of it, please add an acknowledgement to
myself and
-- a link back to my main web site http://searle.hostei.com/grant/
-- and to the "multicomp" page at
http://searle.hostei.com/grant/Multicomp/index.html
--
-- Please check on the above web pages to see if there are any updates
before using this file.
-- If for some reason the page is no longer available, please search for
"Grant Searle"
-- on the internet to see if I have moved to another web hosting service.
--
-- Grant Searle
-- eMail address available on my main web page link above.
library ieee;
use ieee.std_logic_1164.all;
use IEEE.STD_LOGIC_ARITH.all;
use IEEE.STD_LOGIC_UNSIGNED.all;
entity Microcomputer is
port(
n_reset : in std_logic;
clk : in std_logic;
sramData : inout std_logic_vector(7 downto 0);
sramAddress : out std_logic_vector(18 downto 0);
n_sRamWE : out std_logic;
n_sRamCS : out std_logic;
n_sRamOE : out std_logic;
rxd1 : in std_logic;
txd1 : out std_logic;
rts1 : out std_logic;
rxd2 : in std_logic;
txd2 : out std_logic;
rts2 : out std_logic;
videoSync : out std_logic;
video : out std_logic;
videoR0 : out std_logic;
videoG0 : out std_logic;
videoB0 : out std_logic;
videoR1 : out std_logic;
videoG1 : out std_logic;
videoB1 : out std_logic;
hSync : out std_logic;
vSync : out std_logic;
ps2Clk : inout std_logic;
ps2Data : inout std_logic;
sdCS : out std_logic;
sdMOSI : out std_logic;
sdMISO : in std_logic;
sdSCLK : out std_logic;
realworldPinsA : inout std_logic_vector(7 downto 0);
realworldPinsB : inout std_logic_vector(7 downto 0);
realworldPinsC : inout std_logic_vector(7 downto 0);
driveLED : out std_logic :='1'
);
end Microcomputer;
architecture struct of Microcomputer is
signal n_WR : std_logic;
signal n_RD : std_logic;
signal cpuAddress : std_logic_vector(15 downto 0);
signal cpuDataOut : std_logic_vector(7 downto 0);
signal cpuDataIn : std_logic_vector(7 downto 0);
signal basRomData : std_logic_vector(7 downto 0);
signal internalRam1DataOut : std_logic_vector(7 downto 0);
signal internalRam2DataOut : std_logic_vector(7 downto 0);
signal interface1DataOut : std_logic_vector(7 downto 0);
signal interface2DataOut : std_logic_vector(7 downto 0);
signal sdCardDataOut : std_logic_vector(7 downto 0);
signal realworldDataOut : std_logic_vector(7 downto 0);
signal n_memWR : std_logic :='1';
signal n_memRD : std_logic :='1';
signal n_ioWR : std_logic :='1';
signal n_ioRD : std_logic :='1';
signal n_MREQ : std_logic :='1';
signal n_IORQ : std_logic :='1';
signal n_int1 : std_logic :='1';
signal n_int2 : std_logic :='1';
signal n_externalRamCS : std_logic :='1';
signal n_internalRam1CS : std_logic :='1';
signal n_internalRam2CS : std_logic :='1';
signal n_basRomCS : std_logic :='1';
signal n_interface1CS : std_logic :='1';
signal n_interface2CS : std_logic :='1';
signal n_sdCardCS : std_logic :='1';
signal n_realworldCS :std_logic := '1';
signal n_mmuCS : std_logic := '1';
signal serialClkCount : std_logic_vector(15 downto 0);
signal cpuClkCount : std_logic_vector(5 downto 0);
signal sdClkCount : std_logic_vector(5 downto 0);
signal cpuClock : std_logic;
signal serialClock : std_logic;
signal sdClock : std_logic;
signal n_RomActive : std_logic := '0';
begin
-- Disable ROM if out 38. Re-enable when (asynchronous) reset pressed
process (n_ioWR, n_reset) begin
if (n_reset = '0') then
n_RomActive <= '0';
elsif (rising_edge(n_ioWR)) then
if cpuAddress(7 downto 0) = "00111000" then -- $38
n_RomActive <= '1';
end if;
end if;
end process;
--
____________________________________________________________________________________
-- CPU CHOICE GOES HERE
cpu1 : entity work.t80s
generic map(mode => 1, t2write => 1, iowait => 0)
port map(
reset_n => n_reset,
clk_n => cpuClock,
wait_n => '1',
int_n => '1',
nmi_n => '1',
busrq_n => '1',
mreq_n => n_MREQ,
iorq_n => n_IORQ,
rd_n => n_RD,
wr_n => n_WR,
a => cpuAddress,
di => cpuDataIn,
do => cpuDataOut);
--
____________________________________________________________________________________
-- ROM GOES HERE
rom1 : entity work.Z80_CPM_BASIC_ROM -- 8KB BASIC
port map(
address => cpuAddress(12 downto 0),
clock => clk,
q => basRomData
);
--
____________________________________________________________________________________
-- RAM GOES HERE
MemoryManagement : entity work.MMU
port map(
n_wr => n_mmuCS or n_ioWR, -- group of 8 ports to write data to the MMU
n_rd => n_mmuCS or n_ioRD, -- future use, for reading data back from the MMU
n_reset => n_reset, -- reset line
dataIn => cpuDataOut, -- data lines to send commands to the MMU
cpuAddress => cpuAddress, -- cpu Adress lines to the MMU
mmuAddressOut => sramAddress -- modified address lines from the MMU
);
-- original 5 lines of ram cde below - with a MMU comment out the first
line and instead
-- send the cpu address to the mmu code, which returns 18 address lines
-- sramAddress(15 downto 0) <= cpuAddress(15 downto 0);
sramData <= cpuDataOut when n_memWR='0' else (others => 'Z');
n_sRamWE <= n_memWR or n_externalRamCS;
n_sRamOE <= n_memRD or n_externalRamCS;
n_sRamCS <= n_externalRamCS;
--
____________________________________________________________________________________
-- INPUT/OUTPUT DEVICES GO HERE
io1 : entity work.bufferedUART
port map(
clk => clk,
n_wr => n_interface1CS or n_ioWR,
n_rd => n_interface1CS or n_ioRD,
n_int => n_int1,
regSel => cpuAddress(0),
dataIn => cpuDataOut,
dataOut => interface1DataOut,
rxClock => serialClock,
txClock => serialClock,
rxd => rxd1,
txd => txd1,
n_cts => '0',
n_dcd => '0',
n_rts => rts1
);
io2 : entity work.SBCTextDisplayRGB
port map (
n_reset => n_reset,
clk => clk,
-- RGB video signals
hSync => hSync,
vSync => vSync,
videoR0 => videoR0,
videoR1 => videoR1,
videoG0 => videoG0,
videoG1 => videoG1,
videoB0 => videoB0,
videoB1 => videoB1,
-- Monochrome video signals (when using TV timings only)
sync => videoSync,
video => video,
n_wr => n_interface2CS or n_ioWR,
n_rd => n_interface2CS or n_ioRD,
n_int => n_int2,
regSel => cpuAddress(0),
dataIn => cpuDataOut,
dataOut => interface2DataOut,
ps2Clk => ps2Clk,
ps2Data => ps2Data
);
sd1 : entity work.sd_controller
port map(
sdCS => sdCS,
sdMOSI => sdMOSI,
sdMISO => sdMISO,
sdSCLK => sdSCLK,
n_wr => n_sdCardCS or n_ioWR,
n_rd => n_sdCardCS or n_ioRD,
n_reset => n_reset,
dataIn => cpuDataOut,
dataOut => sdCardDataOut,
regAddr => cpuAddress(2 downto 0),
driveLED => driveLED,
clk => sdClock -- twice the spi clk
);
portio : entity work.realworld
port map(
n_wr => n_realworldCS or n_ioWR,
n_rd => n_realworldCS or n_ioRD,
n_reset => n_reset,
dataIn => cpuDataOut,
dataOut => realworldDataOut,
realworldPinsA => realworldPinsA,
realworldPinsB => realworldPinsB,
realworldPinsC => realworldPinsC,
regAddr => cpuAddress(2 downto 0) -- pass group of 8 ports
);
--
____________________________________________________________________________________
-- MEMORY READ/WRITE LOGIC GOES HERE
n_ioWR <= n_WR or n_IORQ;
n_memWR <= n_WR or n_MREQ;
n_ioRD <= n_RD or n_IORQ;
n_memRD <= n_RD or n_MREQ;
--
____________________________________________________________________________________
-- CHIP SELECTS GO HERE
n_basRomCS <= '0' when cpuAddress(15 downto 13) = "000" and n_RomActive =
'0' else '1'; --8K at bottom of memory
n_interface1CS <= '0' when cpuAddress(7 downto 1) = "1000000" and
(n_ioWR='0' or n_ioRD = '0') else '1'; -- 2 Bytes $80-$81
n_interface2CS <= '0' when cpuAddress(7 downto 1) = "1000001" and
(n_ioWR='0' or n_ioRD = '0') else '1'; -- 2 Bytes $82-$83
n_sdCardCS <= '0' when cpuAddress(7 downto 3) = "10001" and (n_ioWR='0' or
n_ioRD = '0') else '1'; -- 8 Bytes $88-$8F
n_realworldCS <= '0' when cpuAddress(7 downto 3) = "01110" and (n_ioWR='0'
or n_ioRD = '0') else '1'; -- 8 bytes $70 to $77
n_mmuCS <= '0' when cpuAddress(7 downto 3) = "11111" and (n_ioWR='0' or
n_ioRD = '0') else '1'; -- 8 bytes $F8 to FF
n_externalRamCS<= not n_basRomCS;
--
____________________________________________________________________________________
-- BUS ISOLATION GOES HERE
cpuDataIn <=
interface1DataOut when n_interface1CS = '0' else
interface2DataOut when n_interface2CS = '0' else
sdCardDataOut when n_sdCardCS = '0' else
realworldDataOut when n_realworldCS = '0' else
basRomData when n_basRomCS = '0' else
internalRam1DataOut when n_internalRam1CS= '0' else
sramData when n_externalRamCS= '0' else
x"FF";
--
____________________________________________________________________________________
-- SUB-CIRCUIT CLOCK SIGNALS
serialClock <= serialClkCount(15);
process (clk)
begin
if rising_edge(clk) then
if cpuClkCount < 1 then -- 4 = 10MHz, 3 = 12.5MHz, 2=16.6MHz, 1=25MHz
cpuClkCount <= cpuClkCount + 1;
else
cpuClkCount <= (others=>'0');
end if;
if cpuClkCount < 1 then -- 2 when 10MHz, 2 when 12.5MHz, 2 when 16.6MHz, 1
when 25MHz
cpuClock <= '0';
else
cpuClock <= '1';
end if;
if sdClkCount < 49 then -- 1MHz = 49
sdClkCount <= sdClkCount + 1;
else
sdClkCount <= (others=>'0');
end if;
if sdClkCount < 25 then -- 25 = 1Mhz
sdClock <= '0';
else
sdClock <= '1';
end if;
-- Serial clock DDS
-- Baud Increment
-- 115200 2416
-- 38400 805
-- 19200 403
-- 9600 201
-- 4800 101
-- 2400 50
serialClkCount <= serialClkCount + 2416;
end if;
end process;
end;
--
You received this message because you are subscribed to the Google Groups "N8VEM" group.
Visit this group at http://groups.google.com/group/n8vem.
For more options, visit https://groups.google.com/d/optout.
_________________________________________________________________________
William R Sowerbutts will-***@public.gmane.org
"Carpe post meridiem" http://sowerbutts.com
main(){char*s=">#=0> ^#X@#@^7=",c=0,m;for(;c<15;c++)for
(m=-1;m<7;putchar(m++/6&c%3/2?10:s[c]-31&1<<m?42:32));}
James Moxham (Dr_Acula)
2014-04-28 14:03:01 UTC
Permalink
Hi Will,

I tried the code you suggested with the clock, but CP/M crashes. Looking at
other entities Grant uses, eg the uarts, they seem to all work with a
common set of code latching on the rising edge of n_wr. So this may be a
feature of this computer? I understand what you are saying about
synchronous clocks but not sure how to implement them in this design.

This is a little bit of code from within the uart, along with a comment
that maybe suggests this has been considered by the author?

process( n_wr )
begin
if rising_edge(n_wr) then -- Standard CPU - capture data on trailing edge
of wr
if regSel='1' then
if txByteWritten=txByteSent then
txByteWritten <= not txByteWritten;
end if;
txByteLatch <= dataIn;
else
controlReg <= dataIn;
end if;
end if;
end process;

Also I've got one rather odd bug in my MBASIC test code - it is not
replacing the block on the first try. The first time I run it I get 195 and
the second and subsequent times it all works fine. Printout below. I don't
know if there is something special about the first time a block is used.

Ok
LIST
10 FOR A=0 TO 10
20 POKE &H9000,A
30 PRINT PEEK (&H9000);
40 OUT &HF8,9
50 OUT &HFD,127
60 POKE &H9000,A+30
70 PRINT PEEK (&H9000);
80 OUT &HF8,9
90 OUT &HFD,0
100 PRINT PEEK (&H9000)
110 NEXT A
Ok
RUN
0 30 195
1 31 1
2 32 2
3 33 3
4 34 4
5 35 5
6 36 6
7 37 7
8 38 8
9 39 9
10 40 10
Ok

RUN
0 30 0
1 31 1
2 32 2
3 33 3
4 34 4
5 35 5
6 36 6
7 37 7
8 38 8
9 39 9
10 40 10
Ok
--
You received this message because you are subscribed to the Google Groups "N8VEM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to n8vem+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To post to this group, send email to n8vem-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
Visit this group at http://groups.google.com/group/n8vem.
For more options, visit https://groups.google.com/d/optout.
William R Sowerbutts
2014-04-28 15:00:15 UTC
Permalink
James

The symptom you report is the type of marginal behaviour I'd expect.

FPGAs give the appearance of massive flexibility (after all, you can write a
description of any digital circuit in VHDL) but in fact the hardware and
software tools are highly optimised based on certain assumptions that your
design must meet (such as this clocking requirement). It's a bit dull but
ultimately a fair trade-off I think, since it lets you achieve higher speed
with those types of circuits you can reliably construct inside the FPGA.
Post by James Moxham (Dr_Acula)
I tried the code you suggested with the clock, but CP/M crashes.
Hmmmmm. Have you run it in the simulator to see if the circuit behaves as you
expect there?

When I was developing my system I found it really useful to drop some test
Z80 machine code into the VHDL ROM. The test code would exercise the hardware
I was working on as soon as the CPU came out of reset. I'd then run the whole
machine in the simulator and observe to check that the various signals inside
were doing what I expected. The simulator is much slower than real time so
you don't want to simulate booting the whole system up.

It's much easier to test in the simulator, fix all the bugs you can find, and
then do a (hopefully final) test on the FPGA device itself. Much less fun
than playing with the hardware but also less frustrating because you have
hugely better insight into what it's doing and you don't need to try and
infer why it's gone bonkers, you can just go look at what the signal states
were immediately before it went wrong and the VHDL code that determined the
next state.

Will
Post by James Moxham (Dr_Acula)
Hi Will,
I tried the code you suggested with the clock, but CP/M crashes. Looking at
other entities Grant uses, eg the uarts, they seem to all work with a
common set of code latching on the rising edge of n_wr. So this may be a
feature of this computer? I understand what you are saying about
synchronous clocks but not sure how to implement them in this design.
This is a little bit of code from within the uart, along with a comment
that maybe suggests this has been considered by the author?
process( n_wr )
begin
if rising_edge(n_wr) then -- Standard CPU - capture data on trailing edge
of wr
if regSel='1' then
if txByteWritten=txByteSent then
txByteWritten <= not txByteWritten;
end if;
txByteLatch <= dataIn;
else
controlReg <= dataIn;
end if;
end if;
end process;
Also I've got one rather odd bug in my MBASIC test code - it is not
replacing the block on the first try. The first time I run it I get 195 and
the second and subsequent times it all works fine. Printout below. I don't
know if there is something special about the first time a block is used.
Ok
LIST
10 FOR A=0 TO 10
20 POKE &H9000,A
30 PRINT PEEK (&H9000);
40 OUT &HF8,9
50 OUT &HFD,127
60 POKE &H9000,A+30
70 PRINT PEEK (&H9000);
80 OUT &HF8,9
90 OUT &HFD,0
100 PRINT PEEK (&H9000)
110 NEXT A
Ok
RUN
0 30 195
1 31 1
2 32 2
3 33 3
4 34 4
5 35 5
6 36 6
7 37 7
8 38 8
9 39 9
10 40 10
Ok
RUN
0 30 0
1 31 1
2 32 2
3 33 3
4 34 4
5 35 5
6 36 6
7 37 7
8 38 8
9 39 9
10 40 10
Ok
--
You received this message because you are subscribed to the Google Groups "N8VEM" group.
Visit this group at http://groups.google.com/group/n8vem.
For more options, visit https://groups.google.com/d/optout.
_________________________________________________________________________
William R Sowerbutts will-***@public.gmane.org
"Carpe post meridiem" http://sowerbutts.com
main(){char*s=">#=0> ^#X@#@^7=",c=0,m;for(;c<15;c++)for
(m=-1;m<7;putchar(m++/6&c%3/2?10:s[c]-31&1<<m?42:32));}
James Moxham (Dr_Acula)
2014-04-29 01:29:30 UTC
Permalink
Hi Will,

I think it might have been the memory block I used - possibly overwriting
part of mbasic. The demo program below works - replacing block &HA which I
think is below CPM, above MBASIC and somewhere in the high part of the code
space for MBASIC.

I did some experiments along the way. Decreased the clock speed. Tried
different blocks to replace. I need to look into your idea of running it in
an emulation - I have no idea how to do that but it sounds clever. Is that
one of the menus in Quartus?

I am still at the very steep part of the vhdl learning curve. Your idea is
to clock everything on the clock pulse. But Grant's computer has a divider
circuit, so the clock goes to this

process (clk)
begin
if rising_edge(clk) then
if cpuClkCount < 1 then -- 4 = 10MHz, 3 = 12.5MHz, 2=16.6MHz, 1=25MHz
cpuClkCount <= cpuClkCount + 1;
else
cpuClkCount <= (others=>'0');
end if;
if cpuClkCount < 1 then -- 2 when 10MHz, 2 when 12.5MHz, 2 when 16.6MHz, 1
when 25MHz
cpuClock <= '0';
else
cpuClock <= '1';
end if;

and then cpuClock is what is fed into the Z80

clk_n => cpuClock,

and out comes the Z80 pins

mreq_n => n_MREQ,
iorq_n => n_IORQ,
rd_n => n_RD,
wr_n => n_WR,

and these get decoded

n_ioWR <= n_WR or n_IORQ;
n_memWR <= n_WR or n_MREQ;
n_ioRD <= n_RD or n_IORQ;
n_memRD <= n_RD or n_MREQ;

and then decoded again eg

n_sdCardCS <= '0' when cpuAddress(7 downto 3) = "10001" and (n_ioWR='0' or
n_ioRD = '0') else '1';

I am trying to understand how to clock things on a clock edge when there is
a divider circuit in there to slow down the CPU clock. If, say, you ran the
cpu at 1/10th the speed, and you clocked things 10x on a read or write
cycle? But would it matter even if you did?

The logic I tried was slightly different. Grant's computer samples on the
rising edge of the decoded n_wr line and the code I tried with the clock
rising edge was sampling on the clock rising edge and when the n_wr line
was low. It shouldn't make much difference though.

It is all very confusing I have to confess.

Anyway, for me the next task is to get MP/M running :) This will really
test the MMU!

A>MBASIC MMU
BASIC-80 Rev. 5.21
[CP/M Version]
Copyright 1977-1981 (C) by Microsoft
Created: 28-Jul-81
29752 Bytes free
0 10 0
1 11 1
2 12 2
3 13 3
4 14 4
5 15 5
6 16 6
7 17 7
8 18 8
9 19 9
10 20 10
Ok
LIST
10 ADDRESS = &HA000
20 FOR A=0 TO 10
30 POKE ADDRESS,A
40 PRINT PEEK (ADDRESS);
50 FRAME = &HA:NEWBLOCK = 16:GOSUB 120: REM spare blocks are 16 to 127
60 POKE ADDRESS,A+10
70 PRINT PEEK(ADDRESS);
80 FRAME=&HA:NEWBLOCK=&HA:GOSUB 120
90 PRINT PEEK(ADDRESS)
100 NEXT A
110 END
120 REM output frame,newblock frame =0 to 15, newblock =0 to 127
130 OUT &HF8,FRAME
140 OUT &HFD,NEWBLOCK
150 RETURN
Ok
Post by William R Sowerbutts
James
The symptom you report is the type of marginal behaviour I'd expect.
FPGAs give the appearance of massive flexibility (after all, you can write a
description of any digital circuit in VHDL) but in fact the hardware and
software tools are highly optimised based on certain assumptions that your
design must meet (such as this clocking requirement). It's a bit dull but
ultimately a fair trade-off I think, since it lets you achieve higher speed
with those types of circuits you can reliably construct inside the FPGA.
Post by James Moxham (Dr_Acula)
I tried the code you suggested with the clock, but CP/M crashes.
Hmmmmm. Have you run it in the simulator to see if the circuit behaves as you
expect there?
When I was developing my system I found it really useful to drop some test
Z80 machine code into the VHDL ROM. The test code would exercise the hardware
I was working on as soon as the CPU came out of reset. I'd then run the whole
machine in the simulator and observe to check that the various signals inside
were doing what I expected. The simulator is much slower than real time so
you don't want to simulate booting the whole system up.
It's much easier to test in the simulator, fix all the bugs you can find, and
then do a (hopefully final) test on the FPGA device itself. Much less fun
than playing with the hardware but also less frustrating because you have
hugely better insight into what it's doing and you don't need to try and
infer why it's gone bonkers, you can just go look at what the signal states
were immediately before it went wrong and the VHDL code that determined the
next state.
Will
--
You received this message because you are subscribed to the Google Groups "N8VEM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to n8vem+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To post to this group, send email to n8vem-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
Visit this group at http://groups.google.com/group/n8vem.
For more options, visit https://groups.google.com/d/optout.
Max Scane
2014-04-29 03:08:03 UTC
Permalink
I would suggest setting up a RAM disk on CP/M 2 as a simpler test for the
MMU. At least you will have a working OS to troubleshoot with. Both CP/M
3 and MP/M rely on a working memory banking scheme before they will work
properly.

Cheers!

Max


On Tue, Apr 29, 2014 at 11:29 AM, James Moxham (Dr_Acula) <
Post by James Moxham
Hi Will,
I think it might have been the memory block I used - possibly overwriting
part of mbasic. The demo program below works - replacing block &HA which I
think is below CPM, above MBASIC and somewhere in the high part of the code
space for MBASIC.
I did some experiments along the way. Decreased the clock speed. Tried
different blocks to replace. I need to look into your idea of running it in
an emulation - I have no idea how to do that but it sounds clever. Is that
one of the menus in Quartus?
I am still at the very steep part of the vhdl learning curve. Your idea is
to clock everything on the clock pulse. But Grant's computer has a divider
circuit, so the clock goes to this
process (clk)
begin
if rising_edge(clk) then
if cpuClkCount < 1 then -- 4 = 10MHz, 3 = 12.5MHz, 2=16.6MHz, 1=25MHz
cpuClkCount <= cpuClkCount + 1;
else
cpuClkCount <= (others=>'0');
end if;
if cpuClkCount < 1 then -- 2 when 10MHz, 2 when 12.5MHz, 2 when 16.6MHz, 1
when 25MHz
cpuClock <= '0';
else
cpuClock <= '1';
end if;
and then cpuClock is what is fed into the Z80
clk_n => cpuClock,
and out comes the Z80 pins
mreq_n => n_MREQ,
iorq_n => n_IORQ,
rd_n => n_RD,
wr_n => n_WR,
and these get decoded
n_ioWR <= n_WR or n_IORQ;
n_memWR <= n_WR or n_MREQ;
n_ioRD <= n_RD or n_IORQ;
n_memRD <= n_RD or n_MREQ;
and then decoded again eg
n_sdCardCS <= '0' when cpuAddress(7 downto 3) = "10001" and (n_ioWR='0' or
n_ioRD = '0') else '1';
I am trying to understand how to clock things on a clock edge when there
is a divider circuit in there to slow down the CPU clock. If, say, you ran
the cpu at 1/10th the speed, and you clocked things 10x on a read or write
cycle? But would it matter even if you did?
The logic I tried was slightly different. Grant's computer samples on the
rising edge of the decoded n_wr line and the code I tried with the clock
rising edge was sampling on the clock rising edge and when the n_wr line
was low. It shouldn't make much difference though.
It is all very confusing I have to confess.
Anyway, for me the next task is to get MP/M running :) This will really
test the MMU!
A>MBASIC MMU
BASIC-80 Rev. 5.21
[CP/M Version]
Copyright 1977-1981 (C) by Microsoft
Created: 28-Jul-81
29752 Bytes free
0 10 0
1 11 1
2 12 2
3 13 3
4 14 4
5 15 5
6 16 6
7 17 7
8 18 8
9 19 9
10 20 10
Ok
LIST
10 ADDRESS = &HA000
20 FOR A=0 TO 10
30 POKE ADDRESS,A
40 PRINT PEEK (ADDRESS);
50 FRAME = &HA:NEWBLOCK = 16:GOSUB 120: REM spare blocks are 16 to 127
60 POKE ADDRESS,A+10
70 PRINT PEEK(ADDRESS);
80 FRAME=&HA:NEWBLOCK=&HA:GOSUB 120
90 PRINT PEEK(ADDRESS)
100 NEXT A
110 END
120 REM output frame,newblock frame =0 to 15, newblock =0 to 127
130 OUT &HF8,FRAME
140 OUT &HFD,NEWBLOCK
150 RETURN
Ok
Post by William R Sowerbutts
James
The symptom you report is the type of marginal behaviour I'd expect.
FPGAs give the appearance of massive flexibility (after all, you can write a
description of any digital circuit in VHDL) but in fact the hardware and
software tools are highly optimised based on certain assumptions that your
design must meet (such as this clocking requirement). It's a bit dull but
ultimately a fair trade-off I think, since it lets you achieve higher speed
with those types of circuits you can reliably construct inside the FPGA.
Post by James Moxham (Dr_Acula)
I tried the code you suggested with the clock, but CP/M crashes.
Hmmmmm. Have you run it in the simulator to see if the circuit behaves as you
expect there?
When I was developing my system I found it really useful to drop some test
Z80 machine code into the VHDL ROM. The test code would exercise the hardware
I was working on as soon as the CPU came out of reset. I'd then run the whole
machine in the simulator and observe to check that the various signals inside
were doing what I expected. The simulator is much slower than real time so
you don't want to simulate booting the whole system up.
It's much easier to test in the simulator, fix all the bugs you can find, and
then do a (hopefully final) test on the FPGA device itself. Much less fun
than playing with the hardware but also less frustrating because you have
hugely better insight into what it's doing and you don't need to try and
infer why it's gone bonkers, you can just go look at what the signal states
were immediately before it went wrong and the VHDL code that determined the
next state.
Will
--
You received this message because you are subscribed to the Google Groups "N8VEM" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/n8vem.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "N8VEM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to n8vem+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To post to this group, send email to n8vem-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
Visit this group at http://groups.google.com/group/n8vem.
For more options, visit https://groups.google.com/d/optout.
yoda
2014-04-29 03:30:36 UTC
Permalink
You would clock on clk_n as that is the CPU clock and that is what you need to synchronize to. It is irrevelant that was divided n times before that. The CPU clock is the master clock. At least that is the way I read your snippet of code.

Dave
James Moxham
2014-04-29 03:41:16 UTC
Permalink
Ah, thanks Dave. That makes sense. I was clocking on the clock *before* it
was divided. I'll test it using the CPU clock. Cheers, James
Post by yoda
You would clock on clk_n as that is the CPU clock and that is what you
need to synchronize to. It is irrevelant that was divided n times
before that. The CPU clock is the master clock. At least that is the
way I read your snippet of code.
Dave
James Moxham (Dr_Acula)
2014-04-29 13:56:11 UTC
Permalink
Getting closer...

Ok, old MMU code post #7 works fine. But it doesn't work on the clock
edges so need to modify slightly.

So..

Yoda's correction to my code works - now have this

port map(
clk => cpuClock, -- clock based on divided cpu clock

and the MMU I have this

entity MMU is
port (
clk : in std_logic;
n_wr : in std_logic;
n_rd : in std_logic;
n_reset: in std_logic;
dataIn : in std_logic_vector(7 downto 0); -- data from the cpu
cpuAddress: in std_logic_vector(15 downto 0); -- address from the cpu
mmuAddressOut : out std_logic_vector(18 downto 0) -- new address sent back
to the chip
);

end MMU;

architecture rtl of MMU is
type mmu_entry_type is
record
frame: std_logic_vector(6 downto 0); -- 128 blocks of 4k in a 512k ram
end record;
type mmu_entry_array is array(natural range <>) of mmu_entry_type;
signal mmu_entry : mmu_entry_array(0 to 15);
signal cpu_entry_select : std_logic_vector(7 downto 0); -- MMU_SELECT .equ
0xF8
alias page_number : std_logic_vector( 3 downto 0) is cpuAddress(15 downto
12); -- break up the incoming virtual address
alias page_offset : std_logic_vector(11 downto 0) is cpuAddress(11 downto
0);

begin
mmuAddressOut <= mmu_entry(to_integer(unsigned(page_number))).frame &
page_offset; -- decode address
--process (n_wr, n_reset) begin -- reset
process (clk) begin
if rising_edge(clk) then
if (n_reset = '0') then
mmu_entry( 0).frame <= "0000000"; -- set first 64k to startup values
mmu_entry( 1).frame <= "0000001";
mmu_entry( 2).frame <= "0000010";
mmu_entry( 3).frame <= "0000011";
mmu_entry( 4).frame <= "0000100";
mmu_entry( 5).frame <= "0000101";
mmu_entry( 6).frame <= "0000110";
mmu_entry( 7).frame <= "0000111";
mmu_entry( 8).frame <= "0001000";
mmu_entry( 9).frame <= "0001001";
mmu_entry(10).frame <= "0001010";
mmu_entry(11).frame <= "0001011";
mmu_entry(12).frame <= "0001100";
mmu_entry(13).frame <= "0001101";
mmu_entry(14).frame <= "0001110";
mmu_entry(15).frame <= "0001111";
-- elsif (rising_edge(n_wr)) then -- write to ports
elsif n_wr = '0' then -- write to ports
case cpuAddress(2 downto 0) is
when "000" =>
cpu_entry_select <= dataIn; -- MMU_SELECT .equ 0xF8
when "101" => --MMU_FRAMELO .equ 0xFD
mmu_entry(to_integer(unsigned(cpu_entry_select(3 downto 0)))).frame(6
downto 0) <= dataIn(6 downto 0);
when others =>
end case;
end if;
end if;
end process;

When this starts up, it crashes CP/M.

BUT - if I push the reset button, then restart, then it all works properly.
It is as if the n_reset signal is not registering on powerup, but it does
register when the button is pushed. It may be there is something subtle in
the code. The prototype code I am copying is Grant's CP/M addition which is
this

-- Disable ROM if out 38. Re-enable when (asynchronous) reset pressed
process (n_ioWR, n_reset) begin
if (n_reset = '0') then
n_RomActive <= '0';
elsif (rising_edge(n_ioWR)) then
if cpuAddress(7 downto 0) = "00111000" then -- $38
n_RomActive <= '1';
end if;
end if;
end process;

Which I presume has to work on either a powerup reset and a pushbutton
reset, otherwise the rom would never be loaded properly.

So - I need a reset that works on both powerup and when the reset button is
pushed.

I feel a solution is very close now!
--
You received this message because you are subscribed to the Google Groups "N8VEM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to n8vem+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To post to this group, send email to n8vem-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
Visit this group at http://groups.google.com/group/n8vem.
For more options, visit https://groups.google.com/d/optout.
William R Sowerbutts
2014-04-29 14:33:36 UTC
Permalink
James

Two possible solutions;

1. Initialise the contents of the mmu_entry array; currently when the circuit
starts in the FPGA I expect it contains all zeroes ie every page of memory
points at the lower 4KB of RAM.

2. Ensure the n_reset line is asserted when the circuit starts in the FPGA.
In my top-level design I have:

architecture Behavioral of top_level is
...
signal reset_button_clk1 : std_logic;
signal reset_button_clk2 : std_logic;
signal reset_sr : std_logic_vector(7 downto 0) := (others => '1');
...
begin
...
process(clk)
begin
if rising_edge(clk) then
-- Xilinx advises using two flip-flops are used to bring external
-- signals which feed control logic into our clock domain.
reset_button_clk1 <= reset_button;
reset_button_clk2 <= reset_button_clk1;

-- Reset logic: Hold reset for 8 cycles
if (reset_button_clk2 = '1') then
reset_sr <= (others => '1');
else
reset_sr <= reset_sr(6 downto 0) & '0';
end if;

-- During reset, latch the console select jumper which
-- is used to optionally swap over the first two UARTs.
if reset_sr(7) = '1' then
swap_uart01 <= console_select_clk2;
else
swap_uart01 <= swap_uart01;
end if;
end if;
end process;

cpu_reset <= reset_sr(7);
...
end Behavioral;

"clk" should be your CPU clock of course.

"reset_button" is the input pin on the FPGA to which the physical reset button
is connected.

I've used an 8-bit shift register here but you could equally well use a small
counter. It may even be the case that a single register is fine. Not sure now
why I decided to hold it for 8 cycles!

What happens is that on startup or whenever the physical button is pushed the
"reset_sr" shift register is filled with 1s. Every cycle the entire shift
register moves one bit to the left and 0s are fed in at the least significant
bit. The most significant bit is fed to the reset line of the other circuits --
CPU, MMU etc.

Will
Post by James Moxham (Dr_Acula)
Getting closer...
Ok, old MMU code post #7 works fine. But it doesn't work on the clock
edges so need to modify slightly.
So..
Yoda's correction to my code works - now have this
port map(
clk => cpuClock, -- clock based on divided cpu clock
and the MMU I have this
entity MMU is
port (
clk : in std_logic;
n_wr : in std_logic;
n_rd : in std_logic;
n_reset: in std_logic;
dataIn : in std_logic_vector(7 downto 0); -- data from the cpu
cpuAddress: in std_logic_vector(15 downto 0); -- address from the cpu
mmuAddressOut : out std_logic_vector(18 downto 0) -- new address sent back
to the chip
);
end MMU;
architecture rtl of MMU is
type mmu_entry_type is
record
frame: std_logic_vector(6 downto 0); -- 128 blocks of 4k in a 512k ram
end record;
type mmu_entry_array is array(natural range <>) of mmu_entry_type;
signal mmu_entry : mmu_entry_array(0 to 15);
signal cpu_entry_select : std_logic_vector(7 downto 0); -- MMU_SELECT .equ
0xF8
alias page_number : std_logic_vector( 3 downto 0) is cpuAddress(15 downto
12); -- break up the incoming virtual address
alias page_offset : std_logic_vector(11 downto 0) is cpuAddress(11 downto
0);
begin
mmuAddressOut <= mmu_entry(to_integer(unsigned(page_number))).frame &
page_offset; -- decode address
--process (n_wr, n_reset) begin -- reset
process (clk) begin
if rising_edge(clk) then
if (n_reset = '0') then
mmu_entry( 0).frame <= "0000000"; -- set first 64k to startup values
mmu_entry( 1).frame <= "0000001";
mmu_entry( 2).frame <= "0000010";
mmu_entry( 3).frame <= "0000011";
mmu_entry( 4).frame <= "0000100";
mmu_entry( 5).frame <= "0000101";
mmu_entry( 6).frame <= "0000110";
mmu_entry( 7).frame <= "0000111";
mmu_entry( 8).frame <= "0001000";
mmu_entry( 9).frame <= "0001001";
mmu_entry(10).frame <= "0001010";
mmu_entry(11).frame <= "0001011";
mmu_entry(12).frame <= "0001100";
mmu_entry(13).frame <= "0001101";
mmu_entry(14).frame <= "0001110";
mmu_entry(15).frame <= "0001111";
-- elsif (rising_edge(n_wr)) then -- write to ports
elsif n_wr = '0' then -- write to ports
case cpuAddress(2 downto 0) is
when "000" =>
cpu_entry_select <= dataIn; -- MMU_SELECT .equ 0xF8
when "101" => --MMU_FRAMELO .equ 0xFD
mmu_entry(to_integer(unsigned(cpu_entry_select(3 downto 0)))).frame(6
downto 0) <= dataIn(6 downto 0);
when others =>
end case;
end if;
end if;
end process;
When this starts up, it crashes CP/M.
BUT - if I push the reset button, then restart, then it all works properly.
It is as if the n_reset signal is not registering on powerup, but it does
register when the button is pushed. It may be there is something subtle in
the code. The prototype code I am copying is Grant's CP/M addition which is
this
-- Disable ROM if out 38. Re-enable when (asynchronous) reset pressed
process (n_ioWR, n_reset) begin
if (n_reset = '0') then
n_RomActive <= '0';
elsif (rising_edge(n_ioWR)) then
if cpuAddress(7 downto 0) = "00111000" then -- $38
n_RomActive <= '1';
end if;
end if;
end process;
Which I presume has to work on either a powerup reset and a pushbutton
reset, otherwise the rom would never be loaded properly.
So - I need a reset that works on both powerup and when the reset button is
pushed.
I feel a solution is very close now!
--
You received this message because you are subscribed to the Google Groups "N8VEM" group.
Visit this group at http://groups.google.com/group/n8vem.
For more options, visit https://groups.google.com/d/optout.
_________________________________________________________________________
William R Sowerbutts will-***@public.gmane.org
"Carpe post meridiem" http://sowerbutts.com
main(){char*s=">#=0> ^#X@#@^7=",c=0,m;for(;c<15;c++)for
(m=-1;m<7;putchar(m++/6&c%3/2?10:s[c]-31&1<<m?42:32));}
p***@public.gmane.org
2014-04-29 15:54:09 UTC
Permalink
Hi Will,
Post by William R Sowerbutts
Two possible solutions;
1. Initialise the contents of the mmu_entry array; currently when the circuit
starts in the FPGA I expect it contains all zeroes ie every page of memory
points at the lower 4KB of RAM.
2. Ensure the n_reset line is asserted when the circuit starts in the FPGA.
I think that using both of these is even better approach - you can have properly
initialized state machines, and automatically generated reset signal for the
components (other than GSR). The only thing which is left to add to your solution
is just to make sure that the time defined by these 8 clock cycles is enough to
bring all required systems to a fully defined state (hopefully - the initial one).

If you allow me, I would like to share my favourite reading for FPGA resets:
http://forums.xilinx.com/t5/PLD-Blog/That-Dangerous-Asynchronous-Reset/ba-p/12856

There are 2 good white papers mentioned in the beginning of this article, that
deserve additional attention.

Regards,
Nikolay


-------------------------------------
Mail.BG: Безплатен e-mail адрес. Най-добрите характеристики на българския пазар - 20 GB пощенска кутия, 1 GB прикрепен файл, безплатен POP3, мобилна версия, SMS известяване и други. http://mail.bg
--
You received this message because you are subscribed to the Google Groups "N8VEM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to n8vem+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To post to this group, send email to n8vem-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
Visit this group at http://groups.google.com/group/n8vem.
For more options, visit https://groups.google.com/d/optout.
William R Sowerbutts
2014-04-29 16:22:29 UTC
Permalink
Hi Nikolay

I think if the reset is synchronous to the clock then it need be active for
only one cycle to have the required effect.

Will
Post by James Moxham
Hi Will,
Post by William R Sowerbutts
Two possible solutions;
1. Initialise the contents of the mmu_entry array; currently when the circuit
starts in the FPGA I expect it contains all zeroes ie every page of memory
points at the lower 4KB of RAM.
2. Ensure the n_reset line is asserted when the circuit starts in the FPGA.
I think that using both of these is even better approach - you can have properly
initialized state machines, and automatically generated reset signal for the
components (other than GSR). The only thing which is left to add to your solution
is just to make sure that the time defined by these 8 clock cycles is enough to
bring all required systems to a fully defined state (hopefully - the initial one).
http://forums.xilinx.com/t5/PLD-Blog/That-Dangerous-Asynchronous-Reset/ba-p/12856
There are 2 good white papers mentioned in the beginning of this article, that
deserve additional attention.
Regards,
Nikolay
-------------------------------------
Mail.BG: Безплатен e-mail адрес. Най-добрите характеристики на българския пазар - 20 GB пощенска кутия, 1 GB прикрепен файл, безплатен POP3, мобилна версия, SMS известяване и други. http://mail.bg
--
You received this message because you are subscribed to the Google Groups "N8VEM" group.
Visit this group at http://groups.google.com/group/n8vem.
For more options, visit https://groups.google.com/d/optout.
_________________________________________________________________________
William R Sowerbutts will-***@public.gmane.org
"Carpe post meridiem" http://sowerbutts.com
main(){char*s=">#=0> ^#X@#@^7=",c=0,m;for(;c<15;c++)for
(m=-1;m<7;putchar(m++/6&c%3/2?10:s[c]-31&1<<m?42:32));}
--
You received this message because you are subscribed to the Google Groups "N8VEM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to n8vem+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To post to this group, send email to n8vem-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
Visit this group at http://groups.google.com/group/n8vem.
For more options, visit https://groups.google.com/d/optout.
Nikolay Dimitrov
2014-04-29 21:22:45 UTC
Permalink
Hi Will,

I think that this is partially true. Yes, only for the cases where the
propagation time through the logic
being reset is less than the clock period. And no, if it exceeds this
timing - easy to happen when the
clock is too fast or the combinatorial path for the reset logic in some
blocks is too long.

In any case, I liked your proposals for reset.

Kind regards,
Nikolay
James Moxham
2014-04-30 00:31:39 UTC
Permalink
Thanks for all this information - very useful.
Looking at Grant's page http://searle.hostei.com/grant/Multicomp/index.html
and then looking at the schematic, there are two things on the board that
can be used. There is the pushbutton reset on pin 144, and there is also
an RC network on pin 73. Grant's comments are
"Pin 144 - pushbutton to ground, no external pullup. Set internal pullup
on FPGA configuration if used. I will use for UK101 "reset"
"Pin 73 - 10uF capacitor to ground, 10K resistor to Vcc, for power up
reset if needed?"

I need to check the pin planner to see how there are connected.

Given there is an RC on the board, this should do the job, right?

James Moxham
James Moxham
2014-04-30 10:51:04 UTC
Permalink
Hi Will,

I've tried a lot of different things with your clock code but I really
can't seem to get it to work. The code that works is

process (n_wr, mmu_reset) begin
if (mmu_reset = '0') then
...
elsif (rising_edge(n_wr)) then -- write to ports

and the code that does not is

process (clk) begin
if rising_edge(clk) then
if (mmu_reset = '0') then
...
elsif n_wr = '0' then -- write to ports

I went off on a tangent exploring the reset signal. I had an idea that it
needed a different reset signal so I tried the RC network on pin 73 but it
didn't help. I then connected one of the diagnostic leds to that RC
network in the code and it is low for 1 second on startup then goes high,
as expected. I then tried connecting it with an AND gate to the reset
button, and using this as the MMU reset signal but no luck. I then put the
diagnostic led on pin144 only, and this appears to generate a 1 second low
signal on cold boot as well. I am not entirely sure how that delay is
happening, but in essence it is what is needed anyway, so I then commented
out all the RC network experiments as the reset signal is what is expected
- cold or warm boot, it is low for a bit then goes high.

However, where it gets strange is that the reset input to the MMU is based
on pin 144. Using the clock edge code, it works fine on a warm boot
pushing the button, but not on a cold powerup boot. Yet the led behaviour
does not seem to distinguish any difference between these two and the code
says these are the same signal.

The other thing different in the code above is the write to ports - one
triggers on the rising edge and the other when the write is low (but when
the clock is rising). This code is only run though when a block address is
changed so it shouldn't matter.

The behaviour we have is that it does not seem to reset if the rising edge
clock code is used, and without a reset working the blocks are not
initialised, so it crashes as expected.

So... I know this is not 'proper' code as it is not using the clock, but
the code I have is working fine and the MMU is working correctly even at
full speed. Maybe later down the track when a few more boards get built
someone who understands vhdl a bit better than me can revisit this?

The next challenge is getting a ram disk working. To do this, CP/M needs
to be modified. I'm not entirely sure where to start with this, but one
thing about Grant's computer is that it is very fast, and that means that
it may well be easier to self compile programs rather than cross compiling
on a PC. So that involves writing assembly in Wordstar, then compiling it
using batch files. Good Old-School stuff! :)

What I am hoping is that there is a way to recompile CP/M, then run that
compiled program from CP/M and it replaces high memory with the new image.
There probably is a standard way to do this. I am trying to recall how it
was done - I think you compiled CP/M with a .org at the correct high
memory location, but loaded it off disk into 0100, (where all the jumps
etc would be incorrect) and then a little bootloader copied the code to
high memory.

Need to get the compile/test/debug cycle down so it is quick. At least it
should be possible to get it quicker than the first N8VEM I ever built
that needed hundreds of eprom erases.

And once there is a system for recompiling CP/M quickly, then need to try
to add the ramdisk code.

Back to coding...

James
William R Sowerbutts
2014-04-30 22:35:09 UTC
Permalink
Hi James

Not sure about your reset problem, I'm afraid. Can you send me your source
code off-list? I'll see if I can get it to build on my platform. I strongly
suggest you figure out how to get the simulator to work, you really need it
to see inside your design and understand what's going on when things don't go
as you planned.
Post by James Moxham
What I am hoping is that there is a way to recompile CP/M, then run
that compiled program from CP/M and it replaces high memory with the
new image.
The original DRI documentation is very good and if you want the machine to be
totally self-hosting (ie you build the OS on the machine itself) you can
follow their system generation process:

http://www.cpm.z80.de/randyfiles/DRI/CPM_2_0_System_Alteration_Guide.pdf

In my socz80 project I separately assemble CP/M and my BIOS from source on my
workstation, then I have a short Python script which combines these two
binaries with a bootstrap sector into a system track. I can write the system
track onto one of my RAM disks through the monitor ROM, and then I just boot
from the RAM disk. I have a build script that does all of that automatically.
From saving the BIOS source code file in my editor to watching it crash
(again) on the FPGA took just a few seconds.
Post by James Moxham
Need to get the compile/test/debug cycle down so it is quick. At
least it should be possible to get it quicker than the first N8VEM I
ever built that needed hundreds of eprom erases.
Do you have a monitor program that lives in ROM, ideally at the top of
memory? Can you talk to it over your serial port? My approach would be to
write a script which generates "write memory" commands to transfer a binary
file from my workstation to a given address in memory. I'd assemble the
CP/M system on my workstation, for 48KB RAM or whatever is left with the
monitor present, ie so CP/M lives at A400--BFFF. Transfer it to memory
through the monitor, then just jump to the entry vector. Once you have the
BIOS working you just adjust for a full 64KB RAM system and boot it normally.

Will

_________________________________________________________________________
William R Sowerbutts will-***@public.gmane.org
"Carpe post meridiem" http://sowerbutts.com
main(){char*s=">#=0> ^#X@#@^7=",c=0,m;for(;c<15;c++)for
(m=-1;m<7;putchar(m++/6&c%3/2?10:s[c]-31&1<<m?42:32));}
James Moxham
2014-05-02 08:30:01 UTC
Permalink
Hi Will,

My build essentially follows Grant's
http://searle.hostei.com/grant/Multicomp/index.html then the CP/M
extension project. I can send source files but I'd need to package it up
and it might be easier to rebuild from Grant's page. The files have sort
of ended up all over the place in subdirectories. But before we do all
that... your computer is using different ram with very different driver
code, and by the time you modify it to run with the sram chip Grant is
using (128K) or the one I am using (512K) it is going to get to be such a
different project that I wonder if another solution might be that we just
wait another week for some new boards to arrive, I'll solder one up and
send it to you?

There are some huge synergies between our projects. I guess my vibe on the
thing is that yours is the Acme of the emulations - it does everything you
want to do, but it uses a board (I hope this is ok to say?..) that is
custom and the cost may be a factor to some builders, and so I'm drawn to
Grant's board because the fpga board is available for $16 and they are
being mass produced in China and they are very easy to solder.

I guess my little challenge to myself is to do all the things you have
done, but on a cheaper board. Plus I want to bring in all the work done on
the Propeller emulation using full colour touchscreens as I think this
could show CP/M in a whole new light - not just a boring A> command
prompt. (I have built a lot of boards, but the only ones my teenage kids
are interested in are the colour touchscreen ones).

So...
I can send you a freebie board. No problem there.
Stuff I want to do:
1) Get self compilation working for CP/M
2) Get a ram disk running using banked memory
3) Get MP/M running
4) Get a touchscreen running
5) Get FAT32 running
6) Rebuild CP/M and/or MP/M so the program exists in banked memory, thus
freeing up more code space and at the same time allowing for much larger
BIOS code - up to 64K or more. Then start adding functions to CP/M to
handle events like fingerdown, fingerup, mouseposition etc.
7) Rummage around in the Opencores library and see what else we can add to
this project - eg ethernet, DSP code like fast multiplies, I2C bus, SPI
bus, analog output etc.
8) Crazy things, like if there is spare space on the fpga, add a second
core (probably 8080 as smaller than Z80) with a tiny bit of internal ram,
and use this to do things that are easy with the 8 core propeller chip.

All of this I believe is entirely doable. Self compilation CP/M has sent
me off on a tangent and not quite a simple as I thought. Everyone seems to
be using cross compilers. I think it might be a matter of taking code
written for TASM and changing the syntax slightly so it is compatible with
the M80 compiler.

And yes... I have a program to cross compile and download quickly. It is
written in VB5, then ported to .net 2008, and just like CP/M, those
programs are now becoming retro in themselves. I *could* port that code to
the latest version of .net, but in a way, I'm kind of tired of the
constant need to port code every few years to a new language. Hence I'm
drawn towards self compilation. I haven't actually quite found an answer
yet. There is code around that will self compile but it is not very well
commented, and there is code that won't self compile but which is
extremely well commented. I'm sure there is an answer to this one.

Ram disk - well that involves rewriting the cbios. I think you have done
this already?

MP/M. I've had a lot of fun inside MP/M and I'd love to see it running on
this fpga.

Touchscreen. I am hoping I can port over the propeller code. Touchscreens
change the whole vibe of the computer - add a battery and now it can fit
in your pocket.

FAT32. Well the entire FAT32 source code exists in propeller Spin code and
there are programs to translate Spin to C, and C can be compiled in CP/M,
so in theory, FAT32 could be part of CP/M. Then you can move files easily
using SD cards.

Moving the operating system to banked memory. I'm sure this can be done.
Then one can add routines without worrying about it taking away program
space.

Opencores is a treasure trove of goodies. It would be so cool to have a
TCP/IP stack running and using CP/M as a web server.

Multiple cores? I did a some quick searches on this. There are tiny 8080
cores. There are other custom RISC cores. For those applications that are
a bit more complex than vhdl, but where you don't want to tie up the CPU.
Plus it would be cool to have a hybrid computer and use BASIC to POKE a
few values into a different CPU, then set it going and see the output. DSP
stuff would be fun - eg FFTs and digital filters.


Sorry about the long ramble... So many exciting things to think about!

Cheers, James Moxham


On Thu, 01 May 2014 08:05:09 +0930, William R Sowerbutts
Post by William R Sowerbutts
Hi James
Not sure about your reset problem, I'm afraid. Can you send me your source
code off-list? I'll see if I can get it to build on my platform. I strongly
suggest you figure out how to get the simulator to work, you really need it
to see inside your design and understand what's going on when things don't go
as you planned.
Post by James Moxham
What I am hoping is that there is a way to recompile CP/M, then run
that compiled program from CP/M and it replaces high memory with the
new image.
The original DRI documentation is very good and if you want the machine to be
totally self-hosting (ie you build the OS on the machine itself) you can
http://www.cpm.z80.de/randyfiles/DRI/CPM_2_0_System_Alteration_Guide.pdf
In my socz80 project I separately assemble CP/M and my BIOS from source on my
workstation, then I have a short Python script which combines these two
binaries with a bootstrap sector into a system track. I can write the system
track onto one of my RAM disks through the monitor ROM, and then I just boot
from the RAM disk. I have a build script that does all of that
automatically.
From saving the BIOS source code file in my editor to watching it crash
(again) on the FPGA took just a few seconds.
Post by James Moxham
Need to get the compile/test/debug cycle down so it is quick. At
least it should be possible to get it quicker than the first N8VEM I
ever built that needed hundreds of eprom erases.
Do you have a monitor program that lives in ROM, ideally at the top of
memory? Can you talk to it over your serial port? My approach would be to
write a script which generates "write memory" commands to transfer a binary
file from my workstation to a given address in memory. I'd assemble the
CP/M system on my workstation, for 48KB RAM or whatever is left with the
monitor present, ie so CP/M lives at A400--BFFF. Transfer it to memory
through the monitor, then just jump to the entry vector. Once you have the
BIOS working you just adjust for a full 64KB RAM system and boot it normally.
Will
_________________________________________________________________________
"Carpe post meridiem" http://sowerbutts.com
(m=-1;m<7;putchar(m++/6&c%3/2?10:s[c]-31&1<<m?42:32));}
yoda
2014-05-02 12:56:04 UTC
Permalink
James,

I would not say the papillo is a custom board. It is a quite popular open source board made in china I believe that you can get from multiple sources like gadget factory, sparkfun and Seed. It has a lot of advantages as well, a more modern fpga, does not require a jtag module to program it, built in USB connection, a good hunk of sdram on the board. It uses the arduino shield format for boards so there are quite a few plugin boards available. Granted it is more expensive, but when you throw in jtag programmer, memory, and serial you have to acquire for your board, I would say it starts to become price competitive. I would not rule this board out. I prefer another board though from numato.com called Saturn which you can get in different sizes of fpga. The entry is cheaper than the papillo and you can order it without the headers installed so you can put male headers on the bottom of the board so you can plug it into a motherboard properly.

Dave
--
You received this message because you are subscribed to the Google Groups "N8VEM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to n8vem+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To post to this group, send email to n8vem-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
Visit this group at http://groups.google.com/group/n8vem.
For more options, visit https://groups.google.com/d/optout.
James Moxham
2014-05-02 13:07:43 UTC
Permalink
Good points there Dave.

Maybe we can start to think about some sort of common code format that can
be ported to different hardware?

Some things might be very similar - eg uart code and SD code.

The memory might be the most complex, but perhaps it is possible to
abstract the SDRAM into a vhdl object where you have a large address bus
and you read and write to addresses and the vhdl sorts out the row/column
stuff? That might already exist now anyway?

Cheers, James
Post by yoda
James,
I would not say the papillo is a custom board. It is a quite popular
open source board made in china I believe that you can get from multiple
sources like gadget factory, sparkfun and Seed. It has a lot of
advantages as well, a more modern fpga, does not require a jtag module
to program it, built in USB connection, a good hunk of sdram on the
board. It uses the arduino shield format for boards so there are quite
a few plugin boards available. Granted it is more expensive, but when
you throw in jtag programmer, memory, and serial you have to acquire for
your board, I would say it starts to become price competitive. I would
not rule this board out. I prefer another board though from numato.com
called Saturn which you can get in different sizes of fpga. The entry
is cheaper than the papillo and you can order it without the headers
installed so you can put male headers on the bottom of the board so you
can plug it into a motherboard properly.
Dave
yoda
2014-05-02 14:08:04 UTC
Permalink
Most sdram code abstracts the interface to a memory controller which has
address, data r/w and busy as an interface should be easy to do. Probably
the hardest thing will be that some people are going to write in VHDL and
some in Verilog. They should be pluggable but people will have to get used
to that. I prefer Verilog over VHDL, I guess Verilog is more C like and
VHDL is more Pascal/Ada like - more comfortable with the former than the
latter.

I will say I have not gone down the CPU emulation route but to use the FPGA
as Super I/O and memory controller. For example the project I am playing
with now is to use a 68SEC000 and the Saturn board as a 68K sbc. The
68SEC000 is a 3.3V version of the 68K so it makes the interfacing easy. I
plan to do all the I/O from the Saturn board. I hope to be able to split
the 32MB of sdram into 16MB for system (max address space for 68K) and use
the other half as graphics display buffer to do relatively hi res graphics.
Should be pretty simple board and have just begun working on bread
boarding it now.

Dave
Post by James Moxham
Good points there Dave.
Maybe we can start to think about some sort of common code format that can
be ported to different hardware?
Some things might be very similar - eg uart code and SD code.
The memory might be the most complex, but perhaps it is possible to
abstract the SDRAM into a vhdl object where you have a large address bus
and you read and write to addresses and the vhdl sorts out the row/column
stuff? That might already exist now anyway?
Cheers, James
Post by yoda
James,
I would not say the papillo is a custom board. It is a quite popular
open source board made in china I believe that you can get from multiple
sources like gadget factory, sparkfun and Seed. It has a lot of
advantages as well, a more modern fpga, does not require a jtag module
to program it, built in USB connection, a good hunk of sdram on the
board. It uses the arduino shield format for boards so there are quite
a few plugin boards available. Granted it is more expensive, but when
you throw in jtag programmer, memory, and serial you have to acquire for
your board, I would say it starts to become price competitive. I would
not rule this board out. I prefer another board though from numato.com
called Saturn which you can get in different sizes of fpga. The entry
is cheaper than the papillo and you can order it without the headers
installed so you can put male headers on the bottom of the board so you
can plug it into a motherboard properly.
Dave
--
You received this message because you are subscribed to the Google Groups "N8VEM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to n8vem+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To post to this group, send email to n8vem-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
Visit this group at http://groups.google.com/group/n8vem.
For more options, visit https://groups.google.com/d/optout.
Loading...