excamera, what I'm hacking on, daily.

jamesb@excamera.com


       
Sat, 11 Sep 2004

Graphs!
I've set up permanently updating graphs here.

*

Sat, 29 May 2004

Bandwidth graph:

*

Sun, 23 May 2004

Using ghostscript and ImageMagick instead of MRTG
I tried to set up MRTG again this morning. What does MRTG do, exactly? As far as I can tell, it collects samples, stores them in a text file, and draws horrible looking graphs from them. So I'm looking at rolling my own graphs using (1) Perl to collect samples and turn them into ps, (2) GhostScript to produce some high resolution image, and (3) ImageMagick to rescale and compress them.

Something like this:

*

Sat, 22 May 2004

Running on 12V down speaker wire
I'm running it directly off DC on 100 feet of 16AWG speaker wire. At first the board wouldn't power up, and a quick check with a voltmeter showed that the droop was about 1.5V, from 12.0V to 10.5V. Fortunately the AC-DC supply has a trim adjustment; so winding this up until the DC-DC PSU was getting 12.0V did the trick.

*

Next issue: temp monitoring
I notice that the motherboard is getting quite hot. A quick check shows that the the heatsinks are getting as get as warm as 120 degrees F. The temperature monitoring tools return 255, hmm...

*

Sun, 16 May 2004

wi hostap fix, kind of
I spent a few hours tinkering (tampering?) with this driver, and have a fix. I've verified that it works: my download speed went up from 170 kbytes/sec to 470 kbytes/sec, which is what I'd expect on an 11Mbps network. If anyone's interested, here it is: The Intersil manual on page 4-28 says that transmit packets need to have their TxRate field filled in with 10,20,55 or 110, depending on the transmission speed. The 4.X driver sets the field tx_frame.wi_tx_rate in function wihap_check_tx(), called from wi_start(). The 5.X driver calls this field frmhdr.wi_tx_rate, and leaves it zero. If you want to run at 11MBps, add a line

                  frmhdr.wi_tx_rate = 110;
in wi_start(), just before the call to wi_write_bap(). (Obviously, this is just a cheesy hack. The real code would have to find the actual tx rate. I just want to point out where the problem lies).

*

Sat, 15 May 2004

wi hostap rate trouble
I'm trying to get my new FreeBSD 5.2-CURRENT box to be a good AP with a Intersil 2.5 802.11b card, but its rate is limited to 2Mbps. This is affecting other people too, so I'm having a look at the driver. Some context:

  • FreeBSD 4.X apparently great at 5.5 and 11
  • I found the Intersil manual, my copy here
  • Incremental kernel build:
    cd /usr/obj/usr/src/sys/GENERIC
    make && make install
    

Some notes on the driver:

  1. The 4.X code does not set WI_RID_BASIC_RATE or WI_RID_SUPPORT_RATE. The 5.X code sets them thus:
        wi_write_val(sc, WI_RID_BASIC_RATE, 0x03);   /* 1, 2 */
        wi_write_val(sc, WI_RID_SUPPORT_RATE, 0x0f); /* 1, 2, 5.5, 11 */
    
  2. On page 3-2 of the manual ("Basic Initialization in Firmware-based AP Mode") it specifies that the driver should set data rate for MAC port 0 by writing to TxRateControl0 (0xFC9E). Neither 4.X not 5.X code does this.
    #define WI_RID_TX_RATE0		0xFC9E
    

*

Sun, 18 Apr 2004

Running Xilinx WebPack under Wine on FreeBSD
Well, I got this working last weekend. It's such a relief not having to worry about Samba mounts any more, and emulation under Wine is, err, as good as you'd expect. While Xilinx now supports WebPack under Wine, they don't seem to give you any help as far as running the tools. For that, I had to run the ISE tools under Windows and look in the command logs that it leaves around to determine what to run! Here's the script I ended up with:

#!/usr/local/bin/bash

export XILINX=c:/Xilinx
export DISPLAY=192.168.0.111:1

function rw()
{
   EXE=$1;
   shift;

   wine -- ~/c/Xilinx/bin/nt/$EXE -intstyle ise $* 2>&1 |
   grep -v ^fixme | tee log
}

ENTITY=c2a

echo "vhdl work $ENTITY.vhd" > xc2s200.prj

sed "s/ENTITY/$ENTITY/" < proto.xst > xc2s200.xst

rw xst -ifn xc2s200.xst -ofn xc2s200.syr
grep -i ^error log && exit 1
rw ngdbuild -dd _ngo -uc $ENTITY.ucf -p xc2s200-pq208-5 xc2s200.ngc xc2s200.ngd
grep -i '^error' log && exit 1
rw map -p xc2s200-pq208-5 -cm area -pr b -k 4 -c 100 -tx off -o xc2s200_map.ncd xc2s200.ngd xc2s200.pcf
rw par -w -ol std -t 1 xc2s200_map.ncd xc2s200.ncd xc2s200.pcf
rw trce -e 3 -l 3 -xml xc2s200 xc2s200.ncd -o xc2s200.twr xc2s200.pcf
rw bitgen -f xc2s200.ut xc2s200.ncd
cp xc2s200.rbt $ENTITY.rbt

Setting up a build directory for a particular chip was similarly painful. Here's what I ended up with for the xc2s200: /files/camera/xc2s200/.

*

Sun, 11 Apr 2004

Back at it
Dusted off the FPGA board today. Starting up again: immediate goal is color NTSC. Some useful links:

  1. http://www.rickard.gunee.com/projects/video/sx/howto.php

*

Mon, 19 Jan 2004

CCD picture!
I found that running the LM9627 at 10MHz was causing clock glitches. Running at 5MHz (10MHz input with a divide by 2) is much safer - the signal looks pretty solid coming back.

In the default setup, the LM9627 gives you 504 lines of 780 clocks each, so each frame is 393120 clocks long. Each 780 clock line consists of 116 clocks of horizontal sync, followed by 664 clocks of actual pixel data.

The current setup samples a 32x32 pixel window into FPGA internal RAM, then scans out this window onto the monitor. I can see the 8 pixel black border around the frame, as well as the beyer pattern in the color samples themselves.

A full frame is about 600Kbytes, so the next step is to sample the whole frame into the 8Mbyte DRAM, then scan from DRAM onto the monitor.

*

Sat, 13 Dec 2003

CCD update
Just got the LM9627's registers read out via i2c. Steps went like this:

  1. Reading the ID register via i2c
  2. Running the LM9627's mclk
  3. Applying resetb to set all register to their default state
Now I just need to figure out how to get a picture out of the thing.

*

Sat, 23 Aug 2003

32 bits
Made the CPU's word size a parameter, so you can have a 16 or 32 bit CPU. The 32 bit version is bigger but still fits in the XC2S15: 190 slices out of 192.

*

Thu, 21 Aug 2003

80 MIPS
The Xilinx timing tools claim that the maximum speed is 55 MHz. Using a 66 MHz oscillator seemed to work fine. And I left it running overnight with an 80 MHz oscillator.

*

Wed, 20 Aug 2003

Why is the program counter a counter?
A critical timing path in this CPU is the next instruction computation (it's complicated because it does a call in a single cycle, and a return in zero cycles). Part of the path is the increment of the 10-bit program counter (PC). So why do program counters increment? As long as they visit memory in a deterministic order, who cares about the actual function? As an experiment, I switched to a 10-bit LFSR, and the CPU clock runs 3 ns faster.

*

Tue, 19 Aug 2003

The XC2S200 board
Here's the board that I use for FPGA experiments. It has an Xilinx Spartan II - the XC2S200 - on it. 40 I/Os on one header, 34 on another, and a slot for 72 pin SIMM memory. Eagle schematic and board here.

*

Sun, 17 Aug 2003

Faster...
I spent a few hours reworking the CPU to have a single phase clock: the old one had a three phase clock. The old CPU ran at 22 MIPS with a 66 MHz clock. The new version runs at 40 MIPS with a 40 MHz clock.

*

Thu, 07 Aug 2003

Meanwhile, back at the network stack
I spent some time going over the networking code, cleaning it up, making it smaller. It's now down to 707 instructions with the host monitor, 627 without. Source here. (I've also finished - kind of - my write up of the CPU itself).

*

Wed, 06 Aug 2003

pcb: it's a Good Thing
Well, I've been trying out pcb and it works pretty well. The documentation is out of data, and it really needs a tutorial, but once you've figured it out it works great. Here's the board for the CCD, planning on etching it this week.

*

Thu, 31 Jul 2003

New chip
Some new chips came last week from National - LM9627. They're 640x480 color CMOS CCDs. What's new and exciting about this part (I should write advertising copy, don't you think?) is that all the timing and analog circuitry is on-chip. You talk to it with I2C and get a stream of 12 bit samples out the back.

Trouble is that the package is a LCC48 - sockets are hundreds of dollars - so I'm having to think about hand soldering wires to the chip.

*

Wed, 30 Jul 2003

Goodbye Eagle?
I've been trying to make a new board - for a new chip, more on that later, and am getting incredibly frustrated with Eagle. The library editor is a big weakness, and its whole schematic entry vs. PCB layout paradigm doesn't fit my workflow. (I always create a schematic that's just the same as the PCB, effectively laying out the board twice). I'm looking at pcb. It's certainly hard to get started - but I remember struggling with Eagle for several weeks at first too.

*

Sat, 26 Jul 2003

Using Spartan II's JTAG pins as general purpose I/O
According to these notes, it's possible to use the JTAG pins as general purpose I/O. This would be nice - I wouldn't have to put a second connector on the board for the host interface.

OK, I get this error message:

ERROR:NgdBuild:423 - The tdi symbol "u1" is not supported in the spartan2
   architecture. Of the currently installed architectures, the following support
   tdi symbols: SPARTAN, SPARTANXL, XC4000E, XC4000EX, XC4000L, XC4000XL,
   XC4000XLA, XC4000XV, XC5200.

It's clear that this isn't supported.

*

Fri, 25 Jul 2003

echo2: missed the bus
Looks like I'm not going to get this board ready in time for the Olimex August vacation. Oh well, gives me time to check it over, try to make it smaller and possibly add an IR input.

*

Thu, 24 Jul 2003

echo2: first cut
The PCB layout for the 100x52mm board is kind of done. I'm missing a few discretes, but they should fit right in. I picked up the DRAM from Jameco today. It's a j-lead package - the chip leads curl underneath the package. Wonder how hard it it will be to hand solder. I also got a 2.1mm coax PCB for the board's main 5V supply.

*

LF1S022 library part
I made an Eagle part for the LF1S022 10-BaseT filter. It's in the same library as the other parts, my-smd-ic.lbr.

*

Tue, 22 Jul 2003

Layout
I'm having trouble fitting everything on a 79x49 board. Part of the diffulty is the huge vias: they're 40 mils across, makes them hard to route around. I'm trying a larger board: 100x52mm.

*

Sat, 19 Jul 2003

Shopping
Yesterday I ordered a bunch of stuff:

  • RTL8019AS and LF1S022 from Rabbit
  • XC2S30 from Insight Electronics

*

Eagle library design for RTL8019AS

Couldn't find any designs out there, so I did one myself. It's in my-smd-ic.lbr. The MT4C16257 DRAM is in there too.

*

Thu, 17 Jul 2003

Board time
OK, the experiment seems to have been a success. I suspect some of its problems are caused by the very long traces (6") between the FPGA and the RTL8019. (I get TCP checksum errors unless I insert extra NOPs in the I/O subroutines).

It's time to design a board. Olimex always gives you 160 mm x 100 mm, so a 79x49 board gets you 4. What to make? Well, I plan to re-do something I made last year. It's a TCP/IP -> S/PDIF converter: it just sits on the network and forwards a byte stream to a digital audio interface. So a simple process on a server can decode MP3s, OGGs or whatever and forward it to the tiny device.

The one I made last year uses a Rabbit module for TCP/IP and the necessarily large FIFO. It also has an FPGA to drive the S/PDIF TOSLINK interface. It worked OK - it's still running - but skips about once every 10 minutes, can't do more than 44.1kHz at 16bit, sometimes just dies, and because of the $55 Rabbit module, cost me about $80 in parts. The biggest headache was that the TCP/IP stack couldn't handle the sustained bandwidth: 1.41Mbit/s. I had to do a bunch of weird hacks to get it to accept packets fast enough.

So this time will be different. Parts:

  • For the FPGA, the Xilinx XC2S30. This is the CPU and controller.
  • For Ethernet, RTL8019AS connected with a 16-bit interface
  • Buffering for 2 seconds of audio: a cheap 4 mbit DRAM
  • Bootstrap the FPGA with a cheap PIC and serial EEPROM

This should be able to handle 48KHz at 24 bit. For MP3 decoding, MAD produces 24 bit output.

*

DRAM
Jameco has a cheap ($3) 256kx16 70ns DRAM, part number 128688CP. It's the same as this DRAM from Micron.

*

Wed, 16 Jul 2003

Ring buffer
OK, the dropped packet is not being received by the RTL8019. The RTL8019 receive buffer has two pointers, a read pointer and a write pointer. The RTL8019 writes packets at the CURR pointer, but never writes packets past the BNRY pointer, which the packet driver is meant to use as a read pointer. Some drivers set BNRY to the value of the read pointer, some set it to (readptr-1).

*

Tue, 15 Jul 2003

Losing packets
During a long transfer I see 1 second pauses, followed by a retransmit of the packet. From tcpdump it looks like the RTL8019 is either not getting the data, or it's not getting sending the ACK. Adding some trace to the driver to find out which.

*

Mon, 14 Jul 2003

TCP bandwidth check
For bulk data transfer - just streaming data into the CPU via TCP - it's running at 877 kbyte/s. Doing a software TCP checksum slows that down to 430kbyte/s.

*

Sun, 13 Jul 2003

TCP data sent
Phew, I just successfully opened TCP, send a short segment, and closed the TCP connection. I wasn't getting my own sequence numbers right, which upset the other end's state machine. Using netstat I check that the other end goes into TIME_WAIT, as it should.

Next task: dealing with big packets. When I do something like:

ping -s 1000 192.168.0.199
I get missing packets when the RTL8019 receieve pointer wraps. The machine thinks it's sending a reply, but it doesn't reach the other end.

OK, I fixed DMA reads so that reading off the end (i.e. above 0x6000) so that they wrap back to 0x4600. Big pings now return.

But there's some very frustrating issues with the RT8019AS. One is the disastrous documentation. I mean, nowhere does it even describe the memory map for DMA transfers: you apparently just know this by copying other peoples' drivers. For example, it's meant to have 16kbytes of SRAM on board, but everyone seems to set the buffers up for 8k - 0x4000 to 0x5fff - I can't see anywhere how to get a 16k buffer. I see that Rabbit Semiconductor is selling the AX88796L for $11.20 each. This part does 100Base-T, and the documentation looks really clear.

*

Thu, 10 Jul 2003

Debugging TCP
One obvious idea that had escaped me until this morning: things are a lot easier to debug if you watch events as they happen:

tcpdump -v -i xl0 'host 192.168.0.199'

I had the TCP checksum calculation wrong: the checksum includes the TCP pseudo-header. It's fiddly. But I'm now sending a single TCP reset packet correctly:

$ telnet 192.168.0.199 20000
Trying 192.168.0.199...
telnet: connect to address 192.168.0.199: Connection refused

*

Mon, 07 Jul 2003

About clocks
I wrote down some ideas about deriving an arbitrary frequency clock in VHDL. This must be well known to hardware people, but wasn't completely obvious to me until I thought about it.

*

Housecleaning
Fixed a few issue with the CPU: made it reset cleanly, made sure it still fits in the smallest Spartan II FPGA, the XC2S15. Shrank the IP code - now down to 612 instructions with debug, 450 without.

I started looking seriously at TCP last night. I think that a safe first step is to send a TCP reset.

*

Sun, 06 Jul 2003

22 MIPS
The Xilinx "Timing Analyzer" found the slow logic in the CPU. Having fixed these, it now runs with a 66MHz oscillator, giving a comfortable 22 MIPS. The ping numbers are now more respectable:

64 bytes from 192.168.0.199: icmp_seq=630 ttl=64 time=1.933 ms

*

Sat, 05 Jul 2003

CPU improved
OK, I made some improvements to the CPU and it's now running a 3-stage clock - ping is better:

64 bytes from 192.168.0.199: icmp_seq=292 ttl=64 time=7.907 ms

I moved the call-return stack from main memory into a dedicated 16x10 bit RAM. This used 10 additional slices, but then I saved about 10 slices by having a simpler memory interface and sequencer, so it's a wash. Also, this frees up 16 program words.

*

Faster?
OK, I can ping things, but it's a little slow: about 10ms.

64 bytes from 192.168.0.199: icmp_seq=3260 ttl=64 time=10.468 ms
64 bytes from 192.168.0.199: icmp_seq=3261 ttl=64 time=10.469 ms
64 bytes from 192.168.0.199: icmp_seq=3262 ttl=64 time=10.486 ms

The Rabbit on the network is much faster, about 2ms. I think that the current 4 MIPS VHDL CPU needs improvement. (I'm using a 16MHz oscillator, and a very simple 4 cycles/instruction scheme.)

*

Fri, 04 Jul 2003

Read a packet: ARP time
Set up the registers and - magic magic - in came a packet. It was an ARP request. I can make an ARP request go out on the network by doing this:

arp -d 192.168.0.199 ; ping 192.168.0.199
and the board picks it up. Now I just need to make it reply. OK, done that:
# arp 192.168.0.199
? (192.168.0.199) at 12:34:56:78:9a:bc on xl0 [ethernet]

... that wasn't too hard. Currently using 477 program words out of 1000. Hmm, wonder how big IP and ICMP will be.

IP and ICMP took about another 130 words:

64 bytes from 192.168.0.199: icmp_seq=146 ttl=64 time=141.437 ms
64 bytes from 192.168.0.199: icmp_seq=147 ttl=64 time=141.139 ms
64 bytes from 192.168.0.199: icmp_seq=148 ttl=64 time=141.061 ms
64 bytes from 192.168.0.199: icmp_seq=149 ttl=64 time=140.902 ms

*