I've finally finished (and debugged) my Fast Discrete Hartley Transform for the Propeller. It takes 2^N signed 32 bit input samples and produces the output in the same array. I've measured the following:
16 samples in 12,496 clock cycles
64 samples in 87,648 clock cycles
256 samples in 534,656 clock cycles
1024 samples in 2,899,504 clock cycles
4096 samples in 14,668,112 clock cycles
At 80MHz that's 27.5 1024 sample DHTs per second!