Lead, Never Follow
MISSION
- Product: Low Cost, High Performance, Digital Signal Processing
- Approach: Software Reconfigurable Hardware
SOFTWARE ARCHITECTURE
- Ported X-Midas to Linux PC for low-cost system-in-a-box and LapTop applications (1994).
- Developed JAVA based MIDAS technology for the Web. NeXtMidas = Networked XMidas.
- JAVA environment for graphics, portability, rhobustness, and industry interface (JAVABEANS).
- Java primitives support local host or browser based processing and/or real-time GUIs.
- Native code (C/C++/Fortran) processing primitives support local host or DSP nodes.
- Extended Midas macro language for application and GUI development at the engineering level.
- Two tier application model. Remote Midas InterFace with network bandwidth reduction hooks, embedded WebServer.
- Goal is to provide framework allowing Midas engineers to program hardware.
- Open source model – runs on Unix, Windows, or VMS.
HARDWARE ARCHITECTURE
- Developed ICE-PIC, high speed PCI data acquisition / playback (1996)
- Dual channel modular I/O.
- High-level programmable DSP.
- Run-time programmable Gate Array.
- Special purpose digital tuner chips.
- Multi-site modular Processing. (2002)
- Flexible data-flow and module interconnects.
- Drivers for Linux, OpenVMS, Digital Unix, Solaris, SGI, Windows-9X/NT, MacG3.
- Growing line of products – keeping in step with current technologies.
HARDWARE ENGINES
- DSPs – general purpose, good at FFTs, data movement, sequencing, terrible with bits. Easy to program.
- TigerSharc TS201 = 3.6GFlop, 14.4GOp, 5Gby/sec IO, 2.5Watts.
- FPGA – good at bits, serial correlation, data packing, terrible at floating point.
- Fixed point 16×16 multiply takes 100nS, 600 cells. Typical size 2-5000 cells. E321 algorithm in 1000 cells = one Alpha 4100.
- DIGITAL TUNERS – great for tuning, filtering. Low power. No programming.
- Graychip GC4016 = 4 chan, 100MHz, 18GOp.
- PLATFORM FPGA – Field Programmable Gate Arrays with RISC core, dedicated FP multipliers and 3GIO.
- With 44 MAC units at 250MHz = 22GOp. 3.2Gby/s full duplex serial IO bandwidth.
- At 2Watts per chip, it is possible to bring 22GOp to a Laptop, or 220GOp to a PCI slot. (Xilinx Virtex-II Pro or Altera Stratix)
FIXED POINT vs FLOATING POINT
- Floating point is easy to code, fixed point must handle scaling issues.
- Fixed point (usually 16bit) typically requires only 1/2 the IO bandwidth.
- Fixed point performance/power: 10GOp/W (PFPGA), 2GOp/W (TI6414).
- Floating point performance/power: 1GFlop/W (TMS), 0.5GFlop/W (TigerSharc), 0.2GFlop/W (Pentium).
- Telecommunications algorithms are usually fixed point code for portability/power reasons.
- Max processing in a PCI slot: 200GOp Fixed (PFPGA), 4GFlop Float (TMS).
- Max processing in a LapTop slot: 20GOp Fixed, 0.6GFlop Float (SHARC).
- There will always be more data to process than we have resources for. Maximize processing density.
THE NEAR FUTURE
- Extreme fixed/floating-point processing, running standard software primitives.
- 200GOp processing in PCI slot, 20GOp in Laptop.
- 10 Gbit/sec I/O architecture, with capture to disk.
- Hanging hardware off the network with RMIF (ICEBOX, ICE-NIC).
