box_tms.gif (4555 bytes)
   SSD Systems      RAM-SAN Systems     I/O Interfaces Specifications      Prices   
s4(164).gif (21284 bytes)

TMS Home

  4 Gflops (2 Nodes)
  Four TM-66 Chips
  60 Instr. per Clock
  32-bit FPU's
  VLIW Architecture
  64 MB Static RAM
  1.6 GB/sec DMA
 DSP Systems

 Scalar Processor
 TM-66 Chip
 DSP Applications
 DSP Math Library

documentation.gif (5102 bytes)

S4 Vector Processor
The S4 vector processor is specially designed to execute intense scientific algorithms efficiently and quickly. The S4 processor board is the basic scientific processing node in the SAM system.  Each S4 combines two vector processor nodes on one board and each S4 node delivers GFLOPs of actual sustained processing power. No other vector processor on the market delivers a higher percentage of its peak theoretical GFLOPs on real applications.   The vector processor node architecture is well suited to scientific and DSP applications, and specifically to FFT, vector, and matrix operations. User application programs are quickly written as sequences of function calls for execution on the S4 vector processor.  Hundreds of optimized scientific and DSP function routines are available in the DSP Math library.
S4 Architecture
Each S4 node features a sophisticated very long instruction word (VLIW) architecture which allows multiple hardware resources to work in parallel and for a continuous flow of data to be processed.  TM-66 DSP chips, in conjunction with a very high bandwidth memory, supply the number crunching power of the S4. Two 800 Mbytes/sec DMA ports keep the data flowing between the S4 nodes and SAM system memory. This combination of processing power and memory bandwidth makes for a very powerful vector processor board well suited for repetitive DSP processing on extremely large data sets.
TM-66 DSP Chip
The TM-66 chip was designed by Texas Memory Systems to provide the number crunching power for the SAM system. Each TM-66 chip has 20 floating point execution units, six fast data I/O ports, and a separate instruction port. Internally, twelve adders and eight multipliers are arranged into two parallel pipelines.   While this chip has many execution units, it has a simple architecture. Since all data management and processing are deterministic, efficient programming algorithms are easy to implement.

480.gif (962 bytes)