Real-time Compression for Acoustic Array Time-Domain Data

From Navy 18.1 SBIR Solicitation

N181-067 TITLE: Real-time Compression for Acoustic Array Time-Domain Data

TECHNOLOGY AREA(S): Battlespace, Electronics, Sensors

ACQUISITION PROGRAM: PMS 485, Maritime Surveillance Systems Program Office

The technology within this topic is restricted under the International Traffic in Arms Regulation (ITAR), 22 CFR Parts 120-130, which controls the export and import of defense-related material and services, including export of sensitive technical data, or the Export Administration Regulation (EAR), 15 CFR Parts 730-774, which controls dual use items. Offerors must disclose any proposed use of foreign nationals (FNs), their country(ies) of origin, the type of visa or work permit possessed, and the statement of work (SOW) tasks intended for accomplishment by the FN(s) in accordance with section 5.4.c.(8) of the Announcement. Offerors are advised foreign nationals proposed to perform on this topic may be restricted due to the technical data under US Export Control Laws.

OBJECTIVE: Create innovative algorithms and software for commercial off-the-shelf (COTS) general-purpose computers and Digital Signal Processing (DSP) equipment capable of converting time-domain data from a passive acoustic array into a compressed data stream that can be transmitted via satellite communications (SATCOM) and rebuilt into a replica of the original data.

DESCRIPTION: The Navy is seeking solutions to enable data from acoustic arrays to be transmitted in real-time without degradation to shore facilities for processing by specialists. Such transmission solutions would reduce the need for installation of expensive data processing and display systems on ships and the requisite specially trained crewmembers on-board to perform real-time data analysis. Since the capability sought is expected to be fielded as software that can be integrated into the shipboard array processor system, the recurring cost to field the capability would be minimal.

Current surface Anti-Submarine Warfare (ASW) practice for Surveillance Towed Array Sensor System (SURTASS) ships requires installation of an expensive, custom-built data processing system on each ASW ship, because the data from the arrays is too large to be transmitted to shore in real-time via Navy satellite communications (SATCOM). The quantity of data expected to be created by next-generation arrays is even greater and available satellite data bandwidth is not expected to grow from its present size. Therefore, current and future ASW platforms need a lossless data compression capability that enables raw time-domain data from each element of the array (up to 256 channels simultaneously) to be transmitted to shore and reconstructed as an exact replica to enable accurate data processing and precise target localization.

A unique solution to acoustic data compression is required for this application. Unlike consumer audio applications in which psychoacoustic phenomena are leveraged and much of the inaudible data is selectively removed, the compression scheme must preserve the time-domain waveform precisely in both amplitude and time across all sensor channels in order for the array beamforming performance to be fully exploited by the DSP system. Additionally, unlike commercial audio applications, all sensor channels are receiving data from a common real-world physical source (e.g., there is not a guitar on one channel and a vocal on another); therefore, each channel is processing the same acoustic data, but with variations in amplitude and time among the channels. It may be assumed that the configuration of the sensors, including their spacing and bandwidth, is provided to the compression and decompression algorithm. Sampling rates may vary among sensor channels. It should be noted that ambient noise needs to be preserved in the compression, and electronically induced sensor noise can be assumed less than ambient noise (and therefore inconsequential).

The product for this effort is software source code that can be integrated into the Navy’s common processor system, which is based upon the Intel x86-64 platform and the Linux operating system. (Previously other SBIR-sponsored software projects have been integrated into this common processor system, and appropriate safeguards to protect the contributors’ intellectual property have been put in place.) A COTS DSP device may be used if required, but it will need to be integrated into the processor system. It is acceptable to leverage available open-source code. The objective ratio of compression is 80% compared to the raw array data stream with an input/output latency of less than one minute.

In summary, the capability includes the following characteristics that are not available in today’s lossless compression/de-coding schemes:

  1. Input has user-variable number of channels up to 256
  2. Output incorporates forward error correction (FEC) with automatic negotiation capable of supporting radio/satellite data links with a bit error rate of 0.01, end-to-end latency of two seconds, and data block outages of up to two seconds
  3. Including FEC, output is single data stream composed of 80% of the data volume compared to raw data
  4. Sample rates can be selected by the user in 1Hz intervals up to at least 96kHz
  5. Sample rates can vary among channels, but will be in integer multiples of each other
  6. After coding and de-coding, the time domain waveform shall be visually identical to the original with objective performance of true effective 16-bit resolution (96dB dynamic range), < 0.05% Total Harmonic Distortion (THD), frequency response accuracy +/- 0.1dB, and phase accuracy between channels of +/- 1 degree at 0.9*(Nyquist frequency/2)
  7. Coding and de-coding processing latency not to exceed one minute

Work produced in Phase II may become classified. Note: The prospective contractor(s) must be U.S. Owned and Operated with no Foreign Influence as defined by DOD 5220.22-M, National Industrial Security Program Operating Manual, unless acceptable mitigating procedures can and have been implemented and approved by the Defense Security Service (DSS). The selected contractor and/or subcontractor must be able to acquire and maintain a secret level facility and Personnel Security Clearances, in order to perform on advanced phases of this contract as set forth by DSS and NAVSEA in order to gain access to classified information pertaining to the national defense of the United States and its allies; this will be an inherent requirement. The selected company will be required to safeguard classified material IAW DoD 5220.22-M during the advance phases of this contract.

PHASE I: Develop a concept for real-time lossless compression for acoustic array time-domain data implementation and perform analysis, modeling, and/or a demonstration to support the technical recommendation. The Phase I Option, if awarded, will include the initial design specifications and capabilities description to build a prototype solution in Phase II. Develop a Phase II plan.

PHASE II: Using the requirements and concept of Phase I and the Phase II Statement of Work (SOW), develop and deliver a prototype for a complete implementation of the data compression and decoding capability. Demonstrate the prototype’s performance in a lab using real-life array data that will be supplied by the Navy. (Since this data is classified, this demonstration can be performed on accredited classified equipment in the performer’s facility or at a Navy facility.) Support a temporary installation of the compression capability aboard a Navy ship to demonstrate the performance of its design in an operational environment; support the installation of the decoding capability at a Navy shore site; and provide operational testing technical support and performance analysis. In preparation for a potential Phase III, provide an estimate of schedule, non-recurring cost, and product cost to support integration of the capability into the Navy processor system.

It is probable that the work under this effort will be classified under Phase II (see Description section for details).

PHASE III DUAL USE APPLICATIONS: Support the Navy in transitioning the technology to Navy use. Work with the Navy’s integrator to support the integration and testing of the capability into the Navy processing equipment on-board an operational SURTASS ship. Tasks may include software development, software quality assurance, cybersecurity support, development of documentation, and test support on shore and at-sea. Deliver future software/hardware builds of the processor system with the SBIR-developed integrated compression and de-coding capability.

Passive acoustic arrays are used in the oil industry and this data compression capability would have direct application for data storage or transmission via radio. As an example of a business case, this data compression capability could be used to decrease deployment costs significantly by enabling the use of a relatively inexpensive unmanned vessel to collect acoustic data rather than a ship with crew ($50k-100k/day).

REFERENCES:

1. Johnson, M., Partan, J., and Hurst, T. “Low Complexity Lossless Compression of Underwater Sound Recordings.” J. Acoust. Soc. Am., March 2013, Vol 133, No. 3, pp. 1387-1398. https://soundtags.st-andrews.ac.uk/files/2012/05/Johnson_etal_JASA2013.pdf

2. Liebchen, Tilman. “MPEG-4 ALS – The Standard for Lossless Audio Coding.” The Journal of the Acoustic Society of Korea, October 2009, vol. 28, no. 7. http://elvera.nue.tu-berlin.de/files/1216Liebchen2009.pdf

KEYWORDS: Audio; Lossless Compression; Acoustic Data Compression; Real-time Data Compression; Lossless Compression/De-coding Schemes; Data Compression; Communications in a Battlespace Environment

TPOC-1: Mandeep Nehra

Phone: 401-832-9174

Email: mandeep.nehra@navy.mil

TPOC-2: Mary Johnson

Phone: 401-832-3840

Email: mary.h.johnson@navy.mil

Submitted Proposal: A181-067-0655

Note: Formatting changed to suite web page layout, page title blocks, disclosure restriction blocks, and trade secret page, have been removed from text.

Volume Two: Technical Proposal for Phase I

1 Identification and Significance of the Problem or Opportunity

Acoustic Array Sensors simultaneously output a parallel set of data from each sensor detector for the sought sounds and ambient sounds being detected. Due to the close proximity, the similarity in detector operation, the similarity in digitization, and the post calibration processing whether included in the detector readout electronics or in the compression processing, the sought and ambient sounds all have a high degree of similarity but occurring at different time shifts depending on physical placement of the detectors and angles in which the detected sounds come in. Although there may be several sounds from different sources simultaneously being detected, these sounds all have a very high degree of correlation across the sensor array set which makes the data detected, highly compressible.

There is a need to compress the high volume of data from Acoustic Array Sensors with very little to no signal data degradation to allow it to be sent over a limited bandwidth SATCOM communications in real-time to a data processing center for detailed signal processing analysis and expert evaluation by specialists. This could greatly reduce the deployed sensor system usage cost compared to having the sensor processing equipment and sensor specialist on board to perform the analysis. The operation could then be with lower cost unmanned vessels, less costly operation with smaller crews, or deployment where the vessel operation is primarily for other work and toeing an Acoustic Array Sensor system is not a big inconvenience and has value.

Note: Through conversation with the topic authors, it has been acknowledged that non-acoustic noise is a moderate component that cannot be mathematically compressed, and that the target compression system is not expected to have lossless binary data compression to achieve the target compression range. The data compression is not expected to be a "Lossless Data Compression" (which has specific meaning), as stated in the Request for Proposal.

2 Phase I Technical Objectives.

The Phase I objectives are:

  1. Create a base compression program architecture for a IA x86 COTS Linux server for doing realtime compressing of blocks of varying length of samples for parallel channels of Acoustic Array Sensor that support multiple types of compression and decompression algorithms and new algorithms can be easily added. Some code components such as the post compression data management will be simplified preliminary components in Phase I. It is expected the overall compression system, software components, compression algorithms, and compression performance, will evolve over the 3 project phases.
  2. Create a decompression program matched to each version of compression program for IA x86 COTS Linux server that decompresses the data blocks. The compressor internally has a matched decompressor for each compression algorithm to keep a running difference of data to determine the binary data difference and determine when compression is done. This program just breaks out that decompressor portion of the compressor into a separate decompression program.
  3. Create an Acoustics Sensor Array Data Simulator including the ability to mix or feed real Acoustics Sensor Array data into the data which feeds the compression program. Would prefer having real unclassified Acoustics Sensor Array data, ideally from the target sensor, however, can be from a similar sensor array, as soon as possible, for compressor noise analysis, algorithm development, and ambient noise analysis, modeling, and compression. The Acoustics Sensor Array Data Simulator will also be used to feed precise data into various compression algorithms for algorithm development and testing.
  4. Create a document with detailed descriptions of the software architecture, software components (or modules), and analysis and compression algorithms in detail. Covers the preliminary compression algorithms and some performance measurement metrics. Invent, research, and document other possible compression algorithms and methods.
  5. There is no question that a full compression framework to support multiple methods can be created and some compression can be done. The only question is: Can the desired compression performance be achieved by end of Phase II work?

The overall Phase I objective is to create a preliminary real-time Acoustic Array Sensor compression and decompression program for a Linux server with a preliminary compression capability where multiple and different compression operations can be iteratively applied to the data blocks. It is expected that the Phase I algorithms used in the initial coding will not meet the compression rate goals for all possible data types, and that the algorithms and real-time code will be iteratively improved, and new compression with decompression algorithms will be added and improved as work progresses through the Phase I Option and Phase II, and possibly in the future phase III work.

3 Phase I Statement of Work

The work for Phase I and Phase I Option are each 6 Months in length and will be conducted at Lightning Fast Data Technology Inc. Headquarters (3419 NE 166th Ave, Vancouver, WA 98682). However, some of the non-classified work might be done while in The Dalles, OR.

3.1 Overview of Realtime Acoustics Array Compression System

Shown in Figure 1: Real-Time Acoustics Array Compression System is the preliminary overall simplified diagram from the Acoustics Sensor Array to the SATCOM communications system. The compression program will be executing in an IA X86_64 COTS type system(s) running recent Linux. External to the server are some boxes for an overall operation discussion.

Acoustics Sensor Array: A two dimensional array of near identical acoustic detectors with preset and known physical spacing. It is controlled externally, the data order, rates, and associated setting are known and are provided to the compression program.

Real-Time Acoustics Sensor Array Compression Process: The real-time Acoustics Sensor Array data compression process, is discussed in more detail later in this document. It receives the sensor data and creates compressed data blocks and also difference data blocks. Where a difference data block is mostly non-signal noise, and is the difference between the decompress result and the input data (or the post calibrated data). If needed, a sensor equalization process component might be included in the process, but the equalization or calibration data will not change often and communicated as a separate very low block rate item and probably stored between runs. Both the compressed data block and difference data block are passed as a set to the Data Storage and Management Process.

Compression Diagram

Figure 1: Real-Time Acoustics Array Compression System

Data Storage and Management Process: Compressed and Difference data blocks are received from the real-time compression process. Both data blocks are saved in a high speed solid state drive(s) with the compressed data scheduled and sent to the SATCOM Data Transmission Formatter. This keeps track of data sent and can retransmit blocks if requested by feedback through SATCOM. Based on available bandwidth and block importance classification by compressor and data center feedback, it may send some of the difference blocks and continue sending when the sensor is idle. Feedback will also allow keeping track of the data block ranges that has been successfully received and is backed up at the data center. This will also manage and provide data to a locally communicated data copy system. For example a drone comes within range and data gets transferred to a drone for high speed retrieval. It deletes the data when data is no longer needed or is fully saved at data center via SATCOM, drone, and/or external data collection system.

Data Storage System: High performance solid state, sensor data storage. Huge data storage, could be 3D XPoint to allow large rewrite counts to handle large amounts of continuous data from the sensor as well as high speed output at the same time. Early on there can be data storage system for the input Acoustics Array Sensor data, however, decompression and difference blocks should allow input data to be recovered as long as there are no errors in the compression and decompression algorithm set. A second data storage might also be useful for temporary data storage if system's memory is not large or compression is not fast enough to hold all needed data at the same time.

Data Transmit Control: This is for feedback from the data center through the SATCOM. It is assumed that the SATCOM will be providing control and operation management, which is outside the scope of this compression work. However the SATCOM subsystem might be able to provide some communication blocks for feedback information from the data center. This is not for automatic error correction as in the case of TCP/IP operation, but for direct request of data block time periods for retransmission. Specific difference data could also be requested.

SATCOM Data Transmission Formatter: Internally the sensor data blocks do not have forward error correction (FEC) data. This process adds the FEC codes to the data set. With a worst bit error rate of 0.01 (discussed during pre-release technical conversations) needs to have the biggest 2n word size smaller than 100 bits (=1/0.01). That value is 64 bits, which requires n bits or 8 bits for FEC (same as an ECC memory module). This feeds the SATCOM and works with the Data Storage and Management Process to keep data available for continuous SATCOM communications. Sends difference data when bandwidth permits.

High Speed Localized Communications Formatter: This is to provide data copying to an external system. This external system could be a drone, and ideally would land on the boat and wait for the transfer to complete, and take off, since the copying of large amounts of data can take considerable time, even at a high data rate. Or this could be a copy system on the boat that continuously copying data that someone picks up every few days. Once data has been received at the data center, a SATCOM message can inform the data storage manager, and the old data can be deleted and storage released. This is a future component and initially will only have basic operation at moderate data rates over a LAN connection to a Data Copy System.

Data Copy System: This is a system to remove compressed data from the compression system and to also offload all of the difference data to free up storage for new data. A large number of solutions can be done, including the direct reading of the data disks by a separate Linux general purpose process, but that process would need to understand the data file organization. The decompression process will need to be executed to convert data back to sensor data.

SATCOM Radio: Outside the scope of this project, other than having a way to feed the data to the SATCOM radio. Transfer to SATCOM is a Phase II or Phase III project. It is also assumed that this radio is part of the control system and probably has bidirectional communications. However, the system can still operate and pass data to the SATCOM even without feedback from the data center but blocks lost in communications will not be resent.

SATCOM Control System: Assumed to exist, probably has a different name, and can provide input to the Data Transmit Control. However, is outside the scope of this project.

Sensor Array Control: Assumed to exist and probably has a different name. Is outside the scope of this project, however the compressor needs to know the readout rate and sequence settings.

3.2 High Level Compressed Data Block Format

The preliminary compressed data block has a format shown in Figure 2: Compress Data Block Format. This format allows locating the beginning and verification of a valid block intermixed with bad blocks, which might occur in SATCOM communications with such high error rates. All data is in little-endian format which is the native format of the x86 processor.

Compressed Block Diabram

Figure 2: Compress Data Block Format

Data Header: The data has a header block used for framing, sizing and finding data in communications stream or file of data. The byte packed header contains:

Framing Byte: The first byte of the header which allows a scan and validation process to find the next start of a frame. A valid frame byte and compressed block size provides hash location, allowing a full frame to be found and validated.

Flag Word: 8 bit Flag of header type, data type, and calibration id.

Large Header: 1 bit: Extended header for Compressed block sizes larger than 64K bytes (Large data header will be similar but to be defined later, if needed).

Data Type: 3 bits: Compressed Data, Difference Data, Compression Calibration, Information about data, etc.

Calibration id: 2 bits, if calibration id changes, the system knows to look for a new calibration block for data decoding. However, sensor calibration changes will be avoided in the middle of compression processing, since the data blocks might not be correctly decoded without correct calibration data.

Reserved: 2 bits.

Compressed Block Size: 16 bit word: Size in bytes which includes Header, Compressed Data Block, and Data Hash size sum.

Data Set Tag: A 16 bit word used for data set tags to match up to data sets. The data sets id is probably an incremented counter. Within a short time frame, the tags will be different, but over a long period of time the count will roll over and other sets will have the same tag. This tag is to help sort out data and difference data in the near term. Over the long term, the data needs to be put into data sets and managed by some other means.

Start Sample: A 32 bit word to indicate the current starting sample of this block from the start of data, for the fastest data rate data, of the set data being compressed. At the fastest possible rate, this will rollover in about a half a day. If this overflows, a new data set will be created, and this will become sample 0 of the new data set.

Sample Frame Count: 16 bits word of the number of data sample frames compressed for the fastest sample rate of the compressed data block.

Compressed Data Block: The data that the system or acoustics sensor array compressor provides as information to support decompression. It can be the compressed data block, difference data block, calibration data block, etc. as tagged by the Data Type above.

Data Hash: 32 bit word: Since a sum hash can easily cancel if 2 bit errors occur in 2 words at same bit location, an SHA1 will be used. The 160 bit hash will be converted to 32 bits by Exclusive OR of the 5, 32 bit words together. Hash includes Data Header and Compressed Data Block. If the recalculated hash is different, there is an error in the data. This results in a 1 in 4 billion chance of an internal data error being present for a matched hash value. A 16 bit hash could be used for 65 thousand to 1 chance of data error. (16 or 32 to be determined).

In addition to the compressed data blocks, some data set related blocks, such as general information about data, compression calibration data, channel information, etc. will be sent at beginning of data. In the end, the system will know the required blocks in a data set and can ask for them through COMSAT request feedback if they are lost due to transmission errors.

3.3 Data Center Receive Processing

Shown in Figure 3: Data Center Compressed Acoustics Array Receive System is a preliminary simplified diagram of a IA X86-64 COTS based system to receive, manage, decompress, and output the acoustics data into the data processing system at real-time rates.

SATCOM Radio: Outside the scope of this project, other than having a way to feed the output data into the compressed data receive system. It is expected that the data center has communications to the sensor system and some information about data blocks received can be fed back. However, the system can work without feedback, but it will not be able to directly recover lost data blocks.

Receiver Block Diabram

Figure 3: Data Center Compressed Acoustics Array Receive System

SATCOM Data FEC Process: This does single bit error detection and recovery, checks for and corrects data byte alignment for byte dropout in the FEC byte word groups, and parses the data stream for compressed data packets and validates the data blocks using the hash. Passes the data blocks to the Data Storage and Management Process.

Data Storage and Management Process: The received data is saved in the high speed data storage and all data block information is kept organized. Some Packets will be received out of order since some data blocks may compress faster than other blocks depending on content and be sent before the earlier block is queued for transmission. Data blocks are put in order and fed into the real-time decompression process. If feedback is enabled, it coordinates with the other side. Data blocks can be migrated to data center in a background process, and when the real-time receive processing has completed, the remaining data set can be offloaded into the data center and the receive system storage is cleared for future receive processing.

Data Receive & Loss Feedback: Dropout management, data management and recovery may be possible with some feedback. (Actual feedback maybe through a different path.)

Data Storage System: Similar to that discussed for the compression system.

Real-time Multi-Threaded Decompression Process: The matched high speed decompression process to the compression process running on the sensor side. The data information will include the version of the compression process, and the decompression process that supports that version will be executed. Later, the full static data set and decompression might be executed in a data system as a general process for post process analysis. If needed, the output can be decompressed as fast as possible or the data paced at the sensors output data rate.

Data Output Control: The Data Storage and Management process provides information about the available data and the data center provides information about what it wants to process and the Data Storage and Management System provides the real-time feed of the requested data.

Data Center and Acoustics Array Sensor Processing System: The data center that does the Acoustics Array Processing and supports the data specialists.

3.4 Real-Time Acoustic Array Compression Process

The preliminary diagram in Figure 4: Real-Time Acoustics Sensor Array Compression Process is expanded from Real-Time Acoustics Sensor Array Compression Process box shown in Figure 1: Real-Time Acoustics Array Compression System. This is a multi-threaded process that does the multi-channel data compression processing and is the primary focus of this contract.

Sensor Array Compression Process

Figure 4: Real-Time Acoustics Sensor Array Compression Process

This is designed to allow multiple subtractive encoding steps to be applied to a data block. As each step is applied, the decoder for the algorithm is executed and the difference to original data is determined and evaluated. This new difference block is used by the compressor for the next round until the overall data difference is small enough to the original data, that the last step generates a small residual difference above the read out noise that is compressed and saved as the last compression block A difference block of the readout noise floor is also generated. An occasional poor compression block might be OK if most of the other blocks compress well and bandwidth is available. How well the data blocks are being compressed can be tracked as part of the process. If bandwidth is an issue, a bit of the residual difference data might be shifted to the readout noise difference block, but will be avoided if possible.

To decompress the data later, the a decompression algorithms for each compression block is executed to get a result and added to the other decompressed blocks to get the near original data block minus the uncompressible detector readout noise. Decompressing the difference block and adding it to the decompressed block gets back to the original binary data or the post calibrated sensor data, depending on what was saved in the difference data block.

Compression and decompression algorithms can incrementally evolve and new compression and decompression algorithms sets can be added as analysis and algorithm refinement work is done.

Data Input and Reorder: The 50 megabyte per second input data needs to be received and probably reordered for efficient calibration handling. This could be a physical Ethernet interface directly owned by the program for fast low latency and high efficiency handling or communications through the slower Linux communications protocol stack. There are some restrictions to methods to use at this data rate, but it is assumed that input is already solved.

Sensor Calibration & Equalization: If the acoustic sensors are not calibrated in the readout circuitry, calibration is done since the more similar that the data is, the better the system will be able to compress the data. This may also do some general runtime data monitoring for minimum, maximum, mean, runtime checking of calibration, determining DC offset, environmental noise estimation, combined readout noise floor estimation, running signal power estimations, etc. Some of these items may be executed in other blocks or not at all if it turns out that they are not really needed for the compression or monitoring operations.

Data Grouping and Separation: The post calibrated data is ordered in buffers in a format useful for efficient compression processing. Channels at different data rates are organized as convent and data is passed to the Data Buffer List Manager. 5 channels of data, one close to the center, and 4 on the outer most corners of the sensor array, are copied into buffers convenient for data waveform and frequency analysis and passed to the analysis block.

Data Buffer List Manager: Temporary holds data while waveform and frequency analysis is occurring. When a compression process block length is determined, the data is separated and queued up as an Original Data Block to begin the compression processing.

Per Channel Waveform and Frequency Analysis: Each single channel is analyzed for current frequency and waveform components. Impulses, waveforms, and center frequencies are estimated and the data is passed to the Data Waveform Analysis.

Data Waveform Analysis with Past and Future Data: Data for each of the channels is kept for a second or two to allow additional frequency analysis when needed and keep analyses information until compression is done for the data region. Some trade secret algorithms might be used to calculate accurate signal and phase angle relationships for the start of the next block using the 5 data sets for some of the data component frequencies detected. The processing in this block provides data to the Data Block Compress Length Evaluator and data to the Compression Encoder to help determine what compression algorithms to apply.

Data Block Compress Length Evaluator: This looks at the current data block to be sectioned and determines how many samples in time to group together into a block for compression. This may consider the past decisions and data to be sectioned following this (the future) to determine areas of similarity, quite areas, etc., to help divide the data. It is desirable to create big enough blocks for good compression with the per byte, packet block overhead, being minimized, but small enough blocks that allow easy detection, recovery and retransmission of corrupted blocks as well as keeping the compression from taking too long to compress a large data block. There is also some short symbol advantages where block errors get averaged across the data blocks which can reduce internal block compression biases.

Multi Channel Compression Encoder: This is the primary compression process which will use a hyper-threaded CPU core at 100% (using correct Linux process management) working on the compression of a block until completed. Simultaneously, other data blocks will be compressed with other hyper-threaded CPU cores to get a high degree of compression processing in a single system. A multitude of Compression and decompression algorithms can incrementally evolve and new compression and decompression algorithms sets can be added as analysis and algorithm refinement work is done. Input is the original data or the current difference block of compressed blocks, decompressed and subtracted from the original data block.

Multi Channel Compression Decoder: This decompresses the current compressed data block and adds the data to the last decompressed data block to get the current decompressed block.

Data Difference Calculator: The current decompressed data block is subtracted from the original data block to get the new difference data block.

Data Difference Evaluator: The new difference data is evaluated to determine the quality of the compression. If the compression was ineffective, the compression can step back and try a different compression algorithm. If the difference is small, the compression process is done. If compression was acceptable, the compression data is kept and the compression encoder does another compression round with the new difference data.

Compressed Data Block Formatter: A data header is composed and compressed data blocks added to form the Compressed Data Block without the Hash.

Noise Data Difference Data Block Formatter: Creates a Difference Compressed Data Block. Applying the difference data to the decompressed data will get back to the original data block. If no sensor Calibration & Equalization is present, the residual data forms the data for the block. However if Calibration & Equalization has occurred, the error block can be for data after it was applied or before it was applied. If for the data before Calibration & Equalization, the decompressed data can have the Calibration & Equalization reversed and the difference calculated, which will result in a larger difference data block, but will get the data back to the exact binary input. The difference block data is in a bit compressed format.

Hash Calculator & Append: The hash of the data is calculated for the Compressed data Block and for the Difference Data Block and appended to each. The header information matches the blocks together. The pair of blocks are output to the Data Storage and Management Process, completing the compression of the data block.

3.1 Noise Considerations

Non- sound detected sample data components, such as internal detector generated noise, electronic circuit noise, cross talk noise, A/D converter electronic and quantization noise, noise of any preprocessing and result rounding noise, and any other noise contained in the data not generated from sound stimulation, will just be grouped into Detector Readout Noise. Detector Readout Noise is predominately random, but probably with some colorization from averaging effects. True random data is not compressible, which makes Detector Readout Noise not compressible. This Detector Readout Noise forms a noise floor is which limits the accuracy of data and does not need to be sent if SATCOM communications bandwidth cannot support it.

3.2 Basis of Initial Compression Algorithm

Frequency analysis in the 5 data sets (closest to center, 4 outermost edges) will be used to find the primary data signals, incoming directions, and phase relationships. For wavelength frequency less than the spacing of the 5 data sensor positions, the number of cycle changes for the signal between the positions will need to be measured to determine accurate incoming angle(s). The data's frequency and phase angle will give a much more accurate waveform position than trying to do a mechanical sample by sample waveform curve match, even for impulse matching. However, some algorithms might use curve overlay matching and impulse data might be represented as a curve. Careful data set evaluations will determine the best methods.

Starting from largest to smallest, a model of the data segment for frequency, amplitude, direction, starting phase, and some other symbols to categorize bulk characteristics of how the waveform changes through the block, will be used in the compression blocks. The compressed block, removes the data for each waveform compressed. After the series of accurately compressed data is removed, the residual data should be fairly small. The residual is separated where the estimated Detector Readout Noise number of bits is sliced off the bottom, and the remaining residual representing detected data is compressed by a bit algorithm and becomes the last block compressed. The series of bulk waveforms and residual makes up the compressed data.

Isn't this cheating since this is what the end system is trying to measure and these measurements will bias the measurement results? Not at all, since the data blocks are short and measurement parameters are recalculated and restarted at the beginning of each compressed block. The residual data helps remove remaining compression error of the data block. Full data measurements at the data processing center will be across a larger block of data formed from a series of data blocks and estimation errors will become less significant. This is a method to isolate and compress the bulk data characteristics as fast as possible.

The theoretical shortest block would be 1 sample frame thick. For example, 20 bits could be used to represent 1 of a million possible symbols (data patterns) of the characteristics of data across data block, and with scale and rotation, this could encode data very efficiently. A frame might be made of 1 or more symbols. The symbol characteristics is static and known by both compressor and decompressor. This could work since there is always some form of relationship of the detected data across the sensor array, but no relation for the non-signal generated noise. Symbols can also be multiple samples thick, however matching to a large number of seemingly random symbols can be very difficult. An algorithm generated and matched algorithm might work. I suspect that a symbol compression method might be difficult to do. Symbol compression might be explored as an additional method after bulk signal compression has been developed. If there is a pattern of residual signal error across the data set, a basic symbol could represent this pattern. For example, the residual data starts small and increases toward the end of the data block, a symbol representing a cone shape could be used to compress the residual where it starts with compressing the residual with one bit, which goes to 2 bits in the center and 3 bits at the end. Another pattern might be an hour glass shape where residual is minimum at the center. Overall there is a multitude of possible ways to compress data. More methods can be found through thought experiments of the characteristics of particular data in hand (which is why processing and analysis of real data is important) and document research of digging through multitude of documents and books to find better methods that has already been done, or a method applied differently than before to create a new method.

3.3 DSP vs COTs IA x86

The IA x86 Xeon has a DSP instructions set, a Vector instruction set (multiple instructions at the same time), huge cache memories, can use 1 Gigabyte page memory (reducing data access overhead for virtual system) for efficient data memory access, and can do mathematics very fast, with the main problem of keeping the data available (memory reads are slow). Plucking of data across the memory will cause single data point per memory cache line reads, and would be slow. Ordering the data and processing so that most data is in the cache, when needed, can make a big difference. DSPs have the same problems and often without the resources. The ordering of data for DSPs to make them fast, also works with an x86. At 0.3 ns per complex instruction or math calculation, it difficult to find a cost effective DSP system that can outperform the Xeon with its multiple and parallel execution units in each core. Expecting many dedicated IA x86 CPU cores available for processing and not planning to use a DSP system for this project.

3.4 Phase I Tasks

3.4.1 Task 1: Research Acoustics Sensor Array Data Sequence Variability

The array can have up to 256 sensor channels running simultaneously, with sample rate selectable in 1 Hz intervals up to 96kHz. Sample rates can be different on some channels, but will be an integer multiple of one another. A thorough understanding of the possible data sequences and data rates and a plan on how a compression system could handle all of these variations has to be determined. It is also assumed that there is a small number of simultaneous data rates, and cases where every data channel has a different data rate, will never occur.

3.4.2 Task 2: Create the Base Compression Program

Using the preliminary block diagram in the COTS system portion shown in Figure 1: Real-Time Acoustics Array Compression System and the preliminary compression architecture in Figure 4: Real-Time Acoustics Sensor Array Compression Process create the base program to run in a Linux system. This program is with very basic operation and will be refined during the development process as needed. Some things like the SATCOM formatter and complex Data Storage and Management will be left to Phase II work. All coding work will be on an evolutionary path which will be iteratively improved through expended time rather than exact definition and exact delivery time. A much higher quality product can be evolved than can be guaranteed at the start, due to the learning and organization that occurs during the process.

3.4.3 Task 3: Create a Decompression Program

The compression and decompression algorithms are a matched set. The compression program contains both and this program breaks the decompression portion out. This Linux program will decompress a data file which contains a series of Compressed Data Blocks. If a difference data block follows the Compressed Data Block, both blocks will be used to reconstruct the original data block back to the original binary data. Tests will be ran with a sensor data set through the compression and decompression programs and compare with original data for errors and generate other data such as compression efficiencies, average block sizes, etc.

3.4.4 Task 4: Create an Acoustics Sensor Array Data Simulator

This program generates simulated acoustics sensor array data to feed the compression system. This will be used to generate some specific data with which allows testing and evaluation of the mathematical operations in the compression algorithms. May also generate some simulated environmental sound data and detector readout noise data for general signal processing tests. This will allow some modeling, demonstrate operation, and allow compression performance analysis. However, this program will have limitations, since it is just a stimulation tool for the data generated, and will not be generating data of a fully modeled acoustics sensor array.

In real-time mode, the simulator will generate data rates consistent with the acoustics arrays and to simulate the 16 bit multi-channel (up to 256 channels) digital sample feed rate at whatever data rates up to 96K samples per second, with the start of the set accurately timed by using the CPU internal 64 bit running clock counter as a reference. This data can then be used to show that the compression algorithms can keep up with the input data rates. Data is generated with the modeled data frequency content consistent with the set output sample rate. The sample rate is just a component of frequency wave generation that can be changed as needed.

This program will be able to feed stored sensor data at either real-time or non realtime rates. For real-time operation, it may require fast disk(s) like in the Data Store System discussed earlier.

3.4.5 Task 5: Research & Development Compression Algorithmsr

Search for and study articles and books related to audio compression methods, acoustics sensor array data processing, and general data compression methods. Do signal processing and compression of real acoustics sensor array data (if sensor data is available), look for compression issues, determine what the problems are, ponder ways to solve the problems and create algorithms to improve processing.

3.4.6 Task 6: Detailed Project and Algorithm Documentr

Document the software architecture, components and libraries. Document sensor data organization and buffering to support the organization since data organization and how it is directly used can greatly affect the signal processing performance. Document the waveform analysis methods providing data to the compression system, and it's good and poor properties. Document each compression algorithm used in the compression engine, test results, estimated compression performance (varies depending on data), and the good and poor properties of each.

3.4.7 Task 7: Test and Data Display Programr

A windows utility, testing, and GUI display program which connects to the compression program through a TCP/IP to monitor internal operations, plot data, and help develop the code and algorithms. Programs are coded to build and run on Linux, but due to the far superior debugger in the windows development environment, will probably build the most of code as a library and do some of the code debugging linked into in a Windows program where it is easy to set breakpoints and step through and closely examine and evaluate the program's code.

Can easily write windows display code to display data however it's needed. A possibility of a 3 dimensional plot that will graphically show progress from sensor or difference data to the next difference data that visually shows how the compression performed and how the residual data is spread across the data block. This may allow visualization of data compression mismatches that would be more difficult to see by than just looking at numbers. This is a development and convince tool, to help develop and evaluate some of the algorithms, and is added to as needed, rather than as a predefined program.

3.4.8 Task 8: Project Management

General project support and management. Work scheduling, planning, and progress tracking and review. SBIR, government contracting compliance review and planning.

3.4.9 Task 9: Phase I Monthly Status and Progress Reports

Status and Progress Reports document the status of overall project, the projects objectives for the month, the progress of each task, results obtained, and any concerns. Provided within 15 days after the completion of each month, but excludes the last month which is included in the Final Report.

3.4.10 Task 10: Preparation of Material for Meetings

Compile material for algorithms reports, data, and status into documents, Power Point presentations, and laptop demonstrations to present at the Sponsoring SYSCOM Facility.

3.4.11 Task 11: Meeting at SYSCOM Facility

For kickoff, mid-contract, and end of contract meetings, travel to contact officer selected facility for meetings to discuss presentations, program related information, and laptop demonstrations.

3.4.12 Task 12: Phase I Final Report

Contains detailed information for project objectives, work performed, results obtained, and estimates of technical feasibility. Provided within 30 days of Phase I completion.

3.5 Phase I Option Tasks

3.5.1 Option Task 1: Continue Improving Compression Program

The base program will be created quickly in Phase I, however the data analysis and compression program will be an ongoing effort that will take considerable time to improve the algorithms, create new compression algorithms, and improve pre-compression data analysis. Improvement will be an ongoing effort across the Phase I Option and Phase II programs.

3.5.2 Option Task 2: Continue Improving Decompression Program

Polishing of the decompression program to be used as a real-time decompression program as shown in Figure 3: Data Center Compressed Acoustics Array Receive System. And, the decompression software is a linkable library component used to create a simple command line, file set in with file out, decompression program that can be used by script files to do general decompression of the compressed files located on a general purpose Linux IA x86 computer. Improvement will be an ongoing effort across the Phase I Option and Phase II programs.

3.5.3 Option Task 3: Continue Improving Acoustics Sensor Array Data Simulator

As algorithms are refined, testing and simulated input data types and sets will need to be refined and be added to support more algorithms types and better sensor data simulation, and continue to be an ongoing effort across the Phase I Option and Phase II programs.

3.5.4 Option Task 4: Continue Research & Development of Compression Algorithms

Algorithm and methodology searches and study will continue. Analysis of the results of signal processing, studying the related data issues, thinking of how to solve the issues and forming new solutions to improve and create new algorithms will be an ongoing effort across the Phase I Option and Phase II programs.

3.5.5 Option Task 5: Continue Adding to the Project and Algorithm Document

As coding evolves, software architecture, components and code library documents will be updated. Continue documenting the new compression algorithms and improvements to the compression algorithms as the project progresses.

3.5.6 Option Task 6: Continue Improving the Test and Data Display Program

The Test and Data Display program will continue evolving as needed in support of the compression system development.

3.5.7 Option Task 7: Project Management

General project support and management. Work scheduling, planning, and progress tracking and review. SBIR, government contracting compliance review and planning.

3.5.8 Option Task 8: Initial Design Specification and Capabilities Description

Information from this proposal, diagrams, research and development, data, and initial test results will be used to create the Initial Design Specification and Capabilities Description document as called out in the request for proposal, which describes the prototype solution for Phase II.

3.5.9 Option Task 9: Create Phase II Proposal

Write the Phase II Proposal for "Real-time Compression for Acoustics Array Time-Domain Data".

3.5.10 Option Task 10: Phase I Option Monthly Status and Progress Reports

Same as described in Phase I, but for the Phase I Option time period.

3.5.11 Option Task 11: Preparation of Material for Phase I Option Meeting

Same as described in Phase I, but for the Phase I Option material.

3.5.12 Option Task 12: Phase I Option Meeting at Sponsoring SYSCOM Facility

The kickoff for Phase I Option will be done at same time as Phase I end of contract meeting. The Phase I Option mid-contract, and end of contract meetings, is to travel to contact officer selected facility for meeting to discuss presentations, program related information, and demonstrations.

3.5.13 Option Task 13: Phase I Option Final Report

Contains detailed information for project objectives, work performed, results obtained, and estimates of technical feasibility. Provided within 30 days of Phase I Option completion.

3.6 Schedule of Major Events

ww (work week) 2: Phase I kick-off meeting at SYSCOM facility.

ww6, ww11, ww15, ww20, ww24: Phase I Monthly Status and Progress Reports.

ww15: Phase I mid-project meeting at SYSCOM facility.

ww25: Phase I end of contract and Phase I Option kick-off meeting at SYSCOM facility.

ww26: Phase I deliverables: Phase I: Compression Program, Decompression Program, Acoustics Sensor Array Data Simulator, Test and Display Program, Detailed Project and Algorithm Document, and Program User's Manuals.

ww29: Phase I Final Report.

ww32, ww37, ww41, ww46, ww50: Phase I Option Monthly Status and Progress Reports.

ww35: Phase II Proposal, Initial Design Specification and Capabilities Description document.

ww40: Phase I Option mid-project meeting at SYSCOM facility.

ww51: Phase I Option end of contract meeting at SYSCOM facility.

ww52: Phase I Option deliverables: Phase I Option: Compression Program, Decompression Program, Acoustics Sensor Array Data Simulator, Test and Display Program, Detailed Project and Algorithm Document, and Program User's Manuals. ww58: Phase I Option Final Report.

3.7 Deliverables

Applies to Phase I and separate set for Phase I Option if the Phase I Option is exercised.

  1. Monthly Status and Progress Reports: May contain proprietary information, unless otherwise requested or 2 versions with and without proprietary information.
  2. Final Report(s): May contain proprietary information, unless otherwise requested, or 2 reports with and without proprietary information at contract officer's earlier request.
  3. Compression Program: Buildable program on Linux system. As Git repositories.
  4. Decompression Program: Buildable program on Linux system. As Git repositories.
  5. Acoustics Sensor Array Data Simulator: Buildable program on Linux system. As Git repositories.
  6. Test and Display Program: Buildable program on Window's system but also need to build code libraries for Window's from the Linux repositories. As Git repositories.
  7. Detailed Project and Algorithm Document: R&D Document of Project.
  8. Program User's Manuals: Documentation on how to build and general use of programs.
  9. Technical Meeting Presentation Documents: For meetings for discussion of technical algorithms and data, status, progress and ongoing contract plans. May contain proprietary information. Could also deliver versions with the proprietary information removed.
  10. Phase II Proposal: If Phase I Option exercised, the proposal for Phase II.
  11. Initial Design Specification and Capabilities Description Document, if Phase I Option exercised.

3.4 Technical Data Rights Assertions

Technical Data for Restrictions Basis of Assertion Asserted Rights Name of Person
Signal Power Measurement Algorithms Contains proprietary trade secret algorithms developed at private expense. In process of applying for a patent. Restricted (License) Mike Polehn
Linux, Libs, DPDK, Components Open Source Software GPL, BSD Licenses Open Source Community

The high quality trade secret method (discussed later) that will be used in this project, also has use in general Array Sensor Signal Processing. These algorithms are Licensed only for this compression project through Phase III including licensed for the deployed system. This gives no rights to disclose elsewhere or to be used elsewhere. Other agreements and/or contracts are required for other uses. Standard SBIR data rights applies to all other parts of the contract.

4 Related Work.

Note: Acoustic Signal Processing has the same exact meaning as Audio Signal Processing. Different audio processing algorithms area used for different types of end results but does not separate acoustic and audio meaning. Digital Acoustic Signal Processing is a sub-category of Digital Signal Processing where Signal Processing is much more encompassing and represents the mathematics and methods of processing of real world data which usually includes noise data. (This being provided sense the evaluators of a different SBIR, didn't understand these relations.)

4.1 Sensor Signal Processing and Algorithm Development

Sensor Signal Processing R&D for 7 years at Boeing Aerospace. This included sensor signal processing algorithm development, sensor simulations, noise analysis, data measurement, sensor signal processing R&D reports, presentations, and designed, manufactured, tested, and delivered high performance pipeline digital signal processor hardware prototypes, along with a sensor emulator to the missile defense center in Huntsville, AL. Lead Engineer for creation of an improved sensor digital signal processing hardware prototype. The R&D work included an algorithm using correlation methods to do a 6 to 1 data compression while maintaining high data content integrity.

As an expert electrical optical engineer (from the Boeing experience) consultant for Northrop Grumman, L-3 Communications, and Sensors Unlimited in support of the infrared camera in the U2 “military asset”. Created a signal processing analysis program to analyze the camera data, improved the deployed sensor performance, and improved the camera testing program. Lead development engineer for creation of new sensor chip test program, which made testing easier, did better data measurements, and generated high-quality test reports. The camera system used correlation techniques to achieve a 12 to 1 data compression to reduce data rates for radio based communications to the ground based data analysis system. John Ueng-McHale, 609-649-5400. Mike Caro, 609-524-0341. 2003 to 2009, part time to 2014.

4.2 High Performance Real-Time Processing using COTS x86-64 Xeon

To help increase markets for Intel x86 Xeon processors, work used IA X86 Linux servers to run real-time packet processing, which is also directly applicable to doing real-time signal processing. On a single server, simultaneously achieved input/output data rates > 380 gigabit/sec with 20 10 GbE interfaces, did basic packet processing of > 180 million packets/sec, while utilizing 20 CPU cores executing real-time code in 10 VM (Virtual Machines). These were 60 second, 0 loss tests, but could have sustained operation and performance level indefinitely. Work included expert code and algorithm optimization. Intel, Trevor Cooper, (503)-696-2096, 2015.

Additional information can be found at http://www.lightningfastdata.com/info/flow_log.html.

4.3 Acoustic Data Signal Processing for Signal Frequency Measurement

A trade secret proprietary frequency analysis algorithm has been developed that that will be very useful for signal analysis in this project (and maybe future Navy acoustic data processing projects). This algorithm allows high-quality frequency analysis at any chosen frequencies between 0 and Nyquist frequency. This has some very important qualities that is not available with the standard frequency analysis methods.

This secret proprietary frequency analysis algorithm will be used in the detailed processing of the 5 data channels to find the primary contents and in some of the compression algorithms to find the location and starting phase of these primary content sounds in the remaining channels. These algorithms will be used in the initial target compression algorithms discussed above and some of the future compression algorithms, but some of the compression algorithms might not use them. The compression algorithms to try first, and order that the algorithms will be tried, will be based on the initial detailed processing of the 5 data channels.

An acoustic processing program is under development based on this algorithm. Due to expert knowledge of optimization, signal processing, and utilizing x86 CPU to do signal processing: On a laptop (8th gen i7, 4 core, 8 threads, 15 watt) the program can sustain a real-time processing rate of around 465 million 64 bit floating point operations per second on a single computer core. In addition to the math, the single core is also converting 16 bit data to 64 bit floats for each input data access, doing loop processing, getting and saving intermediate and result data, processing data buffers, sending data buffers to the sound system, getting the done buffers from the sound system, comparing minimum and maximum data results, clearing and scaling and plotting data up to 40 times/sec (the plotting also uses considerable CPU resources), and doing interactive Window's processing for the program, all while maintaining real time operation (the sound does not breakup), and the Window's program remains fully interactive. All together this is a very substantial amount of processing being executed on 1 CPU core. The preliminary acoustics program will be used to evaluate real sensor data and the plots will help visualize the frequency content, but is not part of this project. It is available for your evaluation as a prototype.

Additional information can be found at http://www.lightningfastdata.com/freq/freq_anal.html.

Currently a patent application is being prepared for the frequency analysis algorithm. The current convolution, FFT, and Goertzel algorithms are the current methods for frequency analysis.

Convolution 600 Hz Off Frequency Plot Convolution 600 Hz Off Frequency Plot

Figure 5: Comparison of Convolution and Goertzel Response

Plots created for the patent's current methods in Figure 5: Comparison of Convolution and Goertzel Response shows that a point by point sine and cosine wave convolution, which gets the same input sine wave data, produces the same exact response as the Goertzel Algorithm; (even though the base mathimatics are completely different). Note, the FFT is based on the sine and cosine convolution, so it is covered. A 500 Hz center frequncy analyisis with at 600 Hz input is used. These 200 points match to 13 digits, and the same frequncy analysis with input data at the center frequncy, matches to 14 digits. These responses, by going up and back down to 0, show that the frequncy response has a zero frequency width, which is undesirable when measuring real data that usally has frequencies that are not at the measurement center frequencies. This off frequency measurement response is different (non-linear), as you move across the curve in time, and the particular stopping point on the curve for data further off frequency, is just noise, making acurrate high-quality data measurements difficult.

[Note: 1 page of Trade Secret information removed]

5 Relationship with Future Research or Research and Development.

(a) Anticipated Phase I results: 1): Sensor Array simulator that can produce simulated sensor array outputs for well-defined uniform high-signal amplitude inputs at various angles and mono frequencies and readout noise simulation of uniform random and-or white noise. 2): Acoustics sensor array compression system as outlined in the discussion, that can analyze and measure the input data waveforms and noise floor and produce compression and difference blocks for various sample length block sizes. High compression of simulated signals is expected. 3): R&D of algorithms, environmental noise, and (hopefully get) real array sensor data for analysis that will provide data for determining more complex data and noise environments for compression. Overall this development and R&D work will provide very good data for the technical feasibility of the proposed compression architecture and the initial proposed compression methods.

Anticipated Phase I Option results: 1): Get real-time multi-threaded operation running; (most development work will be done in non real-time mode). 2): Improved simulation models. 3) Evaluation of real data, (if available), through the compression process. 4): Additional and improved compression algorithms for environmental sounds and more complex sound environments. 6): Additional R&D learning and documentation. 5): Prepared for Phase II.

(b) The significance of the Phase I objective is to provide a Phase II foundation for evaluating real sensor data, and creating and improving compression algorithms. It is anticipated as the data becomes more complex, the difficulty in compression will correspondingly become more complex, where some complex data blocks may have poor compressibility. Poor compressibility of some blocks is OK if other data blocks have good compressibility since it is the overall average compressibility that is important. As work proceeds, it is anticipated that there will be diminishing returns on work quantity to improve algorithms and compression capability.

(c): Regarding clearances, certifications, and approvals: Fully US owned business, the principle investigator and owner, Mike Polehn, has had secret clearance in the past and has no anticipation of any issues for obtaining a new clearance. Have received "Military Critical Technical Data Agreement" before and should be able to get one for this project. Have had ITAR DDTC registration in the past, when needing it, and should be able to obtain it again, when needed for this project. The business is located in a private home, however, review of the facility clearance documents gives the impression that getting a secret facility clearance should not be an issue. The project will start out with 1 person, the principle investigator Mike Polehn. Coordinate with the contact officer to get clearances, certifications, and approvals as needed, and make appropriate adjustments to the project cost structure to support these.

6 Commercialization Strategy.

The compression of large channel count Acoustic Sensor Arrays data for communications over a SATCOM communications is a high value component for the Navy. The sale of Phase II and Phase III contacts to the Navy to accomplish the development of the compression and decompression software through deployment, is the primary focus and revenue for this work. There is also some value in accomplishment to help obtain future Navy and DoD contracts.

After Phase III deployment, the acoustic array compression will be marketed to other businesses like the oil industry which also use Acoustic Array Sensors. This highly specialized product has no use outside the Acoustic Array Sensor data compression. Overall, non-DoD delivery count is expected to be very small.

7 Key Personnel.

PRINCIPLE INVESTIGATOR: Mike Polehn

Oregon State University, BS in Computer and Electrical Engineering, 1983

RELEVANT EXPERIENCE

Many years of R&D experience for digital signal processing systems. Did sensor evaluations, did signal processing algorithm development, did extensive simulations and processing performance evaluations, designed and delivered high performance pipeline signal processors to internal and DoD facilities. Extensive test and evaluation of infrared sensor camera, performance and data, test program improvement, extensive analysis computer programs, operational and test issues resolution, algorithm development, and operational performance improvement. Also did audio-acoustics signal processing, optimization and compression for Intel and modem companies.

Information and resume can be found at http://www.lightningfastdata.com/mike/mike_info.html.

8 Foreign Citizens.

No foreign nationals or foreign citizens or individuals holding dual citizenship working as a direct employee, contractor, or consultant will be working on or have access to this Phase I or Phase I Option projects.

9 Facilities/Equipment.

The physical facilities to carry out Phase I is only office space, since this will be programming, documenting, and compression analysis work. Currently have several fast i7 development systems, with one of these is a very recent fast i7 running CentOS 7 with 32 Gigabyte of DDR4 which runs as fast as a server. Also have some i5 Linux systems. Overall no new systems are needed for phase I work. Can create a new unconnected network for classified data processing.

The COTS equipment to do work is low cost, is provided by Lightning Fast Data Technology Inc., and a local area network island (not connected to any other networks), is easy to create. All classified data will be kept on encrypted disks and only decrypted during dynamic use with all result files being encrypted. Access of decryption keys will be password encrypted and kept on a separate flash drive, which will be kept in a safe, when not being used. When not running overnight, long running, or weekend compression tests, equipment will be shut down during non-working hours to clear any residual unencrypted data in memory. When classified data is no longer needed, it will be destroyed or returned to the contract officer as requested.

The facilities meets all environmental laws and regulations of Federal, Washington State, and local Governments for, but not limited to, the following groupings: airborne emissions, waterborne effluents, external radiation levels, outdoor noise, solid and bulk waste disposal practices, and handling and storage of toxic and hazardous materials.

10 Subcontractors/Consultants.

No Subcontractor or Consultants are required for Phase I and Phase I Option.

11 Prior, Current or Pending Support of Similar Proposals or Awards.

No prior, current, or pending support for proposed work.

12 Discretionary Technical Assistance.

No Discretionary Technical Assistance (DTA) required for Phase I and Phase I Option.

Proposal Evaluation Criteria

From section 4.1 of the 18.1 DoD SBIR Program BAA:

a. The technical approach has a reasonable chance of meeting the topic objective,

b. This approach is innovative, not routine, with potential for commercialization and

c. The proposing firm has the capability to implement the technical approach, i.e., has or can obtain people and equipment suitable to the task.

Evaluation comments

It is unknown if this is the exact criteria in the debrief since the exact criteria is not listed on the debrief, however criteria b and c might be swapped relative to the list in the debrief.

Note: Debrief text information was copied and pasted, however the formatting was changed to adapt to the html type document.

Proposal Evaluation Debrief

Proposal Evaluation Debrief

Thursday, May 10, 2018

Proposal Evaluation Debrief N181-067-0655


Proposal Number: N181-067-0655 Topic Number: N181-067
Title: Real-time Compression for Acoustic Array Time-Domain Data
Firm: Lightning Fast Data Technology Inc.

Evaluator 1

  Criteria A:    
 
  Strength: The vendor has proposed a framework in which each block of data is evaluated for frequency/waveform components iteratively, the content being removed at each step, until it has sufficiently low energy that all content has been recovered and all that remains is noise. The algorithm expects this noise to be small and will compress it as a final step in the block compression.

The proposal outlines not only the compression approach, but all of the framework surrounding it from array to SATCOM ready data.

At the heart of the approach is the vendor’s trade secret Signal Power Measurement Algorithms.

This company understands how to configure a computer system for fast processing and operation, which would support the real-time aspects of this task.

  Weakness: The ocean acoustic noise is fairly random and the contacts of interest are very low in SNR, making this approach challenging. While identifying and peeling off the signal components as waveforms is intellectually appealing and attractive for eventual use in classification, it is likely that the final compression step for the block will overwhelm the data compression budget. If, in Phase I, the vendor focuses solely on his own data simulation, he is likely to misjudge the importance of the ocean background noise, flow noise, etc.

This proposal focuses on the data manipulation and methods for gathering the data from the array and transmitting it to the shore station more than on the actual compression. It merely says that “a multitude of compression and decompression algorithms” can be used and evolved.

Their emphasis is on the engineering of the data storage and processing hardware architecture.

They mention that a significant portion of their development work will involve building a sensor array simulator and then evaluating their compression approach using simulated data. It is likely that their simulated data will not be completely representative of actual data, and thus provide false characterization of their compression scheme, especially without having significant prior underwater acoustic array experience. A significant amount of work would be required to essentially develop all of the compression components from scratch; in fact, the proposal includes time dedicated to researching existing compression techniques.

  Criteria B:    
 
  Strength: The PI has many years of experience developing and implementing high performance digital signal processing systems, including audio/acoustics signal processing and compression.

The PI has held a clearance in the past.

PI has performed work for the DoD in the past. He has held a secret clearance previously, and expects to issues in obtaing one.

  Weakness: The PI has no experience with underwater acoustics and has made some incorrect assumptions.

This is a one-employee company with very limited acoustic data experience. The PI has an emphasis on computing and processing capabilities, and there is no one else at the company to assist with the acoustics aspect of the project. The PI does not seem to have experience specifically with compression techniques either. There would be a lot of effort spent learning the relevant background information. The required clearance for working with classified data are not currently in place.

  Criteria C:    
 
  Strength: The vendor expects to market to the Navy and to the oil and gas industry.
  Weakness: There is little information in the commercialization statement and no detailed plan.

Lightning does not perceive any market for the developed technology outside of the applications specifically involving acoustic array data compression and therefore does not emphasize marketing to non-DoD businesses. They do point to the oil industry as a possible market, however.

Post Proposal Comments

Attempted an Award Protest, which I had a week. Putting together a discussion of the weakness and inaccuracies in the evaluation, resulted in a paper that was longer (30 pages of discussions about the debrief information) than the proposal. Realizing that this was probably a big waste of time and time would better be used for other things, I terminated writing, did some formatting, added a section from a DARPA proposal that discussed my very successful IR sensor and IR FPA chip test programs which totaled at 183,480 lines that would require 5 1/4, 500 page reams of paper just to print (70 line/page, single side), to show that I could write a very substantial program, by myself, if I need to. Could have stood polishing, but needed to put it behind me, and forget about adding all the information that came to mind. The program manager never responded to the Award Protest.

Award Protest Docuement

To: jilion.kohler@navy.mil, michelle.prinstone@navy.mil

Dear Jilion, Michelle,

This is comments in protest to the Award Evaluation Debrief N181-067-0655.

Contents

Criteria A                                                           2
  Comments on Strength Evaluations                                   2
    Invalid Assumption About How much processing is Being done       2
  Comments on Weaknesses Evaluations                                 4
    Comment A                                                        4
    Comment B                                                        6
    Comment C                                                        7
    Comment D                                                        9
      Processing Software Architecture                              10
      Data Storage                                                  14
    Comment E                                                       16
    Comment F                                                       17
      Task 5: Research & Development Compression Algorithms         18
Criteria B                                                          19
  Comment A                                                         19
  Comment B                                                         20
    "limited acoustic data experience"                              20
    Limited Experience Angle                                        21
    One-Employee Company                                            22
  Comment C                                                         22
  Comment D                                                         23
    The Trivial Compression Paper                                   23
    Arbitrary Competing Example                                     23
    Problems with the Trivial Compression Example                   24
    The G728 Compression                                            26
    Overall Audio Digital Compression                               26
  Comment E                                                         27
    Sensor Readout                                                  28
    Satellite Communications                                        29
  Comment F                                                         29
Criteria C                                                          29
  Comment A                                                         29
Capabilities Copied From Larger DARPA Proposal                      31
Conclusion                                                          37

Criteria A

a. The soundness, technical merit, and innovation of the proposed approach and its incremental progress toward topic or subtopic solution.

Comments on Strength Evaluations

The vendor has proposed a framework in which each block of data is evaluated for frequency/waveform components iteratively, the content being removed at each step, until it has sufficiently low energy that all content has been recovered and all that remains is noise. The algorithm expects this noise to be small and will compress it as a final step in the block compression.

Invalid Assumption About How much processing is Being done

"sufficiently low energy that all content has been recovered"

The primary analysis is to remove highest level signals to make the rest of the data more compressible. The highest general compression for a strong signal is to defining a data component as a component waveform model that covers signal strength and change over time, which means I am looking at one of the highest possible compression method first. A further compression of this compressed single data, could be symbol for say signal model type X, at start amplitude, phase and frequency with amplitude modulation model symbol AM45, frequency modulation symbol FM22, where all they symbols are predefined and understood at the both ends. The compression methods are endless and what was meant by "symbol" compression as stated in the proposal.

A: Some of the signals will be compressed by using a signal model. The signals may be whale calls, or rocking falling off the cliffs under the water , etc. Some compression of the signals might be the target sonar signal, however this may not be true.

B:The none compressible noise data will never be able to be compressed since truly random data is not compressible. I have worked with data that had about 3 bits of non data related noise, and can be changed based on hardware design and set analog gains of signal being fed into the A/D converters. This readout/system noise is measurable and can be determine over a moderate amount of data. This noise is removed from the bottom, but kept separate data blocks for exact bit recovery if desired. IN fact the this known issue of non compressible portion of data can makes lossless compression at the desired compression rates impossible made this something I was not going to write a proposal until the discussion indicated that it was not expected and is why, at the beginning of the proposal, I stated:

Note: Through conversation with the topic authors, it has been acknowledged that non-acoustic noise is a moderate component that cannot be mathematically compressed, and that the target compression system is not expected to have lossless binary data compression to achieve the target compression range. The data compression is not expected to be a "Lossless Data Compression" (which has specific meaning), as stated in the Request for Proposal.

However, by hardware and analog signal gain design, a sensor readout system could built where the readout noise is lower than the binary least significant bit of the A/D converter and there would be no readout/system noise component in the data. It all depends on the overall signal dynamic range and converter binary word size. However to get the best sensitivity for low signals, the signal gain before the A/D needs to be set so that some of the readout noise is present, but then there is a limit to the maximum signal that can be readout. These tradeoff should be already been done. I suspect that low signal sensitivity is highly desired and some readout noise would be present in the data.

C: The signal data in between A and B are compressed with some sort of binary signal compression. This is still content and is not removed by the primary compression algorithm. As stated in the proposal:

After the series of accurately compressed data is removed, the residual data should be fairly small. The residual is separated where the estimated Detector Readout Noise number of bits is sliced off the bottom, and the remaining residual representing detected data is compressed by a bit algorithm and becomes the last block compressed.

D: The A compression processing is stopped which C processing appears to have a good enough compression level. This is much different then "all contents has be recovered", which is an assumption that was made by the evaluator and not me. If all contents were recovered, there would not be a block for C.

The highest general compression for a strong signal is to define a data component as a component waveform model that covers strength and change over time, which means I am looking at the highest potential compression method, as the first level of compression! Of course characteristics of component waveform models can be represented as "symbols".

Comments on Weaknesses Evaluations

Comment A

The ocean acoustic noise is fairly random and the contacts of interest are very low in SNR, making this approach challenging. While identifying and peeling off the signal components as waveforms is intellectually appealing and attractive for eventual use in classification, it is likely that the final compression step for the block will overwhelm the data compression budget.

A: " data compression budget": During the prerelease interview I asked about the targets processing system, I was told "Assume Processing Power is not an issue". I can easily visualize a stack of COTS servers each having 2 28 core CPUs with ample memory, etc. Stacking up 10 system for a total of 530 CPU cores. The 500 watts per system for 5000 watts will only use a few horse power of the 1000s of horse power needed for a moderately small boat at sea. This is not an unreasonable system considering today's technology. This is probably tremendously overkill, but it fits within the "Assume Processing Power is not an issue" case. The compression budget cannot be considered as a weakness. In reality, I pride myself on writing quality, high efficiently programs, and expect more moderate CPU needs.

B: The data being processed is to characterize the data for compression, not for classification of the data. The evaluator is right that my algorithms has good potential for data classification, but this is would be a different project. This a strength that gives good use for future sonar related projects. This is why the following was stated in the Proposal:

There is also some value in accomplishment to help obtain future Navy and DoD contracts.

The high quality trade secret method (discussed later) that will be used in this project, also has use in general Array Sensor Signal Processing. These algorithms are Licensed only for this compression project through Phase III including licensed for the deployed system. This gives no rights to disclose elsewhere or to be used elsewhere. Other agreements and/or contracts are required for other uses. Standard SBIR data rights applies to all other parts of the contract.

C: The Final compression step, is the binary compression of the remaining signal data. The algorithms I have in mind are simple, so do not require much processing power.

D: The assumption of the reviewer that processing power needed to do the processing is some huge unreasonable value is invalid. I have seen invalid assumptions before when the person cannot see the full picture, so the conclusion is some godly huge requirement assumption, that no basis in reality. The primary work will be on the base 5 data sets, as per proposal:

Frequency analysis in the 5 data sets (closest to center, 4 outermost edges) will be used to find the primary data signals, incoming directions, and phase relationships.

This will be used to generate the base data waveform model, that will be applied across all of the data sets. Some checks of the waveform for each sensor data maybe done and adjustments may be applied, but data focus is narrow processing associated with a particular waveform, which is less costly than measuring the original unknown waveform set across the full frequency range for each sensor data set. This is not some pie in the sky unlimited processing function.

E: Compressing real world data from 16 bit data by 80% to 3.2 bits per sample, while "maintaining high quality analysis integrity", is non-trivial, which makes any compression method challenging. The comment "The ocean acoustic noise is fairly random and the contacts of interest are very low in SNR, making this approach challenging." is just stating the obvious.

However this is also hinting that this particular method is somehow undo-able to be done or is highly undesirable. I suspect that no proven compression method for this type of data currently exists. It is also possible that all the other simpler methods being developed might not be able to meet the needs and may never meet the needs. Since this is a unique method, that is substantially different then the other methods, but is a valid approach, and has a possibly of meeting the needs, although complex, and since the desire to get the best technology and capabilities for the United States and also the Navy, this uniqueness is a strength, not a weakness.

This an innovation contract not a basic mindless effort.

Comment B

Phase I, the vendor focuses solely on his own data simulation, he is likely to misjudge the importance of the ocean background noise, flow noise, etc.

A: Creating compression code without validating and characterizing the algorithms and testing the operation of the algorithms is a very poor way to do algorithm development. The compression algorithm validation, is the primary function of the data simulator. This will also help provide for checking the other algorithms such data management and handling. The Following is in the object statement:

The Acoustics Sensor Array Data Simulator will also be used to feed precise data into various compression algorithms for algorithm development and testing.

Under the task section:

This program generates simulated acoustics sensor array data to feed the compression system. This will be used to generate some specific data with which allows testing and evaluation of the mathematical operations in the compression algorithms. May also generate some simulated environmental sound data and detector readout noise data for general signal processing tests. This will allow some modeling, demonstrate operation, and allow compression performance analysis. However, this program will have limitations, since it is just a stimulation tool for the data generated, and will not be generating data of a fully modeled acoustics sensor array.

It is a weakness of the evaluator to not understand the requirement for such a test tool to design and develop a substantial systems. It is a strength in the proposal to recognize the need for known and controlled data content to do accurate testing and algorithm math evaluation.

Noise generation is needed to verify some of the less obvious algorithms that measure DC offset, baseline noise, and noise characteristics algorithms and are working as expected. To evaluate these with unknown noise is possible, however checks and balances provide a much better basis of operational verification. However by the time this project has completed Phase II, the operational characteristics will be known and if the general data processing does not result in statistical characteristics or other relevant characteristics, this will indicate a problem and can be traced down and solved.

Not really focused on own data simulation. It takes time to create the realtime compression software and simple known data works better then raw unknown content data. It was unclear when I could get some real repetitive sensor data, which would not be needed for awhile, however the following statement is in the Proposal:

Would prefer having real unclassified Acoustics Sensor Array data, ideally from the target sensor, however, can be from a similar sensor array, as soon as possible, for compressor noise analysis, algorithm development, and ambient noise analysis, modeling, and compression.

The statement indicates that I understand the importance of having real data or that I place no importance of ocean background noise, etc. This is a total made up idea by the evaluator since this shown paragraph section, directly states otherwise.

Comment C

This proposal focuses on the data manipulation and methods for gathering the data from the array and transmitting it to the shore station more than on the actual compression. It merely says that “a multitude of compression and decompression algorithms” can be used and evolved.

The baseline compression method was presented, and as indicated in Comment A, that this baseline compression was understood with the exception that not all audio items are to be extracted before the residual non-system-noise data was compressed as the final step. In reality this residual probably needs to be no more than 2 bits, maximum amplitude of 4 for positive only (amplitude of 3 for signed data which is why the unsigned representation would be used) of the 16* 0.2 = 3.2 bit data budget per data sample.

From another comment on the strength, this was recognized as a processing framework. It highly desirable that the framework support not just one method of compression, but multiple methods of compression methods depending on the data content.

I have ran across algorithms that analyzed data, didn't take into account edge conditions and trashes the results since it produces a result non-representative of the data. In fixing these algorithms to make better and produce reasonable measurement results, you can say that they were evolved.

Signal data, such as a sound series has certain characteristics. If the integrity of the data is to be maintained certain constraints must be maintained. Most audio compression algorithms either discard data or modify the data in such a way as to make the data more compressible. The human hearing system is very adaptable. A small shift in time, the change in relationship of the signal between the left and right side such as phase is not going to make much difference to a human, but would probably make a difference to high quality data measurement system like a sonar, that requires high integrity in content and phase relationships. Every once in awhile, you hear the some of the more extreme cases when talking on a cell phone, but the algorithms have been slowly improved and these effects are becoming less evident.

Any off the self audio compression algorithm is suspect of not maintaining high data integrity needed for this project. Any algorithm used, it must be understood how it will affect the data that has gone through the compression and decompression process. If this is not understood same data might compress OK but some data will probably not compress and decompress properly. Starting with complex, unknown exact characteristics of data processing will be much more difficult to fully evaluate than starting from the side of designing the algorithms with known characteristics for processing the data. The bottom line is that almost any available sound compression algorithm will probably not meet the data integrity needs and it would be very costly (in time) to determine the exact characteristics to determine if it will meet the requirements. However, it will be even more expensive to build and fully deploy a system with a complex algorithm and find out afterward that it trash portions of the important data content, which would put you back at the beginning.

Changing data characteristic of the data could randomize an objects data say from submarine, and make it look like a cliff face which has a much more random structure, and the change may even go unnoticed. It is very important that data changing effects be understood for all algorithms used, which is why no off the shelf algorithms were called out.

How do go about create compression algorithms for such data?

  1. Look at the data at data block of interest and determine what compression methods could be applied to get the desired compression results for that type of data while maintaining integrity.
  2. Then try the methods on the data and evaluate the data to verify that the assumptions were correct.
  3. Determine the range of data that would fit the criteria and come up with a way to analyze the raw data to apply compression for that data range.
  4. Validate the range checking and validate application of the compression algorithm on a lot of real data and log and analyze the weaknesses and improve.

This is a process of algorithm development for such data. Only an expert would know of this method and know that it is method that can be effective when effective algorithms are unknown, which is assumed for this project.

The compression frame work was designed to support this method of algorithm development. It provides checks of the applied algorithms effectiveness and can determine when an algorithm is not effective when applying the compression algorithm, then applying the decompression algorithm, determining how well it works, and reporting a measurement of effectiveness. Poor effectiveness just means that the algorithms for data range of evaluation for applying the algorithm is incorrect, or the algorithm was not correctly understood and needs changing or elimination.

This process is repeated each data blocks that do not compress well. Some types of data there may not have an effective compression method and so would not meet the compression range needed for that block. However it is the overall averages that counts and few blocks poorly compress or do not compress at all are OK when others blocks compress well.

This framework can support other compression methods such as only send data blocks that only contains valid Sonar data during realtime operation. All other data is low priority and sent only if bandwidth is available. Many tradeoffs can be made with this type of frame work.

This method is an evolutionary type process which allows the compression performance to improve as effective methods are determined for the various types of data blocks that occur in the dataset. This is an iterative method and that takes considerable time to do. How can one expect that a full effective high quality compression system can be done in one tiny SBIR Phase I contract?

This should be on the Strength side because:

  1. A fundamental base compression algorithm using a compression method that has the potential for the highest level of compression possible (sending a waveform model of the data content rather than the raw data) has been presented, but there are possible limitations for some data blocks of complex data combinations.
  2. A realtime processing framework, which has a way of managing data, processing, and results has been presented that can also support the development and operation of additional algorithms that can be added to and improved (evolved) over time. If at a later time (after Phase I and Phase II) a very effective algorithms are developed, they can be added. Even a 3rd party algorithm could possibly be added. Without such a framework a new solution would require a full development cycle of the full compression system to make changes.
  3. A framework that supports the development and analysis that is needed for expert development of algorithms, and recognizing this in the first place proves expert knowledge and abilities is already in place.
  4. The "proposed approach" implements an " incremental progress toward topic solution" as defined in "Criteria A”

Comment D

Their emphasis is on the engineering of the data storage and processing hardware architecture.

The hardware architecture is called out as COTS servers/systems with COTS data storage components, which there is very little hardware engineering occurring. This project is primary a software project and the standard COTS system hardware architecture is hardware framework that it will execute on.

Data management, a software function, since the satellite band cannot handle the full data throughput, is very important to sort the data into important data and less important data, where the less important data can be saved on a data storage system and either sent, collected, or deleted later as determined by the needs of the user. The important data can also be saved until it is known that the data was successfully received in the data center and for cases of sensor systems or power failure where data up to just before failure can be recovered.

The other issue is recovery of the data to the original data content. What if something goes wrong. A small data corruption could snowballs into a major corruption. What about the loss of satellite communications for 10 minutes or an hour. Just discarding the data means all information during that period is lost. By use of data management, realtime communications maybe lost, however the data would be recoverable, if it is important.

Processing Software Architecture

Maybeo f "processing hardware architecture" the person meant to say "processing software architecture". Writing high quality, high performance, high efficiency realtime code requires expert coding abilities. Writing and testing algorithms with no concern of real-time performance as done on a general purpose system is moderately easier then writing code designed for a high efficiency realtime signal processing. However the consequence of that is either much lower efficiency code requiring considerably more processing resources to execute or a total rewrite of the code to fit the new realtime execution needs. The initial establishment of a realtime framework and the writing of the processing code to operate efficiently and work with characteristics of the processing environment and with the managed data buffers is much less costly than a code rewrite at a later time.

Knowing how to write high efficiency real-time code before you write any lines can be quite valuable for creating such as system. Within the proposal it discussed my audio signal processing program:

An acoustic processing program is under development based on this algorithm. Due to expert knowledge of optimization, signal processing, and utilizing x86 CPU to do signal processing: On a laptop (8th gen i7, 4 core, 8 threads, 15 watt) the program can sustain a real-time processing rate of around 465 million 64 bit floating point operations per second on a single computer core. In addition to the math, the single core is also converting 16 bit data to 64 bit floats for each input data access, doing loop processing, getting and saving intermediate and result data, processing data buffers, sending data buffers to the sound system, getting the done buffers from the sound system, comparing minimum and maximum data results, clearing and scaling and plotting data up to 40 times/sec (the plotting also uses considerable CPU resources), and doing interactive Window's processing for the program, all while maintaining real time operation (the sound does not breakup), and the Window's program remains fully interactive. All together this is a very substantial amount of processing being executed on 1 CPU core

However what I didn't state was that I wrote the code once and did not make any modifications (other than basic debugging) or have to do any optimization of the original code to get that performance level. This means that my abilities are already at that level.

Also am using a very old complier (1997) to generate the code (in both my develop system and on the 8th generation i7 mentioned above). And the code is being generated in debug mode which does not apply all of the optimizations). So recent CPU advances are not being used. (I use this old compiler because the code editor is far superior to the editor of the newer Microsoft Visual Studio versions, when coding at expert level.)

MSVC Program Mostly Used

My development systems is a 8 year old first generation i7 (works good):

Development System Information

Released in June 2009:

CPU Information

Recently added Goertzel processing to directly compare FFT results with Goertzel results at the FFT frequencies. This does 512, 1024 Goertzel Algorithms on each of the 2 channels for a total of 1024 64 bit simultaneous Goertzel algorithms and doing the full processing in including plotting as described above on 1 single core, while processor running in hyper-threaded mode (8 logical cores above) with only rare audio breakups (Windows is not a realtime operating environment).

Within a document comparing FFT, Goertzel, and Convolution Analysis Relationship:

http://www.lfdt.biz/info/fmeas/freq_meas.html

is the following Goertzel Data plot id data for the FFT and Goertzel processing result comparison (results equal to 8 digits of accuracy):

1024 Point Right Channel Goertzel Progression Plot

The Goertzel Algorithm is well defined, and is also defined in the web page. You calculate processing power using 64 bit math for continuous execution of 1024 Goertzel Algorithms at 44100 Hz sample rate. Of course if you need to be expert enough to understand what this means to achieve this continuous mathematics rate on 1024 independent results, in a real code implementation, otherwise it is just numbers.

As stated in the web page, the measurement outcome for the highest signal frequency at the center of the Goertzel 1024 point block (since equivalent to FFT, this also means for the FFT) then if 2 512 point blocks were processed. This makes and shows the frequency bin for non-center frequency measurements, which most frequencies are not at the center frequency, very non-linear(Inaccurate results). (A big plus for my trade secret data frequency measurement algorithm discussed in the proposal).

The bottom line is that I can write very high efficiency code with little need to optimize it, which is not your common programming ability. What this really means is that it will take much less time, which is directly related project cost to write the target code, than would another group that does not have these abilities.

I also have the ability to know when something will work before I do it. It did not take me very long to come up with the processing frame work and I know without have written it, that I can make this real-time software work on a set of IA COTS Linux servers, and depending on the algorithms and considerations might be able to run on one server. Having a valid framework in place saves the expensive cost of rewriting the code to fit a viable real-time frame work at a later time, which means lower overall project costs.

Emphasis on using a valid real-time framework is a strength.

By putting this comment as in the weakness category may imply that there is some negative aspect, maybe somehow affects cost. The software architecture design was to show that I had my act together by having a viable system, there is almost no additional cost to the project to use the architecture or additional costs to make design changes for adaption, which I fail to see as a weakness.

The Data management framework is highly desirable. Leaving it out would mean a non-robust system or the much higher cost of changing the framework to support the needs at a later time.

Data Storage

Maybe the emphasis referring to the calling out of the harddisk structure. Below is my new windows development system with 8th Generation, 6 core i7, 4 MHz, 9 MB Cache, 32 GB DDR4, with a 3D XPoint Storage System. The new systems CPU per core performance is well over twice as fast my current 950 CPU per core performance. The system also boots very fast and can get data off disk very fast. The 3D XPoint is plugged in the lower right hand side in the picture below.

PC showing xPoint Plugged in

Shown in the picture below is the 3D XPoint Storage System in it's shipping box. It's moderately expensive at $360 for a 280 GB (bigger sizes are available), however rotational hard disks are easily damaged and cannot keep up. Flash disks can end up needing to clear flash data blocks and other overhead which can create issues on high speed real-time operation since the flash disk is not accessible for periods of time. The flash disks have a limited number of times that can be written before the go bad. Having continuous high speed data, the flash SSD will have a short life span. The 3D XPoint has a 4 GByte access rate, does not wear out like the Flash Disks, can be continuously and randomly accessed, and used while the real-time compression system is running at full rate and at the same time offload and clear old data from the disk.

xPoint SSD

Calling this out this COTS disk is just including it in the COTS system architecture. It is accessible by standard methods and requires nothing special, or any additional engineering costs, to cause it to be considered some weakness.

There is some programming work to write code to manage data and use the storage system, but in Task 2 states:

Some things like the SATCOM formatter and complex Data Storage and Management will be left to Phase II work.

It is a strength that I am knowledgeable about the real needs of a real-time system to do a correct design in and knowledgeable about weakness of common storage systems to specify the right solution.

Comment E

They mention that a significant portion of their development work will involve building a sensor array simulator and then evaluating their compression approach using simulated data. It is likely that their simulated data will not be completely representative of actual data, and thus provide false characterization of their compression scheme, especially without having significant prior underwater acoustic array experience.

This is basic a repeat of comment B with some additional comments that is in Criteria B. Repeating comments is just fluffing up the weakness comment size, which has a negative evaluation benefit without content.

Again the simulator is for signal processing algorithm mathematics tests and making the system functional.

Contained in Task 4: " However, this program will have limitations, since it is just a stimulation tool for the data generated, and will not be generating data of a fully modeled acoustics sensor array."

When the data is in the form of digital acoustics data (another name for digital sound data), underwater physical acoustics prosperities really has low relevance! This compression program is not trying to determine any characteristics about the properties of matter that exist below the sensor. However some basic signal properties of data across the sensors might be useful for high data integrity compression. Only correlated data can be used for cross channel compression. Uncorrelated data cannot be used for compression across the data channels, just as random data is uncompressible.

Comment F

A significant amount of work would be required to essentially develop all of the compression components from scratch; in fact, the proposal includes time dedicated to researching existing compression techniques

Code not written from the prospective of real-time operation and performance and high quality efficient processing can be a tangled mess and more expensive to convert then to just writing correct code in the first place.

The existing audio data compression algorithms does not have the need for high quality and high integrity data that is needed for very high quality data analysis. Data loss and data modification where humans cannot tell much difference before and after is part of the process. Some types of data can be changed, data can be delayed, signal phase relationships changed, and makes little difference to a human, but would be detrimental to high accuracy data, needed for measurement in sensor array data. In all probably, any audio data (audio and acoustic data is interchangeable term since they have the exact digital data characteristics) compression program with good compression performance will probability not be a good candidate for compression, unless it was designed for or was was specifically designed for high quality and high integrity. Most very high quality data is maintained as uncompressed data since there is no real reason to compress it.

Trying to convert a low integrity algorithm to a high integrity algorithm has a high probability of being an extremely difficult to impossible task. The right way is to only look at algorithms that only use methods that will result in high integrity processes and work from a solid foundation as applied.

Internally the signal processing framework has feedback that could do some correction for introduced signal errors of the algorithm. But, depending on that built in mechanism to correct an arbitrary poor integrity compression algorithm probably will not have substantial overall compression, since the correction data maybe substantial and cancel the compression gains of the algorithm.

The SIBR is where the "I" stands for Innovative. It is not too innovative to copy algorithms from the internet or out of a book without full knowledge of their exact characteristics. Using algorithms without knowing the side effects is foolish, if the target is high quality and high integrity data.

I have no problem writing code as complex as the real-time compression framework code that was described in the proposal. Most audio compression deals with a small number of channels as compared to the target array system, so no such frame work would be readily available. It requires expert abilities that I process, and most programmers without realtime capabilities would probably do it poorly. Not having a frame work that can robustly handle a large number of channels, means that adapting something not designed for that type of operation will be backwards and expensive to do. This being a very short contract with limited number of hours, will limit the amount of effort that can be applied to initial waveform analysis and the compression data measurement processing, but will probably have basic operation for compression of simple waveforms working.

SBIR phase I is primary a research phase and assuming that all research was done before Phase I including researching algorithms invalidates the definition of phase I. In fact I might spend several hundred dollars of my own money buying professional books on the subject of compression and sonar processing and thumb through them looking for things I don't know that might be usefull, but that will be a small effort and has a good chance of not having a big value, but doing that where there is no need (don't have a contract) is just a waste of resources. The proposal for R&D work as stated in the proposal:

Task 5: Research & Development Compression Algorithms

Search for and study articles and books related to audio compression methods, acoustics sensor array data processing, and general data compression methods. Do signal processing and compression of real acoustics sensor array data (if sensor data is available), look for compression issues, determine what the problems are, ponder ways to solve the problems and create algorithms to improve processing.

This has 2 parts. The first part of book research is a smaller portion. The second part is working with data blocks being processed by the compression frameworks is the basic outline of the iterative method used to solve for and create algorithms. Using the Task 7 display system, the data content can be studied and the algorithms waveform signal model algorithms can be improved and new specific data algorithms start to be created. However the Phase I and Phase I options are very limited in time, and the primary time consuming work will be done in phase II.

The fact that I can develop and write all of these items from "scratch" is a strength.

The fact that I recognize that most all available significant compression algorithms will have a high probably of having undesirable effects is a strength. To recognize the cost of determination of these side effects is also very costly is a strength. To recognize that I may not know all of the possible ways to do high integrity compression and to include some document R&D as part to the effort is a strength.

Note: Anyone that believes they know everything, probably knows much less then than they should since they don't need to learn more, probably has too low of an IQ to realize they don't know everything, and that is a weakness. I do not make the claim that I know everything, which is a strength.

Criteria B

b. The qualifications of the proposed principal/key investigators, supporting staff, and consultants. Qualifications include not only the ability to perform the research and development but also the ability to commercialize the results.

Comment A

The PI has no experience with underwater acoustics and has made some incorrect assumptions.

Saying that I have made incorrect assumptions without listing what the incorrect assumptions are has no value. It maybe the assumptions about my assumptions might be invalid. ... Just like the assumption "The vendor has proposed a framework in which each block of data is evaluated for frequency/waveform components iteratively, the content being removed at each step, until it has sufficiently low energy that all content has been recovered and all that remains is noise" is an invalid assumption since this is only targeting the stronger signals and the small remaining signals above the system noise baseline bit level are then compressed as a low amplitude binary bit block.

Once the signal has been detected and digitalized, the data is in the data domain and underwater acoustics and sound prorogation physics of water, is of little relevance for this project of compression and decompression. The only use is to use the relationship where it helps to relate strong similar signals across the data sensor set where they can be correlated, and the majority of the high amplitude signal removed from the data blocks, leaving only lower amplitude signals to compress a different way.

The compression code is only compressing data and is not trying to use the data to determine the characteristics of the source of the reflections of the data as sonar processing would do. The physics of sound propagation in air and water, the physics of the addition and cancellation of signals, water temperature layers that affect the signals, the speed of the boat and speed of the currents that have Doppler effects, etc, and the complex mathematics used to sort all that out and create a sonar image is pretty much irrelevant for the need to compress, manage and decompress data. There may be some value in some of the gross characteristics to help in the correlation of data across channels, but the actual sample data, really plays a much bigger role.

The underwater acoustic deity has no importance in the high integrity compression, transmission, and decompression of the sensor data. From my prospective it is the Signal Processing and algorithm development deities that has value for this project. The reason I am calling it a deity is a good description of the relative overrated importance that the biased and convergent thinking that is being used. The expertise needed for this type of work is really Signal Processing and Algorithm Development and over estimating the importance of underwater acoustics shows the weakness of this comment due to incorrect assumptions.

Comment B

This is a one-employee company with very limited acoustic data experience.

"limited acoustic data experience"

This implies that somehow putting the term acoustic in gives it some property, that implies that it has some special property that at is unknown, or some kind of magical property that must be blessed by some all knowing deity in order to understand the data. I have run across this same "acoustic" that places themselves above other people non-sense before, and is why the Proposal contains:

Note: Acoustic Signal Processing has the same exact meaning as Audio Signal Processing. Different audio processing algorithms area used for different types of end results but does not separate acoustic and audio meaning. Digital Acoustic Signal Processing is a sub-category of Digital Signal Processing where Signal Processing is much more encompassing and represents the mathematics and methods of processing of real world data which usually includes noise data. (This being provided sense the evaluators of a different SBIR, didn't understand these relations.)

I think some people need to look into the mirror to discover where this nonsense comes from.

Evaluation of data for content as needed for sonar mapping is different knowledge needs is substantially different needs than the needs of this Project. This is like tell the expert mechanic that they cannot possibility work on a car because they have not driven it. There may be some very minor relationship, but the argument is just plain invalid non-sense.

Limited Experience Angle

Under "Related Work" section there is over 3 pages related to experience.

The majority is the discussion of signal processing, noise analysis, data measurement, etc. There is also discussions of audio data processing and audio data frequency analysis and program. There is only so much information you can put in a limited size Proposal.

Again adding the magic word of "acoustic", somehow nullifies the real needs of such a project like:

  • Signal Processing, the knowledge of how data is processed and processing effects and what data processing methods that would work and what will not work and/or create negative side effects
  • Algorithm Development, the skill to create a compression algorithm that does not exist
  • Data compression
  • High efficiency programming
  • High performance an high throughput programming and use on the target COTS system
  • Realtime programming
  • High Accuracy Data Frequency Analysis
  • Noise analysis
  • Data measurements
  • Audio processing programs (for digital data, acoustic is just another term for audio).
  • Previous deployment of very advanced military equipment, that used advanced skills.

As you can see there are a great many important skills where shown that is of relevance to such a project. As been shown earlier acoustic reference is very weak compared to the combined abilities that can be used to complete a successful project. What's limited is the evaluators knowledge of what skills are really the important skills for such a project.

Due to the limited space, there are many valuable skills left out. For example, I know how to work with virtual hugepage using 2 Megabyte and 1 Gigabyte pages. With the 4 Kilobyte memory pages, the CPU is constantly juggling virtual program page descriptors during normal execution. Besides juggling the descriptors many CPU valuable cache line memory blocks are being occupied with page table data for constant juggling. If the descriptor data is not in the cache, the CPU generally has to stall as it waits a long time to locate and load the descriptor information into the cache, which then can load the page descriptor into the CPU core. When the huge page is not available, the system still builds the memory area with 4K pages, which makes it easy to test the performance of the exact code, with the hugepages and without the hugepages. In a packet processing program the processing performance went from 9 MPPS (Million Packets Per Second) to 15 MPPS for a 15/9 = 1.66 for a 66% performance gain. This program had a somewhat limited number of memory blocks (sub-allocated from the hugepage) being accessed, while a program that has a much larger amount of memory blocks being accessed and more randomly spread out, such as this compression system, will be improved even a by high percentage by using 1 Gigabyte memory pages for the data memory. This also means that a CPU can be idle for longer period of times (saving energy) or requires less CPU cores to get the job done (smaller compression system) while getting the work done faster. 1 Gigabyte pages does take some minor work and would only be looked at in Phase II. This is just one of many important professional skills not listed.

One-Employee Company

Most companies start as a 1 person company. It is only through success does companies grow and higher other employees. The SBIR is an entry mechanism and there is no discussion that indicates that a small company is somehow not allowable by an SBIR.

The only question is, does the company have the skills necessary for such a project. As listed in the previous section, the single employee has a multitude of skills, very relevant for doing the work and has all of the skills necessary to do the work.

For a Phase II contract, the company will add employees to support a bigger contract. However, the support of the PI employment must be the primary goal, since without the support there would no reason for the PI to do work on the project.

Comment C

The PI has an emphasis on computing and processing capabilities, and there is no one else at the company to assist with the acoustics aspect of the project.

The acoustics deity is presents itself again. I myself have not written equations or code to analyze array sensor for sonar results. This is highly specialized work, and there is a very high probability that none of the evaluators have either. Otherwise how could they miss the fact that underwater acoustics is not really a true requirement for the project and therefore do not need any assistance with the "acoustics aspects" of the project making the comment invalid. Some people can only see their own abilities, which do not meet the needs and then apply them to other people making the assumption that other people do not have the abilities that they themselves do not process.

The Proposal is a small document and does not contain all of the information about all of the aspects of a person. I am a person that invests a moderate amount of time each year learning about new things, which knowledge adds up over many years. There is a great many skills, experiences, and knowledge that is present but being part of the general process, is not listed. For example, software code debugging, not directly listed, but due to listed complex projects, general engineering development skills is implied.

The way the acoustics label is being used throughout this debrief, the irrelevant label "acoustics", is being used for purposeful social exclusion, since as explained earlier the digital data is realm of signal data does not have much to do with physical acoustics. There is no honor using a label for social exclusion, just as not being a marine can be a label for exclusion. Using the term "acoustics" in a comment, to somehow gives the comment extra credibility, is poor when given the facts of the real needs.

I have all of the abilities and skills I need to do the data compression project, otherwise I would not have wasted my time writing a proposal.

Comment D

The PI does not seem to have experience specifically with compression techniques either.

In the proposal: "The R&D work included an algorithm using correlation methods to do a 6 to 1 data compression while maintaining high data content integrity" Then addition:"included sensor signal processing algorithm development, sensor simulations, noise analysis, data measurement," which is the skills needed for the project. Left out the base skill Signal Processing from that particular list, but it is elsewhere.

The Trivial Compression Paper

The compression algorithms in the referenced document " Low complexity lossless compression of underwater sound recordings" is a trivial algorithm. It was obvious that it worked because it was designed around low frequency slowly changing signal data. It is obvious to me that a quickly changing signal (higher frequency), the algorithm would fail to produce good compression.

Arbitrary Competing Example

Data with only low frequency data can be compressed in another way. Let's say the data frequency range was less than 25% the Nyquist Frequency. In this case, repeating saving the first sample and discarding the next 3 samples, called down sampling would compress the data by 75% or the compression would be at 25%. Let s say the maximum signal amplitude of the block is +/- 1020 and the readout noise is 4 bits. The 10 bit amplitude + 1 bit sign, - 4bits chopped off readout noise would require 7 bits for the word size, which can be binary packed at 7 bits, for another 7/16 = 0.4375 for a total of 0.25 * 0.4375 = 0.109375 or 11% result compression. This is better than the target compression of 20% and so passes. And the data retains signal data integrity since signal phase and signal amplitude was well as background environment noise is maintained.

Frequency analysis of the data can determine the data's frequency range. The standard FFT and Goertzel algorithms have some questionable results in some cases, while the trade secret frequency analysis can set the frequency width and effectively cover wide frequency ranges with much less processing power and not have the problem of questionable results. Whether this method is desirable to use for the compression project is dependent on the project and the ability of post processing data center ability to handle different portions of the data set at different sample rates.

Problems with the Trivial Compression Example

From the " Low complexity lossless compression of underwater sound recordings" document

Thus from the viewpoint of compression, underwater sound generally comprises two components: (i) occasional transient signals that represent a small fraction of the average power and (ii) a slowly varying noise floor with a low-pass characteristic up to about 10 kHz and a flat spectrum above this.

In quiet areas, the ambient noise floor at 30 kHz is so low that it is challenging to build sound recorders capable of recording it, at least with low operating power. The system noise of the recorder may then dominate the background noise level above about 10 kHz.

Even though animal calls or discrete boat traffic are frequent in some areas, the duty-cycle of these noise sources, i.e., the proportion of time that they are detectable above the ambient noise, is generally low.

According to the datasheet (Cirrus Logic Inc., 2006), the ADCs have a minimum dynamic range of 93 dB implying an effective number of bits of <16 (Orfanidis, 2010, also see Sec. VI) and so only the most significant 16 bits of the 24-bit samples were stored.

Putting these together, says that the data signal strength is mostly very low often the signal barely above the readout noise (low signal data is easily compressed), low frequency data (that is easily compressed), very little content that does not compress as well. Instead of analyzing the signal to determine the level of the readout noise, we will just discard the lower 8 bits of the D/A data.

The document also contains a table of compression of several data sets, where the compression performance varied across these different data sets. However it contains a common FLAC compression tests which has a compression range of 3.2 to 7.2 or result size compression of 31% down to 13.9%.

In the second document "MPEG-4 ALS – The Standard for Lossless Audio Coding" where some standard audio music recordings compression was performed, where moderate amount of data was used, the 16 bit FLAC only compressed on an average down to 48.6% well above the needed 20% and far from the 13.8% presented for the trivial algorithm.

How do you account for this huge difference? It is data content of the data. If the system sonar was active, the signal content will increase and be active most of the time. A comment was made that the sonar data signal data was low:" contacts of interest are very low in SNR" however, what about if the water depth was 100 feet, would this remain true for would the sonar signals be well above the background signal? A robust system that gracefully handles all possible conditions is the desired target.

Since the high compression values in Table V was due to data content. The change of data to a different data content, such has a high content data used in the ALS testing, will cause the compression performance to dramatically drop across all compression types listed, including the target X3 program which is the main focus.

Just because some data content can compress well under certain circumstances does not mean that all data content will also compress well. Just because someone compressed a couple blocks of data better than the 20% does not mean that all data can also be compressed better than 20%. Just because low power, low quality A/D system was used and works OK for some conditions, does not mean the higher quality military grade systems should be handled the same way. Just feeding data to a trivial compression algorithm and hoping it will always have good compression, is not a professional way to do something. The data needs to be pre-analyzed such as a processing block "Data Waveform Analysis with Past and Future data" (the future just means the data more data is looked at and delayed before grouping the data to be compressed) to determine what contents is present to help determine what is the best compression algorithms that can be effectively applied.

The G728 Compression

I had written a spec audio G728 encoder and decode set which is based on a scale factor and series of symbols for each short audio block, where each symbol represented a very short portion of a waveform. A limited number of predefined symbols were available, but the audio compression and quality for voice was acceptable and the compression brought voice data down to 2 bits per sample. However the symbol set was optimized for speech and would introduce noise that might not be acceptable for data measurement. But a system could be designed with a number of symbol tables that is each optimized for particular content types and this method be used to compress some type some of the sonar data blocks.

The inexact signal shape symbols and data block to data block amplitude miss matches adds process noise into the decompressed signals. However some of the characterizes such as not changing the frequency content, nor changing the phase of the signals, etc., makes this particular algorithm one that might be usable for high integrity data. This method might be useful for compressing the low signal amplitude final data block. The process errors for low amplitude signals will be much smaller than for full scale amplitude signals. Only careful noise analysis at the signal level and checking with real data would determine if this is acceptable.

Overall Audio Digital Compression

The desire for lossless audio data compression has been around for many, many years. Lossless audio data compression is not a trivial problem. The supplied reference document for MPEG-4 ALS only compressed to 44.6% for 16 bit, which is still 2x the data than the desired target of 20%. The 24 bit, 192 kHz compression was better at 37.5%, however the difference probably comes from the fact that audio data does not need higher frequencies beyond the human hearing of around 20 kHz, and given more bits, the recorder probably gave more binary head room at the top of the signal giving a higher percentage of 0 value data bytes in the data set, which makes the compression difference probably just basically just data content differences, just as the trivial compression algorithm range of 3.1 to 7.2 (32% down to 13%) was just different data compression for different data content, not different compression algorithms, and all had very low data content.

The sensor data input is expected have the gain set to have reasonable signal amplitude which will have the most potential amount of data content in the signal to get the best output signal. And also the signal will have a large percentage of the data having active sonar data, which is not present in the date set discussed in the trivial with mostly in an environment " Even though animal calls or discrete boat traffic are frequent in some areas, the duty-cycle of these noise sources, i.e., the proportion of time that they are detectable above the ambient noise, is generally low." which means very low data content, which is easily compressed. If the target signal is that very low environment, a simple algorithm of finding the noise floor and chopping from the bottom, and taking the next 3 bits as the signal (since all signals are have low amplitude signal to levels) and sending the data is all that is needed to get a compression of 3/16 = 18.75% compression level. I suspect in reality it is not nearly that easy.

Any claims of high compression without very complex evaluation of the data is suspect. This means that anybody believes they can apply some single specific algorithm to all of the data or use some off the shelf programs/algorithms to get high compression is rate, is probably going to fail to deliver, which could include all of the selected proposals at this time.

My approach is to analyzes the data content, remove the high amplitude signals by measuring the signals, and use very various compression algorithms to get the best compression for the particular data content. It should be able to tell what data areas has and does not have sonar data, be able to tell which blocks have more important data, be able to add more compression algorithms, and do compression performance evaluation on the fly. It is tough to do true lossless data compression and cannot guarantee that the 5/1 or 20% will be effectively met, but the system will be able to do controlled compromises to get the best tradeoffs.

Comment E

There would be a lot of effort spent learning the relevant background information.

The real needs are understanding the signal processing, doing algorithm development, writing code with all aspects of writing code, and understanding the detailed characteristics of a digital data series such as sound data is all that is needed for this project.

When considering the analysis of sensor array data to determine characteristics, there a many complex models including sound propagation, waveform addition and cancelation, Doppler effects, beam forming, thermoclines, propogation losses, noise levels, reverberation levels, etc., that must be taken into account at the data processing center. This maybe relevant information for the data center processing of the data, however for this project sonar related characteristics is only useful for finding signal data that can be correlated together that is useful for compression.

I know about chirps, constant frequencies, stepped frequencies, they can all be detected by correlation which is using the original signal and convoluting with the input data samples will get a correlation sum output, where the bigger the result the better. Even a signal even with a radically modulated frequency pattern can be the source sonar signal. However some sonar signal combinations will be more effective than others and some signals will be affected by more by Doppler effects than others.

The primary reason analyzing the data for waveforms is to find and subtract out the high amplitude environmental sounds, which can have a frequency varying signature, but will probably have a much simpler signature then can be used in an active sonar waveform. For frequency varying signals, the following was in the proposals:

"even dynamically adjust the center frequency to track with the input data frequency"

"However, this dynamic frequency adjustment is not without additional processing costs.

A moderate strength complex sonar signature could be tracked across the frequency and analyzed, however convolution of the normalized sonar signal to the data is simple, but the cost of doing a full convolution, moving one data sample and repeating, which is doing a complete convolution for each data sample, and doing that for all data channels is computationally expensive, even though such an algorithm would get a very high math instruction per CPU instruction ratio and overall high instruction rate per CPU core.

From a different prospective, for a moderate and stronger signal, since my algorithm is continuous, the sonar signal would produce a ripple signature across the frequency plane data. For example, If it was a chirp with constant frequency across the frequency plane, it would produce angled straight line in time of ripples across frequency plane, which could be easy to detect, and is much less costly than full convolution processing. However, for very weak signals, the full convolution method would give the best signal detection probably.

But this comes back to the compression method, and finding the larger signals, and compressing the very small signal data as the last compression step, i.e. not finding all of the small signal data, is how it is to work.

When it comes to professional Signal Processing, I am very knowledgeable and very capable. Although like most other things there is always something new to learn, but

Sensor Readout

Not all the information about all aspects of the system are made available. For example, the readout of the sensors, being an important aspect, is assumed to have a high quality engineering. Each sensor is somewhat independent of one another. It makes sense to do the A/D conversion at the sensor since the highest signal to noise conversion can be done there. However, the A/D conversion and readout must be synchronized. But, it can be conceived that less professional design can be done where there was no synchronization and each sensor had its own clock running at close to but slightly different rates and some kind of asynchronous communication was used to move the data and the read out software is not available, and messy data input system needs to be written.

Instead I have assumed that the synchronization and sensor readout was professionally done, tested and data input readily available. The sensor systems control and management is in place, and the ability to use the sensor data is simple and strait forward.

Satellite Communications

The satellite is communications architecture is a generic model. Specific information needs to be learned about the system actually in place.

Comment F

The required clearance for working with classified data are not currently in place.

SBIR is an entry level contract program. Clearances cannot be obtained without a need for a clearance, and must be asked for by the DoD employee.

The basic chicken and egg problem. Cannot get a security clearance without a project need for a security clearance. Cannot get the project without a security clearance. Ethically its is invalid requirement .

Criteria C

c. The potential for commercial (Government or private sector) application and the benefits expected to accrue from this commercialization.

Comment A

There is little information in the commercialization statement and no detailed plan. Lightning does not perceive any market for the developed technology outside of the applications specifically involving acoustic array data compression and therefore does not emphasize marketing to non-DoD businesses. They do point to the oil industry as a possible market, however.

The private sector does not need a military grade, high data integrity, audio data compression system since lower quality, OK to human perception systems are in place and if high grade data is desired, the cost of data storage is very low, making it unnecessary. This makes this project very specialized, and spending money to try and market a system that has no market is a waste of money.

For the few cases that might use it, as the oil industry, which may not really need to process the data in realtime, will just save the data and process it at a later time, also making in effective to market to. For the few that really wants satellite communications, it would just ask the people that are doing it, i.e. the Navy and they would find out who can provide it. Once done, the makers of the sonar systems would get ambitious once they realize that it can be done and create their own version or contact the doer, again probably through the Navy. Actively marketing is just a waste of time since passively marking due to its existence, and additional needs of the Navy, will be the primary return.

Pretending there is a market that does not exist and creating marketing plans to market to that non-existent market is just fantasy and a waste of effort. The current market is that the Navy is willing to spend some money in the development of compression/decompression system is the current market and there is no good reason to create this product otherwise since no reasonable market to justify development currently exists. If during the process that a item, such as a general compression algorithm, is created that can be readily used for the commercial market then marketing can begin. However a very considerable amount of work has been done on this in the past, and therefore very improbable. Even if creating a new general compression algorithm, does not mean that it will be readily accepted and profitable to market it.

Maybe doing this project might make future evaluators believe that I am blessed by the "acoustics" deity and make it easier to get future "acoustics" related projects, irregardless of the relevance. A successful SBIR project might help in getting future DoD projects, which has some value.

When the government attaches classified to the work, this excludes or greatly reduces private sector salability. Spending money or effort (also equates to money) to make plans for and when salability is in question is bad judgment. Until it is determined how the "classified" project requirements directly affect the marketability of the project in the private sector, making marketing plans to the private sector is not practical.

It is a strength to know when not to market something. The only time that this is a weakness is when an obvious very lucrative profitable market can be readily described. Pretending that a possible private sector market exists to add to the rhetoric to disqualify this SBIR proposal from approval, is a weakness of the evaluator.

There is an important aspect of the project that will have indirect commercial benefits, is the practice of making high precision waveform measurement. The other Navy SBIR I submitted was for the analysis of sound to be able to monitor the surrounding environment. This was using my trade secret algorithms, measure all of the input sounds and convert to a set frequency, frequency change, and power vs time waveforms which could be converted into a general sound model. What is different is that the compression requires detail to precision measurement of the exact characteristics of the waveform for re-creation, the other just needs general quality frequency measurements since other is looking for changes across the waveform which will be smoothed out some to get an easily definable waveform pattern for classification. The acoustics deity showed its colors, over true knowledge and abilities, and the Navy missed the boat on that one. Doing high precision waveform measurement processing will be exercising the skills of using my algorithms (standard frequency measurement methods have problems, see my web site www.lfdt.biz), which will help in demining additional refined aspects of using algorithms for other future work. However, it is only that the sonar data is going to be used for measurement purposes and the data must maintain very high integrity content for this compression work, which high integrity is unnecessary in most other work, so cannot be directly applied to other applications for other commercial products at this time. The one aspect of lossless music data compression is a possibly, but the high computational cost and system complexity, will be outweighed by the very low cost of just saving the original data. With no apparent use, this makes this a very specialized product.

The simple statement as provided in the proposal for commercialization is as detailed as it needs to be and as shown in the Strength section "The vendor expects to market to the Navy and to the oil and gas industry" which shows that the full the message was clear, so does not lack of being incomplete.

Capabilities Copied From Larger DARPA Proposal

Note: Left section out for now, may add later....

Conclusion

I have not fully covered all of what I could put in this document, this document is already bigger than the proposal.

The proposals are a very limited length and not all aspects of everything can be reasonably covered in such a short proposal.

Rather than the evaluation focusing on the strengths of the technology, the ability to arrive at a viable solution framework and the mythology to arrive at a solution to determine the PI abilities and viability of a project, the evaluation focus of aspects that do not exist since which everything cannot be covered in such a small document, and irrelevant things since the evaluator(s) does not fully understand what type of abilities and effort is needed for such a project. Was it because the information was beyond the evaluators knowledge, because of the disinterest of the evaluator to do a careful evaluation, or the policy of spending the minimum resources to an evaluation? After writing several proposals, there appears to be major short comings in determining what are valuable innovations.

In reality, it is expensive to write proposals, and no matter what my technical abilities are, I have not been successful at getting a government contract after 2 years of writing proposals. I have to consider in no longer investing in writing more government proposals.

Mike Polehn

Lightning Fast Data Technology Inc.

Post Protest Comments

The short 1 week protest time frame makes writing a good protest is hard. I looked for some government instructions for a protest, but was unable to find any. Did find some minor comments about protesting as internet text, but you cannot be sure if these are accurate.

This protest is strictly based on comments provided in the evaluation debrief. The goal was to invalidate these comments so that a new more thoughtful (less invalid comments) evaluation can be done for comparison for award.

There was brief internet text about arguing/protesting against companies that have received an award. For example (made this up), Boeing and Wal-Mart could each put in a $30,000,000,000 proposal to provide military fighter jets. If Wal-Mart wins the proposal, Boeing could put in a protest that Wal-Mart does not have the specialized company resources and specialized experience to effectively create and provide military fighter jets in the short time frame needed. This makes perfect sense for that type of a case.

However the SBIR proposals are very small and a protester does not have access to the other proposals, would not know much about the other companies, and putting money or time into researching the businesses to determine weaknesses associated with the proposal does not make financial sense for such small contracts. Protesting other companies receiving small SBIR is just not practical, so a waste of time.

After spending the time for the invited protest, the government didn't respond, not even an acknowledgement of receiving or of reviewing the protest document.