Design of a Dual Core Processing System for Ultrasound Digital Signal Based on FPGA and STM32

Chao Peng¹, Chang Liu¹, Yue Song¹, Sha Zou²

¹School of Electrical Engineering and Intelligentization, Dongguan University of Technology, Dongguan, 523808, China
²HengNan County Seventh Middle School, Hengyang, 421000, China

*E-mail: pengchao@dgut.edu.cn

Keywords: Non-metallic ultrasonic flaw detection; Dual-core digital processing system; STM32; FPGA core design;

Abstract: In this paper, a dual-core digital system for receiving and processing ultrasonic digital signals in the field of ultrasonic nonmetal detection is designed. The paper first analyzes the traditional ultrasonic digital signal receiving system and compares the advantages and disadvantages of various systems. Then the previous analysis is used to find that FPGA and MCU have strong complementarity. At the same time, considering the insufficiency of ordinary MCU computing power, FPGA+STM32 dual-core processing system is used to design the whole digital processing system. FIFO memory unit, frequency divider unit, serial transmission unit and serial control unit are designed in the FPGA module. The whole system realizes the control, communication and data processing functions of the key functional modules of the ultrasonic nonmetal flaw detection system. Finally, the correctness and rationality of the design are verified by establishing the circuit model of digital signal processing and the actual circuit. The dual-core digital processing system designed in this paper not only avoids too many hardware connections, but also makes full use of the high stability and efficiency of the FPGA. It also lays a solid foundation for the expansion of the system in the future, and has good engineering application value in the field of ultrasonic flaw detection.

1. Introduction

Ultrasound is a kind of sound wave whose frequency is higher than 20,000 Hz. It has the characteristics of good direction, high power, strong penetration, easy to obtain concentrated sound energy, and no damage to the detected substances and human body. These advantages make ultrasound widely used in medical, military, agricultural and almost all industrial flaw detection fields[1-4]. In addition, ultrasonic detection has a lot of in-depth research and application in thickness measurement, ranging and medical ultrasound imaging[5-7]. With the development of modern signal processing technology, the application field of ultrasonic flaw detection is expanding. However, at the same time, the processing speed and processing accuracy of the ultrasonic non-destructive testing system are also constantly improving[8-10]. Therefore, we put forward new requirements in the digital signal processing part of ultrasonic flaw detection.

In the digital signal processing system for ultrasonic flaw detection, there have been several such schemes in the past:

(1) Pure microcontroller type. This system has lower cost. The main disadvantage is that because the MCU pins are limited, these ports are used to generate the data and control lines needed to control the peripheral devices. The number of them is often not enough. Therefore, the decoding and latch circuits need to be added externally. To expand the I/O port, the entire system becomes very bloated and the size of the device increases. At the same time, due to the addition of a large number of discrete components to expand the I/O port, the hardware connection becomes complicated, the interference between signals increases, the reliability is low, and the system power consumption also increases[11-13].
(2) Pure CPLD/FPGA type. The basic structure inside the CPLD/FPGA is some basic macrocells or logic blocks. There are hundreds of macrocells in the chip, and the chip port resources are also very rich, and the digital circuits of various logic functions can be quickly completed. At the same time, it has a parallel working mechanism, and the running signal flows simultaneously in multiple logic blocks. This capability can improve system efficiency. But pure CPLD/FPGA also has some shortcomings[14, 15]. First of all, its timing control ability is not as good as that of single-chip computer, because it has parallel execution. On the one hand, it can run efficiently, but on the other hand, it also makes its timing difficult to consider all aspects of the problem. Designers need to fully consider the effect of timing on the signal during execution. At the same time, the design of CPLD/FPGA must follow the real circuit, so when using CPLD/FPGA to generate operation units, it is very troublesome. It needs to build a large number of hardware operation circuits, which will waste its internal resources, and the implementation of various data processing algorithms is also more difficult.

2. The overall design of the ultrasonic digital processing system

Through the analysis of the above advantages and disadvantages, we can find that FPGA and MCU have strong complementarity, so we adopt the dual-core digital processing mode of FPGA+MCU, and at the same time, for the lack of ordinary MCU computing power, we use STM32 instead of ordinary 51 MCU. The structure of the entire system is shown in Figure 1. The thick line is the data signal line, and the thin line is the control signal line.

![Figure 1 Overall Block Diagram of System](image)

When the external received ultrasonic signal passes through the pre-processing circuit and enters the A/D conversion chip, the timing generating unit in the FPGA provides the conversion clock for the A/D conversion chip, and the output of the A/D conversion chip is output to the input of the high-speed FIFO memory. The clock of FIFO memory is synchronized with the conversion clock of A/D conversion chip. When the controller detects that the contents of the FIFO memory are full, the serial generator provides the reading clock matching the serial baud rate to the FIFO chip, and prohibits data from being written to the FIFO memory. Then open the serial transmitter to read the contents of FIFO memory, and send all data to STM32 chip by serial communication. After the transmission is completed, the controller in the FPGA re-allows the FIFO memory to write, disables the reading, and allows the timing generation unit A/D chip to provide the clock.

After receiving all the data, the STM32 processor first performs digital filtering on the data, performs waveform analysis, and displays the result through liquid crystal display. You can also choose to upload the waveform data to the host computer for further data processing. The function keyboard is used to set the ultrasonic detection parameters, such as the transmission pulse width, interval, and so on.

3. Design of FIFO Memory

The FIFO memory is a first-in, first-out data buffer. The data that first enters the FIFO is first read. The difference between him and normal memory is that there is no external read/write address line, which is very simple to use. In the traditional pure MCU solution, the FIFO chips are externally connected, and some control signals are added and deleted to the standard structure of the FIFO unit in this design. The modified function module pins are shown in Figure 2, and the
function description is shown in Table 1. The clock required for the entire functional unit of the design can be implemented by calling the PLL phase-locked loop in ISE. Specifically, as shown in FIG. 3, CLK_OUT2 is an A/D chip acquisition clock and a FIFO write clock. CLK_OUT1 is the clock required for the serial port transmission module.

![Diagram](image.png)

Figure 2 Designed FIFO Memory Pin Schematic

Table 1 Pin Function Description

<table>
<thead>
<tr>
<th>Pin Name</th>
<th>Function Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Din[7,0]</td>
<td>Parallel Input of A/D Data</td>
</tr>
<tr>
<td>CLK</td>
<td>Data write clock (consistent with AD sampling frequency)</td>
</tr>
<tr>
<td>RST</td>
<td>The reset signal (high level effective) cannot be written in 3 cycles after reset.</td>
</tr>
<tr>
<td>WREN</td>
<td>Cache Work Enabling Signal (High Level Effective)</td>
</tr>
<tr>
<td>Dout[7,0]</td>
<td>Data Output Pin</td>
</tr>
<tr>
<td>RDCLK</td>
<td>Data Readout Clock</td>
</tr>
<tr>
<td>RDEN</td>
<td>Data Read Enables (the next clock cycle after elevation starts to work)</td>
</tr>
<tr>
<td>EMPTY</td>
<td>Cache readable status standard bits. Low level means readable. High level indicates that no data is readable</td>
</tr>
</tbody>
</table>

![Diagram](image.png)

Figure 3 Clock Setting

4. Design of Communication Module

This module completes the communication between the FPGA and ARM modules. The communication module is divided into three units: a frequency divider unit for generating a baud rate; a data transmitting unit for transmitting data; and a serial port control unit for global control. The overall framework is shown in Figure 4.
The workflow is: after the transmitting circuit emits the pulse, the data is all stored in the FIFO chip.

(1) The request terminal (REQ) of the serial port control unit sends a request transmission signal to the ARM. If the ARM is idle and can receive data, the permission transmission command (ALLOW) is issued to allow the transmission instruction.

(2) The serial port control unit obtains the data of one unit in the FIFO through Txdata[7,0], and then sends the data to the serial port sending unit through Outdata[7,0], and the WRSIG (transmission enable) sends a rising edge signal.

(3) After receiving the rising edge signal of the WRSIG terminal, the serial port transmitting unit starts to move the received frame data to the TX end one bit at a baud rate. At the same time, setting the line busy end (RI) to 1 indicates busy.

(4) After the transmission is completed, the line busy end (RI) is idle, while ARM receives the data and carries out data verification. The verification results are sent to the serial controller through ERROR. According to RI and ERROR signals, the serial controller judges that it updates data through RDCLK or commands the serial sending unit to retransmit data.

The specific unit design is as follows:

### 4.1. Design of frequency divider unit

The baud rate is defined as the reciprocal of the time interval between transmitted data bits and bits. Considering that the serial port receiving unit may be used in the future, the frequency divider divides the input clock of crystal oscillator 50M into 16 times the required baud rate. The baud rate is 9600, so the frequency division coefficient is 325. The divider module program is as follows:

```verilog
module boclk(clk,boclk);
    input clk;
    output boclk;
    reg boclk;
    reg [15:0]cnt=0;
    always @(posedge clk)
    begin
        if(cnt==16'd216)
            begin
                boclk <= 1'b1;
                cnt <= cnt + 16'd1;
            end
        else if(cnt==16'd433)
            begin
                boclk <= 1'b0;
                cnt <= 16'd0;
            end
        else
            begin
                cnt <= cnt + 16'd1;
            end
    end
endmodule
```

![Communication Module Design Block Diagram](image)
4.2. Design of serial transmission unit

It detects that the transmit command is valid and the data to be transferred is loaded into the internal shift register when the line is idle. At the same time, a counter is set, and the shift register is shifted to the TX port by one bit every 16 times. (including the generation of start and stop bits and parity bits). Serial port transmission module program:

```verilog
always @(posedge boclk)
begin
    if (send == 1'b1)  begin
        case (cnt)  //Generate start bit
            4'd0: begin
                tx <= 1'b0;
                idle <= 1'b1;
                cnt <= cnt + 4'd1;
            end
            4'd1: begin
                tx <= datain[0];    //Send data 0 bit
                idle <= 1'b1;
                cnt <= cnt + 4'd1;
            end
            4'd2: begin
                tx <= datain[1];    //Send data 1 bit
                idle <= 1'b1;
                cnt <= cnt + 4'd1;
            end
    end
endmodule
```

4.3. Design of Serial Port Control Unit

The control unit is divided into three parallel parts, as shown in Fig. 5. The relevant part of STM32 is responsible for the sending of REQ signal and the judgment of ALLOW signal. The data caching part is responsible for updating the FIFO output data and fetching the data from FIFO into the register of the serial port control unit. The related part of the serial port is to receive RI and ERROR signals to determine whether FIFO data is updated to the data cache part, and send the serial port to send the start signal (WRSIG).

![Figure 5 Block Diagram of Serial Port Control Unit](image)

5. Experimental test

After theoretically designing the circuit, we use PROTEL to simulate the whole circuit into a compact printed circuit board, and then make it into physical objects, as shown in Figures 6 and 7. Then we test the circuit, and the test results are shown in Figure 8, where the penultimate third
action reads the data from FIFO. The penultimate behavior sends the clock (baud rate 115200.). The TX terminal is the serial port output terminal. After FIFO data arrives, it begins to send after two cycles. One frame is 10 bits. Update the output data of FIFO after sending a frame. From the test results, it can be seen that the overall design of the system is reasonable, the communication process between ARM and FPGA is simple and efficient, and communication errors can be detected and corrected in time.

Figure 6 System design schematic diagram

Figure 7 Physical Chart of Dual Core System

6. Conclusion

In this paper, a new type of dual-core processing system for ultrasonic digital signal is designed. Firstly, the shortcomings of the traditional ultrasonic digital signal processing system are analyzed, and the overall design of the ultrasonic digital processing system is carried out according to these shortcomings. Subsequently, the core FIFO memory module and communication module of the system are specifically designed. Finally, the designed system is made into a physical board and tested. The test results show that the new system can complement each other's strengths and weaknesses. It not only takes advantage of the high efficiency and programmability of the FPGA to avoid excessive hardware connections, but also uses STM32 to process data, and uses the existing library to operate peripherals, which makes the whole system very convenient. The research of this paper can provide technical support and theoretical guidance for ultrasonic nondestructive testing, and has great practical value.

Acknowledgments

This work was supported by Dongguan Municipal Science and Technology Bureau under grant No. 2016508140.

References

[1] SASAKI, K., R. KOMATA and K. SUYAMA, Ultrasound probe for ultrasonic flaw detector, has probe part arranged with welding part in width direction in position farther than oscillators, where
second oscillator receives ultrasonic wave reflected in defect part.


[8] CHO, E., E. YANG, S. KIM, and W. LEE, Ultrasound apparatus for displaying ultrasound image, has controller to adjust brightness of ultrasound image displayed on touch screen at corresponding depth levels based on second time gain compensation (TGC) value set.

[9] GONG, P., S. CHEN and P. SONG, Method for producing contrast-enhanced ultrasound (CEUS) image using ultrasound system, involves generating difference data by computing difference between first and second decoded data, and producing image from difference data.

[10] SAKAI, T., Ultrasound diagnostic device for obtaining heart beat or movement of fetus of patient as ultrasound image, has hardware processor for controlling transmitter and receiver to generate ultrasound image data corresponding to input depth.


