Real-Time Scrolling Spectrogram
2016-08-15 | By Varun Hedge
License: General Public License
Ever wondered what your voice looks like? Or perhaps you want it visible in a fashionable and interactive way? This project shows the audio frequencies of your voice in real-time with a scrolling display (sliding visualization) using 4-bit grayscale monitors.
The project is low cost and user friendly. You just need the following items to make it work:
- ATMega1284p
- STK500 Programmer
- 6” Prototype PCB
- Power Supply
- MCU PCB
- LCD TV
- Header Pins
- DIP Sockets
- MAX295 8th-order LPF
- 2N3904 NPN Transistor
- Pushbutton Switch
- Audio Jack
- Sliding Switch
- Microphone
- Assorted 5% Resistors
- Assorted Capacitors
The central idea of the project is to display audio as a real-time spectrogram, and then determine the similarity of two audio inputs by analyzing the spectrograms.
Figure 1: Block Diagram of the Project
As shown in the block diagram above, two ATMega1284p microcontrollers were used, each dedicated towards frequency analysis and video generation. The operations need to be real-time, which requires precise timing of the audio sampling of the input. The audio signal also needs to be done at precise intervals, while considering that the data transfer and processing in the video MCU should be done during the “blank lines” of the TV.
Since the project deals with frequencies, it is important to note the mathematical background for determining the sampling frequency, as well as the conversion of time domain signal into its frequency representation. We used the Nyquist-Shannon sampling theorem to determine our sampling frequency, and the Discrete Fourier Transform (DFT) for frequency representation because the project deals with finite digital systems.
Design Implementation
The audio input uses a standard 3.5 mm stereo audio jack socket and a microphone. The audio amplifier circuit then accepts the stereo line-in, microphone audio and 150-kHz input clock signals. The line-in and microphone inputs have separate gain stages. Using Texas Instruments’ LM358 op-amps, and a mechanical switch, it selects one of the two signals to feed to the MAX295 8th order low-pass Butterworth filter. This is done before the ADC conversion to remove aliasing and preserve as much of the signal as possible.
A toggle switch is also added in the design for play/pause, speed control and log/linear conversion functionalities for the video and audio microcontroller units. For the 16-bit gray scale generation, a circuit built by Francisco Woodland and Jeff Yuen in their Gray-Scale Graphics: Dueling Ships final project was borrowed. The borrowed circuit can display a 128 x 96 pixel image in 4-bit grayscale (16 intensities) by using memory-map compression schemes.
Integrating and implementing the hardware, we start by mounting the two MCUs on their own target boards as shown in the Figure below.
The target boards and push buttons are mounted on a single breadboard to allow easy connection between the two microcontrollers. The two MCUs share a common ground, but use a separate 9 or 12-V power supplies. The audio amplification and filtering circuits are soldered on a separate board since it has many more components, with pin sockets soldered for easy wiring between the MCUs and the circuit input/outputs.
Figure 2: Final Circuit Wiring
The software implementation was divided between the code for the audio microcontroller and video microcontroller. The audio is responsible for data acquisition, ADC conversion, FFT conversion of the digital time domain signal into frequency domain and transmission of the processed data to the video MCU through USART. The video will then be responsible for receiving the data from the Audio MCU, performing the visualization processing by using 4-bit grayscale, implementing a circular buffer to get a scrolling display and the transferring the data to the NTSC television.
The FFT part in the audio MCU and the video display part in the video MCU were based off of the example code shared by Bruce in the public course website. We then use the AVR Studio 4 v 4.15 with the WinAVR GCC Compiler version 20080610 to build and write the code, as well as program the microcontroller. The crystal frequency is set to 16 MHz and the compiler optimization was set to -Os to optimize the speed.
After testing the implementation, the project performed extremely satisfactorily. The project successfully implemented a fully functional and accurately displaying histogram visualization of the input audio and frequency content. So, interested trying this yourself? Read more about the project by clicking the source link below. You might want to shift your interest on audio signals or perhaps it could be a good start on learning a new skill!
Have questions or comments? Continue the conversation on TechForum, DigiKey's online community and technical resource.
Visit TechForum