Blog Post

Deep Dive: Sony Spresense with Edge Impulse (Part 2)

Embedded Devices, TinyMLThis article takes a closer look at the Sony Spresense's CXD5602 SoC and CXD5247 power management/audio codec IC, and what they mean for tinyML.

Peter Ing

July 11, 2021

In part one of this blog series, we covered the Sony Spresense ecosystem at a high level. You will recall I mentioned that at the heart of the platform are the CXD5602 SoC and CXD5247 power management/audio codec IC.

Sony has provided extensive documentation in the Spresense Hardware Documents and here. It is my intention to orientate you and help make sense of things while starting out; however, it is not possible to go into every single detail and I encourage you to explore the documentation on your own to find out more about the features specific to your application.

In the world of tinyML, there are so many applications that will make use of different combinations of features. You will definitely need to first understand which features you want to use and then take the time to carefully understand each relevant sub system in more detail.

Read the CXD5602 SoC sectionCXD5602 SoC

The SoC is structured in terms of domains each dedicated to specific functional requirements, which communicate with each other within the SoC using a multi-layer implementation of Arm's AHB bus. This means multiple concurrent controller/agent communication channels exist, allowing cores to communicate simultaneously with different peripherals. This translates to improved concurrency and the lowest possible I/O latencies, which is what you want if you intend to push Inference to the limits of the performance spectrum. There are a total of eight CPU cores as you will see with six high-performance Cortex M4Fs dedicated to application code and two other cores dedicated to peripheral control.

CXD5602 architecture (Image: Technical Manual)

Diving a little deeper into the functional domains we find the following:

Application Domain — Here we find the six main application CPU cores, which are dedicated to user applications and are the highest performing cores being Cortex-M4F (floating point and DSP capability). There are also functional subdomains, specifically the Audio Codec Subdomain that offers hardware support for managing dual I2C channels for integrating stereo microphones up to 24 bits for high-quality audio and interfaces directly with the CXD5247.

The Image Display and Processing Subdomain comprises the hardware interface for the camera as well as hardware implementation of the basic 2D raster image processing functions (blit, rotate, scale and alpha) thereby enabling the possibility of offloading some of the basic preprocessing from the main CPU cores so that they can be more efficiently utilized for Inference.

The Connectivity and Storage Subdomain provides a USB 2.0 device supporting all the main classes, SD Card Controller, eMMC for bigger storage and all the usual communication interfaces you typically expect to find on a 32-bit MCU (SPI, UART, I2C and DMA).

Finally the Clock, Reset and Power Subdomain provides the internal clocking control, a real-time clock that supports full date and time formats, and of course power management that allows for independently controlling the power to each core.

System and I/O Processor Domain — Centered around an additional Cortex-M0+ core that can host I/O control applications so that the main CPU cores can be powered down to save power. This subdomain provides hardware mechanisms for inter-processor communication and standard features to be expected, such as programmable interrupts and controlling whether the cores boot from external SPI flash or USB/serial storage. There are also additional SPI, UART and I2C serial interfaces, which can be operated independently of the ones in the application domain under the control of the Cortex-M0+.

Sensor Domain — This provides yet another degree of parallelism by providing further serial interfaces, but also some key functionality for physical interfacing in the form of analog interfaces (ADC and PWM) to let you build real world interfaces to physical processes. There are some nifty DSP functions for pre- or post-processing as part of the sensor engine that helps implement sensor fusion in hardware for really demanding real-time applications. There's even a hardware timestamping feature linked to the RTC that can be useful for logging, diagnostics, and IoT applications.

GNSS Domain — Finally what could arguably be a somewhat unexpected bonus feature is the incorporation of GPS and GLONASS for positioning data, which could prove very useful for applications involving self-driving vehicles, robots, and drones. This Subdomain has its own dedicated Cortex-M4 core, which can be used to preprocess all the of GPS data and not impact the application code doing Inference.

As is now evident, there are not just the six DSP cores but also the additional M0+ and M4 cores for the the system I/O and GNSS respectively. The memory mapping used by these cores is quite interesting as it is separated into three “Views” with overlapping windows to allow for sharing data between cores.

Memory maps (Image: User Guide)

The six main M4F cores all make use of the APP view, and therefore share that address space whereas the other two processors have dedicated address spaces. All of this once again ensures that a high degree of parallelism is possible.

Read the CXD5247 Power Management and Audio sectionCXD5247 Power Management and Audio

The CXD5247's functionally is divided into two blocks, namely the power block and audio block. 

Power Block — contains all the power management functionality. The platform is designed to be powered from multiple power sources, and this is catered for by the power path control system as well as the built-in DC/DC convertors and LDO regulators to supply the functional areas of the CXD5206. There is also battery charging functionality available including a battery fuel gauge functionality and USB charging the use of which is optional.

Power block architecture (Image: CXD5247 Technical Manual)

A standard 32.768KHz clock source is integrated to drive the RTC in the CXD5602, saving the need to add this as discrete components on the main board.

Audio Block — includes support for both analog and digital microphones. There is a microphone bias generator to provide the necessary bias voltages to analog microphones. There are complete analog frontends for each analog channel including preamplifiers and delta sigma analog to digital convertors, which output a pulse density modulation (PDM) format that has been specifically adapted by Sony to enable optimal interfacing to the CXD5602 via the control logic block.

Audio block architecture (Image: CXD5247 Technical Manual )
 

The CXD5247 is also able to receive data from the CXD5602 in the same PDM format and provides a two-channel PDM to 1-bit PWM output via a push pull 400mW Class D driver (8 ohm load) for each channel, which can be either be directly interfaced to speakers for lower power application or can be interfaced to an external power amplifier. Note that the CXD5602PWBMAIN1 main board is not able to directly interface to speakers as the Class D amplifier does not have a 3.3V supply connected, which would require some modifications to the board. The CXD5602PWBEXT1 extension board has the correct power supply connected to the Class D amplifiers via the board to board connector. The CXD5247 supports up to 24-bit /192KHz high-quality audio for both recording and playback.

The architecture of the CXD5602 lends itself to a high degree of parallelism, low-power utilization and powerful audio and video processing capabilities. It's evident that Sony has thought hard about how to maximize performance, power efficiency and features, and has found a perfect balance in the design of the Spresense platform.

Edge Impulse has made tinyML accessible to the masses and so far a lot of amazing things have been done with existing MCU platforms, which have found a new purpose running ML models. Many of these were not originally intended for inference and thanks to the optimizations and innovations such the EON Compiler.

Imagine what can be built using hardware that is also optimized for real-time AI. An inspiring example is the shape-changing robot that the Japanese Space Agency (JAXA) is developing based on the Spresense platform.

Once you are ready to deploy your impulses, take a look at the Spresense firmware to get started building your next amazing thing.

Subscribe

Are you interested in bringing machine learning intelligence to your devices? We're happy to help.

Subscribe to our newsletter