Sunday, 28 February 2016

i.MX6SX - Video processing via VADC & PXP (UDOO Neo)

Unlike the rest of the i.mx6 family, the i.mx6sx has limited hardware video decoding support for example there is no h.264 or mpeg support. The i.mx6sx hosts a simpler video analogue to digital converter (VADC) which allows the conversion of a NTSC or PAL signal to a YUV444 format. The YUV444 is slightly unusual in that its encoded as a 32 bit value which equates to YUV32 in V4L2 terms. Given the unusual YUV32 output decoding becomes more interesting because it heavily relies on the PXP (Pixel Pipeline) engine for colour space conversion and onward rendering to a display. The PXP can be considered a smaller brother of the IPU (Image Processing Unit) as found on the rest of i.mx6 family. In addition to colour space conversion the PXP has the capability to flip (horizontal/vertically), scale and blend/overlay images from multiple sources. The one caveat is that PXP on the i.mx6sx has limited input and output formats that it can accept.

Engine check sensor image
To give you an idea of the what the PXP can do, the video above shows the output of a NTSC camera blended with a graphics overlay. The camera is actually a parking (reversing) camera (courtesy of the UDOO team), this is highlighted by the red/yellow/green lines and the blinking 'STOP' text which form part of the camera output. Given the automotive theme the graphics overlay represents a engine check sensor dash. The output video (720x480) fits nicely on the UDOO 7" LVDS display. As the UDOO Neo is targeted towards sensor technology (IOT),  we use the tilt angle from the gyroscope to determine the amount of alpha blending to apply (the graphics fade away or become fully visible).  Furthermore depending on axis of the tilt the combined video output is flipped horizontally or vertically (in real time) noticeable by the 'STOP' text and graphics appearing reversed or upside down.

Given the above, I have put together a simple example that demonstrates to how read the camera output and pass it to the PXP for onward processing. The example is partially based on existing freescale code snippets but modified to use the PXP and then simply renders to a frame buffer. The example will only run against a frame buffer and not on X11. Unlike the above video, the example does not use GPU hardware acceleration therefore the CPU usage will be high. In order to build the example you require PXP header/library to be built and installed which are part of the imx-lib package.

To build:

make clean

To run:


Optional parameters

-t Time in seconds to run
-h flip image horizontally
-v flip image vertically
-i invert colours

Sunday, 31 May 2015

i.MX6SX - UDOO NEO Early hardware prototype

A few weeks back I received an early hardware prototype of the UDOO NEO. It hosts one of the newer members of the i.mx6 family, the i.MX 6SoloX which integrates a Cortex A9 with a Cortex M4. This seems to be Freescales first venture with a heterogeneous SOC, interestingly there may more down the road with the introduction of the i.mx7 (Cortex A7 + Cortex M0/M4).

What is striking about the i.mx6sx architecture is that the primary processor is the Cortex A9. Therefore to boot the i.mx6sx requires uboot or (another i.mx6 bootloader) to be present. Once the A9 is active the secondary processor (Cortex M4) can be brought on-line, either through uboot or after the A9 starts running Linux. In either case, application code (and/or bootloader) has to be made available to the M4 via on board flash or loaded into RAM. The M4 is brought on-line via a Reset following the standard Cortex M4 start up process of booting from a vector table at address zero (which in the A9 world is mapped to 0x007f8000). From what I understand address zero seems to be 32K of TCM (Tightly-coupled memory). The M4 seems to be capable of interacting with most of the on-board peripherals provided by the i.mx6sx. Other areas of interests on the i.mx6sx are :
  •  An semaphore mechanism (SMEA4) to aid processor contention 
  • A messaging mechanism (MU) to enable the two processors to communicate and coordinate by passing messages
The i.mx6sx isn't intended as multimedia processor and this is highlighted by the lack of a VPU and the inclusion of a low spec GPU, the Vivante GC400T which claims up to 720p@60 fps and is limited to Open GL ES 2.0.

What is unusual is the inclusion of the video analogue to digital converter (PAL/NTSC). Given that Freescale target market is Automotive I suspect this is for applications like reverse parking.

In it's current form the NEO has a small footprint  (850mm x 600mm)  and the notable on board peripherals are:
  • Ethernet (10/100) - KSZ8091RNA
  • Wifi/bluetooth - WL1831MOD
  • HDMI  - TDA19988
  • Gyroscope - FXAS21002C
  • Accelerometer + Magnetometer - FXOS8700CQ
  • 2 User defined LED's
  • 1 USB 2.0 type A female connector
The FXAS21002C and FXOS8700CQ are currently interfaced through i2c and the datasheets indicate a maximum clock speed of 400Khz (in Fast Mode).


Software for the NEO is still in early stages of development, so it's taken me a couple of weeks of development effort to get uboot and a kernel functioning. Above is a short video demonstrating the fruits of my efforts. On power up we launch a small application on the M4 from uboot which toggles (flashes) the Red LED and continually outputs a message on Uart 2 (shown later in the video). We then boot the kernel to a Ubuntu rootfs and launch an Open GL ES application on which outputs a rotating square to hdmi (720p).  The speed of rotation increases the more the board is tilted to the left or right (using the accelerometer). Note the Red LED continually toggles highlighting the fact that the cortex M4 is uninterrupted by the A9.

Having started to develop code for the TDA19988 interface, one useful feature of this controller is the ability to support DVI or HDMI displays. The downside being that the GPU struggles past 720p in my tests.

Another point to note is that there is no on board flash for the M4. Therefore code for the M4 has to be loaded into RAM which means reserving a small portion away from kernel. This introduces another layer of complexity as the reserved area needs to be protected from accidental access.

Developing for the board definitely gives you a good insight in its usability. This leads to my wish list, if the board were to be improved :

1. 2 x USB 2.0 type A connectors, for keyboard & mouse useful when used with HDMI.
2. 2 x Ethernet (definitely make it stand out from the competing SBCs).
3. Use SPI for FXAS21002C & FXOS8700CQ or ideally replace with 9 axis IMU.
4. I'd prefer the sensors to be on a separate pcb that can be connected via a jump cable to the main board. The main reason being that manoeuvring the board to test the sensors with the cables attached (ethernet, serial, power, usb, hdmi) isn't that practical.
5. On board JTAG support, it is very difficult to debug the M4 code without it!
6. There are two uart outputs, one for A9 serial console and other for the M4 on the NEO. Similar to the UDOO it would be very useful if these were exposed via single usb connector. In the current design you need to hook up 2 serial-to-usb adapters.

Wednesday, 15 April 2015

IOT - ESP8266 (ESP-201) + CC1110 (XRF)

Overall past few months there has been a huge amount of interest in the ESP8266 which offers a low cost Serial to WIFI solution. The introduction of the ESP8266 by Espressif coincidences with the increased hype of IOT (Internet of Things) as the next major emerging technology. Realistically IOT has been around for a decades in different disguises so the concept isn't new. I'm guessing the renewed interested is partly because sensors, processors and networks are dirt cheap and wireless technology is ubiquitous. Coupled with the rise of the maker culture it now means the cost of entry is low for connecting devices.

The are numerous modules available that host a ESP8266 and these can be found for a few dollars, however most have limited pins broken out, UART plus some GPIO. In my opinion a better option is the EPS-201 which has a larger number of pins broken out and an external U.FL antenna connector (IPX antenna). With a slight modification the module can be made breadboard compatible.

The main drawbacks with the ESP-201 are:

1. Lack of CE/FCC compliance (the ESP8266 is not the module).
2. The pin marking are on the underside of PCB therefore not visible when sitting in a breadboard.
3. The UART pins protrude on the underside of the PCB which means it can't be plugged into a breadboard without bending these pins at a right angle (as shown) or ideally de-soldering the pins and re-attaching the correct way round.

To evaluate the ESP8266 I chose to integrate it with an existing wireless (868Mhz) set up which is used for monitoring a number of sensors as well providing periodic temperature readings.  The existing radio set up uses Ciseco's XRF  which hosts TI's low power CC1110 transceiver. It runs custom developed firmware that is controlled through one of the two CC1110 UART ports. The plan was extended the firmware so that CC1110 could be controlled over TCP by running a socket server on the ESP8266. Compared to the ESP8266 the CC1110 has a wealth of reference documentation which increases the options for interfacing with other devices in addition to easing the programming burden. The main drawback with the CC1110 is the cost of the development tools although it possible to use SDCC (Small Device C Compiler) as an alternative.

Initially I was hoping to use I2C/SPI as the interface between the CC1110 and the ESP8266. However due to the CC1110 not supporting hardware I2C and coupled with the fact that I had just a few free I/O pins remaining I was left with one option that was to use the second UART port.

Espressif provide an SDK that can be flashed to the ESP8266 which provides a simple AT command style interface to program the device. Note, the SDK is continually being updated so check for later releases. One quirk with the ESP-201 is that is IO15 has to be grounded for the device to function. To flash the device IO00 has to be grounded. Instructions for SDK set up on Linux can be found here. To flash the AT firmware for SDK 1.0 on Linux we can issue the following command: write_flash 0x00000 boot_v1.2.bin 0x01000 at/ 0x3e000 blank.bin 0x7e000 blank.bin

After flashing the ESP-201 the bulk of the coding was done on the CC1110 which mainly entailed sending AT commands to the ESP-201 to initial a connection to the,  launch a socket server and send updates from the sensors. The sequence of AT commands was similar to this:

AT+CWJAP="SSID","Your Password"

After coding up the AT commands on the CC1110 I could test by launching a telnet session on port 80 to the ip address allocated via DHCP from the AP. Output sensor data from both the CC1110 UART port and ESP-201 is shown below.

Coding the above highlighted a number of pitfalls with the AT firmware and hardware.

  • It can be tedious to parse the verbose and inconsistent responses returned by the ESP8266 to AT commands. To tone down the verbose responses  I used ATE0, however its not permanent so needs to be sent on a reset.
  • Resetting (AT+RST) or joining an access point (AT+CWJAP) can be slow therefore you need to carefully select relevant time out values.
  • STA mode (AT+CWMODE=1) can silently disconnect after a random time.
  • The ESP8266 isn't particularly well suited as a battery powered device because it can consume up to 300mA.

It is possible to write your own firmware instead of using the pre-built AT firmware, which in my opinion is a better option. Espressif provide a set of closed sourced C libraries which offers a finer level of control compared to the AT firmware. Having  spent a considerable amount of time writing custom firmware to interface to the CC1110, here's my findings:

  • Although there is a second UART available on the ESP8266, in most circumstances only the TX pin available (primary use is for debugging) because the RX pin is used as part of the SPI interface for the flash memory.
  • There is no in-circuit debugging option, your reliant on sending tracing output to the UART port or somewhere else.
  • Although SSL support is provided, it seems to be a hit and miss affair between SDK versions.
  •  The API is closed source, so your reliant on Espressif providing regular updates for new features or bug fixes.
  • No hardware encryption.
  • Not all I/O features are available eg RTC or I2C.

Given the amount of attention the ESP8266 has received it is fair to say it does offer a low cost and rapid approach to prototyping a WIFI solution for existing hardware or for a new application. However you could argue that most of the attention has come from hobbyists and not commercial ventures. In my opinion I think it is worth exploring other WIFI SOC's that coming to the market this year such as:

Furthermore it still not clear whether WIFI (2.4GHz or 5GHz) is the ideal medium for wireless IOT as the wake up and connect times aren't particular quick. The other point to make is that cost of the some of the above SOCs can make them overlap with traditional networking SOC which are used in low cost router boards. One example is the AR9331 which supports a full Linux stack and can be used for video streaming or complex routing something the WIFI SOC's may find hard to achieve.

Sunday, 25 January 2015

RK3288 - Firefly development board

I received this board just over a month ago from the Firefly team and have been keen to assess it development capabilities given its hosts a quad core Cortex A17 (or technically a Cortex A12) processor. 

On board is a Rockchip RK3288 processor which on initial glance has a pretty decent specification:

  1. 4Kx2K H.264/H.265(10-bit) video decoder
  2. 1080p H.264 video encoder
  3. Up to 3840X2160 display resolution
  4. 4Kx2K@60fpsHDMI2.0
  5. eDP 4Kx2K@30fps
  6. According to Rockchip the GPU core is listed as a Mali-T764 GPU although it's reported as a Mali-T760 by the Mali drivers.
  7. Ethernet Controller Supporting 10/100/1000-Mbps 

Given the above I think it is import to clarify what 4Kx2K actually means and the clue is in point 3. Having spent many hours debugging the kernel driver display code it turns out the RK3288 has 2 display controllers know as VOP_BIG and VOP_LIT (I presume for little). VOP_BIG support a maximum resolution of 3840x2160 which equates to 4K UHD (Ultra high definition television) and for VOP_LIT its 2560x1600 (WQXGA). Each controller can be bound to a display interface eg HDMI, LVDS or eDP (sadly missing on the firefly).  If you define 4Kx2K as 4096x2160 also know as DCI 4K then the definitions can be misleading. The numbers also align with H264/VP8/MVC decoding which max at 2160p@24fps (3840x2160), although the HEVC(H265) decoder contradicts this by  supporting 4k@60FPS (4096x2304). What is also interesting is the image processing engine can up scale to 3x input, which would imply 720p can be up scaled to 4K UHD.

The Firefly seems to be based on a Android TV box reference design and it could be argued that its targeted as an Android centric development board. The noticeable peripherals are:

1. VGA support + HDMI
2. On board microphone
3. Wifi Connector (RF Connector SMA Female)
4. IR Sensor
5. eMMC 16Gb

Firefly supply a pre-built dual boot image (on eMMC) that offers Android 4.4
and Ubuntu 14.04.

Android 4.4

Firefly supply the Android SDK sources so that customised images can be built from source. What is nice is that the Android Studio integrates well with the Firefly which eases development of Android Apps especially for newcomers. Furthermore the App can be remote debugged while running on the Firefly. I  suggest that you sign your App with the platform key from the SDK to ease integration when remote debugging. One pitfall to bear in mind is that Android 4.4 implements selinux so you may find accessing I/O via sysfs (eg GPIO) from your Android App is severely restricted. 

Ubuntu 14.04

The Ubuntu image uses a patched version of the Android kernel and unfortunately has no support for GPU/VUP acceleration.

Linux Support

Historically many ARM SOC vendors have side stepped any request for providing meaningful Linux support and instead rely on the developer community to progress this as far as they can. Unfortunately the situation is normally exacerbated by the lack co-operation for GPU/VPU support with SOC vendor. What's clearly not recognised by ARM is that this fragmentation is clearly benefiting Intel with their latest lower power SOC's having far superior out of box Linux support.

As of today I would argue the RK3288 falls midway between no and full Linux support. The reason for this is Rockchips effort to develop a Chromebook device, if you examine the Chromium OS source tree will find numerous patches submitted from Rockchip

So the question becomes can we make use of that source tree? Well my initial aim was to get a minimal kernel booting and ideally test if GPU support was possible. The video below shows the progress made after numerous weeks of attempting to bring a workable kernel. In the video I launch Openbox under X and run es2gear,glmark-es2 and some WebGL samples in Chromium.

Although the video may seem impressive the only GPU acceleration available is through EGL/GLES hence the WebGL examples are accelerated. What is important to bear in mind is that the xf86-video-armsoc driver lacks 2D support for Rockchip therefore this still is fair amount of software rendering in X11 plus I implemented workarounds for broken code. Furthermore performance isn't particular spectacular, es2gear & glmark-es2 numbers are here. Unfortunately I haven't had the time to investigate the cause(s) however further progress may be hinder by the lack of newer Mali drivers from Rockchip/ARM.

For those of you expecting a future Rockchip Chromium OS image to work on an existing RK3288 device you may be disappointed unfortunately the hardware differences make this a slim possibility.

Friday, 7 November 2014

Intel BayTrail - J1900

On of the main challengers to the current generation of ARM SOCs are the Intel BayTrail range. For me the interesting part of the family are E38XX (atom) and J1X00 (celeron) processors which boast  7-10W TDP. In this article I will cover the some initial performance metrics against the J1900 with the Intel Linux software stack. My test device was a low profile MX1900J industrial mini-itx board produced by BCM Advanced Research.

What's nice about the MX1900J :

1. Low profile with a heat sink that is approximately 15mm high.

2. On board DC power jack (12v) hence no need for a separate DC to DC converter board.

3. 4 x USB 3.0

4. Inclusion of LVDS and GPIO support.

5. Dual Ethernet NIC's

What's different about this board is the inclusion of a Display Port connector instead of HDMI along with VGA output. The BIOS has UEFI 64 and legacy support.

From an application developers view point there's quite a few advantages with the x86 platform. Firstly there is the vast amount of existing software that can run of the 'out of the box' or with minimal changes. Another is the shorter ramp up time between set up/configuration of the BSP/kernel/rootfs to actual application development. Lastly I would also argue that Intel do seem to devote a fair amount of resources to open source development therefore the underlying BSP have the potential to keep up with the latest trends (eg Chromium-ozone, Wayland, Tzien).

The J1900 GPU core supports Intel HD 4000 graphics and there are two linux graphics drivers available for the J1900. The lesser known of the two is the EMGD driver (Embedded Media and Graphics Driver) which are closed sourced binaries that are accessible through user space libraries eg libdrm, mesa and libva. The EMGD documentation targets these drivers for Fedora 18 against  Kernel 3.8 and xorg 13.1. Having previously used the EMGD drivers they can be ported to other Linux distros however problems may arise when upgrading or moving to newer  distro versions, where ABI breakages prevent this happening or cause stability issues. Intel prefer EMGD as they claim better 3D performance due to the Unified 3D (UFO) Driver.

The alternative to EMGD is the open source (Intel Linux Graphics) driver which can have better support for later kernels and hence usable on a later Linux distro. The downside may be a slight drop in overall performance and possibly stability. I chose to deploy the open source drivers against a very lightweight Ubuntu 14.04 image. The drivers provides OpenGL 3.3 and OpenGL ES 3.0.

The J1900 GPU core has 4 EU (Execution Units) combined with a maximum GPU frequency of 854Mhz. To given you have ideal of where the J1900 fits in the 'food chain' lets compare the FPS (frame per second) rates of running the WebGL Aquarium Demo on a imx6q, J1900 and an older 1037u celeron.

Number of Fish
1 50 100 500 1000
Platform Screen Resolution Frames Per Second
i.mx6q 1280x720 8 7 7 5 5
J1900 1680x1050 48 48 47 40 33
1037u 1680x1050 60 60 60 60 60

First lets be clear the above results are to be interpreted as a relative comparison. It should not be used as a primary marker for judging one platform to be superior to the other. Each platform has its merits based on a number of factors.  The i.mx6q as expected struggles (even at the lower resolution the rendering was not smooth) mainly due to the lower spec CPU/GPU core , an inefficient X driver and possibly some inefficient code paths in Chromium.  The older 1037u dual core celeron performs well due to the higher GPU frequency 1HGz and 6 EU, the trade off is a higher thermal output at 17W. What I couldn't easily account for was the drop off in the FPS rate at 1000 for the J1900. Below are the results from the BabylonJS train demo which is an intensive WebGL application supporting multiple camera angles, CAM_TRAIN being my favourite. What's interesting is that the FPS rate did not deviate when forcing Chromium to use EGL/GLES instead of OpenGL for the J1900.

Platform Screen Resolution Frames per second
J1900 1680x1050 13
1037u 1680x1050 24

Video playback is available through VAAPI plus libav with h.264/mpeg-2  hardware accelerated encoding/decoding. mplayer and gstreamer 1.0 support is readily available. CPU usage for decoding Big Buck Bunny at 1080p (H.264) was around 13% both in mplayer and using a simple gstreamer 1.0 pipeline. Decoding a 720p usb webcam at 30fps (YUY2) with the output encoded (H.264) to file and displayed to screen using a simple gstreamer pipeline resulted in 15% CPU usage.

Given the recent interest in HTML5 development for embedded platforms I  deployed a development build of Chromium (build 39). Chromium is fast becoming the web container of choice given the recent adoption of its engine in QT (QtWebEngine). HTML5test reported a healthy score of 512 out of 555 against Chromium.  A test HTML5 page with two embedded video files (playing concurrently) along a with a bunch of images (png) and static text ran without hiccup consuming 20% cpu. I briefly ran some demo HTML5 widgets from  zebra , webix and Kendo UI  again these ran smoothly. What should be possible with this platform is the ability to create a HTML5 GUI interface that could drive the rest of the application hosted on the same machine.

On the whole the results look very encouraging and the J1900 seems to offer a good trade off for a fan less solution with decent performance. Furthermore it should provide a relatively smooth route for application development. The main consideration is form factor and it is possible to find the J1900 in 3.5" SBC or Q7 form.

Saturday, 27 September 2014

i.MX6 Efficient font rendering and smoothing scrolling

Here is a short video demonstrating efficient font rendering and smoothing scrolling using OpenGL ES.

You may be forced to adopt this approach if your finding alternatives routes such as Qt5, HTML 5 or GTK+/Cario aren't giving you satisfactory results. The downside is that since you are starting from a low base (this a very low level implementation) it requires a far amount of development effort. On the positive side the code can be made to be as efficient as possible (to reduce power consumption/heat) and can be highly customised, in the above demo we control the screen from a remote PC.

Sunday, 27 July 2014

I.MX6 Developing with WebGL

On the new features in BSP 3.10.17 was support for WebGL. In a nutshell WebGL is javascript API supporting 3D and 2D graphics rendering, more about WebGL can be found here. WebGL is based on Open GL ES 2.0.

WebGL support opens up the possibility of developing and running graphical web based application on the i.mx6 within a browser. Furthermore there's also the possibility of the deploying a LAMP stack to serve the application from the i.mx6.  This opens up the possibility of developing simple games, kiosk applications, interaction user manuals/instructions and signage displays, etc..

To give you a taste of what is possible with the current WebGL implementation, I've put together a number of short videos that run existing WebGL demos/applications. These were run on lightweight Debian rootfs with Chromium browser (under X) on an i.mx6q board as part of a prototyping exercise. Chromium has been tweaked to maximize performance and the screen resolution was 720p (1280x720. Beware that not all WebGL application will run, some will fail because the Vivante libraries currently lack the 'Float Textures' extensions, others because the GPU is not powerful enough to give a decent FPS rate. Apologies for the quality of the videos. I'd advise that a i.mx6 quad processor is used to run WebGL as the examples consume 25-35% CPU under load.

The first video which also sums up what can be done on the i.mx6 with WebGL is the 'Lego build Chrome' demo. This application allows the creation of a Lego structure.

three.js is a javascript library which makes WebGL easier, here are some WebGL/CSS 3D examples which play nicely.

babylon.js is a 3D engine based on WebGL and javascript. The 'Create Your Own Shader' demo allows online editing of shaders.

CopperCube is an editor for creating 3D application, this is the 'Backyard Demo'.

Finally the "undulating-monkey" from aerotwist.