Part 1. Retrospective

The summer of my second year of university was my Summer of Discovery. As my “day job” wasn’t very engaging, I had much creative energy and curiosity left over to spend throughout the long British summer evenings. I also had access to the university labs so I had a place to do any soldering or ‘scope measurements necessary. That summer I learned a great deal thanks to various projects that I attempted for the first time, from analog electronics to web development in PHP, including the subject of this blog post.

I’m not sure when the WS2812 LED strip (NeoPixel) craze started, but that’s when I became aware of them. Understandably, the bright flashing colours had a magnetic effect on me, so I bought myself a strip and the necessary power supply. I had just the project in mind for them - my very own Ambilightâ„¢ clone. This is something that had been at the back of my mind for a while, but I had lacked the knowledge and the context to approach it. I was semi-aware commercial solutions already existed, but I wanted to do it myself. I called it a learning exercise - you may call it Not-Invented-Here syndrome.

I worked on both parts - the PC software and the controller - more or less in parallel. I had several ATTiny2313 microcontrollers lying around and the university’s oscilloscope - nothing could stop me! Then I hit the limits of the hardware - the chip didn’t have enough RAM and the LED timing was too sensitive to do a naïve data streaming scheme even at the highest baud rate so I caved in and flashed the Adafruit library on an Arduino Uno. However, I was determined to press on with the desktop app.

In all honesty, this part of the blog post should have been written years ago, as a sort of lab book; I promise that had been my intention back then. I recall having a lot of trouble setting up the required libraries for debug - it would just crash with no indication of the issue. Documentation was scarce, so I hoped my efforts would save others the pain. Alas, I did not write this post then, and the library became unsupported in the meantime, which I’ll use as excuses for not including that information. Eventually, just as my internship was finishing, I got it all working pretty nicely. It lacked polish and any serious optimisations, but images were displayed and LEDs flashed - I was pleased. I didn’t have my own TV at the time so I flew it back home and stuck it to my parents’. They were happy with it and reportedly used it for several months when watching movies - a great success in my mind. Eventually TVs got moved, Arduinos were recovered, and project files, archived, so the project was shelved for a few years.

Part 2. Revival

I’m not sure what prompted me to revive this project. I had had my own TV for a while and I had attached LEDs to other surfaces in my house, but the thought of restoring this didn’t really cross my mind until one day I decided to order the parts and give it a go. One day, I opened up the project files, clicked “Run”, and… it still worked! Thankfully I didn’t have to fiddle with the project setup and I could just dive in and fix up the code.

Autopsy

The spartan UI was built in Windows Forms. I used SharpDX as a wrapper around DXGI1 to capture the frames. I then looped over all the pixels and, if one fell into the area corresponding to an LED, I would add its value to a running sum. This was then averaged and sent off for displaying (after some gamma correction). I could spot some obvious areas for improvement, but it was fairly reasonable for experimental code.

Enhancements

The first things I did was rewrite the big loop so that it only looped over the pixels of interest instead of breaking out when a pixel was out of bounds. Speed up: 2x.

I then split the 3 sides (left, top, right) into parallel tasks, which was pleasantly easy to write in C#. Speed up: 2.5x.

I was getting very quick times at this point, about 6ms/frame. I then rearranged the nested loops to remove a division operation, which sped it up by another 0.3x. To exploit some more parallelism, I used Parallel.For to compute each LED’s colour sum in parallel.

When I first worked on this, I attempted to have the GPU do the averaging via mipmaps. This would eliminate the need to copy the texture data from GPU to CPU memory. I didn’t quite get it to work back then, but it’s something I’ll come back to in the future.

Quality of life improvements

Having an Arduino hanging off the PC wasn’t very elegant. In addition, serial writes had become a bottleneck for the optimised logic. To get around this, I wrote a UDP sever running on Raspberry Pi and some quick logic to stream the LED data via UDP instead of serial. Doing this practically removed any time spent sending data from the main loop. This also affords me the freedom to easily stream colour data based on any other source I want (audio?).

I added a Zigbee switch on the PSU’s mains input so I could turn it on and off without physically touching anything. I also wrote a automation that would turn it on when data started coming in via UDP, and turn it off after a timeout if nothing was received. The UDP server kicks (or pets, if you prefer) the watchdog every second - no need for 60FPS updates everywhere.

Future plans

  • GPU-accelerated computation, thought I’m not sure how well this will work with video games trying to use it at the same time
  • Smooth colour transitions - right now quick changes in the image are quite jarring and produce too much flashing
  • Fire Stick 4K screen capture, as most of my movie watching is not done through my PC
  • Clean up the DXGI code to allow you to select the desired monitor - this currently crashes for some reason
  • High-DPI support for the Forms app (it looks awful when I drag the window over to the TV)

  1. Microsoft DirectX Graphics Infrastructure: https://docs.microsoft.com/en-us/windows/win32/direct3ddxgi/d3d10-graphics-programming-guide-dxgi ↩︎