Posts from the “Uncategorized” Category

Debouncing Method Dispatch in Objective-C

I’ve been doing a lot of Node.js programming recently, and Javascript / Coffeescript programming style is starting to boil over into the way I think about Objective-C. One of my favorite practices in Javascript is the concept of deferring, throttling, or debouncing method calls. There are tons of uses for this. For example, let’s say your app is updating a model object, and you want to persist that model object when the updates are complete. Unfortunately, the model is touched in several bits of code one after another, and your model gets saved to disk twice. Or three times. Or four times. All in one pass through the run loop.

It’s pretty easy to defer or delay the execution of an objective-c method using “performSelectorAfterDelay” or any number of methods. Debouncing—running a method just once in the next pass through the run loop after it’s been called multiple times—is a bit trickier. However, in the case described above it’s perfect. Touch the model all you want, call “save” a dozen times, and in the next pass through the run loop, it get’s saved just once.

Here’s how I implemented a new debounce method “performSelectorOnMainThreadOnce”:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
@implementation NSObject (AssociationsAndDispatch)
 
- (void)associateValue:(id)value withKey:(void *)key
{
	objc_setAssociatedObject(self, key, value, OBJC_ASSOCIATION_RETAIN);
}
 
- (void)weaklyAssociateValue:(id)value withKey:(void *)key
{
	objc_setAssociatedObject(self, key, value, OBJC_ASSOCIATION_ASSIGN);
}
 
- (id)associatedValueForKey:(void *)key
{
	return objc_getAssociatedObject(self, key);
}
 
- (void)performSelectorOnMainThreadOnce:(SEL)selector
{
    [self associateValue:[NSNumber numberWithBool: YES] withKey: (void*)selector];
 
    dispatch_async(dispatch_get_main_queue(), ^{
        if ([self associatedValueForKey: (void*)selector]) {
            [self performSelector: selector];
            [self associateValue:nil withKey:(void*)selector];
        }
    });
}
 
@end

A more advanced version of this method would allow you to say “perform this selector just once in the next 100 msec” rather than performing it in the next iteration through the run loop. Anybody want to take a stab at that?

Teaching kids to code: It’s not about fame and fortune.

I’ve been programming since I was seven. I started with Cocoa, the children’s language developed by Apple in 1996 (they recycled the Cocoa brand when they bought NeXT). Then I learned AppleScript, REALBasic, and eventually Objective-C and Java. I have code so old that it’s full of spelling mistakes: I learned to program before I could spell trivial words like “building.” My personal journey as a programmer was propelled by my dad who has been developing Mac shareware for more than 25 years, but it mirrors what might happen if programming is taught in schools. For the most part, that’s really exciting.

This week’s code.org campaign put the power and celebrity of today’s most successful developers behind an effort to teach the world to code. But the reasons they champion for learning to program—namely, fame, fortune, and a great job with free food—seem oddly genderless. When I excitedly played through their latest ad, I was left thinking to myself: “They pulled together our industry’s best and brightest, and that’s all they came up with?”

Over the last few years, I’ve been in more than 30 classrooms—in elite private schools with Macbooks for every student, and in inner city schools on the edge of collapse. My research in educational technology has brought me in touch with the kids code.org is trying to reach, and I think all of them would benefit from learning to code. But when I look at the impact programming has had on my life—especially on my childhood, marked by school transitions, my parent’s divorce, and a fair bit of high school drama, it has done far more for me than provide a steady job.

For kids, programming is about zen. It’s about learning patience. It’s about feeling accomplished when the world around you seems to be rooting for your failure. It’s about being able to sit down at a $250 Chromebook and envelop yourself in a world where known rules apply, where constant input produces constant output. A world you can learn to understand and use to create whatever your heart desires. For kids for whom most of the world makes no goddamn sense, it’s an amazing gift.

I understand why code.org is marketing to students the way it is—you don’t tell inner city kids to play basketball because it’ll keep them in shape and doesn’t require grass, you tell them they can be like Michael Jordan. What upsets me is that the individuals who tell their personal stories in their latest video—people who have a deep, personal affiliation with the art, present it in such an objective way. If I were featured in that video, my cameo would by dramatically different. Programming taught me patience and process, and it’s given me balance when I’ve needed it most.

Discuss this post on Hacker News.

targetContentOffsetForProposedContentOffset: not called

Just a quick tip. If you’re subclassing UICollectionViewLayout or UICollectionViewFlowLayout and targetContentOffsetForProposedContentOffset:withScrollingVelocity: is not being called, check to make sure your UICollectionView does not setPagingEnabled: YES. If you have paging enabled on the collection view, that setting takes precedence over the targetContentOffset.

stringWithFormat: is slow. Really slow.

I’m working on a project that makes extensive use of NSDictionaries. Buried deep in the model layer, there are dozens of calls to stringWithFormat: used to create dictionary keys. Here’s a quick example:

1
2
3
4
5
6
7
8
- (CGRect)rect:(NSString*)name inDict:(NSDictionary*)dict
{
    float x = [[dict objectForKey:[NSString stringWithFormat: @"%@@0", name]] floatValue];
    float y = [[dict objectForKey:[NSString stringWithFormat: @"%@@1", name]] floatValue];
    float w = [[dict objectForKey:[NSString stringWithFormat: @"%@@2", name]] floatValue];
    float h = [[dict objectForKey:[NSString stringWithFormat: @"%@@3", name]] floatValue];
    return CGRectMake(x, y, w,h);
}

In this example, I’m using stringWithFormat: in a simple way. To read four CGRect values for the rect ‘frame’ from the dictionary, it creates the keys [email protected], [email protected], [email protected], and [email protected] Because of the way my app works, I call stringWithFormat: to create strings like this a LOT. In complex situations, to the tune of 20,000x a second.

I was using Instruments to identify bottlenecks in my code and quickly discovered that stringWithFormat: was responsible for more than 40% of the time spent in the run loop. In an attempt to optimize, I switched to sprintf instead of stringWithFormat. The result was incredible. The code below is nearly 10x faster, and made key creation a negligible task:

1
2
3
4
5
6
7
8
- (NSString*)keyForValueAtIndex:(int)index inPropertySet:(NSString*)name
{
        // the following code just creates %@@%d  - but it's faster than stringWithFormat: by a factor of 10x.
        char cString[25];
        sprintf (cString, "@%d", ii);
        NSString* s = [[[NSString alloc] initWithUTF8String:cString] autorelease];
        return [name stringByAppendingString: s];
}

It’s worth mentioning that I’ve refactored the app even more—to completely avoid saving structs this way (since NSValue is uh… an obvious solution), but I felt like it was worth posting this anyway, since you might not be able to refactor in the way I did.

Photoshop v. Fireworks

If you build apps, or work with people that build apps, I recommend you start using Fireworks instead of Photoshop. Like right now. Fireworks is an integral part of my design / development workflow, and I wouldn’t have it any other way. Why? There are several reasons Fireworks trumps Photoshop for designing apps and websites:

1. Fireworks deals with objects, not layers. In fireworks, you create interface elements using primitive shapes and then add gradients, drop shadows, etc… They have great tools for selecting objects, and it’s easy to grab them and move them around. You could use Photoshop shape layers, but once you’ve tried the Fireworks approach I think you’ll find that shape layers are an awkward extension of the layer concept and not a true solution for interface design.

2. The properties of objects in Fireworks mirror what is possible in CSS3 and also in iOS / Android while remaining flexible enough to make anything. I’ve gotten too many Photoshop designs where beautiful assets were static images. The Fireworks object / properties model encourages designers to create things out of object primitives and downplays images wonderfully.

3. PNG is the native file format of Fireworks. PNG is also the native file format of iOS and Android, meaning you don’t have to “export” the images without layers as assets. You just use the actual source images in the app. When something needs to be tweaked, you right click it in the sidebar of Xcode, say “Open in External Editor” and bam. Fixed.

4. Fireworks is vector based, but it is intended for bitmap output. This combination means it’s easy to @2x things and create two sets of images, because the objects in the design are vector based anyway.

Just my 2c! I use Fireworks for everything and I’ve always been impressed when people go through all the effort of creating mockups in Photoshop. You might think I’m crazy, but just check out what happened when Adobe asked about improving Photoshop for iPhone designers on their official blog:
John Nack on Adobe: How could we improve Photoshop for iPhone Developers?

Connecting to a Socket.io server from Node.js Unit Tests

If you’re looking for a more expansive tutorial, check out Liam Kaufman’s intro to Socket.io, Mocha, and Node.js. It’s by far the best tutorial I’ve found so far.

I just spent the last three hours trying to create unit tests for my Node.js + Socket.io server with Mocha. I was following all the instructions, but creating a new Socket.io client connection with socket.io-client did nothing. A connection was never established and the node script would finish executing, in spite of my best efforts to attach listeners to the socket. The problem, it turns out, was that I had forgotten to include two critical configuration options when creating the Socket.io client connection. In case anyone is trying to set up unit tests for a socket.io server, here’s a boilerplate test coffeescript file that works:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
io = require('socket.io-client')
assert = require("assert")
should = require('should')
 
socketURL = 'http://localhost:9292'
<strong>socketOptions = 
  transports: ['websocket']
  'force new connection': true
</strong>
describe 'Array', () ->
 
  beforeEach (done) ->
    this.socket = io.connect(socketURL, socketOptions)
    this.socket.on 'connect', ()->
      console.log('connected')
      done()
 
  describe '#indexOf()', () ->
    it 'should return -1 when the value is not present', () ->
      assert.equal(-1, [1,2,3].indexOf(5))
      assert.equal(-1, [1,2,3].indexOf(0))

To run this test, place it in a new folder called ‘test’ in your Node.js project. Run the following commands to get the modules you need:

1
2
3
npm install socket.io-client
npm install should
sudo npm install -g mocha

Then, on the command line run:

mocha --compilers coffee:coffee-script

Kinect Fun House Mirror


A Kinect hack that performs body detection in real-time and cuts an individual person from the Kinect video feed, distorts them using GLSL shaders and pastes them back into the image using OpenGL multitexturing, blending them seamlessly with other people in the image.

It’s a straightfoward concept, but the possibilities are endless. Pixelate your naked body and taunt your boyfriend over video chat. Turn yourself into a “hologram” and tell the people around you that you’ve come from the future and demand beer. Using only your Kinect and a pile of GLSL shaders, you can create a wide array of effects.

This hack relies on the PrimseSense framework, which provides the scene analysis and body detection algorithms used in the XBox. I initially wrote my own blob-detection code for use in this project, but it was slow and placed constraints on the visualization. It required that people’s bodies intersected the bottom of the frame, and it could only detect the front-most person. It assumed that the user could be differentiated from the background in the depth image, and it barely pulled 30 fps. After creating implementations in both Processing (for early tests) and OpenFrameworks (for better performance), I stumbled across this video online: The video shows the PrimeSense framework tracking several people in real-time, providing just the kind of blob identification I was looking for. Though PrimeSense was originally licensed to Microsoft for a hefty fee, it’s since become open-source and I was able to download and compile the library off the PrimeSense website. Their examples worked as expected, and I was able to get the visualization up and running on top of their high-speed scene analysis algorithm in no time.

However, once things were working in PrimeSense, there was still a major hurdle. I wanted to use the depth image data as a mask for the color image and “cut” a person from the scene. However, the depth and color cameras on the Kinect aren’t perfectly calibrated and the images don’t overlap. The depth camera is to the right of the color camera, and they have different lens properties. It’s impossible to assume that pixel (10,10) in the color image represents the same point in space as pixel (10, 10) in the depth image. Luckily, Max Hawkins let me know that OpenNI can be used to perform corrective distortions, aligning the image from the Kinect’s color camera with the image from the depth camera and adjusting for the lens properties of the device. Luckily, OpenNI performs all of the adjustments necessary to perfectly overlay one image on the other. I struggled for days to get it to work, but Max was a tremendous help and pointed me toward these five lines of code, buried deep inside one of the sample projects (and commented out!)

1
2
3
4
5
6
7
     // Align depth and image generators
     printf("Trying to set alt. viewpoint");
     if( g_DepthGenerator.IsCapabilitySupported(XN_CAPABILITY_ALTERNATIVE_VIEW_POINT) )
     {
         printf("Setting alt. viewpoint");         g_DepthGenerator.GetAlternativeViewPointCap().ResetViewPoint();
         if( g_ImageGenerator ) g_DepthGenerator.GetAlternativeViewPointCap().SetViewPoint( g_ImageGenerator );
     }

Alignment problem, solved. After specifying an alternative view point, I was able to mask the color image with a blob from the depth image and get the color pixels for the users’ body. Next step, distortion! Luckily, I started this project with a fair amount of OpenGL experience. I’d never worked with shaders, but I found them pretty easy to pick up and pretty fun (since they can be compiled at run-time, it was easy to write and test the shaders iteratively!) I wrote shaders that performed pixel averaging and used sine functions to re-map texcoords in the cut-out image, producing interesting wave-like effects and blockiness. I’m no expert, and I think these shaders could be improved quite a bit by using multiple passes and optimizing the order of operations.

Since many distortions and image effects turn the user transparent or move their body parts, I found that it was important to fill in the pixels behind the user in the image. I accomplished this using a “deepest-pixels” buffer that keeps track of the furthest color at each pixel in the image. These pixels are substituted in where the image is cut out, and updated anytime deeper pixels are found.

Here’s a complete breakdown of the image analysis process:

The color and depth images are read off the Kinect. OpenNI is used to align the depth and color images, accounting for the slight difference in the lenses and placement that would otherwise cause the pixels in the depth image to be misaligned with pixels in the color image.
The depth image is run through the PrimeSense Scene Analyzer, which provides an additional channel of data for each pixel in the depth buffer, identifying it as a member of one or more unique bodies in the scene. In the picture at left, these are rendered in red and blue.
One of the bodies is selected and the pixels are cut from the primary color buffer into a separate texture buffer.
The depth of each pixel in the remaining image is compared to the furthest known depth, and deeper pixels are copied into a special “most-distant” buffer. This buffer contains the RGB color of the furthest pixel at each point in the scene, effectively keeping a running copy of the scene background.
The pixels in the body are replaced using pixels from the “most-distant” buffer to effectively erase the individual from the scene.
A texture is created from the cut-out pixels and passed into a GLSL shader along with the previous image.
The GLSL shader performs distortions and other effects on the cut-out image before recompositing it onto the background image to produce the final result.
Final result!

Here’s a video of the Kinect Fun House Mirror at the IACD 2011 Showcase:

GLSL & The Kinect – Part 2

For the last couple weeks, I’ve been working on a kinect hack that performs body detection and extracts individuals from the scene, distorts them using GLSL shaders, and pastes them back into the scene using OpenGL multitexturing. The concept is relatively straightforward. Blob detection on the depth image determines the pixels that are part of each individual. The color pixels within the body are copied into a texture, and the non-interesting parts of the image are copied into a second background texture. Since distortions are applied to bodies in the scene, the holes in the background image need to be filled. To accomplish this, the most distant pixel at each point is cached from frame to frame and substituted in when body blobs are cut out.

It’s proved difficult to pull out the bodies in color. Because the depth camera and the color camera in the Kinect do not align perfectly, using a depth image blob as a mask for color image does not work. On my Kinect, the mask region was off by more than 15 pixels, and color pixels flagged as belonging to a blob might actually be part of the background.

To fix this, Max Hawkins pointed me in the direction of a Cinder project which used OpenNI to correct the perspective of the color image to match the depth image. Somehow, that impressive feat of computer imaging is accomplished with these five lines of code:

1
2
3
4
5
6
7
8
    // Align depth and image generators
    printf("Trying to set alt. viewpoint");
    if( g_DepthGenerator.IsCapabilitySupported(XN_CAPABILITY_ALTERNATIVE_VIEW_POINT) )
    {
        printf("Setting alt. viewpoint");
        g_DepthGenerator.GetAlternativeViewPointCap().ResetViewPoint();
        if( g_ImageGenerator ) g_DepthGenerator.GetAlternativeViewPointCap().SetViewPoint( g_ImageGenerator );
    }

I hadn’t used Cinder before, and I decided to migrate the project to Cinder since it seemed to be a much more natural environment to use GLSL shaders in. Unfortunately, the Kinect OpenNI drivers in Cinder seemed to be crap compared to the ones in OpenFrameworks, et. al. The console often reported that the “depth buffer size was incorrect” and that the “depth frame is invalid”. Onscreen, the image from the camera flashed and occasionally frames appeared misaligned or half missing.

I continued fighting with Cinder until last night, when at 10PM I found this video in an online forum:

This video is intriguing, because it shows the real-time detection and unique identification of multiple people with no configuration. AKA it’s hot shit. It turns out, the video is made with PrimeSense, the technology used for hand / gesture / person detection on the XBox.

I downloaded PrimeSense and compiled the samples. Behavior in the above video achieved. The scene analysis code is incredibly fast and highly robust. It kills the blob detection code I wrote performance-wise, and doesn’t require that people’s legs intersect with the bottom of the frame (the technique I was using assumed the nearest blob intersecting the bottom of the frame was the user.)

I re-implemented the project on top of the PrimeSense sample in C++. I migrated the depth+color alignment code over from Cinder and built a background cache and rebuilt the display on top of a GLSL shader. Since I was just using Cinder to wrap OpenGL shaders, I decided it wasn’t worth linking it in to the sample code. It’s 8 source files, it compiles on the command line. It was ungodly fast. I was in love.

Rather than apply an effect to all the individuals in the scene, I decided it was more interesting to distort one. Since the PrimeSense library assigns each blob a unique identifier, this was an easy task. The video below shows the progress so far. Unfortunately, it doesn’t show off the frame rate, which is a cool 30 or 40fps.

My next step is to try to improve the edge of the extracted blob and create more interesting shaders that blur someone in the scene or convert them to “8-bit”. Stay tuned!

Generative Art in Processing

I threw around a lot of ideas for this assignment. I wanted to create a generative art piece that was static and large–something that could be printed on canvas and placed on a wall. I also wanted to revisit the SMS dataset I used in my first assignment, because I felt I hadn’t sufficiently explored it. I eventually settled on modeling something after this “Triangles” piece on OpenProcessing. It seemed relatively simple and it was very abstract.

I combined the concept from the Triangles piece with code that scored characters in a conversation based on the likelihood that they would follow the previous characters. This was accomplished by generating a Markov chain and a character frequency table using combinations of two characters pulled from the full text of 2,500 text messages. The triangles generated to represent the conversation were colorized so that more likely characters were shown inside brighter triangles.

Process:

I started by printing out part of an SMS conversation, with each character drawn within a triangle. The triangles were colorize based on whether the message was sent or received, and the individual letter brightnesses were modulated based on the likelihood that the characters would be adjacent to each other in a typical text message.

In the next few revisions, I decided to move away from simple triangles and make each word in the conversation a single unit. I also added some code that seeds the colors used in the visualization based on the properties of the conversation such as it’s length.

Final output – click to enlarge!

Kinect & GLSL Shaders = Fun!

I’m revisiting the Kinect for my final project. I’m separating the background of an image from the foreground and using OpenGL GLSL Multitexturing Shaders to apply effects to the foreground.

GLSL Shaders work in OpenFrameworks, which is cool. However, there’s a trick that took me about three days to find. By default, ofTextures use an OpenGL plugin that allows for non-power of two textures. Even if you use a power of two texture the plugin is enabled and allocates textures that can’t be referenced from GLSL. FML.

The first GLSL shader I wrote distorted the foreground texture layer on the Y axis using a sine wave to adjust the image fragments that are mapped onto a textured quad.

I wrote another shader that blurs the foreground texture using textel averaging. You can see that the background is unaffected by the filter!