EvoLisa in C++ and OpenGL

Last December Roger Alsing had an interesting blog entry that created an approximation of the Mona Lisa with genetic programming in a program named EvoLisa. I'm not familiar with genetic programming techniques, but given the application to graphics I was interested.

His code was written in C#, and approximates a target image by randomly tweaking the number of polygons, their vertex count and location, and polygon color and opaqueness. As the polygon collection is tweaked, a new image is rendered and compared to the target. As the image difference measure decreases, the new polygon collection is kept and tweaked for the next generation. It is a slick little system and received a fair amount of attention.

I ported his released code to C++, creating a slightly more generally-named program evoimagegl and have tried a few versions to accelerate convergence with decent results. My target image is of Delicate Arch in Arches National Park, Utah:

Convergence

Convergence is measured by per-pixel color difference. For each pixel, the difference value is sqrt( (r1-r2)^2 + (g1-g2)^2 + (b1-b2)^2). Summing all corresponding pixel differences gives a final, somewhat large, value that corresponds with matching. The particular value isn't important, but tracking this value with the goal of minimizing this sum is the point. As images more closely match, this value decreases.

The very first generation greatly differs from the target, and often has difference values of 7.0e7 or 8.0e7. Final matching will be close to 2.2e7.

Version 1

This is a straight port, with a basic generation algorithm to test things out. Each generation was rendered in software by the GD library, so this step was slow. It worked and tended to converge into a vaguely recognizable image at about 80000 generations. Each generation created one mutant, and if the mutant was a better fit to the environment, it was taken as the next "breeder". Convergence to 3.17281e7 took 500,000 generations. A full run of 1,000,000 generations took 3:10 h:m.

Version 2

The performance of the GD library is really an unknown and to get pixel data code must access it via (x,y) coordinates. This may not have been cache-considerate depending on implementation, and two for loops (x outer, y inner) were used to get pixel data. Swapping loop order to (y outer, x inner) yielded a 6% speedup alone.

Next was to take advantage of multiple CPU cores and reduce the differencing work per thread. Simply using one additional thread in addition to the main thread yielded an extra 10% speedup (run on a dual-core MacBook). This should yield a better speedup if larger images are used; I don't think the 200x200 images provide enough compute work to amortize the second thread creation and join.

Next generation selection was the same as Version 1. A full run of 1,000,000 generations took 2:40 h:m, a 16% speedup.

Version 3

This modification was an optimization for quicker convergence. Instead of selecting a single child if it is better, each parent is mutated N times, and the child that has the best improvement towards the environment is selected. Thus, the search is broader and can more quickly converge.

Old code would run 1000000 mutations. Version 3 will also create 1,000,000 mutations, but may have 5 children per each of 200,000 generations.

Convergence to <3.17281e7 took 40,962 generations (204810 mutations) with Version 3 code, compared to 500,000 linear mutations of #1, or a 60% decrease in run time to a given difference value (2.5x speedup). A broader search is beneficial to convergence.

Convergence to less than 2.86891e7 took 800,000 iterations with Version 2. Version 3, with 5 children/generation, reached it in 53403 generations, or 267015 mutations. Since this number of iterations is 80% of Version 2's total trial, Version 2's effective time is 128minutes (80% * 2:40). Version 3 reached this same fitness level in 32minutes, a 4x speed improvement. (Nearly 5x the speed of Version 1.)

Letting it run a full 5/200,000 run took 175 minutes, but the image fitness was good (very slow to converge below 2.40e7). Final convergence was 2.39418e7.

Version 4

Instead of using software rendering via the GD library, Version 4 uses OpenGL to draw the polygons. This version also supports multiple children(mutations) per generation, and renders each mutant on the left side of the window, and the current "best" match individual on the right side.

A full 5/200,000 run takes 47.5 minutes, or 4x Version 1, and 175/47.5 or 3.7x Version 3.

Convergence to < 3.17281e7 took 65,000 generations (to 3.11688e7, and 325k mutations).

Convergence to < 2.86e7 took < 75,000 generations (2.74e7, 375k mutations), or approximately 75k/200k * 47.5 = 17.8 minutes, almost twice as fast as Version 3's time to this convergence.

Final convergence was 1.94149e7 at 200,000 generations (1e6 mutations).

Results

It is interesting how basic form emerges from the mutations so quickly. After just a few thousand generations, the basic colors of the sky, arch and ground are in place. One thing I didn't track is the maximum number of polygons in the final images, but it is bounded at 255.

2402 generations:

4215 generations:

10202 generations:

50119 generations:

199824 generations:

target: