Screenshots to Code

An experiment to (re)learn Machine Learning

What am I trying to accomplish?

This blog post has a handful of goals:

  • Get up to speed on Machine Learning
  • Help others do the same
  • Get writing again
  • (stretch) Provide a tool that helps developers move code forward

While this last bullet feels ambitious, it will act as the framing to make sure we explore and learn things of real use (not just theoretical).  The project we’ll tackle is to create a tool that takes a screen shot of UI and generates the code to create that UI.  This blog post will serve as the directory of the explorations that go into tackling this project.  It’s broken up in a few sections:

  • Project Structure – A description of the various problems and sub-projects that need to be completed.  This will be updated as progress is made as well as when there are set-backs and their are lessons to be learned.
  • Resources / Topics
    • Topics – As I explore certain topics that feel a gnarly to those of us who aren’t data-scientists, I will write-up a guide as well as links to resources to get started.
    • Resources – For topics that are already well covered and manageable, I will simply include a brief description and a set of links.
    • Up-coming dives – I fully expect to hit a number of topics that require exploration.  This will be a stack of topics that are on the backlog to explore.

Project Structure

Having not written any AI/ML algorithms in over a decade, my first stab at project structure is likely naïve.  Current thinking is that it is composed of the following problems/sub-projects:

(Neural Network) Screenshot to GUI Code

  • Description: Input a screenshot and output a semantically correct snippet of GUI code that creates it.
  • Input: A set of screenshots
  • Components:
    • Primary Recurrent Neural Network – this will take the input screenshot and output code.
    • (sub-project) GUI Code to Bitmap – takes the output of the RNN and creates a bitmap.
    • (sub-project) Bitmap Comparer – take two bitmaps and returns a float for how similar/different they are.
  • Training loop: 
    • Run RNN on sample, take its output and run the GUI Code to Screenshot, score the result


GUI Code to Bitmap

  • Description: Create a component that can take semantically correct GUI code and generate the expected visual result..
  • Components:
    • Data Set: (sub-project) Collect Code Samples via GitHub Crawler
      • Description: Crawl GitHub (taking license into account) and get UI/XAML files, and generate UI snapshot
      • Goal: Create the training set.
    • Code to Bitmap: Current thinking is to use Windows.UI.Xaml.Markup.XamlReader to parse the file and RenderTargetBitmap to create the image.  Note, due to performance reasons, my guess is that we will need to make a Generative NN to replace these two calls.

Bitmap Comparer

  • Description: Take in two bitmaps and calculate their similarity.
  • Challenge: No idea yet on how to create a good cost function.  Adding it to the list of coming soon topics…

Resources / Topics

If you have any suggestions on content that you hope I will explore, pointers to great resources / material, or just comments and questions; Please don’t hesitate to leave a note in the comments below.

Subjects to explore:

  • How to create a good cost function for comparing two images?


<Coming Soon(ish)/>


<Coming Soon(ish)/>

Off the Electronic Grid: Into the Grand Canyon

Check-out some of the great shots we got of friends and family in the canyon.  Then head past to read a description of the trip and some thoughts on why trips like these are important.

Quick stop at Trail View Overlook.
Left to Right - Lindsey, Nathan, Nick, Xheni

A view out from Trail View Overlook.

Starting down New Hance
Front to back: Nick, Dennis, Xheni, and Nathan

A view of the trail hugging the rock formation on the right with the red rock to step around.

Good view of the canyon a few hours into our hike.

A break on the way down with the
rain spitting on us.
Left to Right - Nick, Xheni, Dad, Nathan

Xheni and my Dad getting close to Red Canyon

Some ominous clouds just north of us in the canyon.

A shot of the cactus flowerers with a cloudy canyon backdrop.

Red Canyon and what looks like heavy rain.

Dad traversing a tricky section of the trail while I watch.

Looking down the canyon at our tents (Dad's is the grey one on the left, Xheni and I's is the yellow one).

Looking back up trail at Nathan and Lindsey's tent.

The last mile of the hike to the river was in a dry creek bed.

Big boulders lining the creek bed on our way down to the river.

Brining up the rear, my wife and I make it to the river banks.

A shot of the Colorado river.

Xheni and I getting back into camp, ready for dinner.

Lindsey getting started on over an hour of water filtering while Nathan cooks, Dad enjoys the best Raman he's ever had, and Xheni stares off into space.

A fabulous morning as we exited Red Canyon.

First real break on the climb.
Back to Front - Nathan, Nick, Dad, and Xheni

Beautiful shot of the clouds, sky, and the canyon.

A view up the canyon and of a false summit.

Nathan stopped to get a shot of the flowers
against the Juniper tree as we hiked past.

Nathan and I relaxing on a rock shelf that was almost the perfect bench.

The view off to the left from our rock shelf.

Nathan graciously stopped to get a picture of the
wild flower for Xheni.

Nathan was being a mountain goat as he tried to capture the steepness of the trial.
Left to Right: Dad, Xheni, Nick

One of our last looks at the trail.

Relaxing after the hike by Duck on a Rock.

Read More

The Challenge of Managing the Whole Product as a Feature Owner

I recently had a conversation with someone who co-managed a healthy sized organization at Microsoft (~50 program mangers and engineers).  The goal of the chat was to explore some customer development techniques her team had been using to try and learn how they could be applied to the work my team was doing.  But right off the bat, the conversation took a turn I hadn’t expected.  Some of the work she had been doing was pulled in the 11th hour and she seemed disenfranchised with her lack of control and autonomy over the product.  Her argument was, that even as a PM Lead, you are basically only the owner of a feature and not of a product.  Therefore, you don’t have enough span of control to effectively employ customer development.  This disempowered line of reasoning runs completely counter to my personal philosophy.  As such, I have spent the last couple of weeks mulling over the conversation and pondering the relationship between span of control, influence, customer advocacy, and sense of responsibility.

Read More

Understand what’s possible with the Windows UI Animation Engine


In November of 2015, the Visual Layer was introduced as a series of new APIs in the Windows.UI.Composition namespace.  These new APIs marked the first opportunity for app developers to get direct access to many of the capabilities that have underpinned the UI Frameworks since Windows 8 (e.g., IE/Edge, XAML, & the Windows Shell).  One key aspect of the new Visual Layer is its new animation engine.    But after spending a bunch of time talking to developers at //build this year, it became clear that devs are still not clear on how the pieces of the animation system fit together.  To shed some light on what you can do with the animation system let’s walk through two questions:

  • Who’s responsible for starting animations?
  • What drives the animation to change values?

Read More