How to take a screenshot of a virtual display server

Let us run and record arbitrary GUI programs via xvfb-run (which itself is a script to run stuff in Xvfb)! I am going to record the session via avconv and will take a screenshot with xwd each second. But first things first.

Xvfb is virtual display server -- implementing the X11 display protocol. That basicly means the rendering only exists in memory and the server won't send output to your screen. How cool is that? xvfb-run is a script (please install it as a packge) which will run your command in Xvfb.

You need xvfb, avconv or ffmpeg, xwd, vorbis (libvorbis0a or something like that), gtk for python (python-gtk2, maybe?). xdotool, used for testing, is optional.

Caveats (xvfb-run in combination with avconv)

Or: stuff I did wrong so you don't have to.

  1. .Xauthority != ~/.Xauthority: some users stumble into permission problems, because they have their own ~/.Xauthority file. So setting the parameter -f is not an option when you want to access the display server, it is mandatory. The manpage says about -f:

    -f file, --auth-file=file
          Store X authentication data in file.  By default, a temporary directory called xvfb-run.PID (where PID is the process ID of xvfb-run itself) is created in the  directory  specified
          by the environment variable TMPDIR (or /tmp if that variable is null or unset), and the tempfile(1) command is used to create a file in that temporary directory called Xauthority.
    

    the tricky thing is, --help says:

    -f FILE   --auth-file=FILE          file used to store auth cookie (default: ./.Xauthority)
    

    that gives you the impression, that ~/.Xauthority is used, because how do you know xvfb-run creates a temp-dir? Right! Read the manpage, read the source. I would like to propose a new acronym (after you RTFMed) you should RTFC (Read The Fucking Code).

  2. The --auto-servernum option won't return the server-number, so either use the simple way (default servernum value of 99 -- thats what I did), or find a display number yourself (which means replicating the --auto-servernum code alredy written. Not DRY, but gives you the ability to run multiple instances at once). Once you got a display number, pass it to avconv, the display number is part of the variable $DISPLAY. $DISPLAY has the format: hostname:D.S:

    • hostname is your host (0 == localhost)
    • D is the display number (for example, you could have multiple displays attached to your monitor)
    • S is the screen number (a display itself can have multiple screens, i.e.: virtual desktops)
  3. Find some good avconv settings. For example: the formats .mpg and .mov fail horribly, because avconv needs to record longer than the server is running. Ideally, I would to stop the recording, before Xvfb stops. But Xvfb stops, if xvfb-run exits. The problem here is similar to the Halting-Problem. There is no way to predict when the executed GUI-program is going to terminate. The programs will exit in this order: the program you ran terminates, then xvfb-run terminates and xvfb-run terminates Xvfb. So, obivously avconv tries to record a source that is no longer available and will produce a malformed file. Some formats choke on that behaviour. Theora didn't.

  4. some display-sizes are not compatible with certain depths.

Script it!

I made a wrapper (xvfb-screenshooter) around the wrapper (xvfb-run) around the wrapper (xvfb) that wraps the output in a virtual display. What could possibly go wrong?

  • xvfb-screenshooter
  • i included a simple python-gtk-hello-world program.
  • there are three behaviours how this wrapper leaves it's main loop:
    • run a specified amount of iterations
    • kill the application with the name "Hello.*"
    • wait until the application exits
  • maybe I will include some cli-switches so you can choose a behaviour without changing the code.

What are some fun things to do with this?

  • Seleniumtests?
  • A lot of screenshots?
  • Really huge screenshots?
  • Emulate a lot of different displays?