Structure and Practices of the Video Relay Service Program
The YouTube Video You Don’t See
Shop with confidence across the web
Helicopter view of your driving directions on Google Maps
Google CIO and others talk DevOps and "Disaster Porn" at Surge
Burning Man 2011 - Yes we were there.
Getting Started on the Google API
CACertMan app to address DigiNotar & other bad CA’s
Custom Class Loading in Dalvik
TWO REPORTS OF ADVISORY COMMITTEES ON DISABILITIES ISSUES RELEASED
Join the White House Disability Group Monthly Call on July 27
Multiple APK Support in Android Market
Forever alone involuntary flashmob
PS3 root key released - sign and run anything
Don't have a front-facing camera?
Mobile phone product testing: Models
How Can the LHC withstand 1 Petabyte of Data a Second?
Linus Torvalds is now officially a US Citizen
Portland bike lanes get mario symbols
Skype RC4 claimed reverse-engineered
Measurement Lab - Google IO BigQuery session is live querying 60 billion rows instantly
All you need is a little egotism, and $6
Convert IDN punycode to/from native characters
Sparkfun free day tomorrow: 1/7
Need a recursive DNS server? Use 8.8.8.8 and 8.8.4.4
JIQL - Java JDBC wrapper for Google DataStore
Unicorn == Mongrel delayed_job
Remus - Transparent HA for Xen
Crossbow Virtual Wire Demo Tool
Eucalyptus MySQL SOLR RabbitMQ Varnish == Nebula.nasa.gov
Apple drops ZFS due to legal concerns
Peering disputes between Cogent and Hurricane Electric
Equinix to acquire Switch and Data for $689 million
Project kxen renamed project HXEN
Lessconf Jacksonville - followed the next day by Barcamp
Stick-figure guide to advanced AES crypto
Why you should pay attention to Google Wave
rails-primer - how to easily host rails projects on appengine
AppEngine-JRuby on google code
Ruby on Google AppEngine: appengine-jruby video
Detecting Spammers with SNARE: Spatio-temporal Network-level Automatic Reputation Engine
Proxmox VE - OpenVZ KVM Cluster appliance management
Sun/Oracle kill of SXCE: Sysadmins everywhere cry in horror.
making water drinkable through nano-filtration
Pigin 2.6.1 adds Xmpp voice and video support
Setting up a Layer-3 tunnel VPN using ssh 4.3 and -w option tun devices
shadowserver.org - botnet hunting resources
OpenBSC - a Siemens BS-11 microBTS or a ip.access nanoBTS == your own GSM tower
Karesansui Project - a Xen management harness from Japan
Pygowave Server - Run your own Google Wave server
Xen clocksource0 time went backwards
Internet vs World Population stats
Apple pulls Google Voice app from iPhone - AT&T's fault
live-android boot ISO - very neat
How to update your GeoIP information in addition to SWIPping
Google Wave hackathon on 20th/21st, if you happen to be in Mountainview
Did I mention OTOY here before?
STuPiD - STUN/TURN using PHP in Dispair
Browser based Server-side 3D gaming from OTOY
Cisco's replacement for the WRT54GL is the WRT160NL
Spinn3r.com - Index the blogosphere
Parts of galaxy Messier 87 are missing
DRAEGER ALCOTEST 7110 MKIII-C Evaluation of Breathalizer Source Code
How Michael Osinski Helped Build the Bomb That Blew Up Wallstreet
Bruce Perens - A Cyber-Attach on an American City
How Google and Facebook are using R
adito - the new gpl fork of the old sslexplorer project
IP Address geolocation for free
Shapeways - $50 "3-D poem rings" until the end of the month
GrandCentral to become Google Voice
TurboVNC VirtualGL == FAST network GL
Ben Rockwood's presentation at the OpenSolaris Storage Summit: ZFS in the trenches
The Crisis of Credit Visualized on Vimeo
10gen - a java based app hosting infrastructure
Engineyard Vertebra - another cloud infrastructure management harness
Eucalyptus - an opensource EC2 compatible hosting infrastructure
railsbrain.com <-- ajaxified rdoc
AP IMPACT: SWAT Teams Deployed in 911 fraud
Lessons learned by people who have quit Google
Makwana indicted for Fanny Mae malware
Zentific svn repo: alpha available
DACS - Distribution and Configuration System - version 2.0
Video of Cisco IOS attack talk at Chaos Computer Conference
Cosmic radio background noise 6 times higher than expected
Grow your own bioluminescent algae
Quartz Composer and Cruise Control status
Sunay Tripathi's Solaris Networking Blog
Merry Christmas from Chiron Beta Prime
Google's Native Client... the next ActiveX?
kenai.com - xVM Server Project site
58% Spam Drop from one colo shutdown
Xenomips - a Xen friendly domU version of Dynamips - Emulate a Cisco 7200
Debian and Android dual-boot on the G1
Sipper (SIPr) - a SIP testing framework in ruby
DBslayer - a SQL abstraction layer using JSON
Fingerworks keyboard in a MacBookPro
The Phoenix BIOS hypervisor is Xen
Do you live in a Constitution-Free zone?
Puppet presentation at NYCOSUG this month
XenSmartIO - Infiniband IO for Xen
Starting with b100, OpenSolaris has virtual consoles
OpenSolaris testfarm build server interface now available
Firefox M9 Fenric - Maemo alpha
SystemZ - aka Sirius - a port of OpenSolaris to IBM System Z mainframe OS running in z/VM mode
Solaris and ZFS on a Dell 2950, tweaking notes
Early Access Windows PV drivers for xVM
Economics: The Theory of Interstellar Trade
The Financial Crisis: What Happened and What's Next?
Cisco to run Windows 2008 on their appliance virtually for services
Packetfence: an OpenSource Network Access Control system
persist.js - an alternative to gears
Chinese building "impossible" EM drive
COMSTAR SMTF - solaris FC, SAS, and iSCSI targets
Flexiscale - yet another control panel?
RightScale - cloud control panels?
Criticial ESXi remote vulnerability in openwsman
Picking the right virtualization technology requires a basic understanding of what is available out there today.
Rik Van Riel has put up the virt.kernelnewbies.org page that shows a number of the existing virtualization methods. You might want to peruse this first to get a feel.
"Bare Metal" or "Raw Iron"
Basic computing today typically occurs on "Bare Metal". This would be where your Operating Systems is installed directly on a given hardware platform. This "Raw Iron" role is how most people treat computing platforms today.
Some higher end hardware platforms offer "Hardware Partitioning". This is where the hardware platform is divvied up between multiple parallel operating systems at the same time. The hardware platform offers up CPUs, memory, and disk to independent operating systems that then run on the resources allocated to them. This isn't as much virtualization as it is resource partitioning. An example of this would be higher end Unix hardware like Sun T1 processor based servers: each hardware platform can be broken up into 32 "LDoms", each with its own install of Solaris.
VPS "Containers" - Security/Role based Virtualization
If your userspace applications don't require unique kernel services to operate, you get far more density with a VPS "Container" solution than with any other virtualization method. Simply put, all of your userspace applications share one kernel and are separated from each other via role based security mechanisms.
There are a number of different VPS technologies out there, each with its own benefits and limitations:
OpenVZ/Vserver
Linux-Vserver
Solaris Zones
BSD Jails
Solaris Zones is the only VPS platform that supports running other flavors of Unix under its "BrandZ" containers. With it, you can run a number of 32bit Linux guest flavors alongside various Solaris/OpenSolaris versions.
OpenVZ has relatively new support for IPTables as well as IPSEC independent to guests, as well as live migration.
Simply put, you should really spend some time verifying that a VPS solution won't solve your virtualization problems first. They are the best method of virtualizing with the least amount of overhead and the highest virtualization density.
User-Mode-Linux
If you need a unique kernel for each virtual machine, and don't mind a bit of overhead, User-Mode-Linux provides a secure jail with a Linux kernel, running entirely in userspace.
Using "skas0", a User-Mode-Linux kernel can boot and run under and Linux kernel without much host kernel support (usually only tuntap networking). The I/O performance of User-Mode-Linux does suffer somewhat, however, and RAM allocation per virtual image isn't as ideal as a VPS solution.
The obvious benefit is the ability to run an manage a User-Mode-Linux virtual server as userspace processes on any "standard" Linux kernel.
If you're going to use User-Mode-Linux, I strongly suggest trying Xen paravirtualization instead. The only thing that User-Mode-Linux buys you is the ability to oversubscribe memory based on host kernel virtual memory paging. Xen doesn't let you overcommit RAM as associated with guests (though it does let you change the running memory footprint on the fly, unlike User-Mode-Linux which pre-allocates it from tmpfs).
User-Mode-Linux suffers from low I/O throughput however, and tends to fall apart under load.
Paravirtualization
Paravirtualization uses a technique of "cooperative virtualization" between guests and a hypervisor. Simply put, a paravirtualized guest virtual machine is aware that it is running under a virtual environment, and adapts to this environment as appropriate.
Xen's hypercall API is well documented, and has been available to the community longer than VMWare's VMI interface. As such, there are a number of Xen "PV" ports including FreeBSD, OpenBSD, and OpenSolaris, as well as the native Linux port that Xen embraces as part of the current opensource Xen platform.
Xen is slowly being ported into the Linux kernel proper, but there is much developer pushback to each stage of the import effort. Instead, the Linux Kernel Maintainers are gung-ho about Rusty's l-guest (previously known as "l-hype") as a paravirtualization platform for future Linux kernels. At this time, l-guest is very immature and quite slow, not nearly ready enough to consider for a production deployment.
VMWare opened up their VMI specification for everyone to use, to entice systems developers to standardize on a paravirtualization API. Providing this VMI interface would allow VMI aware guests to run under VMI aware hypervisors. Unfortunately, the device interface doesn't appear to have made the cut, so guests still need to be aware of paravirtualized devices as well.
Xen PV "backend"" devices appear on a XenBus, and are accessed using a PV "frontend" device driver. Natively, the opensource Xen 3.0 only has Linux 2.6 PV drivers. The various Xen ports of FreeBSD, OpenBSD, and OpenSolaris each have their own PV "frontend" driver implementation.
VMWare ESX uses their LSI SCSI device driver and VMX networking driver to optimally talk to virtual devices. These are available for a number of operating systems and are far more mature than Xen.
Some of the benefits of a paravirtualized guest include the ability to reallocate resources on the fly from the hypervisor (changing memory footprint, hotplugging CPUs) and more integrated lifecycle management (reboot, suspend, migrate).
Both Xen and VMWare ESX are hypervisor approaches with the ability to run paravirtualized guests on intel class hardware.
Xen 2.0 was initially offered only a paravirtulized "PV" mode of operation. Xen 3.0 offers it as well, alongside Hardware Virtualized "HVM" that we will over in the next section.
System Virtualization - Virtual Bare Metal
If VPS, User-Mode-Linux, and Paravirtualization aren't adequate to the task you have at hand, it might be time to consider full system virtualization.
This mode of operation is normally much more resource intensive, and is far less scalable than the earlier virtualization methods. However, for some Operating Systems (like Microsoft Windows), there really are no better choices at the moment.
Full System Virtualization is done in a number of ways.
The entire virtual system memory address space is pre-allocated, and appears to the virtual machine to be a linear address space regardless of how it is actually mapped from the physical hardware address space.
A system BIOS boots inside this address space, much like a full PC's BIOS would boot, providing a real-mode int13 interface to emulated chipsets inside the virtual machine. The Operating System boots and loads devices drivers to interface with the emulated chipsets. As far as the Operating System is concerned, it is running "Bare Metal".
There are a few methods of full system virtualization: software emulation only, software code-scanning and emulation, hardware only, hybrid software with hardware assistance. The difference is really in how each uses Intel VT (vmx) or AMD V (svm) CPU virtualization.
A CPU software emulation only approach is slow. QEMU (without kqemu), BOCHS, older versions of SoftPC for Mac, etc, are prime examples of this. The benefits are that a non-intel hardware platform can run emulated intel software, and that the emulation can be run entirely (if not inefficiently) in userspace.
A CPU software code-scanning and emulation approach is much faster than software emulation only. Guest code pages are scanned for illegal instructions, and illegal code is "trapped" to handle opcodes and operations that would endanger other virtual machines outside of a given virtual machine sandbox. This method only works on like architectures (intel code scanning on intel hardware) and doesn't require any special CPU support for hardware emulation. QEMU (with kqemu), Win4Lin, Virtuozzo, and a number of other "pre-VT" system virtualization technologies used this approach.
A CPU hardware assisted only solution is really limited to two implementations at present. The Linux kvm project allows full system guests to run under a linux host kernel using a modified QEMU to present the virtual emulated chipsets and other system features. Likewise, Xen's Hardware Virtual Machine (HVM) does the same, only running natively under the Xen hypervisor instead of as under a Linux kernel.
A hybrid software with CPU hardware assistance approach can be a bit faster than hardware assisted virtualization alone. VirtualBox is the only opensource project of note at the moment that does this. Commercially, VMWare and Parallels both use this hybrid approach to accelerate system virtualization.
Of the full system virtualization technologies, VMWare is by far the most mature and fully featured. It is, however, commercially licensed. While you can get "Free" versions of VMWare Player and VMWare Server, there are real limitations as to how scalable either are, and what you can do with them.
VMWare Workstation is the "bleeding edge" version of VMWare. All innovations happen on that platform first. The stripped down player is based on VMWare Workstation. Eventually, many of these innovations make their way back into the server grade versions of VMWare.
IBM's power hypervisor is the oddball here, but it's important to mention. iSeries/pSeries have collapsed onto the Power5 hardware architecture with the hypervisor based i5/OS. Using Transitive's x86 emulation, this platform will (soon? already?) run "hundreds of virtual PCs" as well as AS/400, AIX5L, and native Linux on a single hardware platform. Heck, with Fundamental's FLEX-ES, UMX's Virtual Mainframe Facility, or even hercules, you can even emulate a zSeries mainframe.
Unfortunately, power5 hardware isn't commodity PC hosting gear. And that's probably the kind of hardware you're looking at, isn't it?
So, you really really want to use Xen?
First, lets consider the "flavors of Xen".
There are three primary "flavors" of Xen: Opensource Xen, XenSource Enterprise/Express, and Virtual Iron's Xen.
As we're still talking about full system virtualization rather than paravirtualization from this point on, it's important to realize the speed impact of using emulated chipset devices and generic device drivers rather than PV device drivers to access disk and network resources.
Xen uses QEMU to emulate a Intel PIIX3 IDE chipset (with some PIIX4 features), and a Realtek 8139 network card. While the IDE chipset emulation is bearable, it does incur a bit of CPU overhead in dom0 as QEMU emulates the chipset. The network emulation, on the other hand, is abysmal. Upload rates are "ok" at 6mbit+, but download rates are below 1mbit in speed, running on standard commodity PC hardware. While it could be a mere IRQ issue, it is important that you realize that running with the IDE drivers and RTL8139 drivers inside your guest are going to significantly impact your virtual system's performance.
This is where PV drivers come in.
OpenSource Xen and XenSource both have a XenBus upon which "PV devices" appear. Virtual Iron reworked their XenBus into NexBus, largely to support live migration of HVM guests, and likewise have their own unique "PV devices".
Each "flavor" of Xen needs a different set of PV device drivers.
OpenSource Xen 3.0 has been incorporated into a number of Linux Distributions: SuSE 10.1, RedHat Enterprise Linux 5, Fedora Core 6, Debian Etch, Ubuntu Edgy, and Gentoo are just a few.
The Xen project includes "unmodified_kernel" drivers for Linux 2.6. This means, if you want to run full system virtualization using Xen HVM, you only have the option of building Linux 2.6 PV drivers for your guest.
Only Novell's SuSE 10.2 commercial "Xen pilot" will have Windows PV drivers. There are no other OpenSource Xen device drivers for Windows at this time.
XenSource Enterprise/Express, on the other hand, have their own PV device drivers. While you can "almost" use the XenSource PV device drivers with the OpenSource Xen, there is much talk of data corruption and general "that just shouldn't work" messages on the IRC channel from XenSource developers. Simply put, if you run the commercial XenSource product, you should use the XenSource drivers.
Likewise, Virtual Iron has their own device drivers that are unique to their hosting platform. Their "vstools" support one version of SuSE 9 and one version of RedHat Enterprise Linux 4 (U2) in addition to their Windows drivers. While you can download the domu sources from their website, good luck trying to get them running on a linux kernel newer than around 2.6.9. I know. I've tried. If you want to run a Linux guest in Virtual Iron, you're pretty much limited to RHEL4U2. Good luck with anything else.
What if I just want to run Windows under OpenSource virtualization?
OpenSource Xen doesn't have the PV drivers yet. It will be too slow for you to really use in a production capacity.
VirtualBox.org would be my suggestion to you. It includes device drivers that seriously speed up the Windows experience and make it a viable full system virtualized environment for opensource based windows hosting.
If you don't mind forking out the coin, Virtual Iron has a good Windows virtualization platform that is much cheaper than VMWare, and is licensed per socket. With it, you get live migration and vendor support.
If you seriously have no qualms about the cost of the virtualization and want a mature top notch platform, fork out the cash for VMWare ESX.
If none of these solutions seem good to you, look at the "free" VMWare Server. It is based on mature VMWare GSX tech (though features have been whittled down in places) It doesn't scale as well as VMWare ESX, but the cost point is much easier to swallow (free as in beer).
Use the best tool for the job. Move on to the larger business problems. How is that SOA deployment going, anyway? ;)
What exactly is the difference between Paravirtualization (PV) and Hardware Virtualization (HVM) with regard to Xen?
This question continues to come up again and again. Rather than answer it in a private email or rather useless IRC chat room, it seems best to summarize it in a blog post.
Paravirtualization means that guests "cooperate" with the virtualization they run under. This means that paravirtualized virtual machines are aware they are running in a virtual environment, and have special drivers or awareness of that environment as they run.
In the case of Xen, all guests run under the guidance of a tiny Xen "hypervisor".
Think of a hypervisor as a microkernel (remember when those were big?), that is responsible for allocating RAM, acting as an intermediary for IO, routing hardware interrupts, and scheduling a fair share of CPU time to each virtual machine.
By default in Xen, one virtual machine talks to the PCI hardware, doing all IO for the others. In Xen parlance, this is "domain 0" (or "dom0"), which is a master OS that talks to the hardware on the box and provides IO resources like networking and disk space to the other domains on the physical machine. There are Linux, OpenSolaris, and FreeBSD dom0s now, it isn't just linux.
With Paravirtualization, your guest kernels need to be compiled to be aware of the Xen hypervisor, with a special Xen patch set. These kernels cannot run without the Xen hypervisor. They require Xen to operate.
The "new" thing out there is something that Xen coins Hardware Virtualized Machines (HVM). Normally, x86 based Operating Systems run with a kernel running in "ring0". Historically, only one Operating System can run as ring0 on a x86 based PC. Now, both Intel/AMD have added special VT/SVM CPU extensions that allow a special "privileged mode" of operation where a hypervisor can run multiple Operating Systems in ring0 at the same time.
Historically, without these VT/SVM instructions, you have to scan every code page for illegal instructions and/or trap instructions to emulate dangerous ones. This is how VMWare initially worked (today VMWare is a software/hardware hybrid that is aware of VT/SVM instructions). This is how QEMU, Parallels, Microsoft Virtual Server, and other virtualized PC platforms initially functioned.
With Hardware virtualization, you install an Operating System from CDROM just as if it is a physical machine. There is a BIOS, there is a VGA display (VNC/SDL), there are emulated IDE and RTL8139 network chipsets. The hardware is actually borrowed from the QEMU project, but thanks to the VT/SVM instructions, there is no need to scan the code or trap illegal CPU instructions the same way as the previous generation of PC virtualization had to.
The mainframe has had Hardware Virtualization since at least the OS/360 days. This is only something new to the PC platform.
While this is a fun essay, and I'd love to go on at length, I think this answers the initial question adequately.
If anyone else has any questions, please feel free to join us on ##xen on freenode, or drop me an email. Please don't be suprised if I post the answers here.
Xen documentation is sorely lacking. Lets try and change that, shall we?