Search:
Where I Work
NKS
Subscribe
Add to Google
RSS 0.91
RSS 1.0
RSS 2.0
ATOM 1.0
RSS 2.0 and ATOM
Network
View Ian's profile on LinkedIn
Archives
2007 April (1)
2007 February (1)
2007 January (4)
2006 December (2)
2006 November (2)
2006 September (5)
2006 August (4)
2006 July (1)
2006 June (3)
2006 May (2)
2006 March (4)
2006 February (4)
2006 January (1)
2005 December (8)
2005 November (26)
2005 October (10)
2005 September (17)
2005 August (87)
2005 July (48)
2005 June (34)
2005 May (24)
2005 April (243)
2004 April (1)
2004 February (3)
2003 August (2)
2003 June (2)
2003 May (8)
2003 January (1)
2002 September (1)
2002 July (4)
2002 June (2)
2002 May (5)
2002 April (15)
2002 March (15)
Projects
CornFS
DENSO NAV
Rage Powered
Tampa Bay
TampaBad
SLUG
ob-buttons
Creative Commons OpenSource Linux Individual-i GeoURL Linux Speakeasy Speed Test
Twitter

follow icblenke at http://twitter.com
Google
Ian's shared items in Google Reader (subscribe)

CONSUMER AND GOVERNMENTAL AFFAIRS BUREAU EXTENDS EXPIRING CERTIFICATIONS FOR CERTAIN PROVIDERS OF VIDEO RELAY SERVICE AND IP RELAY SERVICE

CONSUMER AND GOVERNMENTAL AFFAIRS BUREAU EXTENDS EXPIRING CERTIFICATIONS FOR CERTAIN PROVIDERS OF VIDEO RELAY SERVICE AND IP RELAY SERVICE

Structure and Practices of the Video Relay Service Program

The YouTube Video You Don’t See

Example Show

Shop with confidence across the web

Helicopter view of your driving directions on Google Maps

Google CIO and others talk DevOps and "Disaster Porn" at Surge

Burning Man 2011 - Yes we were there.

September 08, 2011

Getting Started on the Google API

CACertMan app to address DigiNotar & other bad CA’s

Tangled

Custom Class Loading in Dalvik

Jingle Adventures contd…

TWO REPORTS OF ADVISORY COMMITTEES ON DISABILITIES ISSUES RELEASED

Join the White House Disability Group Monthly Call on July 27

Multiple APK Support in Android Market

Debugging Android JNI with CheckJNI

Android 3.2 Platform and Updated SDK tools

Geektalk

Believe in yourself

Forever alone involuntary flashmob

PS3 root key released - sign and run anything

lunar eclipse shadow on earth

hotpot NFC tags in portland

Oh, little bobby tables

Don't have a front-facing camera?

Tango.me

Looxcie

Mobile phone product testing: Models

Visual 6502

Extruding Light

Foam Printer

How Can the LHC withstand 1 Petabyte of Data a Second?

Linus Torvalds is now officially a US Citizen

Backin up quartet

Oh, hell yes.

Portland bike lanes get mario symbols

Skype RC4 claimed reverse-engineered

Best ever cease and desist

wkhtmltopdf - just awesome

Measurement Lab - Google IO BigQuery session is live querying 60 billion rows instantly

All you need is a little egotism, and $6

Examply punycode link

Convert IDN punycode to/from native characters

Sparkfun free day tomorrow: 1/7

websockets

C thulu ftagn recursion

Need a recursive DNS server? Use 8.8.8.8 and 8.8.4.4

Google Public DNS

JIQL - Java JDBC wrapper for Google DataStore

OpenNebula

Trillions

ZFS L2ARC ZIL on SSD

Swimming in OpenCL

Unicorn == Mongrel delayed_job

Remus - Transparent HA for Xen

Go

What DNS is not

Crossbow Virtual Wire Demo Tool

Banner ads on flies

PoolParty

Eucalyptus MySQL SOLR RabbitMQ Varnish == Nebula.nasa.gov

Nebula.nasa.org

Ubuntu Enterprise Cloud (UEC)

Evernote

Apple drops ZFS due to legal concerns

Peering disputes between Cogent and Hurricane Electric

Equinix to acquire Switch and Data for $689 million

We Are All Connected

Project kxen renamed project HXEN

Pomegranate Phone

Lessconf Jacksonville - followed the next day by Barcamp

Stick-figure guide to advanced AES crypto

Why you should pay attention to Google Wave

rails-primer - how to easily host rails projects on appengine

AppEngine-JRuby on google code

Ruby on Google AppEngine: appengine-jruby video

Dataliberation.org - The Data Liberation Front - a group concerned with moving data in and out of google

Detecting Spammers with SNARE: Spatio-temporal Network-level Automatic Reputation Engine

Proxmox VE - OpenVZ KVM Cluster appliance management

Sun/Oracle kill of SXCE: Sysadmins everywhere cry in horror.

Essentials of Metaheuristics

making water drinkable through nano-filtration

Pigin 2.6.1 adds Xmpp voice and video support

Opera Unite

Setting up a Layer-3 tunnel VPN using ssh 4.3 and -w option tun devices

shadowserver.org - botnet hunting resources

OpenBSC - a Siemens BS-11 microBTS or a ip.access nanoBTS == your own GSM tower

Voxbone's 883 country code

Apple keyboard firmware hack

Karesansui Project - a Xen management harness from Japan

eunicycle

Pygowave Server - Run your own Google Wave server

Happy Sysadmin Day!

Bokode

Bass cannon

Xen clocksource0 time went backwards

Internet vs World Population stats

BBC article on sat-3 cut

sat-3 cut

iPeak - RAIN

Asankya - RAIN

Apple pulls Google Voice app from iPhone - AT&T's fault

HadoopDB

live-android boot ISO - very neat

How to update your GeoIP information in addition to SWIPping

EATR

Google Wave hackathon on 20th/21st, if you happen to be in Mountainview

Did I mention OTOY here before?

NeatX - NX for Ganeti

STuPiD - STUN/TURN using PHP in Dispair

Aviary.com

Browser based Server-side 3D gaming from OTOY

Cisco's replacement for the WRT54GL is the WRT160NL

Spinn3r.com - Index the blogosphere

Team ARIN

Parts of galaxy Messier 87 are missing

DRAEGER ALCOTEST 7110 MKIII-C Evaluation of Breathalizer Source Code

Cyclops

Google's AJAX playground

How Michael Osinski Helped Build the Bomb That Blew Up Wallstreet

Bruce Perens - A Cyber-Attach on an American City

How Google and Facebook are using R

adito - the new gpl fork of the old sslexplorer project

A date idea: forklift sunset

Psytechnics - VVoiP QoE

r1soft cdp

IP Address geolocation for free

Shapeways - $50 "3-D poem rings" until the end of the month

GrandCentral to become Google Voice

Wolframalpha is coming

Hosted Xen Project

VirtualGL X11 transport

TurboVNC VirtualGL == FAST network GL

Ben Rockwood's presentation at the OpenSolaris Storage Summit: ZFS in the trenches

The Crisis of Credit Visualized on Vimeo

10gen - a java based app hosting infrastructure

Engineyard Vertebra - another cloud infrastructure management harness

Eucalyptus - an opensource EC2 compatible hosting infrastructure

asciicasts.com

railsbrain.com <-- ajaxified rdoc

AP IMPACT: SWAT Teams Deployed in 911 fraud

Lessons learned by people who have quit Google

Makwana indicted for Fanny Mae malware

"physicalized" servers

Zentific svn repo: alpha available

Holographic Space-Time ?

DACS - Distribution and Configuration System - version 2.0

Video of Cisco IOS attack talk at Chaos Computer Conference

Cosmic radio background noise 6 times higher than expected

We get a leap second tonight

Grow your own bioluminescent algae

Johnson and Ruby/Javascript

Two turntables and a git repo

Quartz Composer and Cruise Control status

Truthy and stupid.rb

The nature of truth

Get2Human

Sunay Tripathi's Solaris Networking Blog

Merry Christmas from XKCD

Merry Christmas from Chiron Beta Prime

Prius Emergency Generator

German folk tune Jazz improv

Memcached speed improvements

FSF sues Cisco

Asterisk Vishing Alert

Google's Native Client... the next ActiveX?

Waterballs

YAGNI development assistant

HA-xVM demo video posted

Kemari 1.0 released - HA Xen

The Decline and Fall of Agile

Zone Alarm 2009 Free Tomorrow

kenai.com - xVM Server Project site

58% Spam Drop from one colo shutdown

Xenomips - a Xen friendly domU version of Dynamips - Emulate a Cisco 7200

Debian and Android dual-boot on the G1

Sipper (SIPr) - a SIP testing framework in ruby

DBslayer - a SQL abstraction layer using JSON

Clojure - JVM based LISP dialect with immutable persistent data structures that are inherently thread safe

Fingerworks keyboard in a MacBookPro

NfSen - Netflow Sensor

The Phoenix BIOS hypervisor is Xen

Do you live in a Constitution-Free zone?

Puppet presentation at NYCOSUG this month

Kemari - Xen lock-step HA

XenSmartIO - Infiniband IO for Xen

Starting with b100, OpenSolaris has virtual consoles

OpenSolaris testfarm build server interface now available

Firefox M9 Fenric - Maemo alpha

SystemZ - aka Sirius - a port of OpenSolaris to IBM System Z mainframe OS running in z/VM mode

40.8% efficient solar cell

FREDNET

World sunlight map

Solaris and ZFS on a Dell 2950, tweaking notes

Logstalgia

Early Access Windows PV drivers for xVM

Economics: The Theory of Interstellar Trade

COMSTAR Admin Guide PDF file

The Financial Crisis: What Happened and What's Next?

3.5" DIY SSD drive

Microsoft usurping ODF

Cisco to run Windows 2008 on their appliance virtually for services

Packetfence: an OpenSource Network Access Control system

Public.resource.org

persist.js - an alternative to gears

Chinese building "impossible" EM drive

Supertinykeyboard

COMSTAR SMTF - solaris FC, SAS, and iSCSI targets

Flexiscale - yet another control panel?

RightScale - cloud control panels?

GoGrid, a servepath company.

OSCON in 37 minutes

Criticial ESXi remote vulnerability in openwsman

Parasitic power

Microsoft FUD on VMWare: vmwarecostswaytoomuch.com

nmap builds zenmap topology maps

Mon, 26 Jun 2006

Starting down the ActiveSalesforce path, my first goal was to do a simple dump of a class of objects to yaml using the API.


$ gem install activesalesforce
$ cat - <<EOF > dump_accounts.rb
#!/usr/bin/ruby

require 'rubygems'
require_gem 'activerecord'
require_gem 'activesalesforce'

ActiveRecord::Base.logger = Logger.new(STDERR)

ActiveRecord::Base.establish_connection(
  :adapter => "activesalesforce",
  :url => "https://www.salesforce.com/services/Soap/u/7.0",
  :username => "yourlogin@yourdomain.com",
  :password => "yourpassword"
)

class Account < ActiveRecord::Base
end

puts Account.find(:all).to_yaml
EOF
$ chmod u+rx dump_accounts.rb
$ ./dump_accounts > accounts.yml

Next step: figure out how to handle user authentication for Account Contacts...

Wed, 21 Jun 2006

The following is a post I've just made to the pgcluster-general mailing list. As it is blog worthy, it seemed appropriate to post here.

I've been testing a pgcluster running 1.5.0rc7 with pgbench 8.0.2.

I have 6 servers in a cluster:

2 pglb servers (2.6 kernel debian, amd sempron 2800, 1G RAM, 2 IDE drives software RAID1) 2 pgreplicate servers (2.6 kernel debian, amd64x2, 4G RAM, 2 IDE drives software RAID1) 2 postgres database servers (2.6 kernel debian, amd64x2, 4G RAM, 4 IDE drives software RAID10)

The pgbench page is:

http://www.sitening.com/tools/postgresql-benchmark/

It's a simple build:

$ wget http://www.sitening.com/pgbench.8.0.2.c
$ gcc -I/usr/include/postgresql -o pgbench pgbench.8.0.2.c -lpq4 -lm

After a bit of postgresql tweaking, I'm finally getting some good numbers (see below).

Things to remember when installing pgcluster:

  1. Your fully qualified hostnames must resolve and match the config.
    • add entries to /etc/hosts if you must, but make sure everything uses actual resolvable hostnames.
  2. Watch your user process limit (ulimit -u unlimited).
    • on the pglb master: pglb will spawn a thread for each pooled connection.
    • on the pgreplicate master: pgreplicate goes absolutely insane with threads
    • on db nodes: postgres spawns a thread for each incoming connection
  3. Your fully qualified hostnames must resolve and match the config.
    • add entries to /etc/hosts if you must, but make sure everything uses actual resolvable hostnames.
  4. Don't forget about the cluster.conf buried in the postgres server configuration directory on the db nodes.
  5. When you run things with "-v", expect a huge slowdown.
    • pglb drops from 12k tps (using "pgbench -S" for select() only) to only 6 tps. (5 orders of magnitude)
    • pgreplicate -v drops to below 1 tps. (2 orders of magnitude)
  6. Setup ssh key trust between servers using the userid that postgres runs as (usually "postgres")
  7. Remember to start slaves with pg_ctl -o "-i -R" the first time to pull down the rsync of the master.
    • this killed most of my "weird" deadlocks with select() only pgbench right away.

Back to the pgbench numbers.

The fastest mode of operation is select() only (pgbench -S):


$ ./pgbench -S -n -v -c 10 -t 1000 -m 10
  transaction type: SELECT only
  scaling factor: 1
  number of clients: 10
  number of transactions per client: 1000

  number of transactions actually processed: 10000/10000
  tps = 8394.268729 (including connections establishing)
  tps = 12815.846538 (excluding connections establishing)

  mean tps = 8399.209915 (including connections establishing)
  standard deviation = 202.800265

  mean tps = 12574.325714 (excluding connections establishing)
  standard deviation = 428.241079

Running this in parallel with an insert()/select() mix doesn't seem to impact it much. Meaning, a select() running in parallel with an insert()/select() run only seems to drop the numbers by 1k-2k tps or so.

To run an insert()/select() mix, run pgbench with the -N flag:


$ ./pgbench -N -n -v -c 10 -t 1000

  transaction type: Update only accounts
  scaling factor: 1
  number of clients: 10
  number of transactions per client: 1000

  number of transactions actually processed: 10000/10000
  tps = 115.539752 (including connections establishing)
  tps = 116.069260 (excluding connections establishing)

These numbers are to be expected with a synchronous replication system like pgcluster. As long as the select() to insert()/update() ratio is at least 9:1 things should be usable.

Trying to run the full "TPC-B (sort of)" mode, pgbench starts throwing update() into the mix.

This is where pgbench starts to deadlock for me.

You can add the "-d" flag to pgbench to debug things if it seems frozen.

The first pgcluster "bug":

It looks like I deadlock almost immediately after spawning pgbench with the following arguments:


$ ./pgbench -n -v -c 1 -t 1000 -m 1 -d
  pghost:  pgport: (null) nclients: 1 nxacts: 1000 dbName:
  message type 0x43 arrived from server while idle
  message type 0x5a arrived from server while idle
  client 0 sending begin
  client 0 receiving
  client 0 sending update accounts set abalance = abalance + 216 where aid = 52606

  client 0 receiving
  client 0 sending select abalance from accounts where aid = 52606
  client 0 receiving
  client 0 sending update tellers set tbalance = tbalance + 216 where tid = 7

  client 0 receiving
  client 0 sending update branches set bbalance = bbalance + 216 where bid = 1

  *deadlock*

The odd part is that only that pgbench seems hung. I can spawn any number of "pgbench -S" and "pgbench -N" sessions I want while that one is stuck, and things seem to continue running.

My second pgcluster "bug":

While doing this testing, I've found that pglb chokes if you request more client connections than it can handle.

In my testbed, I upped the max connections to 300 per server (each search tuned to allow 500 client connections), leaving me with 600 pooled connection threads running on my pglb server.

If I hit pglb with 600 available pooled connections with, say, 1000 pgbench connection attempts, pglb goes into a dead state refusing to accept more connections, even after pgbench is killed.


$ ./pgbench -S -n -v -c 10 -t 10000 -m 1
  transaction type: SELECT only
  scaling factor: 1
  number of clients: 10
  number of transactions per client: 10000

  number of transactions actually processed: 100000/100000
  tps = 11486.270576 (including connections establishing)
  tps = 11991.783709 (excluding connections establishing)

$ ./pgbench -S -n -v -c 1000 -t 10 -m 1
  Connection to database '' failed.
  pglb could not connect to server: no cluster available.
  $ ./pgbench -S -n -v -c 10 -t 10 -m 1
  Connection to database '' failed.
  pglb could not connect to server: no cluster available.
or
  Sorry, backend connection is full

After this state occurs, I need to kill off pglb and restart it, and sometimes this doesn't fix it (and I need to go through and restart the replication servers and the database servers).

In conclusion:

I actually have 3 distinct pgclusters going here, each containing 6 of the aforementioned servers, counting a total of 18 servers:

  • one dev cluster
  • one qa cluster
  • one production cluster

My question to the pgcluster list is: what version of pgcluster is "stable" enough to be used in a production environment?

I'd rather not need to go the direction of Slony-I with something like pgpool or dbbalance (to shunt writes to the master), first due to the complexity of managing these layers, and secondly due to the data coherency lost between the master and slaves (I want atomic synchronous replication).

Then again, this is all for hosting a Ruby on Rails application. We can make application changes as needed.

I hope this helps.

Tue, 06 Jun 2006

Subversion is a wonderful revision control system, until it breaks.

Historically, the BDB store was often corrupted and a "svnadmin recover" is required to rebuild the berkeley databases.

With newer versions of Subversion, FSFS has taken over as the preferred repository store. Faster and much more reliable than BDB, it has made Subversion far more stable than it has been in the past.

Unfortunately, FSFS appears to continue to have rare random corruption issues. Subversion developers are actively working on tracking down the cause, but it remains elusive at the moment.

Though much rarer than its BDB counterpart, repairing a corrupt FSFS store isn't as simple as running "svnadmin recover" - recover only works with BDB.

Recently, we stumbled upon such a cause. The symptoms appear much like the FAQ apr0.9.6 solution, with a difference: local file:// checkouts errored out just the same. As this is an Apache bug fix for poll(), a local checkout really shouldn't be using poll() at all.

Sadly, my users were continuing to commit changes to their subtrees in the repository. As long as they didn't try checking out the corruption affected tree, they could continue doing their work.

Trying to do a "svnadmin verify" results in the same error as the checkout, and an "svnadmin dump" fails just the same (as dump and verify appear to be very similar in nature). Without the ability to dump, it is neigh impossible to backup the revision history to restore to a repository elsewhere. The only daunting alternative was to checkout the various subtrees and hand-commit each revision checkout to another repository... I was suprised to find that an automated script to do such a thing hadn't been written yet.

Step 1 was to visit #svn on irc.openprojects.net and voice my concern at potentially finding an unanswered bug.

The folks on #svn immediately replied "go to #svn-dev and be prepaired to back your claims".

Step 2 then was to visit #svn-dev and lay out the above facts once again.

In the end, the developers suggested I send an email with all of the facts to the subversion user mailing list.

So I put the IRC chat together with the facts and posted "Repository corruption? Problem similar to FAQ#tiger-apr-0.9.6" to the subversion user list.

John Szakmeister soon replied with a suggestion to try:

http://www.szakmeister.net/fsfsverify/

This python script verifies the transactions in a given fsfs revision, and potentially repairs some of the more common problems found.

When run, the following error presented itself:


$ fsfsverify.py db/revs/653
...
NodeRev Id: 1st.g.r653/17936924
 type: file
 pred: 1st.g.r611/34703561
 text: DELTA 653 1668558 16257066 24194048 c0bd2a8b7ee4db1ee816ea607392755d
 prop: UNKNOWN 405 9727810 53 0 113136892f2137aa0116093a524ade0b
 cpath: /cpp/Client/IE/Project/src/Observer.ncb
 copyroot: 178 /cpp/Client/IE
starting length: 16257062
offset: 1668583
Decoded too many bytes
total: 14384890
remaining: 1872172
Traceback (most recent call last):
  File "/root/fsfsverify.py", line 699, in ?
    process(noderev, rev_file, options.dump_instructions,
options.dump_windows)
  File "/root/fsfsverify.py", line 652, in verify
    dump_windows)
  File "/root/fsfsverify.py", line 289, in verify
    digest = parse_svndiff(f, self.length, dump_instructions, dump_windows)
  File "/root/fsfsverify.py", line 188, in parse_svndiff
    raise 'svndiff error'
svndiff error

John suggested that Andrew MacKenzie also reported a similar issue, but he would need a copy of the revision to verify it was the same problem.

Attempting to use fsfsrepair.py's "-f" or "--fix-read-length-line-error" option didn't seem to affect anything at all.

I posted the revision somewhere John could access it.

After looking at it, John thinks this may be a new kind of corruption, and he'll work something into fsfsverify.py to fix it when he can find the time.

In the interim, John suggested truncating the file in the node revision using the following command:


fsfsverify --truncate=1st.g.r653/17936924 653

"This command will basically truncate the file to 0 length in that revision." - John.

Hopefully others will find this blog post along their search for an FSFS corruption fix and consider this potential "fix".

I'm not entirely sure what the data loss will entail, but it may very well solve your immediate problems dumping and restoring the repository elsewhere.

Note: always do an "svnadmin hotcopy" to make a test repository to test on, and immediately take a backup if you can.

Google
 
Web ian.blenke.com