An Open Access Peon

26 October 2011

Ubuntu 11.10 on Sun X4540 ("Thumper")

I installed Ubuntu 11.10 Server x64 edition using the Java ILOM client - you must use a 32bit Java client to connect a CD image. I set up a MD RAID1 across the bootable devices of controllers 0 and 1 (first disks on 0 and 1).

Post-install I encountered a blank, black screen on the GRUB 2 stage (i.e. after the kernel selection screen). To fix this edit the boot parameters on the kernel selection screen:

set gfxpayload=text ... linux [...] rootdelay=90
To make these changes permanent after booting:

sudo vi /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="rootdelay=90" GRUB_GFXMODE=text sudo update-grub

(reboot to check everything worked correctly)

Installing native ZFS and setting up a ZFS pool

sudo apt-get install python-software-properties sudo add-apt-repository ppa:zfs-native/stable sudo apt-get update sudo apt-get install ubuntu-zfs

I created a small Perl script to set up a zpool over the remaining 46 disks. The scheme I use is 4 hot spares (the first disk of the other 4 controllers) with 7 raidz RAIDs over 1 disk per controller.

Usage:

Warning! This will destroy any data you have on the disks in your system. Only use this if you *really* know what you're doing.

sudo perl [script.pl] --create

The source code for the script:

#!/usr/bin/perl use Getopt::Long; use strict; use warnings; my $usage = "$0 [--dry-run --create --destroy --dump]"; GetOptions( 'dry-run' => \(my $dry_run), 'create' => \(my $opt_create), 'destroy' => \(my $opt_destroy), 'dump' => \(my $opt_dump), 'help' => \(my $opt_help), ) or die "$usage\n"; die "$usage\n" if $opt_help; my @DISKS; open(my $fh, "<", "/var/log/dmesg") or die "Error opening dmesg: $!"; while(<$fh>) { next if $_ !~ /Attached SCSI disk/; s/^\[[^\]]+\]\s*//; die if $_ !~ /sd\s+(\d+):0:(\d+):0:\s+\[(\w+)\]/; my( $c, $t, $dev ) = ($1, $2, $3); $DISKS[$c][$t] = "/dev/$dev"; } shift @DISKS while !defined $DISKS[0]; if( $opt_dump ) { foreach my $i (0..$#DISKS) { print "Controller $i:\n"; foreach my $j (0..$#{$DISKS[0]}) { next if !defined $DISKS[$i][$j]; print "\t$j\t$DISKS[$i][$j]\n"; } } } # system disks $DISKS[0][0] = undef; $DISKS[1][0] = undef; my @spares; for(@DISKS[2..$#DISKS]) { push @spares, $_->[0] or die "Missing disk at $_:0\n"; } my @pools; foreach my $i (1..$#{$DISKS[0]}) { foreach my $j (0..$#DISKS) { $pools[$i - 1][$j] = $DISKS[$j][$i] or die "Missing disk at $j:$i\n"; } } for(@pools) { $_ = "raidz @$_"; } if( $opt_destroy ) { cmd("zpool destroy zdata"); } if( $opt_create ) { cmd("zpool create -f zdata @pools spare @spares"); } sub cmd { my( $cmd ) = @_; print "$cmd\n"; system($cmd) if !$dry_run; }

05 July 2011

IPv6-Only Ubuntu 10.04 LTS

To load eth0 on start-up without a configured IPv4 address /etc/network/interfaces:


auto eth0
iface eth0 inet manual
up ifconfig eth0 up
# iface eth0 inet dhcp


For the Google IPv6 public DNS servers add to /etc/resolv.conf:


domain localdomain
search google.com
nameserver 2001:4860:4860::8888
nameserver 2001:4860:4860::8844

11 February 2011

libxml2 and libxslt supported XPath functions

The following functions are supported by the libxml2 library for use in XPath statements and hence supported in libxslt for use in transforms. For more information see xpath.c in the libxml2 source.

Note: string length is 'string-length' and not just 'length'!

last()
position()
count()
id()
local-name()
namespace-uri()
string()
string-length()
concat()
contains()
starts-with()
substring()
substring-before()
substring-after()
normalize-space()
translate()
not()
true()
false()
lang()
number()
sum()
floor()
ceiling()
round()
boolean()

19 January 2011

CMIS vs Google Documents API vs SWORD

The Atom protocol is a very simple mechanism for publishing news feeds - that is, date-ordered small bits of information. An Atom feed is a collection of Atom entries. Each entry contains some basic metadata (title, id) and may have links to other resources. Links of particular interest are 'edit' and 'edit-media' which, respectively, refer to the entry's metadata and media file.

<?xml version="1.0"?>
<entry xmlns="http://www.w3.org/2005/Atom">
<title>The Beach</title>
<id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
<updated>2005-10-07T17:17:08Z</updated>
<author><name>Daffy</name></author>
<summary type="text" />
<content type="image/png"
src="http://media.example.org/the_beach.png"/>
<link rel="edit-media"
href="http://media.example.org/edit/the_beach.png" />
<link rel="edit"
href="http://example.org/media/edit/the_beach.atom" />
</entry>


The Atom Publishing protocol (or AtomPub) provides a protocol to add new entries (i.e. to publish to feeds). AtomPub uses the HTTP POST, PUT and DELETE methods to, respectively, create, update or delete entries.

To create an entry the client POSTs an Atom entry to the feed's URL. To update an entry the client PUTs an Atom entry to the entry's URL (replacing anything already there). And lastly, DELETEing an Atom entry URL destroys that entry. The protocol itself is quite readable so I suggest going there if you're lost!

AtomPub is sufficient if you just want to post small entries in XML but often the client wants to e.g. publish a photo, which the new Atom entry will then refer to. There are several approaches to this but the simplest is to use the Atom Multipart Media Resource Creation mechanism, which bundles together the Atom entry and the media file into a single POST.

Atom/AtomPub provides us with a fairly simple tool to publish items onto a Web site. As Institutional Repository (IR) developers we, unfortunately, require a more complex model than just a feed of entries containing one file each. We have more complex metadata and multiple files making up an object. An editorial workflow means items uploaded by users must first be checked by editors before they can be published. There are various other aspects to consider that I won't go into here.

So we like the simplicity of Atom/AtomPub but it doesn't fulfil all of our requirements. Fortunately it is easy to extend AtomPub by injecting additional links and metadata into entries. These links can connect to other URLs that allow complex manipulations to be made on the underlying data structure (hence also to create more complex data structures). OASIS CMIS, SWORD and the Google Documents API are all extensions of AtomPub better known as "AtomPub Profiles". (I'm sure there are others but these are the obvious candidates for IR use.)

OASIS Content Management Interoperability Services (CMIS) is over 200 pages long but, in part, describes an AtomPub profile. I concur with the sentiment here that being asked to implement CMIS won't make your developers happy. The model underpinning CMIS has a hierarchical folder structure. By supplying a special tag in a POST to a feed an Atom entry is created that points to another Atom feed (or 'folder'). In this way Atom entries are effectively typed to be either a 'document' or a 'folder'. Atom entries can be moved to other folders by POSTing them that folder's feed. There is lots more that CMIS adds in, to the extent that I forget what's at the beginning before I get to the end!

root feed
|
|-- document entry
|
|-- document entry
|
|-- folder entry
|
|-- folder feed
|
|-- document entry
|
|-- folder entry
...


The Google Documents API is in a different sphere to CMIS and SWORD. It is specific to Google's API so would need tweaking to be used elsewhere. Similarly to CMIS special syntax passed during a POST to a feed can create a folder-type Atom entry. This entry then points to a new feed which can in turn contain a mix of folder-entries or normal entries. Google support a number of parameters to modify the default behaviour of a URL, for instance downloading a document in a different format.

SWORD is an AtomPub profile developed to support deposit in IRs. SWORD v1 adds several HTTP headers to support more complex publishing behaviour. Often in the repository world one user will be depositing on behalf of another (doctoral student depositing her supervisor's old papers ...). To support this SWORD added the X-On-Behalf-Of header, which supplies the username of the user to deposit as - assuming the current user has permission to do that. Another part of SWORD v1 was to support more complex objects (i.e. multiple files) by defining 'packages'. Packages are collections of files and metadata and are archived together then published, with the server unpacking them to create the complex object. SWORD v2 (at time of writing) will look similar to the previous version but will define means for clients to interact with the packages after upload. OAI-ORE is used to describe the unpacked complex object while content-negotiation will be used to allow clients to retrieve the complex object in agreed format.

repository feed
|
|-- document entry
|
|-- document entry
| |
| |-- OAI-ORE/RDF
|
|-- document entry
...


All three AtomPub Profiles probably work with a client speaking just AtomPub. The question that is left is which extension of AtomPub is best adopted to achieve our goals. I don't think any of these protocols are entirely satisfactory: SWORD feels like it is working around AtomPub rather than building on it (publishing .zip files?). It isn't clear what IPR Google's profile has nor whether they will take it in a different (incompatible) direction to what we need. CMIS, given it's industrial backing, will likely be essential in the corporate environment but is daunting in its complexity (and that normally means difficult to get right in practise).

Regrettably my influence over any of these profiles is small - as developers we tend to be pushed more by political requirements ("your must support X") than technical merit. I just hope that, given the narrow range these profiles exist in, that they adopt the best bits of each other! (NB I would be interested to hear of any other potential AtomPub profiles)

03 December 2010

Ubuntu Maverick 10.0 on Acer Revo 3700

These are some rough notes on what I needed to get Ubuntu Maverick 32bit working on the Acer r3700.

wireless

While the kernel rt2860pci driver will run the wireless ok it will cause a hard-lock when it is unloaded (e.g. during shutdown). Thanks to Wolfgang Kufner and Marcus Tisoft for providing a solution - replace the kernel driver with a patched driver from Ralink for the rt3090 (comment #9): https://bugs.launchpad.net/ubuntu/+source/linux/+bug/662288.

sound over hdmi (stereo only tested)

Unmute all digital outputs in Alsa (use right cursor + 'M' until they are all green):
alsamixer -c 1
sudo alsactl store


Testing alsa:
speaker-test -D plughw:1,7


Setting pulseaudio to output via alsa:
sudo gedit /etc/pulse/default.pa


Uncomment and modify the line containing module-alsa-sink:
  load-module module-alsa-sink device=hw:1,7


suspend

Fails to suspend ... haven't found a solution yet.

09 June 2010

Enabling support for HTTPS in wkhtmltopdf

If you get the following error when attempting to web thumbshot an HTTPS-based site with wkhtmltopdf in Fedora Core 13:

QSslSocket: cannot call unresolved function SSLv3_client_method
QSslSocket: cannot call unresolved function SSL_CTX_new
QSslSocket: cannot call unresolved function SSL_library_init
QSslSocket: cannot call unresolved function ERR_get_error
QSslSocket: cannot call unresolved function ERR_error_string


You need to add some additional library links. As root do:

cd /usr/lib64
ln -s libssl.so.10 libssl.so
cd /lib64
ln -s libcrypto.so.1.0.0 libcrypto.so


Hopefully that will fix the problem!

NB I resolved this by using strace to find out where wkhtmltopdf was attempting to find libcrypto and libssl (the problem being the static wkhtmltopdf build was looking for different versions than are installed on FC13):

strace wkhtmltopdf https://mail.google.com/ gmail.pdf 2>&1 | less

20 May 2010

Ubuntu 10.04 vnc-based login server

This recipe is for setting up a VNC login server. This allows you to use a VNC client to access a full GUI on a remote server. If instead you want to get VNC access to your desktop (or share with other users) you need to enable remote desktop.

VNC connections are not encrypted so if you connect directly to the VNC server any login details will be sent in the clear.

Install the required packages:

sudo apt-get install vnc4server xinetd gdm


Restrict GDM to only listening to localhost by adding the following to /etc/hosts.allow:

gdm: ip6-localhost


Enable XDMCP in GDM by setting up /etc/gdm/custom.conf as:

# GDM configuration storage

[daemon]

[security]

[xdmcp]
Enable=true
HonorIndirect=false
# following line fixes a problem with login/logout
DisplaysPerHost=2

[greeter]

[chooser]

[debug]


Create a new xinetd service /etc/xinetd.d/Xvnc (adjust geometry to get different screen sizes):

service Xvnc
{
type = UNLISTED
disable = no
socket_type = stream
protocol = tcp
wait = no
user = nobody
server = /usr/bin/Xvnc
server_args = -inetd -query ip6-localhost -geometry 1280x800 -depth 16 -cc 3 -once -SecurityTypes=none
port = 5901
}


Restart gdm (which will close any current logins!) and xinetd:

sudo service gdm restart
sudo /etc/init.d/xinetd restart


You can then connect to the VNC server using:

vncviewer localhost:5901