Andrew Cooke | Contents | RSS | Twitter | Previous

Django OpenID: Invalid openid.mode: u'i'

From: andrew cooke <andrew@...>

Date: Tue, 31 Aug 2010 07:23:08 -0400

if you unpack the Django OpenID library you can run a demonstration in Django
by doing:

  cd examples/djopenid
  PYTHONPATH=../.. python manage.py runserver

The PYTHONPATH includes the OpenID library itself, so you don't need that if
you've actually installed the library (using setup.py or easy_install).

However, if you do that, and then authenticate with an ID, you'll see the
error:

  OpenID authentication failed.
  Invalid openid.mode: u'i'

This is an error in the demo, not the library.  It's very easy to fix.  Just
edit the file

  examples/djopenid/util.py

and on line 136 change

  return dict((k, v[0]) for k, v in request_data.iteritems())

to

  return dict((k, v) for k, v in request_data.iteritems())

I assume that MultiDict.iteritems() changed at some point, and now returns a
single value when before it returned a list.

Andrew

Permalink | Comment on this post

Previous Entries

For comments, see relevant pages (permalinks).

Good Intro to LVM

From: andrew cooke <andrew@...>

Date: Sat, 28 Aug 2010 03:55:03 -0400

http://www.ntlug.org/Articles/LVM

Andrew

Permalink

A Chilean Day

From: andrew cooke <andrew@...>

Date: Fri, 27 Aug 2010 18:23:24 -0400

[This article is awesome, BTW -
http://www.nytimes.com/2010/08/29/magazine/29language-t.html ]


Recently I have been spending more time than usual thinking about how
"Chilean" I have become.  Today was a particularly Chilean day, and I enjoyed
it.

First, I needed to go to the centre of town, to visit the Foreign Office, to
get a document (my PhD certificate) officially certified (so that I can apply
for a driving licence, which requires evidence that you have attended primary
school).

So I visited the Foreign Office, queued, waited, was called to talk to a lady,
apologised for not having a clue what I was doing, and explained what I
needed.  She carefully listed the steps involved, gave me a piece of paper
with the address of the Ministry of Justice (I am using the closest English
translations here), and sent me on my way.  I found a Notary (something that
doesn't really exist in the UK - they are a kind of witness for simpler legal
paperwork) and got a copy of my document made.  While I was there a younger
man, in a suit, kind-of jumped intervened (politely) with the woman serving me
- it turns out that he had just qualified as Lawyer and needed a copy of his
(very fancy) signed certificate.  So I and the notary's assistant
congratulated him and, when he went to pay, we discussed how it was expensive
to become a lawyer, but how now he could also earn more money.  Then, after a
short while, the assistant took copies of our documents into the notary's
office, who signed them, and we left.

Next, to the Ministry of Justice, which was almost deserted, but which had
Mapuche (and English!) signs.  There, I apologised again for not understanding
what was happening, and someone else signed the copy.  After that, I returned
to the Foreign Ministry, queued again, and a man quickly revised the
signatures and stamped my piece of paper.

In all this, no-one checked my identity, or translated the text.  And the only
place I had to pay was at the notary, which cost a pound.  But I now have a
"legalised" copy of my certificate, which is the correct piece of paper to
apply for a driving licence.


I got back on the metro and headed for home, but got off a few stops early
because I needed to buy some paint.  A few days ago I had visited a paint
shop, which I found almost by chance, near the driving school where I have been
taking lessons (I can drive, but need the practice before taking my test and,
as far as I can work out, practicing in a private care is illegal - there is
no "learners licence" here, for example).

The first time I visited this shop the friendly attendant had carefully mixed
some paint to match a fragment that had been chipped from our wall by the
workman ("maestro" in Spanish, which is much nicer) who was repairing our
earthquake damage.  It has taken him three or four attempts, adding
progressively more dye to a pot of blank paint, until he had the perfect
match.  But I at the time I had only bought a litre; now I needed more.

The second time I visited the shop, someone else was attending, and he was no
help at all (I had a different problem, which I finally solved by chatting to
the owner and a customer at the hardware store near our house).

Luckily, on my third visit, it was the helpful guy (I was planning to leave if
not).  So I greeted him and explained I needed another pot of paint.  Since
he hadn't recorded the mix, he started again, adding colours to white.  At one
point he had a problem with a dirty dye container, so had to start again with
a new pot.

While he was mixing the paint a policeman stopped in the middle of the
crossroads outside, surrounded by barking stray dogs.  At first I thought he
was angry with the dogs, but it turned out that a march of striking students
were coming down the road and he needed to stop the traffic.  The students
"marched" past, with various banners, chanting the same chants I have heard
before, without ever thinking what they mean (perhaps only the same rhythm and
the words change?).  A girl pushed a leaflet into my hand.  They were striking
against the forced "curing" of homosexuals.

I went back into the paint shop.  I wanted to ask the attendant if it was
really still common to force "treatment" on homosexuals, but I suspected he
would think it a good thing.  And so I said nothing while he complained about
the students (I think) and continued to mix paint.

Some "tough" looking guys came in and shouted at the attendant for attention.
he ignored them.  They tried again, in a more polite tone.  He ignored them a
bit more, drying a sample of my paint with a hair-dryer, and then asked them
what they wanted.  It turned out that they needed a specific kind of paint
roller that he didn't have, so he sent them to a second shop, round the
corner.

Then he came to the window, showed me the sample, which was almost the same
colour as my fragment, but not quite.  So he went back and added a little more
dye.

When the paint was ready I bought some extra plaster I needed, to help cover
the price of the failed can, went back to the metro, and went home.


When I arrived home the maestro was on the scaffold outside.  he was surprised
at the extra paint, and had "stretched" what he had to pain the wall.  but
with the extra he could also paint an adjacent surface, which would look
better (and, indeed, the final result is excellent - the wall looks almost
like new, and the "patch" looks more like it has been washed clean than
repainted).


Back inside, I started work.

When the maestro came in I went out to the hallway and we discussed the
plastering there - he had tried to texture it to match the existing work.  We
agreed it wasn't perfect, but that it would do, and that he'd start painting
after lunch.

Since he'd finished with the scaffolding I called Paulina to ask her to call
the company to take it away.  She soon replied to say that they couldn't be
coming til Monday.  That was a problem, as the scaffold is where her car is
normally parked, so I discussed it with the maestro, who commented that it was
just what he'd expected (he had argued with the person who delivered the
scaffolding and we'd eventually got more planks from the company).  But he
could disassemble it now, so Paulina could still bring the car home.

Back to work, with the maestro having a late lunch writing a quote to do the
same work he had done for me for the rest of the building.  Much time later he
appears, asking if I can print out what he's written on my computer.  I agree
and type it in.  I add a few extra phrases from a quote I had received for
some earlier work (from this man's grandson - that work was how I got to know
him).  After installing a Spanish dictionary I got most of the text OK (I am
not sure whose spelling was worse), printed it out, and gave him an envelope
so he could deliver it to my neighbour.



After writing all that, I'm not sure if my point is clear or not.  Doing
things here involves "people" much more than in the UK.  And I am getting
better at doing that - to the point where it's as much a pleasure as a pain.
Although we address each other as "usted" the maestro is my friend, and I know
he will do a decent job.  Similarly, I "connected" with everyone I dealt with
today.

A slower, but more humane, world.

Andrew

Permalink

A Python Logging Service

From: andrew cooke <andrew@...>

Date: Sun, 22 Aug 2010 17:01:28 -0400

I've been lookng at Twisted, which is a framework for cooperative
multi-tasking in Python.  I don't find that a very useful description, so here
are two alternatives:

1 - It's a way of structuring multi-threaded programs that's a lot more like
    Javascript or GUI toolkits.

2 - It's a way of writing network servers that work efficiently without using
    multiple threads.


There's a fair amount of documentation at
http://twistedmatrix.com/documents/current/core/ (and most imporantly at
http://twistedmatrix.com/documents/current/core/howto/index.html ) - I suggest
reading through that until it sticks.  It took me a while, and writing the
code below, but now it makes a lot of sense (and it seems like a very nicely
engineered system).


I structured the example below as a set of different files, which was probably
excessive, but I wanted the difference components to be as clear as possible.

Python logging can be serialised over a socket.  This code in a server that
receives seralised messages and writes them to a log.


First, the protocol:

  from cPickle import loads
  from logging import makeLogRecord, getLogger
  from struct import unpack
  from twisted.internet.protocol import Protocol, connectionDone

  '''
  The protocol for a Twisted server that receives log messages.

  See http://docs.python.org/library/logging.html#socket-handler 
  '''

  class LoggingProtocol(Protocol):

      def dataReceived(self, data):
	  self.__data += data
	  while True:
	      if not self.__message_len and len(self.__data) >= 4:
		  # unpack length prefix
		  self.__message_len = unpack(">L", self.__data[:4])[0]
		  self.__data = self.__data[4:]
	      if self.__message_len and len(self.__data) >= self.__message_len:
		  # unpack message
		  record =
	  makeLogRecord(loads(self.__data[0:self.__message_len]))
		  self.__data = self.__data[self.__message_len:]
		  self.__message_len = 0
		  logger = getLogger(record.name)
		  logger.handle(record)
	      else:
		  break

      def connectionMade(self):
	  self.__data = ''
	  self.__message_len = 0

      def connectionLost(self, reason=connectionDone):
	  self.__data = None
	  self.__message_len = None


Next, the factory (ie a protocol factory):

  from logging.config import dictConfig
  from twisted.internet.protocol import Factory

  from log.protocol import LoggingProtocol

  '''
  A factory for the remote Python logger.

  This seems to be the best location to store configuration information because
  it is accessible both in tests (using a reactor) and to an application.
  '''

  class LoggingFactory(Factory):

      protocol = LoggingProtocol

      DEFAULT_PORT = 2000
      DEFAULT_CONFIG = {'version': 1,
			'handlers':
			  {'file':
			    {'class': 'logging.FileHandler',
			     'filename': 'logging-service.log',
			     'level': 'DEBUG',
			  },},
			'root':
			  {'level': 'DEBUG',
			   'handlers': ['file']},}

      def __init__(self, config_dict=None):
	  if not config_dict:
	      config_dict = self.DEFAULT_CONFIG
	  dictConfig(config_dict)


And the service:

  from twisted.application.internet import TCPServer

  from log.factory import LoggingFactory

  '''
  A service for the remote Python logger.

  This is used by the application.
  '''

  class LoggingService(TCPServer):

      def __init__(self, port=None, config_dict=None, interface='0.0.0.0'):
	  if not port:
	      port = LoggingFactory.DEFAULT_PORT
	  # old style clases in twisted
	  TCPServer.__init__(self, port, LoggingFactory(config_dict), 
			     interface=interface)


This can then be made into an application (a daemon) that's run from the
command-line using a tool called "twistd":

  # You can run this .tac file directly with:
  #    twistd -ny service.tac

  from log.service import LoggingService
  from twisted.application import service

  application = service.Application("Logging application")
  LoggingService().setServiceParent(application)


Alternatively, for testing, the Fatcory can be used directly.  This test code
also gives a glimpse of how the reactor is used to schedule events (there's
also an abstraction for chaining callbacks called "Defered"):

  from logging.config import dictConfig
  from logging import getLogger
  from multiprocessing.process import Process
  from tempfile import mkstemp
  from twisted.internet import reactor
  from unittest import TestCase

  from log.factory import LoggingFactory


  class LoggingTest(TestCase):
      '''
      Test the logging service by starting an instance, then firing up a 
      separate process that logs to the service.
      '''

      def test_logging(self):
	  tick = Tick()
	  (_fd, self.tmp) = mkstemp()
	  process = Process(target=self.logging_process)
	  factory = LoggingFactory({'version': 1,
				    'handlers': {'file': {'class': 'logging.FileHandler',
							  'filename': self.tmp,
							  'level': 'DEBUG'}},
				    'root': {'level': 'DEBUG',
					     'handlers': ['file']}})
	  reactor.listenTCP(factory.DEFAULT_PORT, factory)
	  reactor.callLater(tick(), process.start)
	  reactor.callLater(tick(), reactor.stop)
	  reactor.run()
	  fd = open(self.tmp)
	  contents = fd.readlines()
	  assert contents == ['a warning\n'], contents
	  fd.close()

      def logging_process(self):
	  dictConfig({'version': 1,
		      'handlers':
			{'socket':
			  {'class': 'logging.handlers.SocketHandler',
			   'level': 'INFO',
			   'host': 'localhost',
			   'port': LoggingFactory.DEFAULT_PORT
			   },},
		      'root':
			{'level': 'INFO',
			 'handlers': ['socket']},
		     })
	  logger = getLogger('test')
	  logger.debug('a debug') # discarded by "level: INFO" above
	  logger.warn('a warning')


  class Tick(object):

      def __init__(self, increment=0.1):
	  self.__increment = increment
	  self.__time = 0

      def __call__(self, step=1):
	  self.__time += step * self.__increment
	  return self.__time


Andrew

Permalink

Selenium Tests of Multiple Browser and OS Combinations

From: andrew cooke <andrew@...>

Date: Wed, 11 Aug 2010 16:07:57 -0400

This is a follow-up to my earlier post on Selenium at
http://www.acooke.org/cute/SeleniumWe0.html - there I gave a summary of
Selenium and the basic tools needed to run simple, single tests.

However, what is interesting me at the moment (because we need it at work) is
how to run the same test in several browsers, on different operating systems.
This requires two things:

1 - A way to distribute jobs across machines
2 - A way to make tests general (the tests in the first article specified
    the target browser).

Selenium provides these things via something called "Selenium Grid", although
much of the documentation for that focuses on running tests in parallel (for
speed) rather than on exploiting different environments.

A warning: the Selenium Grid documentation sucks.  I have spent a frustrating
couple of days getting this working.  Even the demo they provide to test the
system doesn't work.  And the system itself seems a bit limited and
unreliable.  But, unfortunately, I don't see anything better.


OK, so what is Selenium Grid?  It's three things:

A - A central hub that manages a collection of servers
B - A collection of servers (possibly on remote machines)
C - Library support for writing tests

(A+B) address (1) above and (C) addresses (2).  This is all packaged in a Java
deploy and run using ant-based scripts.


First, I want to describe how the distribution of tasks works, because I found
this far from intuitive.  It's important to understand that all the servers
(called "remote controls") are equivalent and dumb.  They don't "know" that
they are running on Windows, or can access the Opera browser, for example.
The only way that such information is made available to the system is through
the *environment*.

When you start a server you specify its environment.  This is a string, and
the standard form is something like "IE on Windows" or "Firefox on Linux".
That information is passed by the server to the central hub which uses that
(and only that) to route tests.

The hub is also contacted by the test.  The test requests a particular
environment (I'll address 2/C below) and the hub then routes the test to the
corresponding server.


At this point it's probably worth describing exactly how these things run.
This is how I start the hub and a local server on Linux (first script uses
konsole as they log to stdout - although on Windows you don't see anything,
and I don't understand why):

  > cat startup-selenium.sh
  #!/bin/bash
  konsole --hold -e startup-hub.sh &
  sleep 5
  konsole --hold -e startup-rc.sh &

  > cat startup-hub.sh
  #!/bin/bash
  cd .../selenium-grid-1.0.6/
  ant launch-hub

  > cat startup-rc.sh
  #!/bin/bash
  cd .../selenium-grid-1.0.6/
  ant \
    -Dport=5555 \
    -Dhost=10.2.0.0 \
    -DhubURl=http://10.2.0.0:4444 \
    -Denvironment='Firefox on Linux' \
    launch-remote-control

You can see how the server's environment is defined as 'Firefox on Linux'.
Note also that the server is told the location of the hub so that it can
register itself.  You can see the hub's status by pointing a browser at
/console on the same URL (so http://10.2.0.0:444/console in my example above).


I can then run a test:

  > ant -Dbrowser="Firefox on Linux" run-in-sequence
  Buildfile: build.xml

  run-in-sequence:
       [java] [Parser] Running:
       [java]   Selenium Grid Tests In Sequence
       [java]
       [java] 11-Aug-2010 15:45:52 com.thoughtworks.selenium.grid.tools.ThreadSafeSeleniumSessionStorage startSeleniumSession
       [java] INFO: Contacting Selenium RC at 10.2.0.0:4444
       [java] 11-Aug-2010 15:45:59 com.thoughtworks.selenium.grid.tools.ThreadSafeSeleniumSessionStorage startSeleniumSession
       [java] INFO: Got Selenese session:com.thoughtworks.selenium.DefaultSelenium@...
       [java] 11-Aug-2010 15:46:16 com.thoughtworks.selenium.grid.tools.ThreadSafeSeleniumSessionStorage closeSeleniumSession
       [java] INFO: Closing Selenese session: com.thoughtworks.selenium.DefaultSelenium@...
       [java]
       [java] ===============================================
       [java] Selenium Grid Tests In Sequence
       [java] Total tests run: 1, Failures: 0, Skips: 0
       [java] ===============================================
       [java]

  BUILD SUCCESSFUL
  Total time: 23 seconds

I will explain how tests are written and structured, but first I need to add
one extra point.  There's an additional piece of information needed, which is
the "driver" used to run the tests.  This depends on the browser, and so we
need to map from environment to driver.  This is done in the file
grid_configuration.yaml in the main selenium-grid directory and displayed in
the hub console display.


So far I have addressed my point (1), but haven't really explain how (2) is
solved.  And in truth, I don't completely know - I am simply copying some code
that works.

However, I do know that the documentation is *particularly* poor on this, so
this information is critical.

First, the ant scripts use TestNG to run Java tests.  But the tests generated
by the IDE (export as Java TestNG) seem sot be using an old, unsupported
library.  Do *not* try hunting down the appropriate class and jar
(SeleneseTestNgHelper) because it does not work with the rest of the
(ant-based) environment, as far as I can see.

Instead, copy the supplied code used in the (broken!) tests in the Selenium
Grid package.  In particular, copy GoogleImageTestBase.java and compare ti to
the code generated by the IDE - it's pretty obvious how to do the translation.

Note that the libraries involved do the necessary work of somehow adapting the
tests to use the parameters supplied to any when invoking with a particular
environment.


If you do all that then you will need to link to the following jars:
  commons-logging-1.0.4.jar
  selenium-java-client-driver.jar
  testng-5.7-jdk15.jar
  selenium-server-1.0.3-standalone.jar
  selenium-grid-tools-standalone-1.0.6.jar

You will also need a build.xml that can compile that and run the test.  I
used:

  <project name="..." basedir=".">

    <property name="src.dir"     value="${basedir}/src"/>
    <property name="classes.dir" value="${basedir}/classes"/>
    <property name="lib.dir"     value="${basedir}/lib"/>

    <path id="classpath">
      <fileset dir="${lib.dir}" includes="**/*.jar"/>
      <pathelement path="${classes.dir}"/>
    </path>

    <target name="compile">
      <mkdir dir="${classes.dir}"/>
      <javac srcdir="${src.dir}" destdir="${classes.dir}" classpathref="classpath"/>
    </target>

    <property name="webSite" value="http://10.2.0.0:8080/" />
    <property name="seleniumHost" value="10.2.0.0" />
    <property name="seleniumPort" value="4444" />
    <property name="browser" value="*firefox" />

    <target name="run-in-sequence" description="Run Selenium tests one by one">
      <java classpathref="classpath"
            classname="org.testng.TestNG"
                failonerror="true">
            <sysproperty key="java.security.policy" file="${basedir}/lib/testng.policy"/>
            <sysproperty key="webSite" value="${webSite}" /> 
            <sysproperty key="seleniumHost" value="${seleniumHost}" />
            <sysproperty key="seleniumPort" value="${seleniumPort}" /> 
            <sysproperty key="browser" value="${browser}" /> 

        <arg value="-suitename" />
        <arg value="Selenium Grid Tests In Sequence" />
        <arg value="-d" />
        <arg value="${basedir}/target/reports" />
        <arg value="-testclass"/>
        <arg value="MyClass"/>
      </java>
    </target>

  </project>

which is just a simple copy of the one supplied in the grid package.


I hope that helps!

Andrew

Permalink

Resizing Cryptmount File System

From: andrew cooke <andrew@...>

Date: Wed, 11 Aug 2010 09:36:23 -0400

If you have an encrypted loopback file system that uses cryptmount you can
resize (enlarge) it by doing the following:

- Add extra padding by:
    dd if=/dev/zero bs=... count=... >> file
  where file is the file that contains the file system (unmounted)

- Mount the file system (this also creates loopback device etc):
    cryptmount name

- Unmount the file system:
    sudo su
    umount /dev/mapper/....

- Check and resize the file system:
    e2fsck -f /dev/mapper/...
    resize2fs /dev/mapper/...

- And remount
    mount /dev/mapper/... path

This is based on the information at
http://wiki.archlinux.org/index.php/System_Encryption_with_LUKS_for_dm-crypt#Resizing_the_loopback_filesystem
but that assumes a LUKS file system.

Andrew

Permalink

Selenium Web Testing

From: andrew cooke <andrew@...>

Date: Sat, 7 Aug 2010 19:31:39 -0400

I've been looking at Selenium, which is a system for testing web sites.  It's
very easy to use, for simple sites, and works impressively well.  Using a
Firefox plugin you "record" browsing a site and can then repeat that in
"playback mode" as a test.  You can also export the test as Python or Java
unit tests and run those against all the popular browsers on all the popular
operating systems.

That's pretty neat - you could have a bunch of VMs running different operating
systems and browsers, all running against a test web site.  I tested it by
running a Python unit test from Linux that tested Internet Explorer in a
Windows 7 VM.

However, for the particular case I had in mind there are some problems.  In
particular, it's not clear how to handle mixed namespaces (eg SVG is not in
the HTML namespace in the DOM) and, given how the system is implemented (as
Javascript accessing the DOM within the browser being tested) it will not work
with Adobe's SVG plugin on Windows.


Notes
-----

The "Selenium IDE" is a Firefox plugin that:
 * Lets you define tests  (and group them in suites)
 * Lets you *run* tests (against Firefox)
 * Lets you save test for future use, or to reun from other languages

So for simple, Firefox only tests, this is a very quick, simple solution.  For
more complex testing it is also a good way of generating tests.

Tests can follow links and do things like verify text (when defining a test,
highlight the text, right click and select the correct option).

A test when saved in the "internal" format is a HTML table of commands.  When
saved as Python file it looks like:

  from selenium import selenium
  import unittest, time, re

  class test(unittest.TestCase):

      def setUp(self):
          self.verificationErrors = []
          self.selenium = selenium("localhost", 4444, "*firefox", 
                                   "http://www.acooke.org")
          self.selenium.start()

      def test_selenium(self):
          sel = self.selenium
          sel.open("/")
          sel.click("link=Lepl")
          sel.wait_for_page_to_load("30000")
          sel.click("link=Support")
          sel.wait_for_page_to_load("30000")
          sel.click("link=Show Source")
          sel.wait_for_page_to_load("30000")
          try: self.failUnless(sel.is_text_present("Google Code"))
          except AssertionError, e: self.verificationErrors.append(str(e))

      def tearDown(self):
          self.selenium.stop()
          self.assertEqual([], self.verificationErrors)

  if __name__ == "__main__":
      unittest.main()


To run that, however, and to test other browsers, you need to install "Selnium
RC".  This is a bundle of packages that includes the server along with support
for various languages.

The server is a Java program that you run as:
  java -jar selenium-server.jar

Then, when you run the Python test...
  python test.py
..the server automatically opens firefox and runs the tests.  It's just like
someone is using your machine!

The only extra detail needed in practice is that "firefox" must be the binary,
not the usual script at /usr/bin/firefox (on Linux).  Otherwise you will see
the error

  Caution: '/usr/bin/firefox': file is a script file, not a real
  executable.  The browser environment is no longer fully under RC control

Since selenium checks for "firefox-bin" before "firefox" the easiest solution
is to:

1 - link firefox-bin to firefox in /usr/lib64/firefox
2 - include /use/lib64/firefox on the path

Then run the server as:

  PATH=$PATH:/usr/lib64/firefox java -jar selenium-server.jar

Note that you need firefox-bin to be in the same directory as firefox or you
will see the error:

  Could not read application.ini

With that fix, the server starts and kils firefox, and tests run repeatedly
(without it, only the first test worked - rerunning it somehoe treated the
website as a binary file).


So, next question - can we do the same on Windows?  I started a VB VM with
Windows 7, containing Java 6 and IE 8.  Downloading Selenium RC (the server)
and starting it, then running the same test (on Linux) but changed to call the
windows server, I got the error:

  Couldn't open app window; is the pop-up blocker enabled?"

Which is described at
http://stackoverflow.com/questions/1517623/internet-explorer-8-64bit-and-selenium-not-working

The simplest solution is to change the browser type to iexploreproxy.  The
following code, run on Linux, invokes IE on Windows and gives a successful
test:

  from selenium import selenium
  import unittest, time, re

  class test(unittest.TestCase):

      def setUp(self):
          self.verificationErrors = []
          self.selenium = selenium("10.1.0.28", 4444, "*iexploreproxy", 
                                   "http://www.acooke.org")
          self.selenium.start()

      def test_selenium(self):
          sel = self.selenium
          .... as before

Where 10.1.0.28 is the VM address.


This can be automated further with Selnium Grid, which automatically runs
tests on different machines.


OK, so now to test a web page with SVG.

Hmmm.  Suddenly (well, hours later....) this is not so great.  First, the site
opens new windows on clicks.  These have a "_blank" target.  Selenium has a
command to identify windows by title, but it doesn't work in the IDE (the
Firefox plugin).  After much frustration I found that it *does* work when used
via the server.

Also, because the link is loaded into a new window (ie not the one that
Selenium is watching) the test does not wait for completion.  So a fixed
period wait (30s) needs to be added.

And when I click on a link in the SVG, the selenium IDE records nothing.  I
can add this explicitly to a Python test, but I can't work out what to add.

I have experimented with specifying a namespace in an xpath expresison and
also by using javascript.  Javascript can be tested in the browser address box
(running selenium each time is slow because of waiting for pages) and
unfortunately the code below (and related attempts) does not work.

javascript:alert(function(){var a = document.getElementsByTagName('a');var result = [];for (var i=0; i<a.length; i++) {if (a[i].href == 'some-url-here'){return a[i];}}return null;}())

javascript:alert(function(){var a = document.getElementsByTagNameNS('http://www.w3.org/2000/svg', 'svg'); return a[0];}())

(Javascript can be specified inside selenium to identify the target - see the
"dom=" protocol in the documentation for the Python library).

Worse, on IE I suspect this would also be broken by the plugin.

So I stalled at this point.  Selenium works surprisingly well (better than I
expected), but does not handle SVG well - there appear to be problems with
namespaces and, likely IE's plugin.

Andrew

Permalink

Auto-Scaling Date Axes in Python

From: andrew cooke <andrew@...>

Date: Wed, 28 Jul 2010 10:16:53 -0400

There's a nice algorithm for auto-scaling axes, called the "nice number
algorithm", written by Paul Heckbert and published in "Graphics Gems" -
http://books.google.com/books?id=fvA7zLEFWZgC&pg=PA61&lpg=PA61&dq=nice+numbers+graphics+gems&source=bl&ots=7LdCq3nI-j&sig=L8qoZ8l_a95KAtHmMjagJ8cC0U0&hl=en&ei=KDhQTKLwGcT48AbTsvnEAQ&sa=X&oi=book_result&ct=result&resnum=2&ved=0CBYQ6AEwAQ#v=onepage&q&f=false

The routines below implement this, but are parameterised over the number base
used, so can also be used for axes based on units that repeat over multiples
of 12, 60, or any other value.


from calendar import timegm
from math import floor, log, log10, ceil
from time import gmtime

# These allow the use with base 10, 12 and 60:
LIM10 = (10, [(1.5, 1), (3, 2), (7, 5)], [1, 2, 5])
LIM12 = (12, [(1.5, 1), (3, 2), (8, 6)], [1, 2, 6])
LIM60 = (60, [(1.5, 1), (20, 15), (40, 30)], [1, 15, 40])

def heckbert_d(lo, hi, ntick=5, limits=None):
    '''
    Calculate the step size.
    '''
    if limits is None:
        limits = LIM10
    (base, rfs, fs) = limits
    def nicenum(x, round):
        step = base ** floor(log(x)/log(base))
        f = float(x) / step
        nf = base
        if round:
            for (a, b) in rfs:
                if f < a:
                    nf = b
                    break
        else:
            for a in fs:
                if f <= a:
                    nf = a
                    break
        return nf * step
    delta = nicenum(hi-lo, False)
    return nicenum(delta / (ntick-1), True)

def heckbert(lo, hi, ntick=5, limits=None):
    '''
    Calculate the axes lables.
    '''
    def _heckbert():
        d = heckbert_d(lo, hi, ntick=ntick, limits=limits)
        graphlo = floor(lo / d) * d
        graphhi = ceil(hi / d) * d
        fmt = '%' + '.%df' %  max(-floor(log10(d)), 0)
        value = graphlo
        while value < graphhi + 0.5*d:
            yield fmt % value
            value += d
    return list(_heckbert())


This can then be used with a range of seconds as follows:


def autoscale_time(start, end):
    '''
    Yields a sequence of epochs that are nicely spaced.

    start and end are Unix epochs.
    '''
    time_chunks = [('days', 3 * 24 * 60 * 60, 24 * 60 * 60, 2, None),
                   ('hours', 3 * 60 * 60, 60 * 60, 3, LIM12),
                   ('minutes', 3 * 60, 60, 4, LIM60),
                   ('seconds', 0, 1, 5, LIM60)]
    for (name, limit, secs, sindex, limits) in self.time_chunks:
	if (end - start) > limit:
	    break
    d = heckbert_d(start / secs, end / secs, limits=limits)

    # zero out lower steps, so that we get a starting date that's an
    # integral number of units
    stime = list(gmtime(start))
    for i in range(sindex, 9):
	stime[i] = 0

    # generate a sequence of epochs (cannot use the usual heckbert routine 
    # because formatting will be different)
    value = timegm(stime)
    while value <= end:
	if value >= start:
            yield value
	value += d * secs


This could be extended further by:

- having different formats in the time_chunks parameter, so that different
  intervals are formatted differently

- adding months etc.  This would require changing the "secs" increment to be a
  timedelta and working with datetime instances rather than epochs (because
  months are not all equally sized).


NOTE: The code above is cut + pasted from some working code and is not tested
in its existing form; I may have introduced a bug somewhere, but hopefully
this illustrates the idea.

Andrew

Permalink

Setting File Permissions in Subversion

From: andrew cooke <andrew@...>

Date: Tue, 27 Jul 2010 11:59:16 -0400

Subversion drives me crazy in how it sets file permissions (keeping whatever
permission teh file originally had, and overwriting local changes on update).

Turns out that for the executable bit, you can set a special property that
will get the correct behaviour.  For example:

  find . -name "*.sh" -exec svn propset svn:executable ON \{} \;

Peace.... :o)

Andrew

Permalink

Easy Slide-in Menus using YUI 3

From: andrew cooke <andrew@...>

Date: Mon, 26 Jul 2010 00:40:07 -0400

YUI 3 is amazing.  It looks terrifyingly complex, but once you get into it,
you can do complex things trivially.  I use jQuery at work, and in comparison,
YUI 3 feels like it was written by software engineers rather than people
hacking web pages.

For example, here's how to make a menu that shows as a tab on the right-hand
side of the page that "zooms in" (to the left) when you mouse-over.


First, place the menu in a div in the HTML:

  <div class="menu" id="menu-1">
  <img src="menu-icon-small.png"/>
  <ul>
  <li>Foo</li>
  <li>Bar</li>
  <li>Foo</li>
  <li>Bar</li>
  </ul>
  </div>


Next, add some CSS (probably ina separate stylsheet) so that the icon is on
the left, and will be visible, but the rest of the div is off the page to the
right:

ul {
  margin: 10px;
  margin-left: 40px;
}
li {
  list-style-type: none;
  list-style-position: outside;
  margin-top: 4px;
  margin-bottom: 4px;
}
div.menu {
  position: fixed;
  right: -278px;
  width: 300px;
  border: 1px solid #c0c0c0;
  border-right: none;
  -moz-border-radius: 4px;
  background: white;
}
div#menu-1 {
  top: 50px;
}
div.menu img {
  margin: 1px;
  float: left;
}


Import the YUI magic (in the HTML header).

  <script type="text/javascript"
  src="http://yui.yahooapis.com/combo?3.1.1/build/yui/yui-min.js"></script>

and define the animation:

  <script type="text/javascript">
  YUI().use('event-mouseenter', 'console', 'anim', function(Y) {
      /* new Y.Console().render(); */
      var open = new Y.Anim({node: '#menu-1', to: {right: 0},
			     easing: Y.Easing.easeOut});
      var close = new Y.Anim({node: '#menu-1', to: {right: -278},
			     easing: Y.Easing.easeIn});
      Y.on("mouseenter", function (e) {open.run();}, "#menu-1");
      Y.on("mouseleave", function (e) {close.run();}, "#menu-1");
  });
  </script>

and that's it!


The script above sets up the events and defines animations for the div so that
it moves into and out of sight.  The "Easing" part even makes it move in a
"natural" manner (it slows down as it extends fully).

Andrew

Permalink

Generating SVG in Python 2.4

From: andrew cooke <andrew@...>

Date: Fri, 23 Jul 2010 10:07:27 -0400

There's not much love for Python 2.4 and SVG (which, I admit is something of
an odd combination - who would be so conservative they would use Python 2.4
and then require SVG?), particularly if you need a non-GPL solution.  So I
ended up wiring a simple wrapper around DOM that helps generate the XML.
Here's a summary of the approach (there's nothing really hard here, just
fiddly DOM details):

class SvgBase(object):
    '''
    Support class.  Contains a reference to the DOM document and
    the element corresponding to this node.  Note that we allow any
    node to be extended with any child node; we also allow any 
    attribute to be added.
    '''
    
    SVG = 'http://www.w3.org/2000/svg'
    
    def __init__(self, doc, element, **attrs):
        self._doc = doc
        self._element = element
        for name in attrs:
            value = attrs[name]
            if value is not None:
                self._element.setAttributeNS(self.SVG, name, str(value))
        
    def line(self, (x1, y1), (x2, y2), **attrs):
        '''
        Add a child line
        '''
        return Line(self, (x1, y1), (x2, y2), **attrs)
       
    # more child metods here.....


class Svg(SvgBase):
    '''
    The root svg element.  This creates the document, allows a 
    stylesheet to be used, etc.
    '''

    def __init__(self, version='1.1', width=None, height=None):
        implementation = getDOMImplementation('')
        doctype = implementation.createDocumentType('svg', 
            '-//W3C//DTD SVG 1.1//EN', 
            'http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd')
        document = implementation.createDocument(self.SVG, 'svg', doctype)
        for node in document.childNodes:
            if node.nodeType == Node.ELEMENT_NODE:
                element = node
                break
        # xlink lets "a" element work correctly (use xlink:href)
        ns = {'xmlns': self.SVG,
              'xmlns:xlink': 'http://www.w3.org/1999/xlink'}
        super(Svg, self).__init__(document, element, 
                                  version=version, width=width, height=height,
                                  **ns)
        
    def add_stylesheet(self, url):
        style = self._doc.createProcessingInstruction('xml-stylesheet',
                        'href="%s" type="text/css"' % url)
        for node in self._doc.childNodes:
            if node.nodeType == Node.DOCUMENT_TYPE_NODE:
                self._doc.insertBefore(style, node)
                break

    def __str__(self):
        '''
        The XML of the entire document.
        '''
        return self._doc.toxml()
    

class Line(SvgBase):
    '''
    An example child node.
    '''

    def __init__(self, parent, (x1, y1), (x2, y2), **attrs):
        element = parent._doc.createElementNS(self.SVG, 'line')
        super(Line, self).__init__(parent._doc, element,
                                   x1=self._1dp(x1), y1=self._1dp(y1), 
                                   x2=self._1dp(x2), y2=self._1dp(y2), 
                                   **attrs)
        parent._element.appendChild(self._element)


With that, it's pretty easy to do things like:

    >>> svg = Svg()
    >>> svg.line((1,2), (3,4), stroke='blue')
    >>> svg.add_stylesheet('http://example.com/foo')
    >>> str(svg)
    '''<?xml version="1.0" ?>
       <?xml-stylesheet href="http://example.com/foo" type="text/css"?>
       <!DOCTYPE svg  PUBLIC '-//W3C//DTD SVG 1.1//EN'  
                 'http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd'>
       <svg version="1.1" xmlns="http://www.w3.org/2000/svg"
            xmlns:xlink="http://www.w3.org/1999/xlink">
         <line x1="1.0" x2="3.0" y1="2.0" y2="4.0" stroke="blue"/>
       </svg>'''

Andrew

Permalink

RXPY Benchmarks

From: andrew cooke <andrew@...>

Date: Sat, 17 Jul 2010 16:48:30 -0400

Much of this is just duplicating the re2 work, and the rest is RXPY-specific,
but even so, these are quite interesting (for me, at least).

Note that the times are relative to the Python re package (log10 in brackets).
Also, the plots are both linear ### and logarithmic [ ].


First, some basic matching.  This is just showing how slow my code is
(hundreds of times slower than the re package).

Match a(.)c against abc
             Backtracking [##########                   ]              95 [2.0]
            Parallel wide [#######################            ]       206 [2.3]
    Parallel wide, hashed [#################################### ]     315 [2.5]
             Parallel seq [#######################            ]       205 [2.3]
     Parallel seq, hashed [###################################  ]     304 [2.5]
            Parallel beam [############################        ]      245 [2.4]
    Parallel beam, hashed [######################################]    339 [2.5]

Match (a)b(?<=(?(1)b|x))(c) against abc
             Backtracking [############                   ]           210 [2.3]
            Parallel wide [#########################          ]       398 [2.6]
    Parallel wide, hashed [#################################    ]     531 [2.7]
             Parallel seq [#########################          ]       400 [2.6]
     Parallel seq, hashed [##################################   ]     546 [2.7]
            Parallel beam [##############################      ]      480 [2.7]
    Parallel beam, hashed [######################################]    622 [2.8]

Match a*b against a^100b
             Backtracking [####################################  ]   1902 [3.3]
            Parallel wide [#################################    ]    1737 [3.2]
    Parallel wide, hashed [####################################  ]   1919 [3.3]
             Parallel seq [################################     ]    1733 [3.2]
     Parallel seq, hashed [####################################  ]   1919 [3.3]
            Parallel beam [###################################  ]    1884 [3.3]
    Parallel beam, hashed [######################################]   2073 [3.3]

Match .*b against a^100b
             Backtracking [#################                 ]       1902 [3.3]
            Parallel wide [###############                   ]       1769 [3.2]
    Parallel wide, hashed [#################                 ]       1927 [3.3]
             Parallel seq [###############                   ]       1757 [3.2]
     Parallel seq, hashed [##################                ]       2015 [3.3]
            Parallel beam [###################################  ]    3825 [3.6]
    Parallel beam, hashed [######################################]   4300 [3.6]

Above is quite interesting because the beam search is suddenly twice as slow,
just by changing from "a*b" to ".*b".  That's because the second expression
has a failure ("." matchs the final "b") and so needs to backtrack.  Because
the initial beam width is 1 that means that the entire search must be repeated
with doubled with.


The next test is a search.

Search .*b against a^100b
             Backtracking [                         ]                1172 [3.1]
            Parallel wide [######################################]  41841 [4.6]
    Parallel wide, hashed [                         ]                1459 [3.2]
             Parallel seq [                         ]                1212 [3.1]
     Parallel seq, hashed [                         ]                1338 [3.1]
            Parallel beam [###                          ]            3764 [3.6]
    Parallel beam, hashed [###                          ]            3688 [3.6]

Above shows the problem with a naive wide parallel approach - there are an
awful lot of different states to store.  Of course, most are duplicates so the
hashing removes the problem (as does sequential or beam search).


The next tests show the exponential explosion problem in Python's matcher.
This is a little more difficult to trigger than in the re2 paper because
Python has a "shortcut" (hack!) that avoids the problem in very simple cases
(but, as we will see below, does nothing to address the underlying issues).

Match (?:a|b)?^2n(?:ab)^n against (ab)^n for n=4
             Backtracking [#############                   ]          573 [2.8]
            Parallel wide [#########                     ]            392 [2.6]
    Parallel wide, hashed [##                      ]                  125 [2.1]
             Parallel seq [#########                     ]            398 [2.6]
     Parallel seq, hashed [##                      ]                  125 [2.1]
            Parallel beam [######################################]   1606 [3.2]
    Parallel beam, hashed [########                      ]            360 [2.6]

Match (?:a|b)?^2n(?:ab)^n against (ab)^n for n=6
             Backtracking [######################################]    637 [2.8]
            Parallel wide [#############################       ]      476 [2.7]
    Parallel wide, hashed [#                 ]                         27 [1.4]
             Parallel seq [#############################       ]      475 [2.7]
     Parallel seq, hashed [#                 ]                         27 [1.4]
    Parallel beam, hashed [##                    ]                     51 [1.7]

Match (?:a|b)?^2n(?:ab)^n against (ab)^n for n=8
    Parallel wide, hashed [##############       ]                       4 [0.6]
     Parallel seq, hashed [##############       ]                       4 [0.6]
    Parallel beam, hashed [######################################]     10 [1.0]

As the number of terms increase, only the hashed approaches are efficient (due
to the explosion in the number of states).  This is as expected.  What is nice
to see is that by n=8 my Python code is oly 4x slower than the re package!
This is because the re backtracker is "exploding", while my code is linear
(but slow).


This is only possible, though, when we can discard duplicate state.  Which
isn't so easy when matching groups:

Match (a|b)?^2n(?:ab)^n against (ab)^n for n=4
             Backtracking [#######                       ]            481 [2.7]
            Parallel wide [#####                       ]              378 [2.6]
    Parallel wide, hashed [#########                      ]           627 [2.8]
             Parallel seq [######                       ]             414 [2.6]
     Parallel seq, hashed [##########                     ]           667 [2.8]
            Parallel beam [##########################          ]     1612 [3.2]
    Parallel beam, hashed [######################################]   2415 [3.4]

Match (a|b)?^2n(?:ab)^n against (ab)^n for n=6
             Backtracking [###########################         ]      512 [2.7]
            Parallel wide [######################             ]       415 [2.6]
    Parallel wide, hashed [######################################]    723 [2.9]
             Parallel seq [######################             ]       418 [2.6]
     Parallel seq, hashed [######################################]    714 [2.9]


And here we do no better than normal backtracking.


The main conclusions, then, are:

1 - Python's re package does suffer from the exponential "explosion" issue,
    while my code (with duplicate state elimination) does not.

2 - My code can also handle groups, but the degrades in performance as
    expected.

So RXPY is both general and, when possible, efficient.  Even if it is terribly
slow.

A secondary conclusion is that the beam search approach does not seem to be
worth the trouble.

Andrew

Permalink

This is my blog. It used to be a mailing list called C[omp]ute. It is still generated by email. You can reply to comments via the appropriate link. Edit the mail address to remove the anti-spam measure. However, given the very low volume of replies, and the high rate of spam, it can be months before I moderate a post. Sorry. © 2006-2009 Andrew Cooke (site) / post authors (content).

I am always interested in offers/projects/new ideas. Eclectic experience in fields like: numerical computing; Java web/enterprise; functional languages; Python client GUI/web/database; etc. Based in Santiago, Chile; telecommute worldwide. CV; email.

Recent Threads

Good Intro to LVM

A Chilean Day

A Python Logging Service

Selenium Tests of Multiple Browser and OS Combinations

Resizing Cryptmount File System

Selenium Web Testing

Auto-Scaling Date Axes in Python

Setting File Permissions in Subversion

Easy Slide-in Menus using YUI 3

Generating SVG in Python 2.4

RXPY Benchmarks

RXPY Update - Beam Engine

Forensics Using Frequency Variation of Mains Supply

UK Torture

Cloud Computing

GPU in the Cloud

How To Choose NoSQL

Empty Loops in Regular Expressions

Compiling Python Numerics to GPU wuth Theano

Anybots - Physical Presence for Telecommuting

Fame! (Bonneville Power)

Recent Replies

Firefox uses Proxy with Selenium

Fressia too

Windows etc

More Benchmarks

Future Work

More on CAP

Theano Experience

Plus Memoisation

Museo Allende

SSL MIM Paper

OpenCL Examples

Re: A Practical Introduction to OpenCL

Battery Life

Visiting Rancagua

Not Monads!

RequestPolicy URL

Undead Links

Un-greyed Text

Spam Filtering Details

Mutt Working Well

OProfile - An Alternative for Profiling Java (and C)

Andrew Cooke | Contents | Latest | RSS | Twitter | Previous