Andrew Cooke | Contents | RSS | Twitter | Previous
From: andrew cooke <andrew@...>
Date: Tue, 31 Aug 2010 07:23:08 -0400
if you unpack the Django OpenID library you can run a demonstration in Django by doing: cd examples/djopenid PYTHONPATH=../.. python manage.py runserver The PYTHONPATH includes the OpenID library itself, so you don't need that if you've actually installed the library (using setup.py or easy_install). However, if you do that, and then authenticate with an ID, you'll see the error: OpenID authentication failed. Invalid openid.mode: u'i' This is an error in the demo, not the library. It's very easy to fix. Just edit the file examples/djopenid/util.py and on line 136 change return dict((k, v[0]) for k, v in request_data.iteritems()) to return dict((k, v) for k, v in request_data.iteritems()) I assume that MultiDict.iteritems() changed at some point, and now returns a single value when before it returned a list. Andrew
Permalink | Comment on this post
For comments, see relevant pages (permalinks).
From: andrew cooke <andrew@...>
Date: Sat, 28 Aug 2010 03:55:03 -0400
http://www.ntlug.org/Articles/LVM Andrew
From: andrew cooke <andrew@...>
Date: Fri, 27 Aug 2010 18:23:24 -0400
[This article is awesome, BTW - http://www.nytimes.com/2010/08/29/magazine/29language-t.html ] Recently I have been spending more time than usual thinking about how "Chilean" I have become. Today was a particularly Chilean day, and I enjoyed it. First, I needed to go to the centre of town, to visit the Foreign Office, to get a document (my PhD certificate) officially certified (so that I can apply for a driving licence, which requires evidence that you have attended primary school). So I visited the Foreign Office, queued, waited, was called to talk to a lady, apologised for not having a clue what I was doing, and explained what I needed. She carefully listed the steps involved, gave me a piece of paper with the address of the Ministry of Justice (I am using the closest English translations here), and sent me on my way. I found a Notary (something that doesn't really exist in the UK - they are a kind of witness for simpler legal paperwork) and got a copy of my document made. While I was there a younger man, in a suit, kind-of jumped intervened (politely) with the woman serving me - it turns out that he had just qualified as Lawyer and needed a copy of his (very fancy) signed certificate. So I and the notary's assistant congratulated him and, when he went to pay, we discussed how it was expensive to become a lawyer, but how now he could also earn more money. Then, after a short while, the assistant took copies of our documents into the notary's office, who signed them, and we left. Next, to the Ministry of Justice, which was almost deserted, but which had Mapuche (and English!) signs. There, I apologised again for not understanding what was happening, and someone else signed the copy. After that, I returned to the Foreign Ministry, queued again, and a man quickly revised the signatures and stamped my piece of paper. In all this, no-one checked my identity, or translated the text. And the only place I had to pay was at the notary, which cost a pound. But I now have a "legalised" copy of my certificate, which is the correct piece of paper to apply for a driving licence. I got back on the metro and headed for home, but got off a few stops early because I needed to buy some paint. A few days ago I had visited a paint shop, which I found almost by chance, near the driving school where I have been taking lessons (I can drive, but need the practice before taking my test and, as far as I can work out, practicing in a private care is illegal - there is no "learners licence" here, for example). The first time I visited this shop the friendly attendant had carefully mixed some paint to match a fragment that had been chipped from our wall by the workman ("maestro" in Spanish, which is much nicer) who was repairing our earthquake damage. It has taken him three or four attempts, adding progressively more dye to a pot of blank paint, until he had the perfect match. But I at the time I had only bought a litre; now I needed more. The second time I visited the shop, someone else was attending, and he was no help at all (I had a different problem, which I finally solved by chatting to the owner and a customer at the hardware store near our house). Luckily, on my third visit, it was the helpful guy (I was planning to leave if not). So I greeted him and explained I needed another pot of paint. Since he hadn't recorded the mix, he started again, adding colours to white. At one point he had a problem with a dirty dye container, so had to start again with a new pot. While he was mixing the paint a policeman stopped in the middle of the crossroads outside, surrounded by barking stray dogs. At first I thought he was angry with the dogs, but it turned out that a march of striking students were coming down the road and he needed to stop the traffic. The students "marched" past, with various banners, chanting the same chants I have heard before, without ever thinking what they mean (perhaps only the same rhythm and the words change?). A girl pushed a leaflet into my hand. They were striking against the forced "curing" of homosexuals. I went back into the paint shop. I wanted to ask the attendant if it was really still common to force "treatment" on homosexuals, but I suspected he would think it a good thing. And so I said nothing while he complained about the students (I think) and continued to mix paint. Some "tough" looking guys came in and shouted at the attendant for attention. he ignored them. They tried again, in a more polite tone. He ignored them a bit more, drying a sample of my paint with a hair-dryer, and then asked them what they wanted. It turned out that they needed a specific kind of paint roller that he didn't have, so he sent them to a second shop, round the corner. Then he came to the window, showed me the sample, which was almost the same colour as my fragment, but not quite. So he went back and added a little more dye. When the paint was ready I bought some extra plaster I needed, to help cover the price of the failed can, went back to the metro, and went home. When I arrived home the maestro was on the scaffold outside. he was surprised at the extra paint, and had "stretched" what he had to pain the wall. but with the extra he could also paint an adjacent surface, which would look better (and, indeed, the final result is excellent - the wall looks almost like new, and the "patch" looks more like it has been washed clean than repainted). Back inside, I started work. When the maestro came in I went out to the hallway and we discussed the plastering there - he had tried to texture it to match the existing work. We agreed it wasn't perfect, but that it would do, and that he'd start painting after lunch. Since he'd finished with the scaffolding I called Paulina to ask her to call the company to take it away. She soon replied to say that they couldn't be coming til Monday. That was a problem, as the scaffold is where her car is normally parked, so I discussed it with the maestro, who commented that it was just what he'd expected (he had argued with the person who delivered the scaffolding and we'd eventually got more planks from the company). But he could disassemble it now, so Paulina could still bring the car home. Back to work, with the maestro having a late lunch writing a quote to do the same work he had done for me for the rest of the building. Much time later he appears, asking if I can print out what he's written on my computer. I agree and type it in. I add a few extra phrases from a quote I had received for some earlier work (from this man's grandson - that work was how I got to know him). After installing a Spanish dictionary I got most of the text OK (I am not sure whose spelling was worse), printed it out, and gave him an envelope so he could deliver it to my neighbour. After writing all that, I'm not sure if my point is clear or not. Doing things here involves "people" much more than in the UK. And I am getting better at doing that - to the point where it's as much a pleasure as a pain. Although we address each other as "usted" the maestro is my friend, and I know he will do a decent job. Similarly, I "connected" with everyone I dealt with today. A slower, but more humane, world. Andrew
From: andrew cooke <andrew@...>
Date: Sun, 22 Aug 2010 17:01:28 -0400
I've been lookng at Twisted, which is a framework for cooperative
multi-tasking in Python. I don't find that a very useful description, so here
are two alternatives:
1 - It's a way of structuring multi-threaded programs that's a lot more like
Javascript or GUI toolkits.
2 - It's a way of writing network servers that work efficiently without using
multiple threads.
There's a fair amount of documentation at
http://twistedmatrix.com/documents/current/core/ (and most imporantly at
http://twistedmatrix.com/documents/current/core/howto/index.html ) - I suggest
reading through that until it sticks. It took me a while, and writing the
code below, but now it makes a lot of sense (and it seems like a very nicely
engineered system).
I structured the example below as a set of different files, which was probably
excessive, but I wanted the difference components to be as clear as possible.
Python logging can be serialised over a socket. This code in a server that
receives seralised messages and writes them to a log.
First, the protocol:
from cPickle import loads
from logging import makeLogRecord, getLogger
from struct import unpack
from twisted.internet.protocol import Protocol, connectionDone
'''
The protocol for a Twisted server that receives log messages.
See http://docs.python.org/library/logging.html#socket-handler
'''
class LoggingProtocol(Protocol):
def dataReceived(self, data):
self.__data += data
while True:
if not self.__message_len and len(self.__data) >= 4:
# unpack length prefix
self.__message_len = unpack(">L", self.__data[:4])[0]
self.__data = self.__data[4:]
if self.__message_len and len(self.__data) >= self.__message_len:
# unpack message
record =
makeLogRecord(loads(self.__data[0:self.__message_len]))
self.__data = self.__data[self.__message_len:]
self.__message_len = 0
logger = getLogger(record.name)
logger.handle(record)
else:
break
def connectionMade(self):
self.__data = ''
self.__message_len = 0
def connectionLost(self, reason=connectionDone):
self.__data = None
self.__message_len = None
Next, the factory (ie a protocol factory):
from logging.config import dictConfig
from twisted.internet.protocol import Factory
from log.protocol import LoggingProtocol
'''
A factory for the remote Python logger.
This seems to be the best location to store configuration information because
it is accessible both in tests (using a reactor) and to an application.
'''
class LoggingFactory(Factory):
protocol = LoggingProtocol
DEFAULT_PORT = 2000
DEFAULT_CONFIG = {'version': 1,
'handlers':
{'file':
{'class': 'logging.FileHandler',
'filename': 'logging-service.log',
'level': 'DEBUG',
},},
'root':
{'level': 'DEBUG',
'handlers': ['file']},}
def __init__(self, config_dict=None):
if not config_dict:
config_dict = self.DEFAULT_CONFIG
dictConfig(config_dict)
And the service:
from twisted.application.internet import TCPServer
from log.factory import LoggingFactory
'''
A service for the remote Python logger.
This is used by the application.
'''
class LoggingService(TCPServer):
def __init__(self, port=None, config_dict=None, interface='0.0.0.0'):
if not port:
port = LoggingFactory.DEFAULT_PORT
# old style clases in twisted
TCPServer.__init__(self, port, LoggingFactory(config_dict),
interface=interface)
This can then be made into an application (a daemon) that's run from the
command-line using a tool called "twistd":
# You can run this .tac file directly with:
# twistd -ny service.tac
from log.service import LoggingService
from twisted.application import service
application = service.Application("Logging application")
LoggingService().setServiceParent(application)
Alternatively, for testing, the Fatcory can be used directly. This test code
also gives a glimpse of how the reactor is used to schedule events (there's
also an abstraction for chaining callbacks called "Defered"):
from logging.config import dictConfig
from logging import getLogger
from multiprocessing.process import Process
from tempfile import mkstemp
from twisted.internet import reactor
from unittest import TestCase
from log.factory import LoggingFactory
class LoggingTest(TestCase):
'''
Test the logging service by starting an instance, then firing up a
separate process that logs to the service.
'''
def test_logging(self):
tick = Tick()
(_fd, self.tmp) = mkstemp()
process = Process(target=self.logging_process)
factory = LoggingFactory({'version': 1,
'handlers': {'file': {'class': 'logging.FileHandler',
'filename': self.tmp,
'level': 'DEBUG'}},
'root': {'level': 'DEBUG',
'handlers': ['file']}})
reactor.listenTCP(factory.DEFAULT_PORT, factory)
reactor.callLater(tick(), process.start)
reactor.callLater(tick(), reactor.stop)
reactor.run()
fd = open(self.tmp)
contents = fd.readlines()
assert contents == ['a warning\n'], contents
fd.close()
def logging_process(self):
dictConfig({'version': 1,
'handlers':
{'socket':
{'class': 'logging.handlers.SocketHandler',
'level': 'INFO',
'host': 'localhost',
'port': LoggingFactory.DEFAULT_PORT
},},
'root':
{'level': 'INFO',
'handlers': ['socket']},
})
logger = getLogger('test')
logger.debug('a debug') # discarded by "level: INFO" above
logger.warn('a warning')
class Tick(object):
def __init__(self, increment=0.1):
self.__increment = increment
self.__time = 0
def __call__(self, step=1):
self.__time += step * self.__increment
return self.__time
Andrew
From: andrew cooke <andrew@...>
Date: Wed, 11 Aug 2010 16:07:57 -0400
This is a follow-up to my earlier post on Selenium at http://www.acooke.org/cute/SeleniumWe0.html - there I gave a summary of Selenium and the basic tools needed to run simple, single tests. However, what is interesting me at the moment (because we need it at work) is how to run the same test in several browsers, on different operating systems. This requires two things: 1 - A way to distribute jobs across machines 2 - A way to make tests general (the tests in the first article specified the target browser). Selenium provides these things via something called "Selenium Grid", although much of the documentation for that focuses on running tests in parallel (for speed) rather than on exploiting different environments. A warning: the Selenium Grid documentation sucks. I have spent a frustrating couple of days getting this working. Even the demo they provide to test the system doesn't work. And the system itself seems a bit limited and unreliable. But, unfortunately, I don't see anything better. OK, so what is Selenium Grid? It's three things: A - A central hub that manages a collection of servers B - A collection of servers (possibly on remote machines) C - Library support for writing tests (A+B) address (1) above and (C) addresses (2). This is all packaged in a Java deploy and run using ant-based scripts. First, I want to describe how the distribution of tasks works, because I found this far from intuitive. It's important to understand that all the servers (called "remote controls") are equivalent and dumb. They don't "know" that they are running on Windows, or can access the Opera browser, for example. The only way that such information is made available to the system is through the *environment*. When you start a server you specify its environment. This is a string, and the standard form is something like "IE on Windows" or "Firefox on Linux". That information is passed by the server to the central hub which uses that (and only that) to route tests. The hub is also contacted by the test. The test requests a particular environment (I'll address 2/C below) and the hub then routes the test to the corresponding server. At this point it's probably worth describing exactly how these things run. This is how I start the hub and a local server on Linux (first script uses konsole as they log to stdout - although on Windows you don't see anything, and I don't understand why): > cat startup-selenium.sh #!/bin/bash konsole --hold -e startup-hub.sh & sleep 5 konsole --hold -e startup-rc.sh & > cat startup-hub.sh #!/bin/bash cd .../selenium-grid-1.0.6/ ant launch-hub > cat startup-rc.sh #!/bin/bash cd .../selenium-grid-1.0.6/ ant \ -Dport=5555 \ -Dhost=10.2.0.0 \ -DhubURl=http://10.2.0.0:4444 \ -Denvironment='Firefox on Linux' \ launch-remote-control You can see how the server's environment is defined as 'Firefox on Linux'. Note also that the server is told the location of the hub so that it can register itself. You can see the hub's status by pointing a browser at /console on the same URL (so http://10.2.0.0:444/console in my example above). I can then run a test: > ant -Dbrowser="Firefox on Linux" run-in-sequence Buildfile: build.xml run-in-sequence: [java] [Parser] Running: [java] Selenium Grid Tests In Sequence [java] [java] 11-Aug-2010 15:45:52 com.thoughtworks.selenium.grid.tools.ThreadSafeSeleniumSessionStorage startSeleniumSession [java] INFO: Contacting Selenium RC at 10.2.0.0:4444 [java] 11-Aug-2010 15:45:59 com.thoughtworks.selenium.grid.tools.ThreadSafeSeleniumSessionStorage startSeleniumSession [java] INFO: Got Selenese session:com.thoughtworks.selenium.DefaultSelenium@... [java] 11-Aug-2010 15:46:16 com.thoughtworks.selenium.grid.tools.ThreadSafeSeleniumSessionStorage closeSeleniumSession [java] INFO: Closing Selenese session: com.thoughtworks.selenium.DefaultSelenium@... [java] [java] =============================================== [java] Selenium Grid Tests In Sequence [java] Total tests run: 1, Failures: 0, Skips: 0 [java] =============================================== [java] BUILD SUCCESSFUL Total time: 23 seconds I will explain how tests are written and structured, but first I need to add one extra point. There's an additional piece of information needed, which is the "driver" used to run the tests. This depends on the browser, and so we need to map from environment to driver. This is done in the file grid_configuration.yaml in the main selenium-grid directory and displayed in the hub console display. So far I have addressed my point (1), but haven't really explain how (2) is solved. And in truth, I don't completely know - I am simply copying some code that works. However, I do know that the documentation is *particularly* poor on this, so this information is critical. First, the ant scripts use TestNG to run Java tests. But the tests generated by the IDE (export as Java TestNG) seem sot be using an old, unsupported library. Do *not* try hunting down the appropriate class and jar (SeleneseTestNgHelper) because it does not work with the rest of the (ant-based) environment, as far as I can see. Instead, copy the supplied code used in the (broken!) tests in the Selenium Grid package. In particular, copy GoogleImageTestBase.java and compare ti to the code generated by the IDE - it's pretty obvious how to do the translation. Note that the libraries involved do the necessary work of somehow adapting the tests to use the parameters supplied to any when invoking with a particular environment. If you do all that then you will need to link to the following jars: commons-logging-1.0.4.jar selenium-java-client-driver.jar testng-5.7-jdk15.jar selenium-server-1.0.3-standalone.jar selenium-grid-tools-standalone-1.0.6.jar You will also need a build.xml that can compile that and run the test. I used: <project name="..." basedir="."> <property name="src.dir" value="${basedir}/src"/> <property name="classes.dir" value="${basedir}/classes"/> <property name="lib.dir" value="${basedir}/lib"/> <path id="classpath"> <fileset dir="${lib.dir}" includes="**/*.jar"/> <pathelement path="${classes.dir}"/> </path> <target name="compile"> <mkdir dir="${classes.dir}"/> <javac srcdir="${src.dir}" destdir="${classes.dir}" classpathref="classpath"/> </target> <property name="webSite" value="http://10.2.0.0:8080/" /> <property name="seleniumHost" value="10.2.0.0" /> <property name="seleniumPort" value="4444" /> <property name="browser" value="*firefox" /> <target name="run-in-sequence" description="Run Selenium tests one by one"> <java classpathref="classpath" classname="org.testng.TestNG" failonerror="true"> <sysproperty key="java.security.policy" file="${basedir}/lib/testng.policy"/> <sysproperty key="webSite" value="${webSite}" /> <sysproperty key="seleniumHost" value="${seleniumHost}" /> <sysproperty key="seleniumPort" value="${seleniumPort}" /> <sysproperty key="browser" value="${browser}" /> <arg value="-suitename" /> <arg value="Selenium Grid Tests In Sequence" /> <arg value="-d" /> <arg value="${basedir}/target/reports" /> <arg value="-testclass"/> <arg value="MyClass"/> </java> </target> </project> which is just a simple copy of the one supplied in the grid package. I hope that helps! Andrew
From: andrew cooke <andrew@...>
Date: Wed, 11 Aug 2010 09:36:23 -0400
If you have an encrypted loopback file system that uses cryptmount you can
resize (enlarge) it by doing the following:
- Add extra padding by:
dd if=/dev/zero bs=... count=... >> file
where file is the file that contains the file system (unmounted)
- Mount the file system (this also creates loopback device etc):
cryptmount name
- Unmount the file system:
sudo su
umount /dev/mapper/....
- Check and resize the file system:
e2fsck -f /dev/mapper/...
resize2fs /dev/mapper/...
- And remount
mount /dev/mapper/... path
This is based on the information at
http://wiki.archlinux.org/index.php/System_Encryption_with_LUKS_for_dm-crypt#Resizing_the_loopback_filesystem
but that assumes a LUKS file system.
Andrew
From: andrew cooke <andrew@...>
Date: Sat, 7 Aug 2010 19:31:39 -0400
I've been looking at Selenium, which is a system for testing web sites. It's
very easy to use, for simple sites, and works impressively well. Using a
Firefox plugin you "record" browsing a site and can then repeat that in
"playback mode" as a test. You can also export the test as Python or Java
unit tests and run those against all the popular browsers on all the popular
operating systems.
That's pretty neat - you could have a bunch of VMs running different operating
systems and browsers, all running against a test web site. I tested it by
running a Python unit test from Linux that tested Internet Explorer in a
Windows 7 VM.
However, for the particular case I had in mind there are some problems. In
particular, it's not clear how to handle mixed namespaces (eg SVG is not in
the HTML namespace in the DOM) and, given how the system is implemented (as
Javascript accessing the DOM within the browser being tested) it will not work
with Adobe's SVG plugin on Windows.
Notes
-----
The "Selenium IDE" is a Firefox plugin that:
* Lets you define tests (and group them in suites)
* Lets you *run* tests (against Firefox)
* Lets you save test for future use, or to reun from other languages
So for simple, Firefox only tests, this is a very quick, simple solution. For
more complex testing it is also a good way of generating tests.
Tests can follow links and do things like verify text (when defining a test,
highlight the text, right click and select the correct option).
A test when saved in the "internal" format is a HTML table of commands. When
saved as Python file it looks like:
from selenium import selenium
import unittest, time, re
class test(unittest.TestCase):
def setUp(self):
self.verificationErrors = []
self.selenium = selenium("localhost", 4444, "*firefox",
"http://www.acooke.org")
self.selenium.start()
def test_selenium(self):
sel = self.selenium
sel.open("/")
sel.click("link=Lepl")
sel.wait_for_page_to_load("30000")
sel.click("link=Support")
sel.wait_for_page_to_load("30000")
sel.click("link=Show Source")
sel.wait_for_page_to_load("30000")
try: self.failUnless(sel.is_text_present("Google Code"))
except AssertionError, e: self.verificationErrors.append(str(e))
def tearDown(self):
self.selenium.stop()
self.assertEqual([], self.verificationErrors)
if __name__ == "__main__":
unittest.main()
To run that, however, and to test other browsers, you need to install "Selnium
RC". This is a bundle of packages that includes the server along with support
for various languages.
The server is a Java program that you run as:
java -jar selenium-server.jar
Then, when you run the Python test...
python test.py
..the server automatically opens firefox and runs the tests. It's just like
someone is using your machine!
The only extra detail needed in practice is that "firefox" must be the binary,
not the usual script at /usr/bin/firefox (on Linux). Otherwise you will see
the error
Caution: '/usr/bin/firefox': file is a script file, not a real
executable. The browser environment is no longer fully under RC control
Since selenium checks for "firefox-bin" before "firefox" the easiest solution
is to:
1 - link firefox-bin to firefox in /usr/lib64/firefox
2 - include /use/lib64/firefox on the path
Then run the server as:
PATH=$PATH:/usr/lib64/firefox java -jar selenium-server.jar
Note that you need firefox-bin to be in the same directory as firefox or you
will see the error:
Could not read application.ini
With that fix, the server starts and kils firefox, and tests run repeatedly
(without it, only the first test worked - rerunning it somehoe treated the
website as a binary file).
So, next question - can we do the same on Windows? I started a VB VM with
Windows 7, containing Java 6 and IE 8. Downloading Selenium RC (the server)
and starting it, then running the same test (on Linux) but changed to call the
windows server, I got the error:
Couldn't open app window; is the pop-up blocker enabled?"
Which is described at
http://stackoverflow.com/questions/1517623/internet-explorer-8-64bit-and-selenium-not-working
The simplest solution is to change the browser type to iexploreproxy. The
following code, run on Linux, invokes IE on Windows and gives a successful
test:
from selenium import selenium
import unittest, time, re
class test(unittest.TestCase):
def setUp(self):
self.verificationErrors = []
self.selenium = selenium("10.1.0.28", 4444, "*iexploreproxy",
"http://www.acooke.org")
self.selenium.start()
def test_selenium(self):
sel = self.selenium
.... as before
Where 10.1.0.28 is the VM address.
This can be automated further with Selnium Grid, which automatically runs
tests on different machines.
OK, so now to test a web page with SVG.
Hmmm. Suddenly (well, hours later....) this is not so great. First, the site
opens new windows on clicks. These have a "_blank" target. Selenium has a
command to identify windows by title, but it doesn't work in the IDE (the
Firefox plugin). After much frustration I found that it *does* work when used
via the server.
Also, because the link is loaded into a new window (ie not the one that
Selenium is watching) the test does not wait for completion. So a fixed
period wait (30s) needs to be added.
And when I click on a link in the SVG, the selenium IDE records nothing. I
can add this explicitly to a Python test, but I can't work out what to add.
I have experimented with specifying a namespace in an xpath expresison and
also by using javascript. Javascript can be tested in the browser address box
(running selenium each time is slow because of waiting for pages) and
unfortunately the code below (and related attempts) does not work.
javascript:alert(function(){var a = document.getElementsByTagName('a');var result = [];for (var i=0; i<a.length; i++) {if (a[i].href == 'some-url-here'){return a[i];}}return null;}())
javascript:alert(function(){var a = document.getElementsByTagNameNS('http://www.w3.org/2000/svg', 'svg'); return a[0];}())
(Javascript can be specified inside selenium to identify the target - see the
"dom=" protocol in the documentation for the Python library).
Worse, on IE I suspect this would also be broken by the plugin.
So I stalled at this point. Selenium works surprisingly well (better than I
expected), but does not handle SVG well - there appear to be problems with
namespaces and, likely IE's plugin.
Andrew
From: andrew cooke <andrew@...>
Date: Wed, 28 Jul 2010 10:16:53 -0400
There's a nice algorithm for auto-scaling axes, called the "nice number algorithm", written by Paul Heckbert and published in "Graphics Gems" - http://books.google.com/books?id=fvA7zLEFWZgC&pg=PA61&lpg=PA61&dq=nice+numbers+graphics+gems&source=bl&ots=7LdCq3nI-j&sig=L8qoZ8l_a95KAtHmMjagJ8cC0U0&hl=en&ei=KDhQTKLwGcT48AbTsvnEAQ&sa=X&oi=book_result&ct=result&resnum=2&ved=0CBYQ6AEwAQ#v=onepage&q&f=false The routines below implement this, but are parameterised over the number base used, so can also be used for axes based on units that repeat over multiples of 12, 60, or any other value. from calendar import timegm from math import floor, log, log10, ceil from time import gmtime # These allow the use with base 10, 12 and 60: LIM10 = (10, [(1.5, 1), (3, 2), (7, 5)], [1, 2, 5]) LIM12 = (12, [(1.5, 1), (3, 2), (8, 6)], [1, 2, 6]) LIM60 = (60, [(1.5, 1), (20, 15), (40, 30)], [1, 15, 40]) def heckbert_d(lo, hi, ntick=5, limits=None): ''' Calculate the step size. ''' if limits is None: limits = LIM10 (base, rfs, fs) = limits def nicenum(x, round): step = base ** floor(log(x)/log(base)) f = float(x) / step nf = base if round: for (a, b) in rfs: if f < a: nf = b break else: for a in fs: if f <= a: nf = a break return nf * step delta = nicenum(hi-lo, False) return nicenum(delta / (ntick-1), True) def heckbert(lo, hi, ntick=5, limits=None): ''' Calculate the axes lables. ''' def _heckbert(): d = heckbert_d(lo, hi, ntick=ntick, limits=limits) graphlo = floor(lo / d) * d graphhi = ceil(hi / d) * d fmt = '%' + '.%df' % max(-floor(log10(d)), 0) value = graphlo while value < graphhi + 0.5*d: yield fmt % value value += d return list(_heckbert()) This can then be used with a range of seconds as follows: def autoscale_time(start, end): ''' Yields a sequence of epochs that are nicely spaced. start and end are Unix epochs. ''' time_chunks = [('days', 3 * 24 * 60 * 60, 24 * 60 * 60, 2, None), ('hours', 3 * 60 * 60, 60 * 60, 3, LIM12), ('minutes', 3 * 60, 60, 4, LIM60), ('seconds', 0, 1, 5, LIM60)] for (name, limit, secs, sindex, limits) in self.time_chunks: if (end - start) > limit: break d = heckbert_d(start / secs, end / secs, limits=limits) # zero out lower steps, so that we get a starting date that's an # integral number of units stime = list(gmtime(start)) for i in range(sindex, 9): stime[i] = 0 # generate a sequence of epochs (cannot use the usual heckbert routine # because formatting will be different) value = timegm(stime) while value <= end: if value >= start: yield value value += d * secs This could be extended further by: - having different formats in the time_chunks parameter, so that different intervals are formatted differently - adding months etc. This would require changing the "secs" increment to be a timedelta and working with datetime instances rather than epochs (because months are not all equally sized). NOTE: The code above is cut + pasted from some working code and is not tested in its existing form; I may have introduced a bug somewhere, but hopefully this illustrates the idea. Andrew
From: andrew cooke <andrew@...>
Date: Tue, 27 Jul 2010 11:59:16 -0400
Subversion drives me crazy in how it sets file permissions (keeping whatever
permission teh file originally had, and overwriting local changes on update).
Turns out that for the executable bit, you can set a special property that
will get the correct behaviour. For example:
find . -name "*.sh" -exec svn propset svn:executable ON \{} \;
Peace.... :o)
Andrew
From: andrew cooke <andrew@...>
Date: Mon, 26 Jul 2010 00:40:07 -0400
YUI 3 is amazing. It looks terrifyingly complex, but once you get into it,
you can do complex things trivially. I use jQuery at work, and in comparison,
YUI 3 feels like it was written by software engineers rather than people
hacking web pages.
For example, here's how to make a menu that shows as a tab on the right-hand
side of the page that "zooms in" (to the left) when you mouse-over.
First, place the menu in a div in the HTML:
<div class="menu" id="menu-1">
<img src="menu-icon-small.png"/>
<ul>
<li>Foo</li>
<li>Bar</li>
<li>Foo</li>
<li>Bar</li>
</ul>
</div>
Next, add some CSS (probably ina separate stylsheet) so that the icon is on
the left, and will be visible, but the rest of the div is off the page to the
right:
ul {
margin: 10px;
margin-left: 40px;
}
li {
list-style-type: none;
list-style-position: outside;
margin-top: 4px;
margin-bottom: 4px;
}
div.menu {
position: fixed;
right: -278px;
width: 300px;
border: 1px solid #c0c0c0;
border-right: none;
-moz-border-radius: 4px;
background: white;
}
div#menu-1 {
top: 50px;
}
div.menu img {
margin: 1px;
float: left;
}
Import the YUI magic (in the HTML header).
<script type="text/javascript"
src="http://yui.yahooapis.com/combo?3.1.1/build/yui/yui-min.js"></script>
and define the animation:
<script type="text/javascript">
YUI().use('event-mouseenter', 'console', 'anim', function(Y) {
/* new Y.Console().render(); */
var open = new Y.Anim({node: '#menu-1', to: {right: 0},
easing: Y.Easing.easeOut});
var close = new Y.Anim({node: '#menu-1', to: {right: -278},
easing: Y.Easing.easeIn});
Y.on("mouseenter", function (e) {open.run();}, "#menu-1");
Y.on("mouseleave", function (e) {close.run();}, "#menu-1");
});
</script>
and that's it!
The script above sets up the events and defines animations for the div so that
it moves into and out of sight. The "Easing" part even makes it move in a
"natural" manner (it slows down as it extends fully).
Andrew
From: andrew cooke <andrew@...>
Date: Fri, 23 Jul 2010 10:07:27 -0400
There's not much love for Python 2.4 and SVG (which, I admit is something of
an odd combination - who would be so conservative they would use Python 2.4
and then require SVG?), particularly if you need a non-GPL solution. So I
ended up wiring a simple wrapper around DOM that helps generate the XML.
Here's a summary of the approach (there's nothing really hard here, just
fiddly DOM details):
class SvgBase(object):
'''
Support class. Contains a reference to the DOM document and
the element corresponding to this node. Note that we allow any
node to be extended with any child node; we also allow any
attribute to be added.
'''
SVG = 'http://www.w3.org/2000/svg'
def __init__(self, doc, element, **attrs):
self._doc = doc
self._element = element
for name in attrs:
value = attrs[name]
if value is not None:
self._element.setAttributeNS(self.SVG, name, str(value))
def line(self, (x1, y1), (x2, y2), **attrs):
'''
Add a child line
'''
return Line(self, (x1, y1), (x2, y2), **attrs)
# more child metods here.....
class Svg(SvgBase):
'''
The root svg element. This creates the document, allows a
stylesheet to be used, etc.
'''
def __init__(self, version='1.1', width=None, height=None):
implementation = getDOMImplementation('')
doctype = implementation.createDocumentType('svg',
'-//W3C//DTD SVG 1.1//EN',
'http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd')
document = implementation.createDocument(self.SVG, 'svg', doctype)
for node in document.childNodes:
if node.nodeType == Node.ELEMENT_NODE:
element = node
break
# xlink lets "a" element work correctly (use xlink:href)
ns = {'xmlns': self.SVG,
'xmlns:xlink': 'http://www.w3.org/1999/xlink'}
super(Svg, self).__init__(document, element,
version=version, width=width, height=height,
**ns)
def add_stylesheet(self, url):
style = self._doc.createProcessingInstruction('xml-stylesheet',
'href="%s" type="text/css"' % url)
for node in self._doc.childNodes:
if node.nodeType == Node.DOCUMENT_TYPE_NODE:
self._doc.insertBefore(style, node)
break
def __str__(self):
'''
The XML of the entire document.
'''
return self._doc.toxml()
class Line(SvgBase):
'''
An example child node.
'''
def __init__(self, parent, (x1, y1), (x2, y2), **attrs):
element = parent._doc.createElementNS(self.SVG, 'line')
super(Line, self).__init__(parent._doc, element,
x1=self._1dp(x1), y1=self._1dp(y1),
x2=self._1dp(x2), y2=self._1dp(y2),
**attrs)
parent._element.appendChild(self._element)
With that, it's pretty easy to do things like:
>>> svg = Svg()
>>> svg.line((1,2), (3,4), stroke='blue')
>>> svg.add_stylesheet('http://example.com/foo')
>>> str(svg)
'''<?xml version="1.0" ?>
<?xml-stylesheet href="http://example.com/foo" type="text/css"?>
<!DOCTYPE svg PUBLIC '-//W3C//DTD SVG 1.1//EN'
'http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd'>
<svg version="1.1" xmlns="http://www.w3.org/2000/svg"
xmlns:xlink="http://www.w3.org/1999/xlink">
<line x1="1.0" x2="3.0" y1="2.0" y2="4.0" stroke="blue"/>
</svg>'''
Andrew
From: andrew cooke <andrew@...>
Date: Sat, 17 Jul 2010 16:48:30 -0400
Much of this is just duplicating the re2 work, and the rest is RXPY-specific,
but even so, these are quite interesting (for me, at least).
Note that the times are relative to the Python re package (log10 in brackets).
Also, the plots are both linear ### and logarithmic [ ].
First, some basic matching. This is just showing how slow my code is
(hundreds of times slower than the re package).
Match a(.)c against abc
Backtracking [########## ] 95 [2.0]
Parallel wide [####################### ] 206 [2.3]
Parallel wide, hashed [#################################### ] 315 [2.5]
Parallel seq [####################### ] 205 [2.3]
Parallel seq, hashed [################################### ] 304 [2.5]
Parallel beam [############################ ] 245 [2.4]
Parallel beam, hashed [######################################] 339 [2.5]
Match (a)b(?<=(?(1)b|x))(c) against abc
Backtracking [############ ] 210 [2.3]
Parallel wide [######################### ] 398 [2.6]
Parallel wide, hashed [################################# ] 531 [2.7]
Parallel seq [######################### ] 400 [2.6]
Parallel seq, hashed [################################## ] 546 [2.7]
Parallel beam [############################## ] 480 [2.7]
Parallel beam, hashed [######################################] 622 [2.8]
Match a*b against a^100b
Backtracking [#################################### ] 1902 [3.3]
Parallel wide [################################# ] 1737 [3.2]
Parallel wide, hashed [#################################### ] 1919 [3.3]
Parallel seq [################################ ] 1733 [3.2]
Parallel seq, hashed [#################################### ] 1919 [3.3]
Parallel beam [################################### ] 1884 [3.3]
Parallel beam, hashed [######################################] 2073 [3.3]
Match .*b against a^100b
Backtracking [################# ] 1902 [3.3]
Parallel wide [############### ] 1769 [3.2]
Parallel wide, hashed [################# ] 1927 [3.3]
Parallel seq [############### ] 1757 [3.2]
Parallel seq, hashed [################## ] 2015 [3.3]
Parallel beam [################################### ] 3825 [3.6]
Parallel beam, hashed [######################################] 4300 [3.6]
Above is quite interesting because the beam search is suddenly twice as slow,
just by changing from "a*b" to ".*b". That's because the second expression
has a failure ("." matchs the final "b") and so needs to backtrack. Because
the initial beam width is 1 that means that the entire search must be repeated
with doubled with.
The next test is a search.
Search .*b against a^100b
Backtracking [ ] 1172 [3.1]
Parallel wide [######################################] 41841 [4.6]
Parallel wide, hashed [ ] 1459 [3.2]
Parallel seq [ ] 1212 [3.1]
Parallel seq, hashed [ ] 1338 [3.1]
Parallel beam [### ] 3764 [3.6]
Parallel beam, hashed [### ] 3688 [3.6]
Above shows the problem with a naive wide parallel approach - there are an
awful lot of different states to store. Of course, most are duplicates so the
hashing removes the problem (as does sequential or beam search).
The next tests show the exponential explosion problem in Python's matcher.
This is a little more difficult to trigger than in the re2 paper because
Python has a "shortcut" (hack!) that avoids the problem in very simple cases
(but, as we will see below, does nothing to address the underlying issues).
Match (?:a|b)?^2n(?:ab)^n against (ab)^n for n=4
Backtracking [############# ] 573 [2.8]
Parallel wide [######### ] 392 [2.6]
Parallel wide, hashed [## ] 125 [2.1]
Parallel seq [######### ] 398 [2.6]
Parallel seq, hashed [## ] 125 [2.1]
Parallel beam [######################################] 1606 [3.2]
Parallel beam, hashed [######## ] 360 [2.6]
Match (?:a|b)?^2n(?:ab)^n against (ab)^n for n=6
Backtracking [######################################] 637 [2.8]
Parallel wide [############################# ] 476 [2.7]
Parallel wide, hashed [# ] 27 [1.4]
Parallel seq [############################# ] 475 [2.7]
Parallel seq, hashed [# ] 27 [1.4]
Parallel beam, hashed [## ] 51 [1.7]
Match (?:a|b)?^2n(?:ab)^n against (ab)^n for n=8
Parallel wide, hashed [############## ] 4 [0.6]
Parallel seq, hashed [############## ] 4 [0.6]
Parallel beam, hashed [######################################] 10 [1.0]
As the number of terms increase, only the hashed approaches are efficient (due
to the explosion in the number of states). This is as expected. What is nice
to see is that by n=8 my Python code is oly 4x slower than the re package!
This is because the re backtracker is "exploding", while my code is linear
(but slow).
This is only possible, though, when we can discard duplicate state. Which
isn't so easy when matching groups:
Match (a|b)?^2n(?:ab)^n against (ab)^n for n=4
Backtracking [####### ] 481 [2.7]
Parallel wide [##### ] 378 [2.6]
Parallel wide, hashed [######### ] 627 [2.8]
Parallel seq [###### ] 414 [2.6]
Parallel seq, hashed [########## ] 667 [2.8]
Parallel beam [########################## ] 1612 [3.2]
Parallel beam, hashed [######################################] 2415 [3.4]
Match (a|b)?^2n(?:ab)^n against (ab)^n for n=6
Backtracking [########################### ] 512 [2.7]
Parallel wide [###################### ] 415 [2.6]
Parallel wide, hashed [######################################] 723 [2.9]
Parallel seq [###################### ] 418 [2.6]
Parallel seq, hashed [######################################] 714 [2.9]
And here we do no better than normal backtracking.
The main conclusions, then, are:
1 - Python's re package does suffer from the exponential "explosion" issue,
while my code (with duplicate state elimination) does not.
2 - My code can also handle groups, but the degrades in performance as
expected.
So RXPY is both general and, when possible, efficient. Even if it is terribly
slow.
A secondary conclusion is that the beam search approach does not seem to be
worth the trouble.
Andrew
This is my blog. It used to be a mailing list called C[omp]ute. It is still generated by email. You can reply to comments via the appropriate link. Edit the mail address to remove the anti-spam measure. However, given the very low volume of replies, and the high rate of spam, it can be months before I moderate a post. Sorry. © 2006-2009 Andrew Cooke (site) / post authors (content).
I am always interested in offers/projects/new ideas. Eclectic experience in fields like: numerical computing; Java web/enterprise; functional languages; Python client GUI/web/database; etc. Based in Santiago, Chile; telecommute worldwide. CV; email.
Selenium Tests of Multiple Browser and OS Combinations
Resizing Cryptmount File System
Auto-Scaling Date Axes in Python
Setting File Permissions in Subversion
Easy Slide-in Menus using YUI 3
Forensics Using Frequency Variation of Mains Supply
Empty Loops in Regular Expressions
Compiling Python Numerics to GPU wuth Theano
Anybots - Physical Presence for Telecommuting
Firefox uses Proxy with Selenium