| Andrew Cooke | Contents | Latest | RSS | Twitter | Previous | Next

C[omp]ute

Welcome to my blog, which was once a mailing list of the same name and is still generated by mail. Please reply via the "comment" links.

Always interested in offers/projects/new ideas. Eclectic experience in fields like: numerical computing; Python web; Java enterprise; functional languages; GPGPU; SQL databases; etc. Based in Santiago, Chile; telecommute worldwide. CV; email.

Personal Projects

Lepl parser for Python.

Colorless Green.

Photography around Santiago.

SVG experiment.

Professional Portfolio

Calibration of seismometers.

Data access via web services.

Cache rewrite.

Extending OpenSSH.

Last 100 entries

Headphone Test Recordings; Causal Consistency; The Quest for Randomness; Chat Wars; Real-life Financial Co Without ACID Database...; Flexible Muscle-Based Locomotion for Bipedal Creatures; SQL Performance Explained; The Little Manual of API Design; Multiple Word Sizes; CRC - Next Steps; FizzBuzz; Update on CRCs; Decent Links / Discussion Community; Automated Reasoning About LLVM Optimizations and Undefined Behavior; A Painless Guide To CRC Error Detection Algorithms; Tests in Julia; Dave Eggers: what's so funny about peace, love and Starship?; Cello - High Level C Programming; autoreconf needs tar; Will Self Goes To Heathrow; Top 5 BioInformatics Papers; Vasovagal Response; Good Food in Vina; Chilean Drug Criminals Use Subsitution Cipher; Adrenaline; Stiglitz on the Impact of Technology; Why Not; How I Am 5; Lenovo X240 OpenSuse 13.1; NSA and GCHQ - Psychological Trolls; Finite Fields in Julia (Defining Your Own Number Type); Julian Assange; Starting Qemu on OpenSuse; Noisy GAs/TMs; Venezuela; Reinstalling GRUB with EFI; Instructions For Disabling KDE Indexing; Evolving Speakers; Changing Salt Size in Simple Crypt 3.0.0; Logarithmic Map (Moved); More Info; Words Found in Voynich Manuscript; An Inventory Of 3D Space-Filling Curves; Foxes Using Magnetic Fields To Hunt; 5 Rounds RC5 No Rotation; JP Morgan and Madoff; Ori - Secure, Distributed File System; Physical Unclonable Functions (PUFs); Prejudice on Reddit; Recursion OK; Optimizing Julia Code; Cash Handouts in Brazil; Couple Nice Music Videos; It Also Works!; Adaptive Plaintext; It Works!; RC5 Without Rotation (2); 8 Years...; Attack Against Encrypted Linux Disks; Pushing Back On NSA At IETF; Summary of Experimental Ethics; Very Good Talk On Security, Snowden; Locusts are Grasshoppers!; Vagrant (OpenSuse and IDEs); Interesting Take On Mandela's Context; Haskell Cabal O(n^2) / O(n) Fix; How I Am 4; Chilean Charity Supporting Women; Doing SSH right; Festival of Urban Intervention; Neat Idea - Wormholes Provide Entanglement; And a Link....; Simple Encryption for Python 2.7; OpenSuse 13.1 Is Better!; Little Gain...; More Details on Technofull Damage; Palmrest Cracked Too....; Tecnofull (Lenovo Support) Is Fucking Useless; The Neuroscientist Who Discovered He Was a Psychopath; Interpolating Polynomials; Bottlehead Crack as Pre-amp; Ooops K702!; Bottlehead Crack, AKG K701; Breaking RC5 Without Rotation; Great post thank you; Big Balls of Mud; Phabricator - Tools for working together; Amazing Julia RC5 Code Parameterized By Word Size; Chi-Square Can Be Two-Sided; Why Do Brits Accept Surveillance?; Statistics Done Wrong; Mesas Trape from Bravo; European Report on Crypto Primitives and Protocols; Interesting Omissions; Oryx And Crake (Margaret Atwood); Music and Theory; My Arduino Programs; Elliptic Curve Crypto; Re: Licensing Interpreted Code; Licensing Interpreted Code; ASUS 1015E-DS03 OpenSuse 12.3 SSD

© 2006-2013 Andrew Cooke (site) / post authors (content).

iBatis ORM and Caching Strategy - a Use Case

From: "andrew cooke" <andrew@...>

Date: Sat, 27 Sep 2008 04:34:37 -0400 (CLT)

The Application

I am writing a Java application that presents information, stored in a
database, via a web server.  The presentation does not modify the data,
but the database does evolve, slowly, on disk (modified by other
processes).

The data themselves can be divided into two groups: first a set of related
objects that describe the system being measured (in this context they
could be considered "metadata"); second a set of measurements.  There are
many more measurements than there are metadata.

Some iBatis Details

iBatis provides very simple caching at the SQL statement level.  So if a
query is repeated with the same parameters then the result may be
retrieved from the cache.  The advantage of this approach is that it is
simple to implement and understand.  The disadvantage is that it is not
"intelligent", in that it does not recognise object identities.

For example, if I have two different queries that both return an instance
of the same class, iBatis will cache two different instances of the same
object for each key - one from each query.

iBatis is also limited in how it will map graphs - it does not provide
transparent retrieval of related objects.  So if one class references
another class then either both must be mapped in a single query to iBatis,
or a second call must be made explicitly (in which case the first instance
presumably contains a key that will be used to find the correct value).

This limitation can be removed by using a second layer - for example
Spring AOP.

My Solution

I have decided to use a very simple mapping approach with two levels of
cache.  The first level of cache is the iBatis statement cache; the second
is a set of maps from keys to instances for the "metadata" objects.  This
second level holds all metadata instances in-memory.  The caches are in a
singleton "database interface" class.

The connections between related objects are not explicit in the objects
themselves.  Instead, the "database interface" provides methods that
retrieve related objects (for example "List<Foo> getFoosForBaz(Baz)").

One drawback to this approach is that it is not very "OO".  The instances
do not form a graph that can be traversed directly.  For this particular
application that is not an issue - there is very little "analysis" of the
data (it is just "dumb presentation").

Another drawback is that for each metadata class the ORM manages two
different types - instances and keys.  iBatis must return instances when
asked for "all instances" to populate the cache, but must return only keys
when asked about relations between objects (the keys are then used to
retrieve instances from the second level cache).  In practice, with some
basic engineering (return factory objects rather than keys; these take the
"database interface" and extract the cached value), the extra work is
minimal.

The advantages are that the scheme is fast, compact, and relatively easy
to understand.  It is fast because almost all information is cached:
metadata instances are cached in the second level caches; relationships
are cached (as lists of keys) in the iBatis cache.  It is compact because
no duplicate instances are created (except when a cache expires).

The simplicity is not so obvious until you start to consider details like
cache expiry.  Because there are no explicit relations each pool of
instances (caches are grouped by class) can expire independently.  There
is no worry about instances being "trapped" in complex graphs.  There
should be no memory leaks.

This approach is very different to how I used SQLAlchemy in Python.  The
approach I used there was much more "sophisticated", with transparent,
lazy retrieval of related objects.  In that case I was writing a client
application that did not need to be long-lived, reliable, or efficient
(within reason, of course).

It's possible that I am being too conservative in this case.  Spring AOP
(or Hibernate) might have made the code more transparent at no extra cost
in practice.  But I think this was a reasonable approach to use here,
given the circumstances (both the application - particularly the "read
only" nature - and my limited knowledge).

Andrew

More iBatis Comments

From: "andrew cooke" <andrew@...>

Date: Sat, 27 Sep 2008 09:01:17 -0400 (CLT)

When I was first deciding which ORM solution to use for this project I
read, somewhere, a comment suggesting that Hibernate and iBatis were both
good products, and that the choice of which to use depended on how you
wanted to think about the database: Hibernate was best for those who saw
the database as a transparent store for Java code; iBatis was better for
those who wanted to explicitly manage an interface between Java and SQL.

I don't know if that is true, but one more advantage of the approach
described here is that there's no real restriction to follow a "data
model".  Last night I was working through some uses cases for the user
interface and realised that, for "usability", I needed to present a
relationship between the metadata objects that was completely unintuitive
- it would not have been present in a traditional graph of objects and
will require some custom SQL to generate.  That's no problem here.

(You could argue that I simply has a bad design.  I would say that the
intuitive / straw-man data model above is actually a good design and that
this relationship is really an alternative view of the system that is
particularly useful in one context,  So in a more "OO" approach it would
be better considered the result of some analysis by the system (and still
not directly expressed in the data model).

The difference then comes down to how to manage that analysis in an
efficient manner; with the approach outlined here I can place the analysis
in the database and rely on the existing caches.  If I had been working
with transparently mapped objects I would have been tempted to do the
analysis in Java and add an explicit cache.  From my point of view (ie
preferring to use SQL to do a complex query rather than write Java code)
the approach here is simpler.)


Another possible advantage is the clear separation of concerns.  Quite
naturally my implementation has a single, well-defined interface that
describes all the information I retrieve from the database.  A requirement
that I didn't mention earlier is that the system be easy to move to a
in-house (client) data storage system that is not based on SQL.  Again: no
problem here.


Finally, given the somewhat "crude" (or "hands on") approach I have
described, you may ask whether iBatis is "worth it".  It most certainly
is.  First, it does most of the work of constructing objects for me
(although there is often a final pass to explicitly retrieve objects from
the second level cache).  Second, it allows easy separation of SQL and
Java (in separate files).  Third, it provides a mechanism for adapting the
SQL to different databases (I pass the engine name as a parameter in
queries and that can be used by iBatis to construct the correct query). 
Fourth, it provides the useful first level cache.

Andrew

Simplified Caching; Problem with iBatis, Spring and OSCache

From: "andrew cooke" <andrew@...>

Date: Sat, 27 Sep 2008 20:36:36 -0400 (CLT)

After writing the above and sleeping on it, I realised that I had made
things unnecessarily complicated.  A single cache (provided by iBatis) is
sufficient.  The only modifications needed are:

- Instead of caching all instances in a second level cache, we
  retrieve instances explicitly via their keys, one by one.  This
  places each instance in the iBatis cache.

- Any query that references an instance returns the key, wrapped in
  a factory as before.  The factory is then passed the database
  interface and requests the instance from the cache.

- The factories are themselves cached by iBatis.  So they can be
  made mutable, storing a weak reference to the instance that was
  retrieved.  If they are then re-used they do not need to repeat
  the lookup (the factory "short circuits" via the stored value).

That required a couple of hours to implement and works fine.


I then decided to investigate how teh caching was working.  To do this I
switched to OSCache and monitored the cache administrator via JMX - see
http://www.opensymphony.com/oscache/wiki/JMX Monitoring.html

That would have worked perfectly, except that iBaits will not use the
OSCache created in Spring.  It always creates its own instance.  After
looking in detail at the iBatis docs (the dev API at
http://ibatis.apache.org/docs/java/dev/index.html ) I realised that this
is a consequence of the caching architecture - the CacheModel creates a
CacheController instance via the constructor.  There is no way to inject a
value!

This is a serious design flaw for iBatis.  It means that caching cannot be
integrated with Spring (there's not even any way to programmatically
retrieve the cache that I can see).

Andrew

More Info on IBatis-Based Project

From: "andrew cooke" <andrew@...>

Date: Thu, 16 Oct 2008 09:14:37 -0300 (CLST)

I wrote a paper summarising the architecture of the project, including
iBatis.  It's here - http://www.acooke.org/kpi.pdf - and might be useful
as background for the comments above (or as more detail about how I used
iBatis within a J2EE web application).

Andrew

iBATIS Caching

From: "andrew cooke" <andrew@...>

Date: Sun, 16 Nov 2008 16:00:15 -0300 (CLST)

I was going to add a post summarizing this, but it's simpler to just
include the whole email.  Thanks, Clinton.

---------------------------- Original Message ----------------------------
Subject: iBATIS Caching
From:    "Clinton Begin" <clinton.begin@...>
Date:    Tue, November 4, 2008 12:34 am
To:      andrew@...
--------------------------------------------------------------------------

Hi Andrew,

A friend of mine forwarded a post on to me and I thought I might offer some
advice...

If you'd like iBATIS to use the Spring instance of OSCache, you can create
your own implementation of CacheController.  The interface is quite simple
and self explanatory (but is documented in the javadocs a little more, as
well as the user guide, wiki and our book).

Your cache controller implementation can request the instance of OSCache
from Spring's app context.  It should be quite simple.  To use your
implementation, you can configure it like any other cache.

  <cacheModel id="someCache" type="org.acooke.some.app.cache.CustomOSCache"
readOnly="true" serialize="false">
    <flushInterval hours="24"/>
    <flushOnExecute statement="updateAccountViaInlineParameters"/>
    <property name="size" value="1"/>
  </cacheModel>

Any <property> elements will be passed to setProperties after
instantiation.  You can use a typeAlias to shorten the type attribute if
you're going to use it more than once.

public interface CacheController {
  public void flush(CacheModel cacheModel);
  public Object getObject(CacheModel cacheModel, Object key);
  public Object removeObject(CacheModel cacheModel, Object key);
  public void putObject(CacheModel cacheModel, Object key, Object object);
  public void setProperties(Properties props);
}

Hope that helps,

Clinton

Comment on this post