APS Updates

Consuming XCRI-CAP II: XCRI eXchange Platform (XXP)

leave a comment »

XXP experiences

Since I helped to specify the XCRI eXchange Platform, and I’m currently seeking more institutions to use it, I do have an interest. However, I don’t do the very techie, database development or systems development work on it, so I’m more a very experienced user and partially designer.

xxp-overview

The purpose of XXP is to provide an XCRI-CAP service platform, so it has facilities for loading XCRI-CAP data, though not yet fully automatic ones. The platform has been designed specifically for XCRI-CAP, so its main functions are to provide input and output services that are likely to be relevant to the community. For example, it has CPD and Part Time course data entry facilities, enabling providers to key and maintain these types of course very easily, with vocabularies optimised for the purpose. There is also a CSV loader for those who can output CSV but not XCRI-CAP – this effectively provides a conversion from CSV to XCRI-CAP 1.2, because like all the XXP services, loading in the data enables creation of output XCRI-CAP feeds (both SOAP and RESTful).

Importantly XXP has a feed register (discovered by our Cottage Labs colleagues for their Course Data Programme demonstrator project), so that you can discover where the feed is, who’s responsible for it, what it covers and so on.

XXP is defined by the input and output requirements that APS and Ingenius Solutions have currently provided in response to their perception of market demand. This necessarily changes as more institutions get their data sorted out. While the focus in XXP is on acting as an agent for a provider (a university or college), XXP is effectively an interface between the provider and other aggregating organisations. It enables the creation of ‘value-added’ feeds enhanced by extra data (such as addition of vocabularies, like those for course type, or subject) and by transformation of data (typically concatenating or splitting text fields, or mapping from one classification system or vocabulary to another).

Getting XCRI-CAP data into XXP is at the moment not completely automatic. The main routines are through a manual load – which is fairly time consuming – or through an automatic CSV load (data2XCRI service), requiring a CSV file. In fact (and somewhat bizarrely) it’s not difficult to produce the CSV file from an existing XCRI-CAP file, then load it in. This is a stopgap measure till XXP has a fully functioning XCRI-CAP loader.

My use of the XXP consumption of XCRI-CAP at the moment has been using a push method – I stay in control of the operation and can make sure it all works as expected. XXP has a straightforward read-only View function so you can see the data in the system when loaded. If changes need to be made, then you make them at source (upstream); if there was an edit function for the XXP-loaded data, you would wipe out changes when you next loaded the data in.

As the data content going into XXP is controlled directly by the provider, XXP imports whole data sets, not updates. This simplifies the process considerably on both sides, which can focus entirely on live complete data sets. Maybe this needs a bit more explanation. I figure that if the provider controls the data, then the current data in XXP won’t have been ‘enhanced’ by manual edits or upgraded data. Therefore, it’s safe to completely overwrite all the data for the provider – that won’t wipe out anything useful that we’re not going to add back in. This is in contrast to ‘delta update’ methods that compare old and new data sets and just pump in the changed material. It’s much simpler, which has some merit.

Some of the difficulties that had to be overcome in the XXP aggregation:

  • Use of URLs as internal identifiers (ie inside XXP) for linking courses and presentations – this is overcome either by using a new-minted internal identifier or by re-constructing it (keeping the unique right-hand part).
  • On-the-fly refinements using xsi:type – this is a technical problem as many tools don’t like (read: tolerate) xsi:type constructions, or indeed any type of redefinitions, extensions or restrictions. This requires workarounds for or at least careful handling of extended <description> types.
  • Non-normalised material in XCRI-CAP structures. For example, <venue> is nested in presentations, therefore repeated. As the XCRI-CAP is parsed, you may find new venues or repeated venues that need to be processed. Ideally all venues should be processed prior to course>presentation structures, so it may be best to pass once through the file to discover all the venues, then a second time to populate the rest.
  • Incomplete bits. For example, the venues referred to in the previous bullet may simply have a title and postcode. XXP has a facility for adding missing data to venues, so that the output XCRI-CAP feed can be more complete.
  • Matching of vocabularies. Some feeds may use JACS, others may use LDCS, others simply keywords, and yet all the data goes into a subject field – this requires a method to store the name of classification and version number (JACS 1.7, 2 and 3 are substantially different).

A substantial advantage of XXP is that once you’ve put the data in (in whatever format), you can get it out very easily – currently as XCRI-CAP SOAP and RESTful getCourses, but there’s no reason why other APIs couldn’t be added for JSON, HTML, RDF and so on. This effectively means that XXP can have mapping and transformation services into and out of XCRI-CAP, adding value for particular ‘flavours’ or for new versions.


XCRI-CAP: turn 12 days of keying into 3 hours of checking.

Advertisements

Written by benthamfish

February 25, 2013 at 5:30 pm

Posted in Uncategorized

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: