APS Updates

Archive for the ‘XCRI’ Category

UCAS Postgraduate Launch

leave a comment »

The new UCAS Postgraduate system is up and running, replacing the old Data Collection system (at least for PG).

We’ve been looking forward to having a look at this new system for a number of reasons, not least of which is that UCAS has put some substantial effort into keeping institutions apprised of progress as it was being developed with regular webinars.  Even more: they seem to have taken on board a lot of what people have been asking for, so, in theory, this should be a more straightforward system to use.  I’ve already spotted (and used) the feedback button on the course editing side – we can only hope that people will give feedback and it will be taken into account.  If so, this system could end up being one of the better ones to use.

What’s really got us excited though is that UCAS confirmed all the way through the process that this system will be taking XCRI-CAP feeds!  Anyone who’s worked with us will know that we’ve been working with XCRI-CAP right from helping to write it all the way through so many projects, the Course Data Programme and now we’re working with Prospects to get it rolled out for PG courses.  It’s fantastic to have another aggregator on board.  As the British Standard for communicating course marketing information it is just what is needed to get consistency and accuracy across all aggregators.

We were warned that XCRI-CAP functionality would not be up and running straight away, but we have hopes that in the next couple of months there will be an update to include it.  Looking at the course editing area it certainly looks well structured for XCRI-CAP and I can’t wait to try setting up a feed.  As course marketing distributors for Birkbeck, University of London and The OU this will save us, and them, a lot of time once it’s ready as no keying will be needed for Prospects or UCAS PG.

To find out more about XCRI-CAP take a look at the XCRI website.  For free support creating your own XCRI-CAP feed contact me at jennifer@alanpaull.co.uk.


Written by jennifermdenton

June 17, 2016 at 2:12 pm

Posted in coursedata-tech, UCAS, XCRI

Collectible Course Information: A Quick Look at UCAS’ Course Collect

leave a comment »

We’ve been working recently with a couple of our university clients on the brand spanking new UCAS Course Collect system. This is a data entry service, or if you prefer, a part of the UCAS website where you can key in courses information. It captures information for course marketing purposes and relevant stuff for the UCAS admissions system. It replaces the old netupdate / web-link for courses.

Course Collect Screen Shot

Course Collect Screen Shot

Like all new systems, Course Collect has had a few teething troubles from a university or college perspective. Getting used to a new system for keying is always a bit of a trial, and Course Collect gathers more data within a more structured information model, so it’s almost bound to be complex. We now have Programme > Subject Option >Course > Stage as the structure instead of the very flat one in netupdate. So there’s more flexibility in how the data is represented, but a greater demand for data on universities and colleges.

Data was migrated in May from the old netupdate service, so our early summer has been taken up with checking the data, amending errors on migration, and adding in new courses to be ready for Clearing and then the new season. And of course, we’re managing both 2013 and 2014 entry data.

Particular problematic areas were:

  • Some slight glitches in approving migrated data, especially where the migrated data was too large for the new field size. This took a few weeks to resolve.
  • Paging of lists limited to 15
  • Establishing how to get entry requirements information to appear in the right place in the new course finder tool on the UCAS website, which uses Course Collect data.
  • Complications around showing admissions tests and esoteric prerequisites
  • An annoying lack of Help in the Help system
  • A rather messy mess in the Entry Profiles area, which won’t be settled until early September
  • And at the moment it doesn’t want to work on my Chrome setup.

As we’re really XCRI-CAP people at heart, we continue to encourage UCAS to dispense with this old-fashioned ‘key everything in’ method of data collection and to adopt the XCRI-CAP information standard for bulk updating. To that end I’ve [ed: Alan that is] been doing some mapping of XCRI-CAP to UCAS Course Collect, and also having some thoughts about how a process of getting XCRI-CAP data into the UCAS system might be made to work.

Course Collect Bulk Update Process

Draft Course Collect Bulk Update Process

Our conclusion on Course Collect is ‘the jury’s still out’. Now that we’re down to maintaining the data and only adding in new courses occasionally, it might represent an improvement on the old services. However, my personal view is that we need some good quality management and reporting facilities, and a better workflow sub-system to bring this service up to ‘good’.

Written by benthamfish

July 31, 2013 at 4:21 pm

A slice of Salami: integrating course and job profile searching

leave a comment »

The Salami Layer

We’ve been developing a prototype of the ‘Salami Layer’ idea first mooted a while back as a result of the University of Nottingham’s Salami project. This is all about linking data together to make useful services for people, and to provide more nodes in a growing network of interoperable data.

Salami focused on labour market information. We’ve been taking it forward in the MUSAPI (MUSKET-SALAMI Pilots) project with a view to producing a hybrid service (or services) that use both the MUSKET text description comparison technology and the SALAMI layer material to link together courses and job profiles.

Salami HTML Demo

Thanks to the skill of our newest member of staff at APS (Jennifer Denton), we now have a demonstrator here: It uses recently published XCRI-CAP feeds from The Open University, Courtauld Institute and the University of Leicester as the source of its courses information (noting that these are not necessarily comprehensive feeds). Job Profile information has come from Graduate Prospects, from the National Careers Service and Target Jobs.

The purpose of the demonstrator is to show how we can link together subject concepts that are used to find courses with occupation concepts used to find job profiles. It relies on classifying courses with appropriate terms, in this case JACS3, for the discovery of relevant courses, mapping subject concepts to occupation concepts and then linking in the job profiles. This last task was done by attaching them to the occupation terms (in this case CRCI – Connexions Resource Centre Index – terms), rather than by searching – that will come later. All of these bits were wrapped up in a thesaurus. We then made it all go via a MySQL database, some Java code and a web page. There are some sharp edges still as we haven’t finished cleaning up the thesaurus, but I think it shows the principles.

We haven’t used random keywords, but well known classification systems instead, so that we can develop a discovery service that produces relevant and ranked results (eventually), not just a Go0gle-style million hits listing.

The way the demonstrator works is as follows:

  • Select a term from the drop-down list at the top. This list consists of our thesaurus terms of a mixture of academic subjects for searching for courses and occupation terms for searching job profiles. You can start typing, and it will go to that place on the list. For example try “History of Art”.
  • Then click Select. This will bring up a list of Related Terms (broader, narrower and related terms with respect to your selection), Subject/Occupation Terms (if you’ve picked a subject, it will show related Occupation Terms; if you picked an occupation, it will show related Subject Terms); and Links to Further Information.
Salami Demo 1
Salami Demo 2
  • You can navigate around the search terms we use by clicking on the Refine button next to the entries in the Related Terms and Subject/Occupation Terms lists. For example, if you click on Refine ‘history by topic’, this changes your focus to the ‘history by topic’ subject, and you can then navigate the subject hierarchy from there. If you click on Refine ‘heritage manager’, this changes your focus to that occupation and you can further navigate around jobs about information services or various subjects.
Salami Demo 3
  • At the bottom of the page we have a list of links to further information. These will be either links to relevant courses or to job profiles. The former are drawn from XCRI-CAP feeds, the latter are currently hard-wired into our thesaurus – we’re currently developing a method of using live searches for both types of link. For example, for “heritage manager” we have links to Graduate Prospects and Target Jobs profiles for Heritage Manager.

The upshot of the demonstrator is that we can show how to integrate the discovery of both courses and job profiles (and later on, job opportunities) using a single search term.

Oh-So Thesaurus

The technological underpinning of this is our thesaurus, which has the following broad components.

  • A ‘master’ table of thesaurus terms with attached classifications (in particular JACS3 for subjects and CRCI for job profiles).
  • A table of occupation-subject term links (O>S)
  • A table of subject-occupation term links (S>O)
  • A table of occupation-profile links, currently for implementation of the job profile URLs.

Inclusion of JACS3 codes on the course records and occupation codes on the job profiles is key to the discovery process, so that we can focus on concepts, not string searching. This means, for example, that a search for ‘history of art’ will find courses such as ‘MA in Conservation of Wall Painting’ or ‘MA in Art History’ (Courtauld Institute and Open University respectively), even though neither of the records or web pages for these courses contains the string ‘history of art’.

Perhaps more importantly we can find out that, if we’re interested in the history of art, there are several job areas that might well be relevant, not simply work in museums and galleries, but also heritage manager – and if we browse only one step from there, we can find occupation areas in the whole world of information services, from archaeologist to social researcher, from translator to patent attorney. And all of these possibilities can be discovered without going from this service to any form of separate ‘careers search’ website.

Further extensions

Our Salami demonstrator suggests that this approach could be extensible to other areas. Perhaps we can link in standard information about qualifications, just a short hop from courses. Maybe we can classify competencies or competence frameworks and link these to courses via vocabularies for learning outcomes / competence / curriculum topics.

The other strand in MUSAPI is the textual description comparison work using the MUSKET technology. Even via our Salami demonstrator, your lists are bald undifferentiated lists. If we can capture a range of search concepts from the user – parameters from their current circumstances, past skills, experience, formal and informal education and training, and aspirations – then we could use the MUSKET tools against the Salami results to help to put the results in to some form of rank order. The user would then be able to refine this to produce higher quality results in relation to that individual’s needs, and our slice of salami will have stretched a long way.

Written by benthamfish

March 18, 2013 at 3:38 pm

Consuming XCRI-CAP III: Skills Development Scotland

leave a comment »

Skills Development Scotland has operated a data collection system called PROMT for many years. PROMT is a client application (not browser-based) that sits on your computer and presents you with a series of screens for each course you want to maintain. Each course may have many ‘opportunities’ (these are the same as XCRI-CAP presentations) with different start dates, visibility windows and other characteristics. Many fields in PROMT have specific requirements for content that make the experience of keying not particularly enjoyable (though it has been improved since first launch).

With OU course marketing information consisting of several hundred courses and over 1,000 opportunities, it was with some relief that we at APS (running 3rd party course marketing information dissemination for The OU) turned to the SDS’ Bulk Update facility, using XCRI-CAP 1.1. We had been nervous of using this facility initially, because PROMT data is used not only for the SDS’ course search service, but also has a direct link to a student registration and tracking service for ILAs (Independent Learning Accounts; for non-Scottish readers, ILAs continued in Scotland even though they were discontinued for a while south of the border). Students can get ILA funding only for specific types of course, so each course/opportunity has to be approved by Skills Development Scotland. Changes to the course marketing information can result in ILA approval being automatically rescinded (albeit temporarily), which can mean the provider losing student tracking details, and therefore being at risk of losing the student entirely. So naturally we decided to do some careful testing in conjunction with both SDS and our colleagues at The OU’s Scottish office.

Fortunately we discovered that when we uploaded opportunities the system added them on to existing records, rather than replacing them, so student tracking was unaffected. In addition, individual fields of course records for existing courses was over-written but the records remained active and opportunities were unchanged. These features meant that data integrity was maintained for the opportunity records, and we could always revert to the existing version and delete the new, if necessary.

We were able to load new courses with new opportunities, and also existing courses with new opportunities with no significant problems. The potential ILA difficulty was somewhat reduced, because The OU’s information for an individual opportunity does not need to be updated once it has been approved for ILA; our main reason for updating opportunities themselves was to add in fees information, but cost information has to be present before an opportunity can gain ILA approval, so this type of update would not interrupt ILA approval or student tracking.

Owing to requirements for some proprietary data, for example numerical fees information and separate VAT, not everything could be captured through XCRI-CAP. However, using the PROMT interface for checking the data, adding in very small extras and deleting duplicated opportunities was comparatively light work, as the mass of it was handled by the XCRI-CAP import.

Strikingly good parts of our Bulk Update process (apart from the obvious vast reduction in keying time):

  • Use of a vocabulary for qualification type in PROMT. This made it easy to use various rules to map from The OU data to the required qualification grouping. These rules included a close examination of the content of the qualification title in the XCRI-CAP data to make sure we mapped to the correct values.
  • For some elements, use of standardised boilerplate text in specific circumstances, again identified by business rules.
  • Good reporting back from the SDS Bulk Update system on the status (and errors) from the import. This included an online status report showing how many records of each type had been successfully uploaded, with date and time, after a few minutes from the time of loading.
  • The system permits us to download the whole data set (well, technically as much as could be mapped) in XCRI-CAP 1.1 format, so we were able to compare the whole new set of records with what we expected to have.
  • The ability to review the new data in the PROMT client interface within minutes of the Bulk Upload. This gives a great reassurance that nothing’s gone wrong, and it permits rapid checking and small tweaks if necessary.

I see this combination of bulk upload with a client or web-based edit and review interface as an excellent solution to course marketing information collection. This push method of data synchronisation has the advantage of maintaining the provider’s control of the supply, and it still permits fine-tuning, checking and manual editing if that is necessary. In contrast a fully automatic ‘pull’ version might leave the provider out of the loop – not knowing either whether the data has been updated, or whether any mistakes have been made. This is particularly important in cases where the collector is unfamiliar with the provider’s data.

XCRI-CAP: turn 12 days of keying into 3 hours of checking.

Written by benthamfish

March 6, 2013 at 2:50 pm

Consuming XCRI-CAP I

leave a comment »

This post and a few later ones will be some musings on my experiences of how XCRI-CAP is or might be consumed by aggregating organisations and services. I’ll not go into the theoretical models of how it could be done, but I’ll touch on the practicalities from my perspective. Which, I admit, is not as a ‘proper’ technical expert: I don’t write programmes other than the occasional simplistic perl script, neither do I build or manage database systems, other than very simple demonstrators in MS Access, and I dabble in MySQL and SQL Server only through the simplest of front end tools.

My main XCRI-CAP consuming efforts have been with four systems: XXPTrainagain, Skills Development Scotland’s Bulk Import Facility and K-Int’s Course Data Programme XCRI-CAP Aggregator.

XXP characteristics

  • Collaborative working between APS (my company) and Ingenius Solutions in Bristol
  • Service platform for multiple extra services, including provider and feed register (for discovery of feeds), AX-S subject search facility, CSV to XCRI converter, web form data capture, getCourses feed outputs (SOAP and RESTful)
  • Doesn’t yet have an auto-loader for XCRI-CAP. We can load manually or via our CSV to XCRI facility.

Trainagain characteristics

  • Existing system with its own established table structure, its own reference data and own courses data
  • SQL Server technology
  • I have off-line ‘sandbox’ version for playing around with.

Skills Development Scotland Bulk Import Facility characteristics

  • XCRI-CAP 1.1 not 1.2
  • Existing live XCRI-CAP aggregation service (push architecture)
  • Works in conjunction with the PROMT data entry system

K-Int XCRI-CAP Aggregator characteristics

  • Built on existing Open Data Aggregator, a generalised XML consuming service.
  • Takes a ‘relaxed’ view to validation – not well-formed data can be imported.
  • Outputs JSON, XML and HTML. But not XCRI-CAP.

These are early days for data aggregation using XCRI-CAP. There’s been a chicken-and-egg situation for a while. Aggregating organisations won’t readily invest in facilities to consume XCRI-CAP feeds until a large number of feeds exist, while HEIs don’t see the need for a feed if no-one is ready to consume them. The Course Data Programme takes the second one of these (I guess that’s the egg??) problems – if we have 63 XCRI-CAP feeds, then we should have a critical mass to provoke aggregating organisations to consume them.

Some of the questions around consumption of XCRI-CAP feeds centre on technical architecture issues (Push or Pull?), what type of feed to publish (SOAP, RESTful, just a file?), how often should the feed be updated and / or consumed (real-time updating? weekly?, quarterly? annually? Whenever stuff changes?), how do the feed owners know who’s using it? (open access v improper usage, copyright and licencing). Some of these issues are inter-related, and there are other practical issues around consuming feeds for existing services – ensuring that reference data is taken into account, for example.

I’ll try to tease out my impressions of the practicalities of consuming XCRI-CAP in various ways over the next few blog posts.

XCRI-CAP: turn 12 days of keying into 3 hours of checking.

Written by benthamfish

February 21, 2013 at 3:11 pm

What’s the point of XCRI-CAP?

leave a comment »

What’s the point of XCRI-CAP? This has been a cry for quite a while, even amongst some of the projects in the JISC funded Course Data Programme. Well, this is a story about how I’ve found it useful.

Many years ago I left ECCTIS 2000, the UK’s largest courses information aggregator and publisher, having been technical lead there for 8 years. Over that period of 8 years, during which we moved our major platform from CD-ROM (remember them?) to the web, we established a state-of-the-art course search system with integrated data picked up from:

  • course marketing information (keyed, classified and QAed by Hobsons Publishing),
  • text files from professional bodies (keyed by them, but marked up by us),
  • advertising copy and images (also keyed by the supplier and marked up by us),
  • subject-based statistics from HESA,
  • vacancy information (at appropriate times of the year) from UCAS,
  • and so on.

We used a new-fangled technology called Standard Generalised Markup Language (SGML) with our own bespoke markup.

The technology allowed us to produce separately versioned searchable products for three flavours of CD-ROM (Scotland, rest of UK, international), the web and for printed publications, all from the same integrated data set. Our system enabled us to aggregate data received from multiple sources, including huge data sets of well-structured text (from Hobsons), quite large statistical sources (HESA), and smaller ‘freestyle’ text items from advertisers and other organisations that we marked up ourselves. Shades of XCRI-CAP Demonstrator projects, but 20 years ago. ECCTIS 2000 was a major aggregator, and probably *the* major UK courses information aggregator of the time. Our development built on some highly innovative work carried out by The Open University in the 1980s, including seminal developments in CD-ROM technology, but that’s another story.

Much of my career to date had been centred on the development of standard methods for managing course marketing information as an aggregator. Quite a bit of my freelance career was to be on the other side of the fence, helping HEIs to manage courses information as providers, though I’ve always had some involvement in the aggregating organisation field.

APS Ltd, my small company, was fortunate enough to gain a contract from The Open University to act as their agent for disseminating course marketing information to the wider world of the emerging course search sites on the web. The main ones from the OU’s viewpoint at that time were the British Council, Graduate Prospects, the learndirect services in the countries of the UK. I also set up, for UCAS, its ‘data collection system’ through which UCAS obtained the courses data not used in its application system, but supplied on to third parties (such as learndirect, newspapers, Hotcourses and others).

Most of these small acts of data collection and dissemination were carried out by what are now seen as ‘traditional’ methods: re-keying from prospectuses, keying directly into a supplier’s web form. However, in a few cases (not nearly enough in my view) we were able to obtain electronic files from HEIs – for example, as I was managing the OU dissemination and the UCAS data collection input, it seemed sensible to me to provide the data electronically and to import it automatically. No problem.

At that point, it occurred to me that if I could do this for the OU data, why not for many other HEIs? One reason was lack of standards, the other main one was the chaos in course marketing systems (where they existed) in HEIs – understandable as most were desperately trying to come to terms with new internet technologies, particularly websites, and how these related to their paper prospectuses.

My initial solution was to use SGML (XML being a twinkle in someone’s eye at that time) to create a ‘lowest common denominator’ structure and format for courses information, convert data into that format, then write a suite of programmes to create bespoke outputs for course information aggregrating organisations. There ensued a ‘happy time’ of 3 to 4 years during which we would acquire the OU data in a convenient database format, carry out a swathe of well-documented software-driven and mainly automatic processes, produce a range of output files (Access databases, spreadsheets, CSV files) and fling them around the country for up to ten or so aggregating organisations to import. For learndirect Scotland, to take just one example, we would produce a series of CSV files, email them off and they would load them into their database. Time taken: maybe 5 minutes for the automatic processing, 30 minutes for checking.

OU Course Data Converter Suite

OU Course Data Converter Suite

I stress here that our supply of OU data to learndirect Scotland before 2007 took us about 35 minutes, 90% of that simply checking the data. We would supply updates five times per year, so our total annual time specifically on the learndirect Scotland update would have been significantly less than half a day. However, in a re-organisation, learndirect Scotland disappeared, and in place of their system that imported the OU data, the replacement organisation implemented a new one called PROMT. Ironically, this new system was anything but, from our perspective. With no import mechanism, we were required to key everything from scratch into their bespoke and somewhat eccentric client software – our task went from 35 minutes to 2 to 3 days (the OU had over 1,200 presentations), and the annual task leapt from less than half a day to about 12 days. A double irony: behind their clunky client software was XML and various other interoperability techniques, completely unavailable to those supplying the data.

This was the situation in 2007, and our ‘happy time’ ended, as everyone rapidly stopped taking bulk updates and moved to the ‘easier’ method of forcing HEIs to re-key their data into bespoke web forms. Our time to update the OU data more than doubled – so much for new technology! There was much grinding of teeth (and not just from APS, but from colleagues across the sector).

By now, you should be able to see where I’m coming from in respect of XCRI-CAP.

So, what’s the point of XCRI-CAP? My final illustration: Skills Development Scotland has now done us proud. In addition to their PROMT software (now somewhat improved), they have set up an excellent bulk import facility for providers to use to supply XCRI-CAP 1.0 or 1.1 data (and I’m sure we can persuade them to use 1.2 soon). APS is now using this facility, coupled with The Open University’s XCRI-CAP 1.1 feed, to get back to our ‘happy time’ again; only better, because now everyone can have ‘happy times’ if more aggregators use XCRI-CAP.

XCRI-CAP: turn 12 days of keying into 3 hours of checking.


APS has also produced a ‘value added’ XCRI-CAP 1.2 feed for any aggregator to use: http://www.alanpaull.co.uk/OpenUniversityXCRI-CAP1-2.xml. As we are able to tweak this feed in response to specific aggregator requirements, please get in contact with alan@alanpaull.co.uk, if you would like to use this feed, or to discuss how APS might help you with your courses information dissemination. We also have a range of services through the XXP Platform.

Written by benthamfish

February 14, 2013 at 9:44 am

Posted in XCRI

AX-S Widget Demonstrator – Complete!

leave a comment »

The demonstrator is now live at: http://igsl.co.uk/xxp/ax-s/ou.html.  This demonstrator provides the AX-S search for Open University XCRI-CAP 1.2 data on a mock-up of the look-and-feel of the Open University website.

As explained in an earlier post the AX-S search facility provides concept-based subject search functionality that retrieves records not only matching the user’s selected subject search term itself, but also matching broader and narrower linked concepts. Records were classified with JACS3 codes, which were used to link the courses to a specially constructed thesaurus of terms. When searching, each retrieved record is ranked in the search results list in accordance with how close its JACS3 subject is to the user’s search term within the thesaurus. This functionality can be provided via the AX-S Widget to any institution with an XCRI-CAP 1.2 feed classified with a recognised subject coding scheme (such as JACS3, LDCS, SSA and so on) for use on their website and has the potential to be developed further with additional filters taken from the XCRI-CAP data such as studyMode or attendancePattern.

There were three main work strands in the project: development of the widget itself, development of back-end functions, such as data loading and search functionality, and construction of our bespoke thesaurus of subject terms, on which the searching is based. Software development by InGenius Solutions was key to the success of the project. It was also dependent on classification of the data with JACS3 codes, handled by APS (who also converted the OU data to XCRI-CAP 1.2), and of course, supply of courses data and the website look-and-feel by The Open University.

The project involved more updating of our original thesaurus of terms than was initially expected, but this has now been largely finalised. Some small improvements can still be made by tidying up the detailed formatting of the thesaurus and these are in progress. The demonstrator has been systematically user tested and refined and the code and documentation is available on GitHub.

The AX-S Widget Demonstrator shows how standardised data and small modular software components can be combined to provide a new service that would be very expensive for a single institution to develop, but cost-effective when developed centrally for use across a larger number of institutions. We are pleased to say that there is already interest from several Universities to include this widget on their websites, and we hope to see it in live use soon.

Written by jennifermdenton

January 25, 2013 at 1:44 pm