Postcard from Strasbourg

 

I have now spent four weeks at the Centre de Donnees astronomiques de Strasbourg (CDS), during the period Jan - Feb 2004. I have explored their VO contributions at length (Aladin, UCDs, Vizier, GLU, ...) and also attended the AVO 2004 "First Science" demonstration to the IVOA in Munich. During this time I have been able to focus quite carefully on the following questions, and on this page I hope to outline what my current thoughts are, and how I reached them.

  1. What is the VO?
  2. How should the VO be used?
  3. What is Aus-VO's role?

What is the VO?

I am now convinced that the VO is simple: it is no more than the development and adoption of standards for (a) describing astronomy data, and (b) accessing astronomy data. The main concern of the International Virtual Observatory Alliance ( IVOA) is to ensure this process is collaborative, inclusive, thorough yet timely, and well-defined.

To this end, the IVOA's working groups (WGs) can be classified as follows:

  • describing astronomy data: "Content Description (UCD)", "Data Modeling", "VOTable", "Resource Registry"
  • accessing astronomy data: "Data Access Layer", "VO Query Language", "Resource Registry", "VOTable"
  • process: "Standards & Processes"

Naturally there are overlaps within and across the first two categories.

In addition, there is one further WG, "Grid & Web Services", and three interest groups (IGs), viz. "VO Architecture", "VO Applications", and "VO Theory". These to my mind are very user-oriented and not tied to the primary function of IVOA. However they suggest an important secondary theme of the IVOA, which is the development of VO user communities and groups with common interests. In some cases these IGs will surely provide valuable feedback to the IVOA and its WGs on standards; in other cases members of the IGs may work together to build shared standards-compliant servers or clients.

How should the VO be used?

Current VO prototypes, comprising both services and consumers of services (clients), are built exclusively within the "web model" for want of a better term. Services are accessible via the hypertext transfer protocol (HTTP), and are built using CGI, JSP and other similar technologies. Clients are typically run in the user's browser, and are broadly either HTML, JavaScript or JSP, or Java applets. Examples of the latter include tools such as CDS' Aladin and NVO's specview, VO-India's VOPlot and Starlink's TopCat.

My experience is that while these tools (Aladin, specview, ...) are interesting and in some cases, quite powerful, in general they simply do not live up to the high standards in software that astronomers expect. Compared to bread-and-butter libraries and packages like PGPlot, FFTW, IDL, IRAF, AIPS++ and miriad (which admittedly each benefit from thousands of person-years of coding, use and refinement), nearly all of the VO clients I have used are:

  • less stable (ie. crash frequently),
  • slow (presumably because they are predominantly Java-based),
  • extremely wasteful of memory (also java-related, but potentially related to XML parsing),
  • have unrefined and often non-intuitive interfaces, and
  • lack scripting or "macro" capability.

I repeat that in some cases, they are extremely interesting tools which take a different approach to what we might be used to and therefore offer new ways to do things, but the above comparisons are essentially true across the board.

I should also point out that no-one really claims that these present-day clients - which provide users a way to interact with and use (consume) VO services - are what the VO will look like in the future. In many cases, I think that clients have been prototyped as Java- and web-based tools simply because this is very easy to do if the services themselves have been developed against WSDL files. Too technical for this document - I'm just saying that the demonstration clients have grown up the fastest way they can and are not representative of what we might expect in a few years time.

I am nervous, to say the least, about leading Australia and Aus-VO down the same path. Apart from anything else, the current approach in the VO manages to frequently ignore the rich software heritage of astronomy. At times, I wonder if this is because we now have a substantial body of non-astronomers writing "new age" code for us, and through no fault of their own they are mostly unaware of the wealth of software environments and libraries that have grown up over the last thirty years. Unless this is addressed, I personally view the VO project as relatively "high risk". My thinking is leaning more and more towards adding VO capabilities to existing astro environments rather than building a whole new astronomy environment based on VO capabilities.

What is Aus-VO's role?

  1. Standards. Australia must participate in the standards setting process. This can be accomplished by direct participation of Australian astronomers and computer scientists in the IVOA WGs. Another route to influence the standards is active participation within the IVOA IGs, and yet another way is for Aus-VO itself to form WGs which feed requirements and learned opinions (!) to the various IVOA WGs under the auspices of the Aus-VO.
  2. Data publishing. Australian data producers (observatories, survey teams and theorists) must describe their data products with reference to the IVOA standards, and make them accessible via IVOA defined protocols. I believe this can be done, and indeed is being done, under the umbrella of the Aus-VO project. In particular, Aus-VO can work to provide shared registries, data warehousing hardware, etc.
  3. Using VO data. Rather than develop web-based, Java-based etc. clients, I think Australian astronomers and astronomy software developers should expend effort in adding "methods" to existing, entrenched software packages (miriad, AIPS++, iraf, ...) to access and use data published "in the VO". The paradigm here is that VO data is effectively simply another stream of data to support in software packages, just like FITS and ASCII formats. Yes, VO data formats will be more sophisticated in many ways, including links to data models and query services, but they are just another set of formats which our tried and true software packages would do well to support.
  4. External services. There is still room for clients to be developed outside of existing packages. Operations such as "quick look" (using Java applets for example), catalogue cross-match (including streaming of larger-than-memory, HTM-sorted VOTables), and catalogue join are clear contenders, but these seem to me more service than client. Some of this work can be done in the Aus-VO framework, but I feel that with our limited resources it must be carefully chosen to be of scientific merit. This probably means matching work in this category directly with science use cases developed by the Science Working Group.

Examples

AIPS++ access to Conesearch. Very few people know that within AIPS++, it is relatively simple to obtain a section of a catalogue and overlay it on your image. This is probablye due to (a) low uptake of AIPS++, (b) even lower uptake of AIPS++ at the scripting level, and (c) a relatively poor community-wide perception of AIPS++. This capability was developed at the NRAO. It queries the NVO conesearch registry to find a list of available data sources. The user can select a catalog from those available, provide a coordinate, or even just an image and the field centre and radius are calculated, and obtain the results of the IVOA-compliant conesearch. Once a registry is available in Australia, it can be populated with any conesearch services we publish and the data will be available immediately to AIPS++ users.

AIPS++ access to Simple Image Access Protocol services. The mechanics of the conesearch client, already present in the vo module of AIPS++, would be relatively straightforward to adapt to provide a SIAP client. AIPS++ users would have immediate access to all images served by SIAP services world-wide.

Desktop access to the CDS GLU service dictionary providers. The CDS GLU dictionary, distributed with and used by Aladin Java, is a substantial listing of services supporting define protocols for querying and retrieving remote astronomy databases, including DSS servers, GOODS servers, the Vizier catalogue service, and so on. Many GLU services are called via simple parameter substitution into the service URL, based on metadata provided in the GLU dictionary - these services could be easily supported in stock-standard astronomy reduction programs such as miriad, IRAF, AIPS, ...

Comments

Comment on Strasbourg

At the IVOA interop meeting last year, I think we (from Aus-VO) who were there were seriously unhappy with the thought of a lot of software already in existence being re-created for VO. I was rather confused also as to what environment such software would be written in. Who would be implementing all of the infrastructure that would be needed to generate applications for VO ? I was further alarmed at one of the working group meetings with people apparently starting to build Coordinates class designs (where would this be implemented I wondered).

So David your appraisal makes some sense to me. I.e. the idea that VO fundamentally is providing a data stream, but not the analysis software. This has the advantage of being tractable.

For Aus-VO's role. You have suggested 1) standards, 2) data publishing, 3) using VO data, 4) external services

Related to 4) are services that are provided as an external web-service but in fact depend upon the infrastructure of an existing package. A good example of this is the Quanta and Measures web service ATNF will provide in the coming months. These will use aips++ infrastructure to deliver astronomer useful services.

One of the two VO positions we have just advertised (see http://recruitment.csiro.au/asp/index.asp) has a component to start implementing current VO data access protocols and we can explore the concept of consuming things like SIA in aips++ as well as providing image archives that can respond to SIA queries.

Another example of a service related to 4) is an application like the Remote Visualization Server (which depends upon aips++ Visualization infrastructure). In the coming year we will start deploying RVS to existing image archives here at the ATNF. However, a longer term goal is for other people to install an RVS server at their own image archive so we will have to learn how to package it for end-users, including all of the dependency horror...

So I continue to see a good niche for us in providing useful end-user services packaged around existing infrastructure.

You are not allowed to create comments.