The Catalogue Files & Directories

2.1  The Directory Tree

Each catalogue available at CDS is made of several files stored in a directory of a Unix-like file system.

The directory tree naming conventions exactly follow the standards adopted at CDS in the mid 70's: astronomical catalogues have been assigned a chronological number in categories numbered I to IX (see tree) reflecting the main scientific interest of the catalogue; this numbering system is shared by the CDS and the participating Data centers, mainly NSSDC-ADC (Astronomical Data Center at NASA Space Science Data Center).

Directory Tree of Catalogues at CDS

I/number Astrometric Catalogues
II/number Photometric Catalogues (except Radio)
III/number Spectroscopic Catalogues
IV/number Cross-Identifications
V/number Combined Data
VI/number Miscellaneous Catalogues
VII/number Non-stellar Objects
VIII/number Radio Catalogues
IX/number High Energy Catalogues
J/abbr/Volume/first_page Publications ordered by Journals, with abbr:
 
A+A A&A
A+AS A&A Suppl.
AJ Astron. J.
ApJ Astrophys. J.
ApJS Astrophys. J., Suppl.
MNRAS Mon. Not. R. Astron. Soc.
PASP Publ. Astron. Soc. Pacific
AZh Astron. Zhurnal (Russia)
PAZh Pis'ma Astron. Zhurnal (Russia)
AN Astronomische Nachrichten
AcA Acta Astronomica
BaltA Baltic Astronomy
other Form J/other/abbr/Volume.first_page
  for other journals, abbr being written as the bibcode

The explosion of incoming catalogues from the beginning of 1993 due to the electronic publication (see Chapter 1) lead us to introduce the J category: within this category, the catalogue designation maps the reference of the published paper, e.g. J/A+AS/97/729 for the article published in A&A Suppl. 97, page 729.

Within this new J section, there is therefore no need for an agreement for the numbering of catalogues between data centers; finding out where a catalogue is stored, knowing its reference, is straightforward. But catalogues do not have to stay in this J section for ever: later, more ``consistent'' catalogues could be generated from one or several publications — typically a catalogue is created which is a merging of the results published as a set of several papers.

2.2  File Naming Conventions

All files making up the catalogue or publication are stored in a directory named according to the conventions described above. A description file, which contains the required information needed to understand the origin and the contents of a catalogue, is named ReadMe. The contents of this important file is described in std.

A file named =obsolete=, if existing, means that the catalogue is obsolete — typically is an outdated version. The contents of this file indicates which catalogue can be used instead of the obsolete version.

Besides these 2 ``special'' files — ReadMe always present and =obsolete= existing only for outdated catalogues — the data files are named according to the following rules:

  1. filenames should be compatible with MS-DOS limitations: filename is written name.extension, with at most 8 characters for name and 3 characters for extension; only alphanumeric characters, plus the minus sign and the underscore, are allowed; and case is not significant — filenames are normally displayed in lowercase letters only.

  2. for files corresponding to published material, the names are consistent with the published paper, and we use tablen.extension to refer to the table numbered n in the published paper, fign.extension for the figure numbered n, etc.
  3. if the rule above cannot be applied, we use mnemonic names like main, catalog or data for the main part of the catalogue, refs for the references, notes for the notes, etc...
  4. the extension is related to the format of the file, with the following conventions:

Files may also be Unix–compressed or Gnu-zip compressed; a .Z suffix is appended to the filenames described above in case of Unix compression (the uncompress Unix program has to be used), and a .gz or .z in case of gzip compression (the gunzip public-domain program has to be applied).

Large files may also be cut into pieces, generally not larger than 10 Megabytes. In this case, a numeric suffix of 2 or 3 digits can be added; an example can be found for the Tycho-2 Catalogue, where the data file was split into 20 parts named tyc2.dat.00, tyc2.dat.01, ⋅⋅⋅ tyc2.dat.19.

2.3   Catalogue Subdirectories

It may happen that some catalogues contain a large number of files, as in Catalogue III/166 which contains about 80 stellar spectra corresponding to some standard spectral types. These data files made of just 2-column tables were saved in a subdirectory named sp, and the characteristics of each of these 80 files containing spectra are summarized in a table named spectra.dat which is described in the ReadMe file. In other words, it is possible to describe files with a level of indirection, as a table which details characteristics of files stored in one or several subdirectories.

2.4   Data files

Data files in principle contain only the data, without titles, headers, commments, etc. However introductory comments stored at the beginning of the data files being handy, a possibility of specifying this feature has been added in the Byte-by-byte Description. Two possible ways exist for introductory comments in data files:
  1. by specifying a number of introductory lines, e.g. the first 20 lines are comments.
  2. by specifying a character used for introductory comments, e.g. the first lines having a # as their leftmost character represent introductory comments.
Data files may also contain empty lines – empty lines are ignored wherever they are in the file.

2.5   Index Files

A set of files summarizing the catalogues currently available at CDS is updated regularly (normally on a weekly basis):

Note that a facility exists to query this index remotely: the findcat program, which is a part of the cdsclient package, described in the cdsclient package, described in Chapter 4. /srv/httpd/Pages/doc/catstd-2.htx