The Catalogue Files & Directories
2.1 The Directory Tree
Each catalogue available at CDS is made of several files stored
in a directory of a Unix-like file system.
The directory tree naming conventions exactly follow the standards adopted
at CDS in the mid 70's: astronomical catalogues have been
assigned a chronological number in
categories numbered I to IX (see tree)
reflecting the main scientific interest of the catalogue;
this numbering system is shared by
the CDS and the participating Data centers,
mainly NSSDC-ADC
(Astronomical Data Center
at NASA Space Science Data Center).
Directory Tree of Catalogues at CDS
I/number | Astrometric Catalogues |
II/number | Photometric Catalogues (except Radio) |
III/number | Spectroscopic Catalogues |
IV/number | Cross-Identifications |
V/number | Combined Data |
VI/number | Miscellaneous Catalogues |
VII/number | Non-stellar Objects |
VIII/number | Radio Catalogues |
IX/number | High Energy Catalogues |
J/abbr/Volume/first_page |
Publications ordered by Journals, with abbr: |
| A+A | A&A |
A+AS | A&A Suppl. |
AJ | Astron. J. |
ApJ | Astrophys. J. |
ApJS | Astrophys. J., Suppl. |
MNRAS | Mon. Not. R. Astron. Soc. |
PASP | Publ. Astron. Soc. Pacific |
AZh | Astron. Zhurnal (Russia) |
PAZh | Pis'ma Astron. Zhurnal (Russia) |
AN | Astronomische Nachrichten |
AcA | Acta Astronomica |
BaltA | Baltic Astronomy |
other | Form
J/other/abbr/Volume.first_page |
| for other journals, abbr being written as the
bibcode
| |
The explosion of incoming catalogues from the beginning of
1993 due to the electronic publication
(see Chapter 1)
lead us to introduce the J category:
within this category, the catalogue designation
maps the reference of the published paper, e.g.
J/A+AS/97/729 for the article published in A&A Suppl.
97, page 729.
Within this new J section, there is therefore no need for an agreement
for the numbering of catalogues between data centers; finding out
where a catalogue is stored, knowing its reference, is straightforward.
But catalogues do not have to stay in this J section for ever:
later, more ``consistent'' catalogues could be generated from
one or several publications — typically a catalogue is created which
is a merging of the results
published as a set of several papers.
2.2 File Naming Conventions
All files making up the catalogue or publication are stored
in a directory named according to the conventions described above.
A description file, which contains the required information
needed to understand the origin and the contents of a catalogue,
is named ReadMe.
The contents of this important file is described in std.
A file named =obsolete=, if existing, means that the catalogue
is obsolete — typically is an outdated version.
The contents of this file indicates which catalogue
can be used instead of the obsolete version.
Besides these 2 ``special'' files — ReadMe always present
and =obsolete= existing only for outdated catalogues —
the data files are named according to the following rules:
- filenames should be compatible with
MS-DOS limitations:
filename is written name.extension,
with at most 8 characters for name and 3 characters for
extension; only alphanumeric characters, plus the
minus sign and the underscore, are allowed;
and case is not significant — filenames are normally
displayed in lowercase letters only.
- for files corresponding to published material,
the names are consistent with the published paper, and we use
tablen.extension
to refer to the table numbered n in the published paper,
fign.extension for the figure numbered n, etc.
- if the rule above cannot be applied, we use mnemonic names
like main, catalog or data for the main
part of the catalogue,
refs for the references, notes for the notes, etc...
- the extension is related to the format of the file, with
the following conventions:
- [.csv] for files containing tabular data as
character-separated values
i.e.
columns separated by a special character,
generally the semi-colon ;
(see also the .tsv extension)
- [.dat] for files containing the data in plain
ascii form. The exact structure of such
files — the column layout — is normally
described in the ReadMe file.
- [.fit] for FITS files
- [.fih] for FITS headers, i.e.the top part of
FITS files containing the keywords
with embedded newlines.
- [.gif] for data files containing images in GIF
format
- [.jpg] for data files containing images in JPEG
format
- [.mpg] for data files containing video sequences in
MPEG format
- [.ori] for the original files, when modifications
had to be performed and the original files
have to be available
- [.pdf] for Adobe's PDF format
- [.ps ] for PostScript files
- [.sam] for samples files, when the whole
catalogue can't be stored in the FTP
directories. The total number of records is
then indicated as the first number in the
explanation. See e.g. the USNO Catalogue
(I/252)
- [.sty] for style files related to TeX or
LaTeX definitions.
- [.tar] for files in Tape ARchive format (Unix),
allowing many files to be archived as a
single file.
- [.tgz] for Gnu-zipped Tape ARchive format (Unix),
a short-hand of the .tar.gz suffix.
- [.tex] for files in plain TeX or in LaTeX.
- [.txt] for files containing text in plain
ascii form.
- [.tsv] for files containing tabular data as
tab-separated values, i.e.
columns separated by the TAB character
(see also the .csv extension).
Files may also be Unix–compressed or Gnu-zip compressed;
a .Z suffix is appended to the filenames described above
in case of Unix compression (the uncompress Unix program has
to be used), and a .gz or .z in case of gzip compression
(the gunzip public-domain program has to be applied).
Large files may also be cut into pieces, generally not larger
than 10 Megabytes. In this case, a numeric suffix of 2 or 3
digits can be added; an example can be found for the
Tycho-2 Catalogue,
where the data file was split into 20 parts
named tyc2.dat.00, tyc2.dat.01, ⋅⋅⋅ tyc2.dat.19.
2.3 Catalogue Subdirectories
It may happen that some catalogues contain a large number of files,
as in Catalogue III/166 which contains about 80 stellar spectra
corresponding to some standard spectral types. These data files made
of just 2-column tables were saved in a subdirectory named sp,
and the characteristics of each of these 80 files containing spectra
are summarized in a table named spectra.dat which is described
in the ReadMe file. In other words, it is possible to describe
files with a level of indirection, as a table which
details characteristics of files stored in one or several subdirectories.
2.4 Data files
Data files in principle contain only the data, without titles,
headers, commments, etc. However introductory comments
stored at the beginning of the data files being handy,
a possibility of specifying this feature has been added
in the Byte-by-byte Description.
Two possible ways exist for introductory comments in data files:
- by specifying a number of introductory lines,
e.g. the first 20 lines are comments.
- by specifying a character used for introductory comments,
e.g. the first lines having a # as their leftmost character
represent introductory comments.
Data files may also contain empty lines – empty lines are
ignored wherever they are in the file.
2.5 Index Files
A set of files
summarizing
the catalogues currently available at CDS
is updated regularly (normally on a weekly basis):
- cats.all: lists all catalogues (flat ascii)
- cats.lis: provides only basic information about each catalogue
- cats.tex: is the LaTeX version used for publication in
the Bulletin d'Information du CDS
- cats.dvi: is the dvi translation of cats.tex
which can used for remote display e.g.via XMosaic
- cats.new: contains the same information as cats.all,
for catalogues acquired during the last month;
Note that a facility
exists to query this index remotely: the findcat program,
which is a part of the cdsclient package, described in the
cdsclient package, described in Chapter 4.
/srv/httpd/Pages/doc/catstd/catstd-2.htx