
This file describes the text variant of repository index format
considered in Deepsolver project. The future Deepsolver versions may
include also support of binary repository index format but text one is
better suited for debug purposes. If some Deepsolver utilities does not
follow this document it is a bug and must be fixed as soon as
possible. Everybody are welcome to comment anything written below and
suggest improvements.

1. Repository index reading rules

The every utility for repository index reading must implement the
rules written below to keep reading process reliable and secure. All
index files must be placed in the single directory (usually i586/base
or noarch/base, etc). Since index files can be accessed through
network protocols without file list retrieving support this document
does not depend on this feature.

First of all the program must read the 'info' file (see
RepoIndexInfoFile.txt for details). It must have name 'info' and must
be stored directly inside of index directory (for example
'i586/base/info').

On next step the program must get the name of md5sum file, ensure this
md5sum file includes the record for info file itself and check all
md5sums. Implies the other types of hash algorithms can be supported
in future versions of Deepsolver but are not considered right now. The
md5sum file can contain the GPG signature and if it is present it must
be checked with known keys.

The main repository index data is stored inside of three files:
rpms.data, srpms.data and provides.data. The actual their names depend
on used compression method. Used compression type can be got from the
info file. In case of using any compression special suffix must be
attached to the names of files with the main index content. For
example, if GZIP method is used, real file names must be rpms.data.gz,
srpms.data.gz and provides.data.gz. The info file itself does not
contain names of these files and they must be built by these
rules. All of main index files must be placed in the same directory as
info file. Be sure all of them is covered by md5sum entries. 

2. Format of the file with binary package list

The index file with information about binary packages, named
rpms.data.*, consists of set of sections. One section describes one
binary package. The section include the one line of header and several
lines of required attributes. Empty lines or lines only with blank
characters are silently skipped. No comments are allowed. No blank
characters at line beginnings are allowed.

The section header line has the form '[PACKAGE_FILE_NAME]'. For
example: '[altlinux-release-sisyphus-20081222-alt1.noarch.rpm]'. Any
non-header lines before the first header is invalid. The section
header cannot be confused with other types of lines since it has
square brackets at the beginning and the end. No other lines can start
with opening square bracket. Between the brackets the binary package
name must be stored with file extension but without any parent
directories information.

The section value line contains the line prefix and line value. The
usual line prefixes are 'n=', 'v=', 'r=', 'r:', 'p:'. Line prefixes
ending by '=' sign may be used ine the every section only once, but
lines with the prefix ending by ':' sign can be used multiple
times. The rest of the line after the prefix is considered as the
value and has no special processing rules.

Here is the list of section parameters:

* 'n=' - the package name;

* 'e=' - the package epoch;

* 'v=' - the package version;

* 'r=' - the package release;

* 'arch=' - the package architecture;

* 'URL=' - the URL to the project homepage;

* 'lic=' - the license of the package content;

* 'pckgr=' - the package maintainer;

* 'summ=' - the single-line package summary;

* 'descr=' - the detailed multilined package description;

* 'src=' - the file name of the source package this binary one was built from;

* 'btime=' - the build time stamp (usually the time stamp of the spec file);

* 'pkgsize=' - the size of package file in bytes;

* 'consize=' - the size of package content after installation in bytes;

* 'p:' - one provides entry;

* 'r:' - one requires entry;

* 'o:' - one obsolete entry;

* 'cl:' - one change log entry.

Some of these parameters need more detailed description. The build
time stored as the single integer number in time_t format appropriate
for using by gmtime(3). The multilined package description text is
encoded and stored in the single-line value. Its encoding includes
replacing every new line character ('\n') into exact string "\n" and
replacing of every slash character into string with two slashes. Every
change log entry is represented as multilined text where the first
line is the entry date in iso-8601 format (for example 2012-01-31),
the second line is the information about maintainer and package
version. All following lines are the change log entry text. This
multilined value is encoded into one line by the same way as
multilined package description above.

Every provides, requires, conflicts or obsoletes entry have the same
format. The string after the line prefix to the first space without
precedent backslash is the corresponding package name. IN the package
name every space must be marked by previously used backslash and every
backslash must be used twice. Usually package names cannot include
space character but some files in package content can be interpreted
as provide entries and can include spaces in their names. 

After the package name version information can be added if there is
any. Version information consists of combination of the strings '<',
'>' or '=' rounded by spaces and corresponding version string without
any additional formatting. (TODO:this document currently does not
cover what marks must be used to indicate the require entry is actual
only for package installation or package removing, because it is not
considered now what exact approach must be used)

3. Source package section format

The information about available sources packages (src.rpm) must be
stored in srpms.data file. The entire format is just the same as for
binary packages but not all of the parameters can be used. The
information about the source packages cannot include requires,
obsoletes, provides and conflicts entries. Also there cannot present
fields about package architecture, content size and name of the source
package.

4/ Provides file format

The provides information is stored in the file with name
provides.data. It contains the reverse mapping from package names to
their provides entries from file rpms.data. The provides mapping is
the set of sections where the section head saves the provide name and
its content is the list of packages containing this provide. The
section head has the format '[provide_name]'. Since there is no strong
rule the package name cannot include square brackets the section head
can present at first file line or after the blank line. In other cases
it must be processed as the part of section body. The section body is
the package name list with one package name per line without any
additional formatting. No comments are allowed.
