CWB
Defines | Functions

cwb-decode-nqrfile.c File Reference

See cqp/corpmanag.c for the file format that this utility decodes. More...

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>

Defines

Functions


Detailed Description

See cqp/corpmanag.c for the file format that this utility decodes.

Note, some of the code is repeated across CQP's load-file functions and here. In the long term, we'll aim to remove this duplication. TODO!


Define Documentation

#define SUBCORPMAGIC   36193928

magic number of the subcorpus file format; defined in the CQP code, corpmanag.c ; TODO should probably be defined centrally (cl/globals.h?)

Referenced by nqrfile_print_info().


Function Documentation

int file_length ( FILE *  fd)

Gets the size of the file.

Parameters:
fdFile handle.
Returns:
The size of the file, from stat().
int main ( int  argc,
char **  argv 
)

Main function for cwb-decode-nqrfile.

Parameters:
argcNumber of command-line arguments.
argvCommand-line arguments.

References nqrfile_print_info().

int nqrfile_print_info ( FILE *  fd,
int  print_header 
)

Reads a subcorpus file and prints information about it to STDOUT.

"Subcorpus file" here means (a) it begins with the subcorpus magic number; (b) then there is a "registry" area terminated by one or more zero bytes; (c) then there may be the size of the subcorpus; (d) then there are a whole load of start-end range integer pairs, to the end of the file.

The registry is printed iff print_header. The start-end pairs are printed on tab-delimited lines, one line per pair.

Parameters:
fdFile pointer.
print_headerBoolean: controls whether a "registry" header in the subcorpus file gets printed or not
Returns:
Boolean: true for all OK, false for problem.

References file_length(), registry, and SUBCORPMAGIC.

Referenced by main().