CWB
Functions | Variables

cwb-huffcode.c File Reference

#include "../cl/globals.h"
#include "../cl/cl.h"
#include "../cl/corpus.h"
#include "../cl/attributes.h"
#include "../cl/storage.h"
#include "../cl/bitio.h"
#include "../cl/macros.h"

Functions

Variables


Function Documentation

void bprintf ( unsigned int  i,
int  width,
FILE *  stream 
)

Prints a binary representation of an integer to a stream.

Parameters:
iInteger to print
widthNumber of bits in the integer
streamWhere to print to.

Referenced by compute_code_lengths().

int compute_code_lengths ( Attribute attr,
HCD hc,
char *  fname 
)
void decode_check_huff ( Attribute attr,
char *  fname 
)

Checks a huffcoded attribute for errors by decompressing it.

This function assumes that compute_code_lengths() has been called beforehand and made sure that the _uncompressed_ token sequence is used by CL access functions.

Parameters:
attrThe attribute to check.
fnameBase filename to use for the three compressed-attribute files. Can be NULL, in which case the filenames in the attribute are used.

References _Attribute::any, BFclose(), BFflush(), BFopen(), BFposition(), BFread(), CDA_OK, cderrno, cl_cpos2id(), cl_max_cpos(), CL_MAX_LINE_LENGTH, CompCorpus, CompHuffCodes, CompHuffSeq, CompHuffSync, component_full_name(), corpus_id, _huffman_code_descriptor::length, _huffman_code_descriptor::min_code, NreadInt(), ReadHCD(), _huffman_code_descriptor::symbols, _huffman_code_descriptor::symindex, and SYNCHRONIZATION.

Referenced by main().

void dump_heap ( int *  heap,
int  heap_size,
int  node,
int  indent 
)

Dumps the specified heap of memory to the program output stream.

See also:
protocol
Parameters:
heapLocation of the heap to dump.
heap_sizeNumber of nodes in the heap.
nodeHeap at which to begin dumping.
indentHow many tabs to indent the start of each line.

References protocol.

Referenced by print_heap().

void huffcode_usage ( char *  msg,
int  error_code 
)

Prints a usage message and exits the program.

Parameters:
msgA message about the error.
error_codeValue to be returned by the program when it exits.

References drop_corpus, progname, and VERSION.

Referenced by main().

int main ( int  argc,
char **  argv 
)
void print_heap ( int *  heap,
int  heap_size,
char *  title 
)

Prints a description of the specified heap of memory to the program output stream.

See also:
protocol
Parameters:
heapLocation of the heap to print.
heap_sizeNumber of nodes in the heap.
titleTitle of the heap to print.

References dump_heap(), node, and protocol.

Referenced by compute_code_lengths().

int ReadHCD ( char *  filename,
HCD hc 
)

Reads a Huffman compressed sequence from file.

Parameters:
filenamePath to file where compressed sequence is saved.
hcPointer to location where the sequence's descriptor block will be loaded to.
Returns:
Boolean: true for all OK, false for error.

References cl_malloc(), _huffman_code_descriptor::lcount, _huffman_code_descriptor::length, _huffman_code_descriptor::max_codelen, MAXCODELEN, _huffman_code_descriptor::min_code, _huffman_code_descriptor::min_codelen, NreadInt(), NreadInts(), _huffman_code_descriptor::size, _huffman_code_descriptor::symbols, and _huffman_code_descriptor::symindex.

Referenced by decode_check_huff().

static int sift ( int *  heap,
int  heap_size,
int  node 
) [static]

Sifts the heap into order.

Parameters:
heapLocation of the heap to sift.
heap_sizeNumber of nodes in the heap.
nodeNode at which to begin sifting.

Referenced by compute_code_lengths().

int WriteHCD ( char *  filename,
HCD hc 
)

Writes a Huffman code descriptor to file.

Parameters:
filenamePath to file where descriptor is to be saved.
hcPointer to the descriptor block to save.
Returns:
Boolean: true for all OK, false for error.

References _huffman_code_descriptor::lcount, _huffman_code_descriptor::length, _huffman_code_descriptor::max_codelen, MAXCODELEN, _huffman_code_descriptor::min_code, _huffman_code_descriptor::min_codelen, NwriteInt(), NwriteInts(), _huffman_code_descriptor::size, _huffman_code_descriptor::symbols, and _huffman_code_descriptor::symindex.

Referenced by compute_code_lengths().


Variable Documentation

char* corpus_id = NULL
int debug = 0
int do_protocol = 0

Level of progress-info (inc compression protocol) message output: 0 = none.

Referenced by compute_code_lengths(), and main().

char* progname
FILE* protocol

File handle for this program's progress-info output: always stdout.

Referenced by compute_code_lengths(), dump_heap(), main(), and print_heap().