CWB
Functions | Variables

cwb-describe-corpus.c File Reference

#include "../cl/globals.h"
#include "../cl/corpus.h"
#include "../cl/attributes.h"
#include "../cl/macros.h"

Functions

Variables


Function Documentation

void describecorpus_show_attribute_names ( Corpus corpus,
int  type 
)

Prints the names of attributes in a corpus to STDOUT.

Only one type of attribute is analysed.

Parameters:
corpusThe corpus to analyse.
typeThe type of attribute to show. This should be one of the constants in cl.h (ATT_POS etc.)

References _Attribute::any, TCorpus::attributes, print_indented_list_item(), and start_indented_list().

Referenced by describecorpus_show_basic_info().

void describecorpus_show_basic_info ( Corpus corpus,
int  with_attribute_names 
)

Prints basic information about a corpus to STDOUT.

Parameters:
corpusThe corpus to report on.
with_attribute_namesBoolean: iff true, the counts of each type of attribute are followed by a list of attribute names.

References _Attribute::any, ATT_ALIGN, ATT_POS, ATT_STRUC, TCorpus::attributes, cl_max_cpos(), cl_new_attribute, describecorpus_show_attribute_names(), TCorpus::info_file, TCorpus::name, TCorpus::path, TCorpus::registry_dir, TCorpus::registry_name, and word.

Referenced by main().

void describecorpus_show_statistics ( Corpus corpus)

Prints statistical information about a corpus to STDOUT.

Each corpus attribute gets info printed about it: tokens and types for a P-attribute, number of instances of regions for an S-attribute, number of alignment blocks for an A-attribute.

Parameters:
corpusThe corpus to analyse.

References _Attribute::any, ATT_ALIGN, ATT_POS, ATT_STRUC, TCorpus::attributes, cl_has_extended_alignment(), cl_max_alg(), cl_max_cpos(), cl_max_id(), cl_max_struc(), and cl_struc_values().

Referenced by main().

void describecorpus_usage ( void  )

Prints a message describing how to use the program to STDERR and then exits.

References progname, and VERSION.

Referenced by main().

int main ( int  argc,
char **  argv 
)

Main function for cwb-describe-corpus.

Prints information about an indexed corpus to STDOUT.

Parameters:
argcNumber of command-line arguments.
argvCommand-line arguments.

References cl_delete_corpus(), cl_new_corpus(), corpus, describe_corpus(), describecorpus_show_basic_info(), describecorpus_show_statistics(), describecorpus_usage(), progname, and registry.


Variable Documentation

char* progname = NULL

String set to the name of this program.