Statistics::Gap

Statistics::Gap Perl module is an adaptation of the 'Gap Statistic'.
Download

Statistics::Gap Ranking & Summary

Advertisement

  • Rating:
  • License:
  • Perl Artistic License
  • Price:
  • FREE
  • Publisher Name:
  • Anagha Kulkarni and Ted Pedersen
  • Publisher web site:
  • http://search.cpan.org/~anaghakk/

Statistics::Gap Tags


Statistics::Gap Description

Statistics::Gap Perl module is an adaptation of the 'Gap Statistic'. Statistics::Gap Perl module is an adaptation of the 'Gap Statistic'.SYNOPSIS use Statistics::Gap; $predictedk = &gap("prefix", "vec", INPUTMATRIX, "rbr", "h2", 30, 10, rep, 90, 4); OR use Statistics::Gap; $predictedk = &gap("prefix", "vec", INPUTMATRIX, "rbr", "h2", 30, 10, rep, 90, 4, 7);INPUTS1. Prefix: The string that should be used to as a prefix while naming the intermediate files and the .dat files (plot files).2. Space: Specifies the space in which the clustering should be performed. Valid parameter values: vec - vector space sim - similarity space3. InputMatrix: Path to input matrix file. (More details about the input file-format below.)4. ClusteringMethod: Specifies the clustering method to be used. (Learn more about this at: http://glaros.dtc.umn.edu/gkhome/cluto/cluto/overview) Valid parameter values: rb - Repeated Bisections rbr - Repeated Bisections for by k-way refinement direct - Direct k-way clustering agglo - Agglomerative clustering bagglo - Partitional biased Agglomerative clustering NOTE: bagglo can be used only if space=vec5. Crfun: Specifies the criterion function to be used for finding clustering solutions. (Learn more about this at: http://glaros.dtc.umn.edu/gkhome/cluto/cluto/overview) Valid parameter values: i1 - I1 Criterion function i2 - I2 Criterion function e1 - E1 Criterion function h1 - H1 Criterion function h2 - H2 Criterion function6. K: This is an approximate upper bound for the number of clusters that may be present in the dataset.7. B: The number of replicates/references to be generated.8. TypeRef: Specifies whether to generate B replicates from a reference or to generate B references. Valid parameter values: rep - replicates ref - references9. Percentage: Specifies the percentage confidence to be reported in the log file. Since Statistics::Gap uses parametric bootstrap method for reference distribution generation, it is critical to understand the interval around the sample mean that could contain the population ("true") mean and with what certainty.10. Precision: Specifies the precision to be used while generating the reference distribution.11. Seed: The seed to be used with the random number generator. (This is an optional parameter. By default no seed is set.) Requirements: · Perl · CLUTO · Math::BigFloat · Algorithm::RandomMatrixGeneration


Statistics::Gap Related Software