acube.di.unipi.it
A³ » Software
http://acube.di.unipi.it/a-software
Advanced Algorithms and Application Lab @ di.unipi.it. Technology breakthroughs spur from algorithmic theory, but do not underestimate the impact of algorithm engineering! A framework to fairly and fully benchmark text annotators (systems that, given a text document, aim at finding the topics the text is about, identified as Wikipedia pages), running deep testing that focus on many aspects of a text annotator (currently in alpha stage) [ software. Clustering algorithm for TagMySearch [ paper. With their ...
pizzachili.dcc.uchile.cl
Pizza&Chili Corpus -- Compressed Indexes and their Testbeds
http://pizzachili.dcc.uchile.cl/api.html
Compressed Indexes and their Testbeds. We are particularly interested in self-indexes. Namely compressed indexes that encapsulate sufficient information to reproduce any substring of the indexed text, and thus possibly the text itself. If a compressed index is not a self-index, then one must keep the text together with the index and report the text size plus the index size. To use a compressed index over a text, we first have to build. It, and then we can either query. Interface, written in the C/C.
pizzachili.dcc.uchile.cl
Pizza&Chili Corpus -- Compressed Indexes and their Testbeds
http://pizzachili.dcc.uchile.cl/experiments.html
Compressed Indexes and their Testbeds. In case you are interested in monitoring the performance for increasing text sizes, you will prefer to test increasing prefixes of a text collection. In the utilities there is a program to cut text prefixes of any length. The performance of a compressed index may be evaluated either at construction time or at query time. We are interested in:. To build the index, considering user system time. Internal memory working space. Permanent space on disk. Usually one is int...
pizzachili.dcc.uchile.cl
Pizza&Chili Corpus -- Compressed Indexes and their Testbeds
http://pizzachili.dcc.uchile.cl/initiative.html
Compressed Indexes and their Testbeds. File We strongly suggest to follow the LGPL. License. A software may come only with its executables, but we strongly suggest to add the sources. The submitted indexes must implement the whole API interface we propose. In case some functions are missing, then we suggest to add bogus functions. Several people have already contributed to the site in one way or another. University of Chile, Chile. LZ-index implementation. University of Valladolid, Spain. PPMDI compr...
pizzachili.dcc.uchile.cl
Pizza&Chili Corpus -- Compressed Indexes and their Testbeds
http://pizzachili.dcc.uchile.cl/texts.html
Compressed Indexes and their Testbeds. The choice of the types of texts to be indexed and experimented followed some basic considerations. First, we wished to cover a representative set of application areas where the problem of full-text indexing might be relevant, and for each of them selected texts freely available. Which allows one to limit the indexed text to any possible length (see below). These are the current collections provided in the Pizza&Chili repository:. SOURCES (source program code).
pizzachili.dcc.uchile.cl
Pizza&Chili Corpus -- Compressed Indexes and their Testbeds
http://pizzachili.dcc.uchile.cl/biblio.html
Compressed Indexes and their Testbeds. Compressed Full Text Indexes. By V Makinen and G. Navarro. Technical Report TR/DCC-2005-7, Dept. of Computer Science, University of Chile, June 2005. A comprehensive survey on compressed full-text indexes. Compressed Full Text Indexes, by V. Makinen and G. Navarro. ACM Computing Surveys 39(1), article 2, 2007. For personal use only, download from Compressed Full Text Indexes. A comprehensive survey on compressed full-text indexes. Send Mail to Us.
pizzachili.dcc.uchile.cl
Pizza&Chili Corpus -- Compressed Indexes and their Testbeds
http://pizzachili.dcc.uchile.cl/index.html
Compressed Indexes and their Testbeds. The new millennium has seen the born of a new class of full-text indexes. Which are structurally similar to Suffix Trees and Suffix Arrays, in that they support the powerful substring search. Operation, but are succinct. In space, in that it is close to the empirical entropy of the indexed data. They are therefore called compressed. Suffix Trees and compressed. Suffix Arrays, or in general compressed indexes. This site has two mirrors: one in Italy. And one in Chile.