Journal of Documentation, Letter-to-the-Editor, 28(4), pp. 344-345 (December 1972).
INDEXING TERMINOLOGY AND PERMUTED INDEXES
Eugene Garfield
e-mail: garfield@codex.cis.upenn.edu
home page: eugenegarfield.orgDear Sir,
I applaud the efforts of Kemp et al1to obtain some agreement and standardization for the terms �permuted�, �rotated�, etc. When first we conceived the Rotaform Index® for the Index Chemicus®2 it was obvious that the description of KWIC index as permuted� indexes was inaccurate. To us, a KWIC index was in fact a �rotated� or �cycled� index. I presume Kemp would now prefer the latter term. Would a KWOC index be a �rotated� or a �cycled� index?
As another recent example of �rotated� indexing one should cite the Chemical Substructure lndex:TM3 now in its third year of publication. CSI is based upon the use of the Wiswesser Line Notation. Each of about six �fragments� of the notation is used as a main entry but the entire notation for each chemical compound is repeated in its �cycled� form. CSI has occasionally been called a permuted index by analogy to KWIC indexes.
In contrast to the above, the Permuterm® Subject Index4 I believe is well named. I see no reason why �permuted� should imply that all possible permutations should be used. In PSI all possible permuted pairs are generated producing N(N�I) entries per document indexed.
Incidentally, Permuterm can and has been used in two different contexts. The first is in the Permuterm Subject Index section of the Science Citation Index®. In the PSI, title words are employed �as is�. But permutation is followed by a significant man-machine editing procedure which standardizes many variant spellings, etc.
The second Permuterm method is employed with indexer assigned terms. This method was used in Current Contents® Chemical Sciences quite successfully. More recently it was used in the compilation of the cumulative indexes to the Journal of the Electrochemical Society. The computer programs for Permuterm indexing, while proprietary information, are made available at nominal cost to interested organizations. Several industrial firms have used them quite successfully for internal files.
There is an interesting �trade-off� made in selecting either permuterm method. In the latter case, the indexer suppresses any terms which are judged to be superfluous. This markedly reduces the number of permuted pairs. In a published printed index this results in saving considerable space. It also eliminates certain ambiguities in the use of homonyms but this is a point which needs considerable analysis since in PSI some terms may create noise in the context of a primary term and music as a co-term.
More recently, Current Contents/Life Sciences introduced a Weekly Subject Index based upon title words.5 These entries are drawn from the same database as PSL However, the terms are not permuted. Certain machine editing is performed which e.g., hyphenates terms like BIRTH��CONTROL that might otherwise appear as separate entries instead of pairs. This general topic of machine generation of word-pairs was dealt with extensively in a previous report.6 These frequency based methods can not only reduce space requirements but also create a more useful index. This assumes that it is desirable to further sub-divide any index so that when a threshold number of entries under any term pair is reached that pair becomes itself a primary term which is then permuted with the remaining terms in the title. The advent of cheaper disc memories makes the use of such methods realizable in the near future.
EUGENE GARFIELD
Institute for Scientific Information, President
325 Chestnut Street,
Philadelphia, Pennsylvania 19106, USA.
REFERENCES
1.back to text KEMP, D. A., SIMPSON I. S., and WILSON, T. D. Indexing�Permuted, Rotated, or Cycled, Journal of Documentation, vol. 28, no. I, 1972, pp. 67-8.
2.back to text SHER, I. H., O�CONNOR, J., and GARFIELD, E. Rotadex�A New Index for Generic Searching of Chemical Compounds, Journal of Chemical Documentation, 4, p. 49-53 (1964).
3.back to text GRANITO, C. Chemical Substructure Index (CSI). A New Research Tool, Journal of Chemical Documentation,11(4), p.251-6(1971).
4.back to text GARFIELD, E. Permuterm Subject Index�the Primordial Dictionary of Science, Current Contents, 3 June 1961, p.4.
5.back to text GARFIELD, E. A Weekly Subject Index for Current Contents/Life Sciences. In Press. Presented at the 71st Annual Meeting of the Medical Library Association, held in San Diego, California, 11-15 June 1972.
6.back to text GARFIELD, E. Citation Indexing, Historio-Bibliography, and the Sociology of Science, Proceedings of the Third International Congress of Medical Librarianship, Amsterdam, May 1969, pp. 187�204, Excerpta Medica (1970).