From information retrieval to scientometrics�
is the dog still wagging its tail?

Keynote Address � Plenary Session 1

Presented by
Eugene Garfield, Ph.D
Chairman Emeritus, ThomsonReuters - Health Sciences
Founding Editor & Publisher, The Scientist
Philadelphia, PA, USA
http://eugenegarfield.org/
mail: garfield@codex.cis.upenn.edu

Fifth International Conference on WIS & Tenth COLLNET Meeting
September 13-16, 2009
Dalian, China



 
Abstract

A brief review of the evolution of the Science Citation Index, designed primarily for information retrieval, into a tool for research evaluation and science policy analysis.

________________________

It is gratifying to me as a long time citation analyst to have been asked to open this 10th  COLLNET meeting  on Scientometrics. I am honored by the recognition that WISE has bestowed on me. But as I told Prof. Chaomei Chen who first invited me to be a keynote speaker, it is difficult for me to provide truly original remarks after fifty years or lectures and essays on the subject of citation and collaborative networks. However, I am grateful him,  for persisting in inviting me here.

I think there is more than enough evidence to indicate that quantitative  studies of science and scholarship  have come of age.  Forty-five years after the launch of the Science Citation Index in 1964 and fifty-five years after my first paper on �Citation Indexing as a new dimension in information retrieval� (Science 1955) 1 it is indeed relevant to ask, as I did ten years ago in Copenhagen �  �From Citation Indexes to Informetrics:´ Is the tail wagging the dog?� (Libri, 1998.)     The tail is scientometrics and the dog is information retrieval.

 The new generation of scientists and even scientometricians need to be regularly  reminded that the Science Citation Index (SCI) was not originally created either to conduct quantitative studies, calculate impact factors,  nor to facilitate the study of history of science. As an example, a professor  at Cornell University mistakenly made that claim in a recent paper entitled   �Reward or  Persuasion? The battle to define the meaning of a citation� (Learned Publishing Vol:22 (1): 5-11, January 2009.)3

Subsequently, in an e-mail message dated February 9, 2009 Michael Koenig  reminded  him that the Science Citation Index was not  developed as a tool to study the history of science but rather as a bibliographic search tool or to put it in more modern terms, as an aid to information retrieval.4

This generational amnesia is not unusual.  That same week, in a posting to Steve Harnad�s Open Access listserv, readers had to be reminded that the ASCA  personal profiling system based on citation indexing and keywords had been in existence at the Institute for Scientific Information  since 1965. That system is continued by Thomson Reuters today but we call  all similar systems Citation Alerts.  As I said,  in 1965  the  Automatic Subject  Citation Alert  (ASCA)  combined the advantages of citations, title or keyword or natural language searching. ASCA also included in its original profiling system a �Chinese Menu � system which will sound strange to this audience. In the USA Chinese food is very popular  and  in the days when I was growing up every Chinese restaurant permitted you to select dishes from several different categories of  food.

So, to expand the historical record,  SCI and citation indexing were meant not only to aid information retrieval, but also to facilitate SDI,  i.e.  Selective Dissemination of Information.  The  term SDI  can be easily confused with President Ronald Reagan�s SDI � that is Strategic Defense Initiative � later called �Star Wars�.  SDI or Selective Dissemination of Information eventually evolved into Personal Alert   as in the case of Google Citation Alerts.  And, of course, this is a part of the Web of Science system as well.

We should not chastise the Cornell Professor  too much for his assumption that SCI was invented as a tool for the historian of science.  I myself may have contributed to that impression because of my early association with and references to sociologists and historians of science such as Robert K. Merton, Derek de Solla Price and Arnold Thackray among others.  Indeed, it was J.D. Bernal, the progenitor of the �Science of  Science� , which later became Scientometrics, who laid the foundation for this field.   I described all this history at  the 11th ISSI conference in Madrid a few years ago. ("From the Science of Science to Scientometrics. Visualizing the History of Science with HistCite software" Proceedings of ISSI 2007 : Volume 1, p.21-26, June 25-27 2007.) 5    Recently reprinted in the Journal of Informetrics (3 (2009) 173�179)6.

The confusion on the original objectives of  the SCI was perhaps further compounded when Irving H. Sher and I did our 1964  report on the use of citation analysis for writing the history of science. That project ultimately lead to the creation of the HistCite software for creating historiographs as a by-product of searches in the Web of Science. (Garfield, E., I. H. Sher  and R. J. Torpie. "The Use of Citation Data in Writing the History of Science." Published by The Institute for Scientific Information, December 1964.7

In short, I have always believed that tracing the history and evolution of topics is intimately related to the task of documenting not only new scholarly journal articles and books but also tracing the origins of technologies in the patent citation analysis.

Many scientists and editors often make the even more outrageous error of assuming,  that the SCI was created just to produce its by-product database called  Journal Citation Reports (JCR).  That database now reports the impact factors for over 11,000 journals each year.  (SCI 7,886 ;  SSCI 2,523: and AHCI 1,464).   Considering the diverse range of editorial commentaries arguing  for and against  journal impact factors, it is understandable that science editors and administrators  new to the field of  bibliometrics  assume  that SCI and its successor, Web of Science,  was created to calculate journal impact factors and even Hirsch Indexes among many others.

The growth of COLLNET and ISSI and this symposium demonstrate  that  it is appropriate to say of  citationology or citation analysis, that not only is the tail wagging the dog, but the tail has become a huge   animal which is rapidly mutating into a multiplegic  monster.  Indeed, research evaluation and the creation of research indicators is an industry that is driving the modern R&D enterprise through its influence on administrators and policy makers. The very existence of WISE, one of our sponsors here today, is further evidence of that growth.

Whether it is the NSF Biennial Science & Engineering Indicators report in the US, the annual JCR  reports from Thomson or the Research  Assessment Exercise (RAE)  in the U.K. or its equivalents in other countries, I think there is no turning back.

 Considering, therefore, the impact of  publication studies on individual careers and funding decisions, it is ever more critical that the scientific community and especially journal editors to adopt the �Gentle art of citation itself� (Swart P and Carling P. Sedimentology  55:115-116, October 2008.)  We cannot allow the Google generation of researchers to repeat what Ginsburg called �the disregard syndrome�. (Ginsburg I. �The Disregard Syndrome: A menace to honest science?� The Scientist  15(24):51, December  10, 2001.) 8   Each author, like each potential patentee, must  insure that the appropriate historic precedents are cited with precision to assure accuracy.  When that happens, perhaps we will approach bibliographical nirvana.  So it becomes clear that the quality of science publication will drive Scientometrics.  Information retrieval is still crucial to the scientific enterprise.
 

End of Keynote Address
-----------------------------------------------------------------------------------------------------

APPENDIX:

I would like to repeat the sage remarks of Peter Swart and  Paul Carling concerning the ethical and critical role of citation as published in  Sedimentology, a journal most of us do not read.  Their  comments state the case very well and I will, therefore, quote directly from their abovementioned editorial�.

�The citations which authors use reveal a significant amount in relation to what the authors understand about a subject, how the paper will fare with reviewers and, ultimately, the number of times the paper itself will be cited. The first problem, which we see repeatedly as editors, relates to the citation of a primary reference for a particular phenomenon. Frequently, authors cite the most recent paper rather than the original work.�

�Some authors do not want to do the hard work of unearthing who first came up with a particular idea, so they use the dreaded �e.g.� to include a collective group of papers. This is a �cover your backside� citation so you can give credit to multiple persons including those who maybe had nothing to do with creating the

concept in the first place. This problem is particularly bad in some papers in which every single citation is accompanied by a list of papers, making the citation essentially useless. Finally, we have the problem of the excessive self citation, people who feel that they are not cited enough and insist on squeezing as many of their papers into the reference list as possible. It is natural to cite oneself in a paper, usually for quite valid reasons, but when the self-citation quotient becomes excessive it reflects poorly on the paper, the author and the journal. We have been attempting to correct these problems but it is still up to the authors, who know their fields better than anyone, to consider carefully the insertion of a reference. Find out who was originally responsible for an idea and do not add references indiscriminately. The appropriate use of citations will benefit everyone and delay what is probably inevitable, citation inflation.�

____________________________________________________________


REFERENCES:

 1.        Garfield E. �Citation Indexes for Science � a new dimension in documentation through association of ideas�  Science 122(3159):108-111, 1955.
http://garfield.library.upenn.edu/papers/science1955.pdf

 2.        Garfield, E. "From Citation Indexes to Informetrics: Is the tail wagging the dog?"
Libri,48(2), p.67-80, June 1998.
http://garfield.library.upenn.edu/papers/libriv48(2)p67-80y1998.pdf

 3.         Davis PM. �Reward or persuasion? The battle to define the meaning of a citation�
Learned Publishing, 22(1):5-11, January 2009.

4.          Koenig M. e-mail message dated February 9, 2009.

 5.         Garfield E. "From the Science of Science to Scientometrics : Visualizing the History of Science with HistCite software".  Presented at the 11th ISSI International Conference, Madrid, Spain - June 25, 2007.
http://garfield.library.upenn.edu/papers/issispain2007.pdf

6.         Garfield E. "From the Science of Science to Scientometrics : Visualizing the History of Science with HistCite software"  Presented at the 11th ISSI International Conference, Madrid, Spain - June 25, 2007. Reprinted in special issue of  Journal of Informetrics 3 (2009) 173�179.

7.        Swart P and Carling P. Sedimentology  55:115-116, October 2008.

 8.        Ginsburg I. �The Disregard Syndrome: A menace to honest science?� The Scientist  15(24):51, December 10, 2001.