An autobiographical account of Eugene Garfield's involvement in chemical information systems. This paper traces his personal evolution from laboratory chemist transformed into an information scientist who combined a knowledge of structural linguistics and information technology into an algorithmic system for identifying molecular formulas in the literature.

Recognizing the shortcomings of traditional abstracting and indexing systems, like Index Medicus and Chemical Abstracts, he launched Current Contents , Index Chemicus and Science Citation Index were designed to provide timely, weekly, and highly specific retrieval of chemical information.

The experience in locating and coding the steroid literature for the U.S. Patent Office led to a variety of chemically based services dealing with new compounds and intermediates as well as graphical presentation of chemical formulas and reactions.

The Index Chemicus Registry System was the first to use the Wiswesser Line Notation which became a standard in the pharmaceutical field. This eventually led to Current Chemical Reactions Database and the Reaction Citation Index.

As a young B.S. chemist in 1949 at Evans R&D, a chemical consulting firm, I spent days in the lab measuring the viscosity of potential shampoo products. One day I was asked to summarize a meeting held with a client investigating substitutes for natural products imported from Asia. Someone noticed in my resume that in high school I had learned typing and other office skills. And on my previous job, since I couldn't fine one as a chemist, I worked as a sales clerk at LaSalle University. This provided free tuition to attend classes in Gregg shorthand and Stenotype.

Then I received a call from my cousin, Sid Bernhard. There was an opening for a lab assistant in the laboratory of Professor Louis P. Hammett where Sid was doing his doctorate1. In Hammett's lab I had the job of synthesizing a long series of esters needed to test theories of acid-base catalysis. One day, after months of seemingly endless lab work, I discovered a closet outside the lab in Havemeyer Hall. Here was where decades of graduate students had deposited thousands of synthesized compounds. When I discovered that some were those I had to produce, I decided to search this collection before doing another synthesis from scratch. And I soon learned the value of searching the literature for the same reason.

Professor Hammett was editor of the McGraw Hill chemistry book series. His excellent personal library included a complete set of Chemical Abstracts. However, by the time of my second laboratory explosion, I decided to pursue a different career path. Using my secretarial skills, I applied for a post as chemical secretary at the Ethyl Corporation. To be interviewed, I attended the 75th anniversary meeting of the American Chemical Society (ACS) in New York City in March 1951. By accident, I stumbled into the sessions of the Division of Chemical Literature where I met James W. Perry. He introduced me to Dr. Sanford Larkey at Johns Hopkins University (1951-53). Within a few months I was employed at the Welch Medical Library. I have often stated that this research experience eventually led to the establishment of the Institute for Scientific Information2. It is significant that when he was asked for a reference, Professor Hammett told Dr. Larkey that "Garfield is not particularly imaginative but a very hard-working fellow," -- just what Larkey wanted3.

At Johns Hopkins University I became steeped in the details of Chemical Abstracts and other abstracting services and served as a volunteer abstractor of Spanish pharmacology papers. As the Welch Medical Project chemist, I kept up with a variety of chemical information activities. It was there that I met most of the information pioneers who were the backbone of the American Documentation Institute4. Having worked on the chemical nomenclature used in Medical Subject Headings (MESH) and understanding the need for new approaches to retrieving chemical information at the Welch Project, I was mentally prepared to meet challenges in that arena. I attended Library School from 1953 to 1954. That is when I wrote my primordial paper on "Citation Indexes for Science" published in 1955 in Science5a5b. Within one year I also published a paper on searching chemical patents using citation indexes6. This first Patent Citation Index was based on a sample file of several thousand patents supplied to me by Marge Courain of Merck in Rahway. She had been a classmate at Columbia University School of Library Science.

In the summer of 1954 I went to work for SK&F Labs in Philadelphia as a consultant. Within a year I began Current Contents of Management publications. Then, in 1956 I began Current Contents of Pharmaco-Medical, Chemical & Life Sciences. Miles Labs was the first company to use this weekly product and soon was followed by others who paid $1500 per year for 25 copies. In 1958 Current Contents was made a subscription service at $100 per year for industrial and government institutions and $50 per year for academics. The impact of Current Contents has essentially been ignored by those who discuss major advances in the history of scientific information processing. But the timing of Current Contents was, in fact revolutionary. It had been unheard of to deliver indexed information weekly.

In 1958, since I was an independent, private information consultant, Robert A. Harte of Merck who chaired a committee of the Pharmaceutical Manufacturers Association (PMA), asked me to perform a contract to index all new steroid compounds reported in the literature. The data were to be delivered to the U.S. Patent Office which was represented by Don Andrews. His group of patent examiners would test the feasibility of conducting patent searches using IBM punched cards. The scanning and coding work on this contract led to my recognition that it would be possible, algorithmically, to locate and identify newly reported compounds. This eventually led to the organization of the Index Chemicus in 1959 and its launch in 1960. The first issue was dedicated to my mentor at SK&F, Ted Herdegen, who died that year7.

There are several fundamental points which need to be emphasized in the new approach ISI took to handling chemical information.

First, the recognition in 1958 that a systematic chemical name could be algorithmically converted to a molecular formula and subsequently to a line notation.

Secondly, the recognition that synthetic chemistry journals almost universally called out the new chemical compounds by providing their molecular formulas. Therefore it was not necessary to convert all compounds to systematic names in order to create a molecular formula index.

Thirdly, the realization that many intermediary compounds not identified by molecular formulas were missed by CAS. Index Chemicus provided this small but important additional added value. The focus on intermediates was essential for patent purposes. As Stu Kabach reported recently, CAS was very inconsistent in its handling of patents in the early days8. NutraSweet was an intermediate compound indexed in Index Chemicus long before it became famous as a sweetner9.

Fourthly, the recognition that the graphical presentation of organic chemistry was essential to the specialist. Any working organic chemist realized that the graphical display of structural information was the only way to communicate efficiently. In short, all of these elements speak to the recognition of the linguistic nature of chemistry at various levels.

Fifth, and not least, timing. In those days, Chemical Abstracts was quite late in producing its molecular formula indexes and we took the revolutionary step of doing this monthly and later, weekly. Furthermore, we received all foreign journals by airmail and stuck to a rigid production schedule.

Sixth, based on our experience with Current Contents, Science Citation Index, and the steroid coding project, we knew that Bradford's Law applied in chemistry and especially for novum organum. To this day, less than 100 journals produce over 95% of the papers containing new chemical compounds, even though hundreds more are scanned for them. Hence, Garfield's Law of Concentration proved to be a significant factor in trying to compete with CA and its coverage of thousands of journals.10

It is significant that during the years 1956 to 1960, both with respect to citation indexing and chemical indexing, I implored Chemical Abstracts to take up ideas based upon these six points that were proposed by me and a group chaired by Max Gordon in the Philadelphia section of the ACS. But CAS decided to continue on its traditional path. Nevertheless, I maintained nothing but the friendliest relations with E.J. Crane, Charles Bernier, Dale Baker , Fred Tate, Ralph O'Dette, Gerard Platau, Jim Rush, Michael Lynch, Harry Boyle, and others at CAS. I must say that I felt a great animosity towards the NSF for preferentially supporting CAS work simply because they were non-profit.

An integral part of this brief history is my work in chemical linguistics which began officially at the University of Pennsylvania when I signed up in 1955 for a Ph.D. program. This culminated in my 1961 dissertation on "An Algorithm for Translating Chemical Names to Molecular Formulas."11 Just this past April Nick Kemp and Michael Lynch referred to this work in the opening sentence of their paper in the Journal of Chemical Information -- "The design of algorithms for translating chemical names into the corresponding molecular formulas was one of the earliest points of departure in computer handling of chemical information--this was the subject of Gene Garfield's doctoral thesis in the late 1950s."12

From the launch of Index Chemicus and Current Abstracts of Chemistry to the present ISI introduced many innovative features and changes, with no support whatsoever from the government. Industry supported us by way of paid subscriptions and moral support. But even from the outset, the $2,000 annual commitment from each of the 12 sponsoring firms was far from sufficient to cover costs. What began as a simple molecular formula index that would have cost us about $25,000 per year to produce, quickly evolved into a full-blown graphical abstracting service. I had underestimated how long it would take for other firms and academics to adopt this product.

All of these chemical information activities were happening at the same time we were starting up the National Institutes of Health supported Genetics Citation Index Project. Index Chemicus developed several innovative features. We pioneered the use of Wisswesser Line Notations (WLN), ROTAFORM, SUBSTRUCTURE, and other types of indexes which are briefly listed in Table 1. Bill Wisswesser was one of our consultants. Al Smith was also very helpful. The work on Chemical Reactions was undoubtedly inspired by the work of Kurt Theilheimer.

Index Chemicus Registry System (ICRS) enabled structure searching well before CAS launched its structure searching in the early 1980s. I believe that it is important to note that every major pharmaceutical company used WLN for their internal files and subscribed to ICRS. ISI made substructure searching of the open literature a possibility.

Current Chemical Reactions was created by the customers themselves. We established a charter group that basically signed up before the product was created and the customers had input as to what the product would contain. It was, and is, a premier chemical reaction database that was available before CAS created a competitive version.

The Reaction Citation Index, and Citation Indexing overall, was the forerunner of "weblinks" at a very detailed level. Only now is CAS including citations in their electronic offerings. This was announced recently at a meeting in New Orleans.

Table 1 provides a chronological table showing when certain features of
Index Chemicus were implemented.

1960 – to date Index Chemicus 

Graphic abstracts of new organic compounds published in the chemical literature

Contains graphic abstracts, bibliographic data, author’s summary, and alerts to biological activities, labeled compounds and hazards; also cumulated indexes on subjects, authors, journals, corporate addresses, biological activities and labeled compounds. Indexes on molecular formulas were included until 1987.
1968 - 1987 Index Chemicus Registry System (ICRS)

Compilation of new compounds in Wiswesser Line Formula Notation (WLN)

Data years – 1962 to 1987

Products based on ICRS:

  • Chemical Substructure Index (CSI) 
  • Index Chemicus Registry of Organic Compounds (ROC) 
  • Automatic New Structure Alert (ANSA)
    1979 – to date Current Chemical Reactions (CCR)

    Graphic abstracts of new synthetic methods in the chemical literature

    Depicts reaction schemes, experimental data, product yields, bibliographic data and author’s summary; also cumulated indexes on subjects, authors, journal, and corporate address.


    1984 - 1986 Index Chemicus Online Structural and bibliographic information provided in DARC and QUESTEL, respectively.

    Data years = 1960 - 1986

    1985 Index Chemicus in Microform Available in microfiche and microfilm – graphic abstracts and cumulated indexes for subjects, journals and authors.

    Data years : 

    Graphic abstracts and cumulative indexes: 1960 – 1981

    Graphic abstracts only: 1982 - 1985



    1986 – to date Current Chemical Reactions Database

    Compilation of new synthetic methods in the chemical literature

    Contains REACCS formatted reactions with corresponding bibliographic data, author’s summary, biological activities and related reaction data. Allows text, reaction and substructure searching.

    1986 – covered the chemical literature for over 100 journals

    1988 - added U.S.A. patents

    1992 - increased chemical literature coverage to over 300 journals

    1997 – patent coverage enhanced with international patents

    1986 Index Chemicus Personal Databases Structural information on selected topics with corresponding bibliographic data provided in ChemBase and ChemSmart.
    1986 Current Chemical Reactions Personal Databases Structural information on selected topics with corresponding bibliographic data provided in ChemBase and ChemSmart.
    1993 – to date Electronic Layout Program Allows the electronic creation of print issues by merging the REACCS formatted structures and reactions with ISI bibliographic data. Creates compound R tables from individual structures.
    1993 – to date Index Chemicus Database

    Compilation of new organic compounds published in the chemical literature

    Contains REACCS formatted structures with corresponding bibliographic data, author’s summary, biological activities and other related data. Allows substructure and text searching.
    1995 – to date Reaction Citation Index Incorporates the CCR Database and the corresponding ISI citation data. 

    Data years = 1985 – to date

    1996 – to date Index Chemicus CD-ROM Allows substructure search of new organic compounds and corresponding text, including bibliographic data and compound related information such as biological activities. In addition to the author’s abstract, graphic abstracts are provided with flow-diagrams for rapid scanning.

    Data years: 1995 – to date

    1997 – to date ChemPrep

    CD-ROM product

    Allows search by reaction, substructure or text of new synthetic methods. Also included are author’s summary, complete reaction conditions, graphic abstracts and compound data such as biological activities.

    Data years: 1985 – to date

    1998 – to date ISI Chemistry Center

    1998 – Reaction Center

    end of 1999 – Compound Center

    Web access to reactions, structures and corresponding text, including bibliographic data, author’s summary, biological activities and corresponding reaction and structural data. Also included are graphic abstracts with flow charts.

    Data years: 1985 – to date

    To be a pioneer is to struggle and Index Chemicus was no exception, although the struggle to keep this product alive is not well known. In the early 1960s its red ink caused all four of my Vice Presidents to leave and form a competitive company which eventually failed. Any sensible corporate executive in an established company would have agreed with them and would have abandoned Index Chemicus.. As a compromise, we created a Chemical Edition of Current Contents but it was not accepted as a substitute for Index Chemicus.

    My obsessive attachment to Index Chemicus may be explained as foolish loyalty, commitment, and stubborness or fear of failure. Thanks to Current Contents / Life Sciences , Index Chemicus and Science Citation Index survived. They are now part of an integrated system that includes citation indexing links both to chemical literature and chemical patents. The latter was made easier by virtue of the corporate merger of ISI and Derwent under one Thomson manager.

    Another key point about Index Chemicus concerns the timing of our coverage of Japanese and other foreign journals. CAS was far behind because it relied on volunteers. At one point the Japanese language literature was even much more crucial than it is today. Japanese scientists now largely publish in English. Dr. Dave Jordan is completing his fortieth year as ISI's consulting indexer of Japanese journals. Thanks to him we were able to cover Japanese chemical and pharmaceutical journals in the years when key chemical discoveries were still reported in foreign languages, including Russian, German, and French.

    Shelly Rahman is responsible for doing the painstaking task of compiling the short tabular history of ISI's Chemical Information Products. There is a long list of people at ISI who have been involved in chemical information retrieval. Among them were Gaby Revesz and George Vladutz both of whom died years ago. Bonnie Lawlor was Executive Vice President of Database Publishing when she left ISI a few years ago. During her 28 years at ISI she was increasingly responsible for the chemical products and pushed for their improvement, including the Reactions Citation Index. Irving H. Sher is not ordinarily mentioned in connection with Index Chemicus but he had a role in all our products and especially the Reactions Citation Index. In particular he should receive credit for the first push-pull system of selective dissemination of information. In the case of new chemistry, we called it the Automatic New Structure Alert (ANSA). Its predecessor was the Automatic SubjectCitation Alert (ASCA) which began in 1965.13

    Perhaps the best way to end this short history is to say that ISI's chemical information services - including Index Chemicus - are alive and well as are the competitive products that they inspired. As Emerson said, "Invention breeds invention" and perhaps one of the key legacies of ISI's entry into chemical information is that our non-traditional approach accelerated change and prompted the development of a whole new generation of chemical information products and services.

    In closing, I'd like to pay tribute to the list of persons who have contributed in one way or another to Index Chemicus over its forty-year history.


