What is....

What is H-index? by wikipedia

The h-index is an index that attempts to measure both the productivity and impact of the published work of a scientist or scholar.

The index is based on the set of the scientist’s most cited papers and the number of citations that they have received in other publications.

The index can also be applied to the productivity and impact of a group of scientists, such as a department or university or country.

The index was suggested by Jorge E. Hirsch, a physicist at UCSD, as a tool for determining theoretical physicists‘ relative quality and is sometimes called the Hirsch index or Hirsch number.

Definition and purpose

 

H-index from a plot of decreasing citations for numbered papers

The index is based on the distribution of citations received by a given researcher’s publications. Hirsch writes:

A scientist has indexifof his/her Np papers have at leastcitations each, and the other (Nph) papers have no more thancitations each.

In other words, a scholar with an index of h has published h papers each of which has been cited in other papers at least h times. Thus, the h-index reflects both the number of publications and the number of citations per publication. The index is designed to improve upon simpler measures such as the total number of citations or publications. The index works properly only for comparing scientists working in the same field; citation conventions differ widely among different fields.

The h-index serves as an alternative to more traditional journal impact factor metrics in the evaluation of the impact of the work of a particular researcher. Because only the most highly cited articles contribute to the h-index, its determination is a relatively simpler process. Hirsch has demonstrated that h has high predictive value for whether a scientist has won honors like National Academy membership or theNobel Prize. In physics, a moderately productive scientist should have an h equal to the number of years of service while biomedical scientists tend to have higher values. The h-index grows as citations accumulate and thus it depends on the ‘academic age’ of a researcher.

Hirsch suggested that, for physicists, a value for h of about 10–12 might be a useful guideline for tenure decisions at major research universities. A value of about 18 could mean a full professorship, 15–20 could mean a fellowship in the American Physical Society, and 45 or higher could mean membership in the United States National Academy of Sciences. Little systematic investigation has been made on how academic recognition correlates with h-index over different institutions, nations and fields of study.

Calculation

The h-index can be manually determined using citation databases or using automatic tools. Subscription-based databases such as Scopusand the Web of Knowledge provide automated calculators. Harzing’s Publish or Perish program calculates the h-index based on Google Scholar entries. In July 2011 Google trialled a tool which allows a limited number of scholars to keep track of their own citations and also produces a h-index and an i10-index. Each database is likely to produce a different h for the same scholar, because of different coverage:Google Scholar has more citations than Scopus and Web of Science but the smaller citation collections tend to be more accurate. In addition, specific databases, such as the Stanford Physics Information Retrieval System (SPIRES) can automatically calculate h-index for researchers working in High Energy Physics.

The topic has been studied in detail by Lokman I. Meho and Kiduk Yang. Web of Knowledge was found to have strong coverage of journal publications, but poor coverage of high impact conferences. Scopus has better coverage of conferences, but poor coverage of publications prior to 1996; Google Scholar has the best coverage of conferences and most journals (though not all), but like Scopus has limited coverage of pre-1990 publications. The exclusion of conference preprints is a problem for scholars in computer science, where conference preprints are considered an important part of the literature, but reflects common practice in most scientific fields where conference preprints are unrefereed and are accorded less weight in evaluating academic productivity. The Scopus and Web of Knowledge calculations also fail to count the citations that publication gathers while ‘in press’ (i.e. after being accepted for publication but before being printed on paper), with electronic pre-publication and very long printing lags for some journals, these ‘in press’ citations can be considerable. Google Scholar has been criticized for producing “phantom citations,” including gray literature in its citation counts, and failing to follow the rules of Boolean logic when combining search terms. For example, the Meho and Yang study found that Google Scholar identified 53% more citations than Web of Knowledge and Scopus combined, but noted that because most of the additional citations reported by Google Scholar were from low-impact journals or conference proceedings, they did not significantly alter the relative ranking of the individuals. It has been suggested that in order to deal with the sometimes wide variation in h for a single academic measured across the possible citation databases, that one should assume false negatives in the databases are more problematic than false positives and take the maximum h measured for an academic.

Advantages

Hirsch intended the h-index to address the main disadvantages of other bibliometric indicators, such as total number of papers or total number of citations. Total number of papers does not account for the quality of scientific publications, while total number of citations can be disproportionately affected by participation in a single publication of major influence. The h-index is intended to measure simultaneously the quality and sustainability of scientific output, as well as, to some extent, the diversity of scientific research. The h-index is much less affected by methodological papers proposing successful new techniques, methods or approximations, which can generate a large number of citations.

Criticism

There are a number of situations in which h may provide misleading information about a scientist’s output:

  • The h-index does not account for the number of authors of a paper. In the original paper, Hirsch suggested partitioning citations among co-authors. Even in the absence of explicit gaming, the h-index and similar indexes tend to favor fields with larger groups, e.g. experimental over theoretical.
  • The h-index does not account for the typical number of citations in different fields. Different fields, or journals, traditionally use different numbers of citations.
  • The h-index discards the information contained in author placement in the authors’ list, which in some scientific fields (but not in high energy physics, where Hirsch works) is significant.
  • The h-index is bounded by the total number of publications. This means that scientists with a short career are at an inherent disadvantage, regardless of the importance of their discoveries. For example, Évariste Galois‘ h-index is 2, and will remain so forever. HadAlbert Einstein died after publishing his four groundbreaking Annus Mirabilis papers in 1905, his h-index would be stuck at 4 or 5. This is also a problem for any measure that relies on the number of publications. However, as Hirsch indicated in the original paper, the index is intended as a tool to evaluate researchers in the same stage of their careers. It is not meant as a tool for historical comparisons.
  • The h-index does not consider the context of citations. For example, citations in a paper are often made simply to flesh out an introduction, otherwise having no other significance to the work. h also does not resolve other contextual instances: citations made in a negative context and citations made to fraudulent or retracted work. This is also a problem for regular citation counts.
  • The h-index gives books the same count as articles making it difficult to compare scholars in fields that are more book-oriented such as the humanities.
  • The h-index does not account for confounding factors such as “gratuitous authorship”, the so-called Matthew effect, and the favorable citation bias associated with review articles. Again, this is a problem for all other metrics using publications or citations.
  • The h-index has been found to have slightly less predictive accuracy and precision than the simpler measure of mean citations per paper. However, this finding was contradicted by another study.
  • The h-index is a natural number which reduces its discriminatory power. Ruane and Tol therefore propose a rational h-index that interpolates between h and h+1.
  • The h-index can be manipulated through self-citations, and if based on Google Scholar output, then even computer-generated documents can be used for that purpose, e.g. using SCIgen.

Alternatives and modifications

  • Various proposals to modify the h-index in order to emphasize different features have been made.
  • An individual h-index normalized by the average number of co-authors in the h-core has been introduced by Batista et al. They also found that the distribution of the h-index, although it depends on the field, can be normalized by a simple rescaling factor. For example, assuming as standard the hs for biology, the distribution of h for mathematics collapse with it if this h is multiplied by three, that is, a mathematician with h = 3 is equivalent to a biologist with h = 9. This method has not been readily adopted, perhaps because of its complexity. It might be simpler to divide citation counts by the number of authors before ordering the papers and obtaining the h-index, as originally suggested by Hirsch.
  • The m-index is defined as h/n, where n is the number of years since the first published paper of the scientist also called m-quotient.
  • A generalization of the h-index and some other indices that gives additional information about the shape of the author’s citation function (heavy-tailed, flat/peaked, etc.) was proposed by Gągolewski and Grzegorzewski.
  • Successive Hirsch-type-index introduced independently by Kosmulski and Prathap.
  • A scientific institution has a successive Hirsch-type-index of i when at least i researchers from that institution have an h-index of at least i.
  • K. Dixit and colleagues argue that “For an individual researcher, a measure such as Erdős number captures the structural properties of network whereas the h-index captures the citation impact of the publications. One can be easily convinced that ranking in coauthorship networks should take into account both measures to generate a realistic and acceptable ranking.” Several author ranking systems such as eigenfactor (based on eigenvector centrality) have been proposed already, for instance the Phys Author Rank Algorithm.
  • The c-index accounts not only for the citations but for the quality of the citations in terms of the collaboration distance between citing and cited authors. A scientist has c-index n if n of [his/her] N citations are from authors which are at collaboration distance at least n, and the other (N − n) citations are from authors which are at collaboration distance at most n.
  • Bornmann, Mutz, and Daniel recently proposed three additional metrics, h2lower, h2center, and h2upper, to give a more accurate representation of the distribution shape. The three h2 metrics measure the relative area within a scientist’s citation distribution in the low impact area, h2lower, the area captured by the h-index, h2center, and the area from publications with the highest visibility, h2upper. Scientists with high h2upper percentages are perfectionists, whereas scientists with high h2lower percentages are mass producers. As these metrics are percentages, they are intended to give a qualitative description to supplement the quantitative h-index.

What is Impact Factor

* A citation metric
* The impact factor of a journal is the average number of citations received per         paper published in that journal
* The journal must be published in 2 consecutive years, so the IF can be                     calculated in the 3rd year
Example of IF calculation:
• In 2008, a journal has an impact factor of 3.0,  it means papers published in          2006 and 2007 received 3 citations each on average.
Calculation:

example, for 2008 impact factor = A/B

A = the number of times articles published in 2006 and 2007 were cited by indexed journals during 2008

B = the total number of “citable items” published by that journal in 2006 and 2007.  (“Citable items” are usually articles, reviews, proceedings, or notes; not editorials or Letters-to-the-Editor.)

Last Updated on Wednesday, 11 September 2013 08:14