Inverse document frequency

The inverse document frequency (IDF) is the logarithmed ratio of the amount of documents to documents containing a certain term. For example, if there are 1.000 documents in a database, 5 of them are about "mathematics", the IDF is

IDF = ld (total amount of documents / amount of documents about "mathematics") = ld (1.000/5) = 7,64

Please note, "ld" is the "logarithmus dualis", the logarithm of base 2. It also can be calculated as:

ld x = (lg x)/(lg 2) = (ln x)/(ln 2)

Switch to:

Suggestions about this definition? Please contact the author: Contact form