Tag cloud: Creation of a tag cloud

In principle, the font size of a tag in a tag cloud is determined by its incidence. For a word cloud of categories like weblogs, the frequency of use for example, corresponds to the number of weblog entries that are assigned to a category. For small frequencies it's sufficient to indicate directly for any number from one to a maximum font size. For larger values, a scaling should be made. In a linear normalization, the weight ti of a descriptor is mapped to a size scale of 1 through f, where tmin and tmax are specifying the range of available weights.

s_i = \left \lceil \frac{f_{\mathrm{max}}\cdot(t_i - t_{\mathrm{min}})}{t_{\mathrm{max}}-t_{\mathrm{min}}} \right \rceil for ti > tmin; else si = 1

    si: display fontsize
    fmax: max. fontsize
    ti: count
    tmin: min. count
    tmax: max. count

Since the number of indexed items per descriptor is usually distributed according to a power law,[24] for larger ranges of values, a logarithmic representation makes sense.[25]

Implementations of tag clouds also include text parsing and filtering out unhelpful tags such as common words, numbers, and punctuation.

There are also websites creating artificially or randomly weighted tag clouds, for advertising, or for humorous results.

Tag cloud

Thursday, 10 November 2011

Creation of a tag cloud

No comments:

Post a Comment