In principle, the font size of a tag in a tag cloud is determined by its incidence. For a word cloud of categories like weblogs, the frequency of use for example, corresponds to the number of weblog entries that are assigned to a category. For small frequencies it's sufficient to indicate directly for any number from one to a maximum font size. For larger values, a scaling should be made. In a linear normalization, the weight ti of a descriptor is mapped to a size scale of 1 through f, where tmin and tmax are specifying the range of available weights.
s_i = \left \lceil \frac{f_{\mathrm{max}}\cdot(t_i - t_{\mathrm{min}})}{t_{\mathrm{max}}-t_{\mathrm{min}}} \right \rceil for ti > tmin; else si = 1
si: display fontsize
fmax: max. fontsize
ti: count
tmin: min. count
tmax: max. count
Since the number of indexed items per descriptor is usually distributed according to a power law,[24] for larger ranges of values, a logarithmic representation makes sense.[25]
Implementations of tag clouds also include text parsing and filtering out unhelpful tags such as common words, numbers, and punctuation.
There are also websites creating artificially or randomly weighted tag clouds, for advertising, or for humorous results.
s_i = \left \lceil \frac{f_{\mathrm{max}}\cdot(t_i - t_{\mathrm{min}})}{t_{\mathrm{max}}-t_{\mathrm{min}}} \right \rceil for ti > tmin; else si = 1
si: display fontsize
fmax: max. fontsize
ti: count
tmin: min. count
tmax: max. count
Since the number of indexed items per descriptor is usually distributed according to a power law,[24] for larger ranges of values, a logarithmic representation makes sense.[25]
Implementations of tag clouds also include text parsing and filtering out unhelpful tags such as common words, numbers, and punctuation.
There are also websites creating artificially or randomly weighted tag clouds, for advertising, or for humorous results.
No comments:
Post a Comment