Covering number

In mathematics, a covering number is the number of balls of a given size needed to completely cover a given space, with possible overlaps between the balls. The covering number quantifies the size of a set and can be applied to general metric spaces. Two related concepts are the packing number, the number of disjoint balls that fit in a space, and the metric entropy, the number of points that fit in a space when constrained to lie at some fixed minimum distance apart.

Definition

Let (M, d) be a metric space, let K be a subset of M, and let r be a positive real number. Let Br(x) denote the ball of radius r centered at x. A subset C of M is an r-external covering of K if:

K x C B r ( x ) {\displaystyle K\subseteq \bigcup _{x\in C}B_{r}(x)} .

In other words, for every y K {\displaystyle y\in K} there exists x C {\displaystyle x\in C} such that d ( x , y ) r {\displaystyle d(x,y)\leq r} .

If furthermore C is a subset of K, then it is an r-internal covering.

The external covering number of K, denoted N r ext ( K ) {\displaystyle N_{r}^{\text{ext}}(K)} , is the minimum cardinality of any external covering of K. The internal covering number, denoted N r int ( K ) {\displaystyle N_{r}^{\text{int}}(K)} , is the minimum cardinality of any internal covering.

A subset P of K is a packing if P K {\displaystyle P\subseteq K} and the set { B r ( x ) } x P {\displaystyle \{B_{r}(x)\}_{x\in P}} is pairwise disjoint. The packing number of K, denoted N r pack ( K ) {\displaystyle N_{r}^{\text{pack}}(K)} , is the maximum cardinality of any packing of K.

A subset S of K is r-separated if each pair of points x and y in S satisfies d(x, y) ≥ r. The metric entropy of K, denoted N r met ( K ) {\displaystyle N_{r}^{\text{met}}(K)} , is the maximum cardinality of any r-separated subset of K.

Examples

  1. The metric space is the real line R {\displaystyle \mathbb {R} } . K R {\displaystyle K\subset \mathbb {R} } is a set of real numbers whose absolute value is at most k {\displaystyle k} . Then, there is an external covering of 2 k r {\textstyle \left\lceil {\frac {2k}{r}}\right\rceil } intervals of length r {\displaystyle r} , covering the interval [ k , k ] {\displaystyle [-k,k]} . Hence:
    N r ext ( K ) 2 k r {\displaystyle N_{r}^{\text{ext}}(K)\leq {\frac {2k}{r}}}
  2. The metric space is the Euclidean space R m {\displaystyle \mathbb {R} ^{m}} with the Euclidean metric. K R m {\displaystyle K\subset \mathbb {R} ^{m}} is a set of vectors whose length (norm) is at most k {\displaystyle k} . If K {\displaystyle K} lies in a d-dimensional subspace of R m {\displaystyle \mathbb {R} ^{m}} , then:[1]: 337 
    N r ext ( K ) ( 2 k d r ) d {\displaystyle N_{r}^{\text{ext}}(K)\leq \left({\frac {2k{\sqrt {d}}}{r}}\right)^{d}} .
  3. The metric space is the space of real-valued functions, with the l-infinity metric. The covering number N r int ( K ) {\displaystyle N_{r}^{\text{int}}(K)} is the smallest number k {\displaystyle k} such that, there exist h 1 , , h k K {\displaystyle h_{1},\ldots ,h_{k}\in K} such that, for all h K {\displaystyle h\in K} there exists i { 1 , , k } {\displaystyle i\in \{1,\ldots ,k\}} such that the supremum distance between h {\displaystyle h} and h i {\displaystyle h_{i}} is at most r {\displaystyle r} . The above bound is not relevant since the space is {\displaystyle \infty } -dimensional. However, when K {\displaystyle K} is a compact set, every covering of it has a finite sub-covering, so N r int ( K ) {\displaystyle N_{r}^{\text{int}}(K)} is finite.[2]: 61 

Properties

  1. The internal and external covering numbers, the packing number, and the metric entropy are all closely related. The following chain of inequalities holds for any subset K of a metric space and any positive real number r.[3]
    N 2 r met ( K ) N r pack ( K ) N r ext ( K ) N r int ( K ) N r met ( K ) {\displaystyle N_{2r}^{\text{met}}(K)\leq N_{r}^{\text{pack}}(K)\leq N_{r}^{\text{ext}}(K)\leq N_{r}^{\text{int}}(K)\leq N_{r}^{\text{met}}(K)}
  2. Each function except the internal covering number is non-increasing in r and non-decreasing in K. The internal covering number is monotone in r but not necessarily in K.

The following properties relate to covering numbers in the standard Euclidean space, R m {\displaystyle \mathbb {R} ^{m}} :[1]: 338 

  1. If all vectors in K {\displaystyle K} are translated by a constant vector k 0 R m {\displaystyle k_{0}\in \mathbb {R} ^{m}} , then the covering number does not change.
  2. If all vectors in K {\displaystyle K} are multiplied by a scalar k R {\displaystyle k\in \mathbb {R} } , then:
    for all r {\displaystyle r} : N | k | r ext ( k K ) = N r ext ( K ) {\displaystyle N_{|k|\cdot r}^{\text{ext}}(k\cdot K)=N_{r}^{\text{ext}}(K)}
  3. If all vectors in K {\displaystyle K} are operated by a Lipschitz function ϕ {\displaystyle \phi } with Lipschitz constant k {\displaystyle k} , then:
    for all r {\displaystyle r} : N | k | r ext ( ϕ K ) N r ext ( K ) {\displaystyle N_{|k|\cdot r}^{\text{ext}}(\phi \circ K)\leq N_{r}^{\text{ext}}(K)}

Application to machine learning

Let K {\displaystyle K} be a space of real-valued functions, with the l-infinity metric (see example 3 above). Suppose all functions in K {\displaystyle K} are bounded by a real constant M {\displaystyle M} . Then, the covering number can be used to bound the generalization error of learning functions from K {\displaystyle K} , relative to the squared loss:[2]: 61 

Prob [ sup h K | GeneralizationError ( h ) EmpiricalError ( h ) | ϵ ] N r int ( K ) 2 exp m ϵ 2 2 M 4 {\displaystyle \operatorname {Prob} \left[\sup _{h\in K}{\big \vert }{\text{GeneralizationError}}(h)-{\text{EmpiricalError}}(h){\big \vert }\geq \epsilon \right]\leq N_{r}^{\text{int}}(K)\,2\exp {-m\epsilon ^{2} \over 2M^{4}}}

where r = ϵ 8 M {\displaystyle r={\epsilon \over 8M}} and m {\displaystyle m} is the number of samples.

See also

  • Polygon covering
  • Kissing number

References

  1. ^ a b Shalev-Shwartz, Shai; Ben-David, Shai (2014). Understanding Machine Learning – from Theory to Algorithms. Cambridge University Press. ISBN 9781107057135.
  2. ^ a b Mohri, Mehryar; Rostamizadeh, Afshin; Talwalkar, Ameet (2012). Foundations of Machine Learning. US, Massachusetts: MIT Press. ISBN 9780262018258.
  3. ^ Tao, Terence. "Metric entropy analogues of sum set theory". Retrieved 2 June 2014.