It's the End of the Blog

As it turned out over the last 3 years or so, with teaching and managing the Algorithms & Theory group in Tartu, I just don't have time to post anything here.

Up-to-date current will from now on only be on my official homepage homepage at the University of Tartu.

Bye-bye :)

Hypergraph problems connected to Boolean rank

The Boolean rank of a 0/1-matrix M is the smallest number q of 0/1-matrices $R_1,\dots,R_q$ with rectangular support whose sum in the Boolean semiring ($1+1=1$, otherwise as usual) equals M. An equivalent graph theoretical parameter is the biclique covering number: Given a bipartite graph G, find the smallest number of bicliques in G (i.e., subgraphs which are complete bipartite graphs), such that each edge of G is contained in at least one of the selected bicliques.
There is a hypergraph (or set system, if you will) parameter which is equivalent to the Boolean rank / biclique covering number. Let H be a hypergraph with vertex set V(H) and (hyper-)edge set E(H). Let us say that a generator of H is a set of vertex sets $W_1,\dots,W_q$ such that every edge of H is the union of some of the sets $W_j$. Let us further agree that the generating number of a hypergraph is the smallest number of sets in a generator.
If somebody knows the proper name for this concept, please let me know!
Biclique covering number and hypergraph generating number are equivalent concepts. Proving this is an easy exercise. (Hint: Hypergraphs and bipartite graphs are equivalent concepts anyway, start from there. Consider only bicliques which are maximal in an appropriate sense.)
This was really easy. But let us make a connection to a more familiar hypergraph concept, namely transversals. It is not surprising that the biclique covering number / hypergraph generating number are transversal numbers of suitably defined hypergraphs. Slightly more sophisticated, though, is the following.

Hypergraph transversals and lower bounds for hypergraph generating number
I will use notation and terminology of the hypergraph generating number, which is more natural in this context.  As usual, we are hunting for lower bounds.  Let us denote the generating number of H by $\gamma(H)$.  First, let us make the following trivial observation.
Lemma 0.  $\gamma(H) \le |H|$ --- the generating number is at most the number of vertices.
If we want to prove that $\gamma(H)$ is large, then singleton sets $W_j$ work for us: they drive the total number of sets which are needed towards the trivial upper bound $|H|$.  Now let us adapt the concept of generating number to forbid singleton sets.
Definition 1.  Denote by $\gamma_2(H)$ the smallest number q of set $W_1,\dots,W_q$ of at least two elements, such that every edge of H is the union of some of these sets.
Obviously, for this number to be finite, we need that H has no singleton edges.  Letting, for a vertex set S denote $H\setminus S$ the hypergraph with vertex set $V(H)\setminus S$ and edge set $\{ e\setminus S \mid e\in E(H)\}$ (this is the hypergraph restricted to the complement of S, we trivially have the following:
Lemma 2.  $\gamma(H) = \min_{S\subset V(H)} \; \bigl( |S| + \gamma_2(H\setminus S) \bigr)$.
Now, for a vertex v, denote by H.v the hypergraph with vertex set V(H) and edge set
$E(H.v) := \bigl\{ e\setminus\{v\} \bigm| e\in E(H), v\in e \bigr\}$.
Further, if $v_1,\dots,v_n$ is an ordering of the vertices of H, we denote by $H_r$ the hypergraph induced by the vertices $v_r,\dots,v_n$ -- i.e., the hypergraph with this vertex set, and with all edges of H which are completely contained in that set.  The following is the connection between generating number and hypergraph transversals.
Lemma 3.  Let $v_1,\dots,v_n$ be any ordering of the vertices of H.  Then  $\gamma_2(H) \ge \sum_{r=1}^n \tau( H_r.v_{r+1})$.
(As usual, $\tau$ denotes the transversal number.)  Again, I leave the proof as an exercise.  (Hint: consider the set of all generators $W_j$ which contain the vertex $v_r$)  Let's move to an application of Lemma 2.

The generating number of complete uniform hypergraphs
We would like to determine the generating number $\gamma_{n,k}$ of the complete k-uniform hypergraph on n vertices.  Here, k may depend on n.  In this paper, we prove the following.
Theorem 4. The generating number of the complete k-uniform hypergraph on n vertices is at least $\min\bigl(n, \; \; (n-k+2)(n-k+1)/2\bigr)$.
For the proof, we first note that each of the hypergraphs $H\setminus S$ occurring in Lemma 2 has as its edge set all vertex sets of cardinality between $k-|S|$ and $k$.  Hence, you may check that the transversal number of the hypergraph $H_r$ occurring in Lemma 3 equals $\max(0, n-r-k+2)$, if $k-|S| \ge 2$, and $\infty$, if $k-|S| \le 1$.  In the first case, the sum in that lemma amounts to
The maximum in Lemma 1 thus equals
$\min\Bigl(n, \;\; \min_{s=0,\dots,k-2} \; s + (n-k+2)(n-k+1)/2 \Bigr)$ $= \min\bigl( n, \;\; (n-k+2)(n-k+1)/2 \bigr)$
which proves the theorem.
Corollary 5.  For $k\le n - (\sqrt{8n+1}-3)/2$, the generating number of the complete k-uniform hypergraph on n vertices is equal to n.
For larger than the number in the corollary, the value of $\gamma$ is unknown.  The following upper bound is proved in arXiv:1111.0444.
Theorem 6.  $\gamma_{n,k} \le (n-k+1)^2 \log n$.
The proof uses a simple probabilistic argument, and we omit it here (see Lemma 3.3. in said paper).  What is clear, is that for fast growing k there is a $\log n$ factor between the bounds in Theorems 5 and 6.  For $k = n - O(1)$, this factor is needed, since (exercise!) $\gamma_{n,n-\ell} = \Omega_\ell (\log n)$.  But $n-k \to \infty$ but $n-k = o(\sqrt n)$, the gap is a problem.
Problem 7.  Determine the value of $\gamma_{n,k}$, when $n-k \to \infty$ and $n-k = o(\sqrt n)$!
The general approach using Lemmas 2 and 3 seems to lend itself to study the generating number of random hypergraphs, too.
Problem 8.  Determine the generating number of a random k-uniform hypergraph on n vertices! (With, say, edges chosen independently with probability p = p(n).)

Best combinatorial bound for extension complexity

The following is a well-known and easy fact.

Proposition. For every 0/1-matrix M, the number

$\min \{ \mbox{rk}_+(A) \mid \mbox{supportmatrix}(A) = M \}$

is equal to the Boolean rank of M.

For a polytope P, let xc(P) denote its extension complexity, i.e., the smallest number of facets of a polytope Q which can be mapped onto P by a projective mapping. Moreover, let S(P) denote the support matrix of the vertex-facet slack matrix of P. The proposition above motivates the following question.

Question 1. Let M be the support matrix of the vertex-facet slack matrix of a polytope. Is it true that the number

$\min \{ \mbox{xc}(P) \mid S(P) = M \}$

is equal to the Boolean rank of M?

The answer is most probably ``no, at least not for all P''. However, let's look at what we get if we restrict P to a special family of polytopes. My all-time favourite: the Matching polytopes. Here Question 1 gives rise to the following.

Question 2. Is there a polytope which is combinatorially equivalent to the matching polytope and whose extension complexity is $O(n^4)$?

And we might want to relax that further to the following.

Question 3. Is there a polytope which is combinatorially equivalent to the matching polytope and whose extension complexity is polynomial in n?

We know that the question corresponding to Question 3 for Max-Cut and Traveling Salesman polytopes has a negative answer -- but only because the lower bound proof works combinatorially.

In general, it may be a good idea to start by looking at 3-polytopes, because in that case the realization space is comparatively easy. Some other time.

Nonnegative rank (and other matrix factorization ranks over cones) at Dagstuhl, II

Our proposal for a second Dagstuhl workshop on factorization ranks over cones and combinatorial optimization has been accepted. The Dagstuhl people have scheduled the meeting for

Mon Feb 16 -- Fri Feb 20, 2015

The complete title of the seminar is "Limitations of convex programming: lower bounds on extended formulations and factorization ranks", the official web page is here.  As the title suggests, we (myself, Hartmut Klauck, Troy Lee, Rekha Thomas) have conceived it to be more broad than the 2013 workshop, which was mainly on nonnegative rank (and linear extended formulations of combinatorial optimization problems).  In the nonnegative rank part (even though new lower bounds on Boolean rank are still very much desired for the relevant matrix families) there will be novelties, too. We plan to invite people from real algebraic geometry, more specifically, algebraic statistics, whose interest is in describing the set of matrices with nonnegative rank at most k as a semialgebraic set.

Just as a teaser, here's my favourite open problem in that area. For positive integers r < n, let f(n,r) be the smallest number q such that the set of all matrices of nonnegative rank at most q is has nonempty relative interior in the set of all rank-r $n\times n$ matrices. Clearly, $k \le f(n,k) \le n$, and Yaroslav Shitov (arXiv:1303.1960) has recently proven that $f(n,3) \le 6n/7$.

Conjecture 1. $f(n,k) = \Omega(n)$ for all k.

I also like this stronger version.

Conjecture 2. $f(n,k) \sim n$ asymptotically if both k and n tend to infinity.


We are looking for PostDocs in Discrete Math or Combinatorial Optimization. The duration depends on the kind of funding. Your research would be on problems somewhere in the intersection of
  • Matrix Theory,
  • Probabilistic Combinatorics,
  • Hypergraph Theory,
  • Combinatorial Optimization.

The work environment is multi-national, interaction is in English.

The Discrete Math and Combinatorial Optimization research group is small and vibrant. Besides the above areas, there is research in Combinatorial Number Theory and Coding Theory, as well as connections with a sprawling Cryptology group.

The qualifications I am looking for are roughly the following. The ideal candidate would:
  • be fluent in English (deal breaker);
  • hold a PhD in Math (or TCS);
  • have experience and publications in at least one of the above areas;
  • have a very good record of publications (measured relative to the specifics of your area and the stage of his PostDoc life).

As for your funding, the ideal way would be to obtain that from your current or home country in the form of a fellowship. (This is a reasonable assumption if that is Germany, France, Belgium, ... For Germany, e.g., there is the DFG Research Fellowship for up to 2 years and the DAAD maintains a database of programs.) We will write the proposal together (provided your CV is sufficiently promising). For exceptional candidates, the European Union's Marie Skłodowska-Curie actions are an option, too.

If you are interested, please contact me! Informal inquiries are welcome! A very short CV and pointers to at most 3 of your favourite (own) publications would be great.

Here's some more information about Estonia & Tartu, but check out the Lonely Planet description.

  • Eurozone since 2011
  • 1.3 million inhabitants (= Cologne + Heidelberg)
  • Bustling economy
  • Language similar to Finnish
  • No need to learn it: English works everywhere. (Out of politeness, not necessity, I learned the Estonian words for: 'Hi', 'Please coffee', 'Thanks', 'Good bye'.)

  • Houses the region's most venerable university
  • Student city (= cafes, bars, restaurants, nightlife, ...)
  • Climate:
    • 4 months party in the sauna (winter)
    • 4 months party in the streets (summer)
    • (4 months inconclusive)

PhD positions in Discrete Math

I am looking for candidates for PhD-positions in Discrete Mathematics and Combinatorial Optimization here at the University of Tartu, for the academic year starting in Sep 2015.  Experience shows that some advance contact and collaboration is very helpful to get accepted into the PhD program.
The work environment is multi-national. The language of instruction and communication is English.
You should be interested in at least one of the following areas:
  1. Matrix theory;
  2. Probabilistic combinatorics and hypergraph theory, graph coloring, probabilistic methods;
  3. Computational discrete optimization
Please check out the "Problems" category here on the blog to get an idea of what kind of problems you might be working on.  (That list conveys only a limited selection, though, and is somewhat slanted towards matrix theory.)
The research group in discrete mathematics and combinatorial optimization was created in Feb 2013, and is in expansion (the web site still isn't up yet). As of September '14, it will consist of 2 PhD students, one early PostDoc, one advanced PostDoc, myself, and several half-members whose main research area has strong connections to combinatorics (Coding Theory, Cryptology). Expect a small and vibrant environment. Regarding supervising research students, I value working very closely with them. As a rule of thumb, I have 5+ hours of one-on-one face time per week with each student, usually condensed into one intensive afternoon research session.
The qualifications I am looking for are roughly the following. The ideal candidate would:
  • have an excellent background in mathematics;
  • be fluent in English;
  • be able to express concisely his mathematical ideas and questions;
  • hold a Masters degree (or equivalent), or be in the process of acquiring one, in Mathematics or Computer Science.
If you are interested, please contact me as soon as possible.  Informal inquiries are welcome: Just send a couple of lines about your education and your interests.
Below I have summarized some information about Estonia, Tartu, the University of Tartu, and the institute.

The PhD program
The 4-year PhD program (affiliated in Computer Science) requires to do ~50 ECTS (European credit points) of coursework. On the graduate level, we have 4 core courses covering basic research level combinatorics, which rotate every fall: extremal combinatorics, probabilistic combinatorics, enumerative combinatorics, and (hyper-)graph theory. The courses are heavily geared towards problems solving. The salary composition is competitive (i.e., depends on the candidate's qualifications). The generous version: you get ~1000 Euros; the other verions: you get ~590 Euros plus a half TA position worth ~410 Euros. (All these numbers are after tax.)
About half of the PhD students in the program are non-Estonian. Unfortunately, currently Discrete Math & Combinatorial Optimization is 100% expatriates.

Estonia is the Euro-zone country located directly south of Finland.
It is an ``E-Society'': Broadband Internet (e.g. LTE) is fast and inexpensive or free (public WiFi is ubiquitous). The country scores 1st in the world in Internet Freedom and 5th in Privacy. It is ranked directly behind Germany in the ``State of the World's Children's Index'' (10th). In the Human-Development index, it scores between the UK and Portugal. See Wikipedia for these and other statistical facts.
Pretty much everybody under 40 speaks English. It is not necessary (though polite) to learn Estonian. The language is similar to Finnish.

Tartu is a picturesque city in southern Estonia. Home to 17,000 students (in a population of 105,000), there are lots of inexpensive restaurants, bars, and cafés. Tartu is also the destination of large numbers of Erasmus exchange students from the rest of Europe. Consequently, it has an outsize the nightlife, well beyond what you'd expect in a city that size.

Universitas Tartuensis
Founded in 1632, the University of Tartu is one of the most venerable centers of education and research in this part of the world. In rankings (e.g., Times Higher Education World University Ranking), it is ranked in the top 400 universities world wide, with a rapid ascent over the recent years.

The CS institute
A significant number of the Computer Science institute faculty are expatriates. The discrete math & combinatorial optimization group, of which you will be part, is still in expansion.  Research fields are Graph- and Hypergraph Theory, (Combinatorial) Matrix Theory, Combinatorial Number Theory, Probabilistic Combinatorics.  Within, CS, there are connections and interactions with the sprawling Cryptology group here, and with a small group in Coding Theory.
The faculty comprises the CS, Math, and Statistics institutes. In Math, there are large research groups in Algebra and Functional Analysis, but the Math institute is less international (e.g., all their courses are in Estonian), same holds for the Statistics institute.

Extension complexity and factorization rank over arbitrary convex cones

Let P be a convex polyhedral cone. It is sufficiently well known that the nonnegative rank of the slack matrix of P equals the minimum number of facets of a convex polyhedral cone Q which can be mapped onto P by a linear mapping. This number is called the linear extension complexity of P. Gouveia, Parrilo, and Thomas have generalized the connection between the minimum "size" of Q and a suitable factorization rank of the slack matrix to closed convex sets Q which need not be polyhedral. The most prominent case for applications is undoubtedly that of so-called spectrahedra, where Q is required to be the intersection of the (non-polyhedral) convex cone of (real) positive semidefinite matrices with a linear subspace. The situations when we take the doubly nonnegative matrices or the co-positive matrices instead of the positive semidefinite matrices are equally important.
In this post, I will write out some details of arguments in the context of these constructions.
The setup
We will be concerned with the situation that there are two convex polyhedral cones
$P_0 \subset P_1$.
The case $P := P_0 = P_1$ corresponds to the situation above. We then ask for the minimum "complexity" of some type of convex cone Q, for which there is a linear mapping $\pi$ such that
$P_0 \subset \pi(Q) \subset P_1 \ \ \ \ \ \ \ \ \ \ \ (*)$.
The more general setup is of pertinence in applications in Combinatorial Optimization. There, one often has a convex polyhedral cone $P_0$ generated by the (finitely many) feasible solutions to the combinatorial optimization problem, and a convex polyhedral cone $P_1$ which is a relaxation of $P_0$: the cone $P_1$ only contains $P_0$, but is not equal to it. This "gap" usually is a consequence of the enormous complexity of polyhedra considered in Combinatorial Optimization. As an example, consider the famous Traveling Salesman Problem. In this case, $P_0$ is the homogenization of the Traveling Salesman Polytope (the latter is the convex hull of the edge sets of salesman tours). A large number of facet-defining inequalities is known for $P_0$, but, for more than 10 cities, describing $P_0$ in terms of linear inequalities is elusive. Some of the better known inequalities are the subtour inequalities and the 2-matching inequalities. Both are exponentially large families of inequalities. The following question is now very interesting: Is there a linear/semidefinite/etc (extended) formulation with polynomial "complexity" which is at least as good as the subtour+2-matching relaxation? Letting $P_1$ be the homogenization of the polytope defined by the subour and 2-matching inequalities, this question is equivalent to asking for a low-complexity convex set Q as in (*). The Max-Cut problem is another example. Here, one may ask, for example, whether there exist small semidefinite (extended) formulations which improve on, say, the triangle inequalities. (For more on this question, see our abstract arXiv:1203.3961.)
The rigorous definition is the following. For the rest of this post, let $\mathcal C$ be a set of closed convex cones contained in linear spaces $V_C \supset C \in \mathcal C$, and let $h\colon \mathcal C \to \mathbf N$ be a mapping ($h(C)$ will be the "complexity" of C). For simplicity, we require that the $V_C$ are finite dimensional. For convenience, we also assume that C is full-dimensional in $V_C$, i.e., the interior of C is non-empty. Definition. A $\mathcal C$-extension of a pair of convex polyhedral cones $P_0\subset P_1$ in a linear space $W$ is a $C\in \mathcal C$ together with a linear subspace L of $V_C$, and a linear mapping $\pi\colon V_C \to W$, such that $Q := C\cap L$ satisfies (*). The $\mathcal C$-extension complexity of $P_0\subset P_1$ is the infimum over all $h(C)$ of $\mathcal C$-extensions of $P_0\subset P_1$. Remark. The extension complexity may be infinite. Remark. For the definition to make sense in applications in combinatorial optimization, where one is typically interested in polyhedra $P_0\subset P_1$ which are not cones, the family $\mathcal C$ will have to be closed with respect to homogenization. This is the case in the examples mentioned above.
Connection with factorization rank
We now define the factorization rank of a matrix over a system of convex cones. For a closed convex cone C contained in a linear space V, we denote its dual cone by
$C^* := \bigl\{ \alpha \bigm| \left( \alpha \mid \xi \right) \ge 0 \text{ for all } \xi \in C \bigr\} \quad \subset V^*$,
where $V^*$ stands for the dual linear space. Recall that
$(C\cap D)^* = C^*+D^* \ \ \ \ \ \ \ \ \ \ \ \ \ \ (**)$.
Definition. A $\mathcal C$-factorization of an $m\times n$-matrix S is a $C\in \mathcal C$ together with $\alpha_1,\dots,\alpha_m \in C^*$ and $\xi_1,\dots,\xi_n \in C$ such that for all $k,\ell$
$S_{k,\ell} = \left( \alpha_k \mid \xi_\ell \right)$.
The $\mathcal C$-factorization rank of S is the infimum over all $h(C)$ of $\mathcal C$-factorizations of S. Remark. The factorization rank may be infinite. The connection between extension complexity and factorization rank is via the slack matrix of $P_0$ "with respect to" $P_1$. For this, let A be an $m\times r$-matrix and X an $r\times n$-matrix, and suppose that $S := AX \ge 0$. Moreover, let $P_0$ be the convex cone generated by the columns of X, and $P_1 := \{x \mid Ax \ge 0\}$. We assume that the rank of S is r, i.e., both A and X are of rank r. In this case, we say that S is the slack matrix of $P_0 \subset P_1$.
Theorem. The $\mathcal C$-extension complexity of $P_0\subset P_1$ equals the $\mathcal C$-factorization rank of the slack matrix S.
Proof. For simplicity, denote by $x_1,\dots,x_n$ the columns of X, and by $a_1,\dots,a_m$ the rows of A, so that $S_{k,\ell} = \left(a_k\mid x_\ell\right)$. We first show that the $\mathcal C$-extension complexity is at least the $\mathcal C$-factorization rank, by constructing a factorization of S from an extension of $P_0\subset P_1$. Let $(C,L,\pi)$ be such an extension, and $Q:= C\cap L$. For every $\ell$, there exists $\xi_\ell \in Q$ such that $\pi(\xi_\ell) = x_\ell$. For every k, it is follows from $\left(\pi^*(a_k)\mid \xi\right) = \left(a_k \mid \pi(\xi) \right)$ that $\pi^*(a_k) \in Q^*$. Thus, by (**), there exist $\alpha_k \in C^*$ and $\gamma_k\in V^*$ with $\left(\gamma_k\mid \lambda\right) = 0$ for all $\lambda\in L$, such that $\pi^*(a_k) = \alpha_k + \gamma_k$. We conclude
$S_{k,\ell} = \left( a_k \mid x_\ell \right) = \left( a_k \mid \pi(\xi_\ell) \right) = \left( \pi^*(a_k) \mid \xi_\ell \right) = \left( \alpha_k + \gamma_k \mid \xi_\ell \right) = \left( \alpha_k \mid \xi_\ell \right)$,
i.e., we have a factorization. Second, we show that the $\mathcal C$-extension complexity is at most the $\mathcal C$-factorization rank, by constructing an extension of $P_0\subset P_1$ from a factorization of S. Let $(C,\alpha_1,\dots,\alpha_m,\xi_1,\dots,\xi_n)$ be such a factorization. We define
$Q' := \Bigl\{ \left(\begin{smallmatrix}x\\\xi\end{smallmatrix}\right) \Bigm| \left(a_k\mid x\right) = \left(\alpha_k \mid \xi\right) \forall k, \ \xi \in C \Bigr\}$.
In a moment, we will show that the projection of $Q'$ onto the x-coordinate lies between $P_0$ and $P_1$. This is sufficient to prove the theorem, even though, ostensibly, $Q'$ is not the intersection of $C$ with a linear subspace. But in fact, it is: since A has full row rank, the projection of $Q'$ onto the $\xi$-coordinate is a linear bijection between $Q'$ and
$Q := \Bigl\{ \xi \Bigm| \bigl( \left(\alpha_k \mid \xi\right) \bigr)_k \in \mbox{im}\,A \Bigr\} \cap C$,
which has the desired form. Now, to show that the projection of $Q'$ onto the x-coordinate lies between $P_0$ and $P_1$, we first argue that if $\left(\begin{smallmatrix}x\\\xi\end{smallmatrix}\right) \in Q'$, then $(Ax)_k = (\alpha_k\mid \xi) \ge 0$ because $\alpha_k \in C^*$, $\xi \in C$, and hence $x \in P_1$. For the other inclusion, we only have to check whether the columns of X are contained in the image under the projection. But for the column $x_\ell$, we have $\left(\begin{smallmatrix}x_\ell\\\xi_\ell\end{smallmatrix}\right) \in Q'$ since $(a_k\mid x_\ell) = (\alpha_k\mid\xi_\ell)$ holds. This concludes the proof of the theorem.

Dagstuhl Workshop "Communication Complexity, Linear Optimization, and lower bounds for the nonnegative rank of matrices"

As a Valentines day (2012) present, our proposal -- with LeRoy Beasley, Hartmut Klauck, and Troy Lee -- for the organization of a workshop at Dagstuhl Castle has been accepted. Our vision is to bring together people from communication complexity, matrix theory, and combinatorial optimization in the hope of making progress on some of the infamous and long-standing open questions regarding lower bounding the nonnegative rank of matrices, and its applications. The workshop will take place
Feb 17 -- 22   2013.
Dagstuhl hosts a web page for the workshop.
Here's the abstract from the proposal.
A number of notorious conjectures and open problems in Matrix Theory, Combinatorial Optimization, and Communication Complexity require proving lower bounds on the nonnegative rank of matrices. In Combinatorial Optimization, one of the most infamous questions is whether the Matching Problem admits a polynomially sized Linear Programming formulation (Schrijver's Book). The minimum size of a Linear Programming formulation equals the nonnegative rank of a certain matrix (Yannakakis 1991). In Communication Complexity, one of the biggest open problems, the log-rank conjecture posed by Lovász and Saks (1988), involves the question whether the logarithms of the rank and the nonnegative rank are polynomially related for 01-matrices.
The currently known lower bounds for the nonnegative rank are either trivial (rank lower bound) or known not to work in many important cases (biclique covering lower bound). Recently, in Combinatorial Optimization, there has been a surge of interest in lower bounds of sizes of Linear Programming formulations. A number of new methods have been developed and new results have been obtained. Moreover, a new connection linking nonnegative rank directly to a variant of randomized communication complexity has been discovered.
We want to bring together researchers from the fields involved, in order to discuss and perfect the (known and) new techniques, and use (known and) new connections to transfer tools and methods from each one field to the others. We expect that in the wake of this seminar, considerable advances will be made towards resolving some of the questions regarding nonnegative rank.

Open questions about nonnegative rank and related concepts

This posts summarizes some my favorite open problems about the nonnegative rank of matrices and related concepts like extension complexity of polytopes (see this post.) Some notation & terminology:
$\mbox{frk}_S$ factorization rank over the semiring S
$\mbox{ndCC}(A)$nondeterministic communication complexity of the support of A
$\mbox{fool}(A)$fooling set bound

Theoretical questions

  1. Cohen & Rothblum (1993) asked: If ${A}$ is rational, is its factorization rank over ${\mathbf{Q}_+}$ equal to the factorization rank over ${\mathbf{R}_+}$? More generally: If ${A}$ has entries in a sub-semiring ${S}$ of a sub-semiring ${R}$ of the nonnegative reals, how large can the gap be between ${\mbox{frk}_S(A)}$ and ${\mbox{frk}_R(A)}$? (Beasley & Laffey 2009)
  2. Given any sub-semiring ${S}$ of ${\mathbf{R}_+}$ and ${n\ge 6}$, is there a matrix ${A \in \mathbf{M}_{n\times n}(S)}$ such that ${\mbox{rk} A = 3}$ and ${\mbox{frk}_S(A) = n}$? (Beasley & Laffey 2009) This is open even for ${S=\mathbf{R}_+}$. Cf. Specific-Question #1.
  3. If ${A}$ is a symmetric ${n\times n}$ nonnegative matrix with real rank ${k}$, and if ${A}$ has exactly one negative eigenvalue, must ${A}$ have a nonnegative factorization ${A = XY}$ where ${X}$ is ${n\times q}$ with ${q}$ bounded as a function of ${k}$? (Beasley & Laffey 2009)
  4. What is the smallest $\alpha$ such that $\mbox{rk}_+(A) \le \mbox{rk}(A)^\alpha$ for all 01-matrices $A$? This corresponds to Lovász and Saks log-rank conjecture (with $\alpha = \infty$ if that is false). Proving lower bounds for $\alpha$ corresponds to proving lower bounds for the nonnegative rank.
  5. What is the largest separation between the logarithm of the nonnegative rank and the deterministic communication complexity for Boolean functions? The gap cannot be more than quadratic, but the best known example has a smaller gap. A related question asks for the largest separation between the nonnegative rank and the biclique covering number for 01-matrices.
  6. It is known that ${\displaystyle \mbox{fool}(A) \le (\mbox{rk} A)^2}$. (Dietzfelbinger Hromkovic Juraj Schnitger (1996)), and there are examples where ${\mbox{fool}(A) \ge (\mbox{rk} A)^{\log_3 4}}$. Which of these two bounds can be improved? (Dietzfelbinger, Hromkovic, Juraj, Schnitger 1996)

Questions about specific families of matrices

  1. A ${d}$-dimensional, Euclidean distance Matrix of size ${n}$ is defined by points ${x_1,\dots,x_n}$ in ${d}$-dimensional Euclidean space. Its entries are ${\bigl( \lVert x_k - x_\ell \rVert )_{k,\ell}}$. Do ``generic''/``random'' Euclidean distance Matrices have full nonnegative rank? (Gillis & Glineur 2011; Lin & Chu 2010). A proof of this for ${d=1}$ by Lin & Chu 2010 is fatally flawed. More spcifically: Do some linear Euclidean distance Matrices have the propery asked for in Theory-Question #2? (Beasley & Laffey 2009) (Not all of them do (Gillis & Glineur 2011).) The rank bound and the ndCC bound are both trivial for these matrices. Upper bounds for special cases have been improved. (Gillis & Glineur 2011)
  2. Does a random (with a suitable distribution) ${(m\times n)}$ rank-${k}$ matrix have full nonnegative rank? A weaker question asked by J. Garcke, M. Klimm, H.-C. Kreusler, and A. Uschmajew is: Has the set of matrices with nonnegative rank-${k}$ has positive Lebesgue measure within the ${(m\times n)}$ rank-${k}$ matrices? If ${A}$ is randomly drawn from the ${(m\times n)}$-matrices with nonnegative rank ${k}$, is ${\mbox{rk} A = k}$ with probability one / positive probability? (Dong, Lin, Chu) For some of these questions, erroneous proofs can be found on the internet. In particular, a proof by Dong, Lin, and Chu is incorrect. The rank bound and the ndCC bound are both trivial for these matrices.
  3. Is the extension complexity of the Matching Polytope of an ${n}$-vertex graph polynomial in ${n}$? An infamous, several decades old question in Combinatorial Optimization asks, whether there exist an extended formulation for the matching problem whose coding size is polynomial in the number of vertices of the graph. A slightly weaker question asks whether there exist an extended formulation whose number of inequalities is bounded by a polynomial in the number of vertices of the graph. The best known lower bound is the trivial rank bound ${\Omega(n^2)}$. The ndCC bound is ${O(n^4).}$ The known upper bounds are ${O(c^n)}$, for values of ${c\in \left]1,2\right]}$ which have been improved several times in the last three decades.
  4. Is the extension complexity of the Spanning Tree polytope ${\Omega(n^3)}$?. Michele Goemans asked this question at the Cargese Workshop on Combinatorial Optimization 2010, which was dedicated to extended formulations. The best known lower bound is the trivial rank bound. It is unknown whether the ndCC bound improves on the rank bound. The known upper bound is ${O(n^3)}$, with improvements for certain types of graphs.
  5. What is the extension complexity of the Stable Set polytope of a perfect graph? The question goes back to Yannakakis (1991), and was recently repeated by Michele Goemans (MIT) at the Cargese Workshop on Combinatorial Optimization 2010, which was dedicated to extended formulations.
  6. Is the extension complexity of the Completion Time polytope ${O(n\log n)}$? (Goemans 2011 & Cargese) If true, this would extend work by Goemans (2011). The best known lower bound is the trivial rank bound. The upper bound is ${O(n^2)}$, which has been proved in a couple of ways in the last 20 years.
  7. Is the extension complexity of an ${n}$-vertex convex polygon with vertices in general position ${\Omega(n)}$? (Fiorini, Rothvoss, Tiwary) The rank bound is 2, the ndCC bound is ${\Omega(\log n)}$. Some regular convex polygons have extension complexity ${\Theta(\log n)}$. The best known lower bound for algebraically independent vertices is ${\Omega(\sqrt n)}$.

Random signed SAT

This paper was long in the making, but Kathrin Ballerstein and I have finally completed the final draft of our paper An algorithm for random signed 3-SAT with Intervals. Signed SAT is a "multi-valued logic" version of the classical Satisfiability (SAT) problem. In both cases, one is given as input a set of $n$ variables and a set of $m$ clauses, each consisting of $k$ "literals", and one seeks to find an assignment of values to the variables such that in each clause, at least one literal is satisfied. Whereas in classical SAT the variables are boolean and the literals have the form "$x$" or "$\lnot x$", in signed SAT the literals have the form $x \in S$, where $S$ is a member of some set $\mathcal S$ which is fixed (i.e., not part of the input). Signed SAT has been around for a while. The best starting point for references is probably the paper "The Helly property and satisfiability of Boolean formulas defined on set families" by Chepoi, Creignou, Hermann, and Salzer (it simply is the most recent I know of), but the groundbreaking work was laid by Manyà and Hähnle in the 90s. In our paper, we study a variant where the set $\mathcal S$ comprises all intervals contained in $[0,1]$; we refer to it as iSAT. We study what a proper adaption of the well-known Unit Clause algorithm for classical random SAT does for our interval Satisfiability problem. Since Unit Clause algorithms alone are usually not much good, we enhance it with a "repair" (or "very limited backtracking") option, and analyze the whole thing using Wormald's famous ODE method. Here are some details about the paper.
Algorithms which repair their mistakes
Say you have a smart algorithm, which proceeds through a number, say $\Theta(n)$, of iterations. In each iteration, the probability of it succeeding is quite large. For example, let us say that, in each iteration, the failure probability is $\Theta(r(n)/n)$ for some $r(n) = o(n)$. Due to the large number of iterations, though, the expected number of failures is $\Theta(r(n))$ --- too large even for constant $r$. The remedy is, of course, to let the algorithm check, in each iteration, whether it has made a mistake, and if that is the case, repair the damage. There are certain complications with this approach, mainly because the algorithm has already "looked at" some of the random input data, so that, during the repair, we have to assume that the distribution of some of the data is changed by conditioning on the earlier choices of the algorithm. However, there's a trick in this situation: While the overall algorithm (likely) chooses is actions fairly intelligently, for the repair part it is ok to proceed in a fairly stupid way. This is so because all the repair part needs to do, is to push its own failure probability (conditioned on the fact that a repair was necessary and on the previous choices of the algorithm) to $o(1/r)$. For example, for constant $r$, it suffices that the repair part fails with probability tending to zero arbitrarily slowly. In our paper, the we bound the failure probability of the "smart" algorithm by $O((\mbox{polylog}\,n)/n)$, and the "stupid" repair part fails with conditional probability $O((\mbox{polylog}\,n)/n)$, too.
Branching system
As is often the case with SAT, the analysis of our algorithm entails studying properties of a branching system. Of interest here is the total population size, and we need the expectation and a tail probability estimate. It is actually closer to the problem to speak of a ``discrete time queue'', but at the heart of it is (as is the case with queues) a branching system. I have written two posts (How to analyze a discrete time queue and Tail probability estimate for the busy period) on discrete-time queues in preparation for the completion of our paper. So there's not much to add here, except the one thing that needs to be dealt with almost always in the "real world" -- random variables are almost never completely independent, and also almost never identically distributed. There's generally a small amount of dependence and deviation from the ``right'' distribution, which kills the textbook- (or Wikipedia-)level theorems one would like to apply, and results in several pages of estimates, inequalities, coupling, and conditioning on the parameter regions in which it all works, leading to more estimates and inequalities. Personally, I find that these complications really are a pain: they are usually not complicated, but I never know in how much detail to write them down. The same problem also occurs at a point where we (would usually) use Wald's equation, but the summands are neither (completely) independent nor (completely) identically distributed. Fortunately, though, at this point we can revert to Optional Stopping.
Wormald's ODE-method
A black-box theorem by Wormald allows to control parameters of a random process by writing down conditional expectations of their changes from one iteration to the next in the form of an initial value problem. While I am generally not a big fan of black-box theorems, I quite fancy this one. The basic idea behind the whole method is the simple trick "if you know how the conditional expectations behave, try to find a martingale and use the Azuma-Hoeffding inequality", but the way it is used to prove the ODE-theorem is quite nice.
A little bit of ODE calculus
Even though this is tremendously simple from an analysis point of view, we do some computations with the system of ODEs which we obtain, and transform it into a single initial value problem by a change of parameters. The resulting IVP is still not solvable analytically (I guess), but it is a lot easier to give estimates on how the solution behaves, and also to solve it numerically, which we do in order to determine quantitative properties of the input data for which our algorithm succeeds. More specifically, the initial ratio of the number of clauses over the number of variables must be such that the solution to the IVP does not intersect a certain line before it falls below a certain other function. The line describes the point at which the queue length (and thus the length of the busy period) of the discrete-time queue crosses from finite to infinite.

A tail probability estimate for the busy time in a discrete time queue

In this previous post, I wrote down some analysis of a discrete time queue where one customer is serviced per time unit (i.e., "minute"). If at the beginning of a minute, the queue is empty, a random number of customers arrives during the minute, and the distribution of this number is that of a random variable $A$. If at least one customer is already waiting, the number of new customers is distributed like a random variable $B$. All arrivals are assumed to be independent. In other words, if the length of the queue at the beginning of minute $t$ (a nonnegative integer), is denoted by $Q(t)$, we have the following

$Q(t+1) = \left\{\begin{array}{lll} Q(t) -1 + &B(t)&\text{ if } Q(t) > 0\\&A(t)&\text{ if } Q(t)=0,\end{array}\right.$

where the $A(t)$ are iid random variables distributed as $A$ and the $B(t)$ are iid random variables distributed as $B$. Moreover, $Q(0) = 0$.

As in the other post, we are interested in the busy time $X$ after the start, in other words, the time between 0 and the first return of $Q(t)$ to the state 0. I already argued that, if $h(z)$ is the probability generating function of this time, and $g_A(z)$ and $g_B(z)$, are the probability generating functions of $A$ and $B$, respectively, then the following is true

$\begin{aligned} h(z) &= g_A( f(z) )\\ f(z) &= z\, g_B( f(z) )\end{aligned}$

where $f(z)$ is the probability generating function of some other random variable, which is no longer of particular interest here.

In this post, I'd like to make some statements about properties of $X$, based on the system of equations for the probability generating function.

The mean

In the other post, we obtained the formula for the mean of $X$ as a consequence of the fact that $Q(t)$ is a stationary Markov chain, for which we could write down the probability that it is in state 0. Here, I'll use the probability generating function. We have to compute $h'(1)$. By the chain rule, we obtain

$\begin{aligned}f'(z) &= g_B(f(z)) + z g_B'(f(z)) f'(z)\\h'(z) &= g_A'(f(z)) f'(z),\end{aligned}$

and, assuming everything converges, we obtain

$\begin{aligned}f'(1) &= 1 + g_B'(1) f'(z)\\h'(z) &= g_A'(1) \frac{1}{1-g_B'(1)}, \end{aligned}$

where, to obtain the second equation, we solved the first one for $f'(1)$. With $\lambda := \mathbb E A$ and $\nu := \mathbb E B$, this is just what we obtained from the Markov chain. At this point, note that the desired convergence of the power series $f$ and $h$ is equivalent to them having finite means, plus Abel's theorem, possibly.

Tail inequality

Now I'd like to bound the probability that the server is busy for a very long time. For this, I will focus on specific distributions of $A$ and $B$, namely, I'll assume that both are Poisson distributions. Thus, the probability generating functions of $A$ and $B$ are the following:

$\begin{aligned} g_A(z) &= e^{\lambda(z-1)} \\ g_B(z) &= e^{\nu(z-1)}. \end{aligned}$

Hence, we obtain

$\begin{aligned} h(z) &= e^{\lambda(f(z)-1)} \\ f(z) &= z\, e^{\nu(f(z)-1)}. \end{aligned}$

The second equation can be rewritten as

$z = f(z) e^{\nu(1-f(z))}, \ \ \ \ \ (*)$

of course, so we have a nice expression for the inverse $f^{-1}$ of $f$. Given that we already ``have'' the probability generating functions, we will use the exponential moment method (= exponentiate + Markov + optimize) to bound the probability:

$\Pr[ X \ge \alpha ] = \Pr[ t^X \ge t^\alpha ] \le h(t)t^{-\alpha},$

which is valid for every $t \ge 1$, and $t$ is optimized at some point. The expression on the right is equal to

$h(t) t^{-\alpha} = e^{\lambda(f(t)-1) -\alpha\log t}.$

We now write $s := f(t)$ and use (*). Noting that $t > 0$ is equivalent to $s > 0$, we compute

$\begin{aligned}\log \Pr[ X \ge \alpha ] &\le \lambda(s-1) -\alpha\log(s e^{\nu(1-s)}) = \lambda(s-1) - \alpha\log s - \alpha\nu(1-s) \\&= (s-1)(\lambda + \alpha \nu) - \alpha\log s.&(**)\end{aligned}$

Now we optimize $s$ (later we'll have to make sure that for our choice of $s$, there really exists a $t \ge 1$). Noting that since for $s\to 0$ and $s\to\infty$ the expression (**) goes to infinity, we just have to take for $s$ the unique point where the derivative in (**) vanishes, which is

$\displaystyle s = \frac{\alpha}{\alpha\nu + \lambda}$.

Summarizing, we obtain the following estimate.
Proposition (Tail estimate for our discrete queue).

$\displaystyle \Pr[ X \ge \alpha ] \le \exp\left( \left( 1-\nu - \log \frac{\alpha}{\alpha\nu + \lambda} \right)\alpha - \lambda\right).$

Now, with $\alpha\to\infty$, the term under the logarithm tends to $1/\nu$, and $\log(1/\nu) < \nu-1$ because of $\nu < 1$, so that there exist $\epsilon > 0$ (depending on $\nu$) and $\alpha_0$ (depending on both $\lambda$ and $\nu$) such that $1 - \nu - \log \frac{\alpha}{\alpha\nu + \lambda} \le -\epsilon$ for all $\alpha \ge \alpha_0$. Consequently, the bound in the proposition is essentially linear, i.e., for these $\alpha$ we have

$\Pr[ X \ge \alpha ] \le e^{-\Theta(\alpha)}$.

We still have to verify that our choice for $s$ is feasible, i.e., there must exist a $t \ge 1$ with $f(t) = s$. Let $t(s) := s e^{\nu(1-s)}$. A swift computation shows that $t'(s) > 0$ for $s < 1/\nu$. From elementary calculus we know that this implies that for these $s$, there exists a $t$ with $f(t) = s$. But our $s$ was $\frac{\alpha}{\alpha\nu + \lambda} < 1/\nu$ because $\lambda > 0$ (unless the queue were trivial). Moreover, we have

$\displaystyle t = s e^{\nu(1-s)} \ge s( 1 + \nu(1-s) ) = (1+\nu)s - \nu s^2 > 1$

where the last inequality holds whenever $1 < s < 1/\nu$ (because the two roots of the equation $(1+\nu)s -\nu s^2 =1$ are $s=1$ and $s=1/\nu$). By increasing $\alpha_0$ if necessary, we can assume that $s > 1$ holds.

Nonnegative rank and nondeterministic communication complexity

In this post I would like to discuss the connection between the nonnegative rank of a nonnegative matrix, and the nondeterministic communication complexity of its support matrix --- the connection being simply the former is greater than or equal to two to the power of the latter.

Deterministic Communication Complexity

As a digression, let us first discuss the classical deterministic communication complexity of a boolean function ${f\colon \{0,1\}^d\times\{0,1\}^d \rightarrow \{0,1\}}$. The idea is that Alice and Bob want to compute a specific value ${f(x,y)}$, but Alice only knows ${x}$ whereas Bob only knows ${y}$. Now they'll have to communicate, until one of them, say Bob, knows ${f(x,y)}$. Communication is assumed to be the ``expensive'' thing, so that is what Alice and Bob try to minimize.
The maximum over all inputs ${(x,y)}$ of the minimum number of bits Alice and Bob have to exchange before Bob knows the answer, is called the communication complexity of ${f}$.
It is important to realize that Alice and Bob have unlimited amount of storage and computing power. In particular, both of them may store the complete ${(2^d\times 2^d)}$- matrix ${M_{x,y} := f(x,y)}$ (along with all prime numbers and every book ever written), if they think it might help. (In this sense, it is like the opposite of an exam for students: Alice and Bob may use the complete solution to all exam problems, but they have to find out which problem they are supposed to solve). The question is not, how difficult is to compute ${f}$. The question is, how difficult it is to agree on which region of the matrix the parties ``are'' in.

Nondeterministic CC

Now let us discuss nondeterministic communication complexity. Here, in addition to their respective input ${x}$ and ${y}$, Alice and Bob are given a ``certificate'' for the answer ${f(x,y) = 1}$, which both can verify independently. If in their verifications, they both independently conclude that the certificate proves that ${f(x,y) =1}$, then Bob replies '1'. Otherwise, he says '0'. Bob may err in his reply, but only in saying '0' when the right answer would have been '1'. If the true answer is '0', Bob may never say '0'. (This is just like in nondeterministic computational complexity classes.) Moreover, if the true answer is '1', there must be at least one certificate with which Alice and Bob arrive at the true answer.
Obviously, for Bob to learn the result of Alice's checking of the certificate, Alice has to communicate one bit to Bob; this bit is generally ignored. The ``nondeterministic communication'' is then the coding length of the certificate. Since Alice and Bob may store all possible certificates which they might ever occur, this is the same as the coding length of the number of certificates needed for the function ${f}$.
The only kind of yes/no questions Alice and Bob can answer independently are of the form ${x\in A}$ (for Alice), or ${y\in B}$, respectively (for Bob). Thus, certificates amount to rectangles.
Suppose that we have a number of submatrices ${R_1,\dots,R_q}$ of ${M}$, such that every entry in every ${R_j}$ is one. These matrices are supposed to cover all the 1s in ${M}$ in the sense that for every ${(x,y)}$ if ${M_{x,y} =1}$, then at least one of the ${R_j}$ contains the ${(x,y)}$-entry. The usual word for such submatrices is 1-rectangle (or just rectangle), and ${R_1,\dots,R_q}$ with said properties is called a rectangle covering. when Alice and Bob receive their inputs ${x}$ and ${y}$, respectively, if the right answer would be '1', the certificate could be any of the rectangles containing the entry ${M_{x,y}}$. If, however ${M_{x,y} = 0}$, it does not matter which certificate is given -- Bob will always reply '0'.
Thus, determining the finding the nondeterministic communication complexity of ${M}$ amounts to fining a rectangle covering with minimum number ${q}$ of rectangles. The nondeterministic communication complexity is then the coding length of ${q}$, i.e., ${\approx \log_2 q}$.
Note that ``rectangle covering'' makes sense for all 0/1-matrices --- not just square matrices with the number of rows/columns a power of two.

Rectangle covering as a coloring problem

Let us make this even a bit more combinatorial. Let ${M}$ be an ${(m\times n)}$-matrix ${M}$, and denote by ${\mbox{rc}(M)}$ its rectangle covering number, i.e., the minimum number of 1-rectangles needed to cover all 1-entries of ${M}$.
We define the rectangle graph ${G_M}$ as follows. The vertex set ${V(G_M)}$ is the set of 1-entries of ${M}$; two such entries are adjacent, if the rectangle they span contains a 0. Say, ${M_{k,\ell} = M_{k',\ell'} = 1}$, so we have the two vertices ${(k,\ell)}$ and ${(k',\ell')}$. They are adjacent, if and only if, ${M_{k,\ell'} M_{k',\ell} = 0}$.
Denoting by ${\chi}$ the usual (vertex-) chromatic number, the following is true.

Lemma 1
$\displaystyle \chi(G_M) = \mbox{rc}(M) $
Proof. For a proof, it suffices to realize that a 1-rectangle in $M$ is an independent set in $G_M$, and that, by the definition of the edges of $G_M$, for every independent set $U$ in $G_M$, the smallest rectangle which contains $U$ is a 1-rectangle.

I have been unable to find this fact elsewhere in published literature. The proof, although quite easy, can be found in a forthcoming paper.
So nondeterministic CC amounts to coloring a certain graph defined by the boolean function.

Connection with nonnegative rank (and extension complexity)

Now that the basics are clear, let us come back to the original question. Let ${A}$ be a nonnegative matrix and ${M}$ ist support-matrix, i.e., ${M=}$ is the 0/1-matrix with ${M_{k,\ell} = 1}$ if and only if ${A_{k,\ell} \ne 0}$. The following fact was already known to Yannakakis (1991).

Lemma 2
$\displaystyle \mbox{rc}(M) \le \mbox{rk}_+(A). $
Proof. The nonnegative rank of $A$ is the minimum number of nonnegative rank-1 matrices whose sum equals $A$. In other words, there exist nonnegative vectors $u_k, v_k$ $k=1,\dots,\mbox{rk}_+(A)$ such that
$\displaystyle A = \sum_{k=1}^{\mbox{rk}_+(A)} u_k v_k^T$.
The support of $A$ is thus the union of the supports of the $u_k v_k^T$. But clearly, the support of $u_k v_k^T$ is a rectangle: it is the product of the support of $u_k$ and the support of $v_k$. Thus, we have a rectangle covering using $\mbox{rk}_+(A)$ rectangles, which implies the inequality in the lemma.

In the context of the extension complexity of a polytope ${P}$, the rectangle graph can be defined as follows. The vertices of ${G}$ are the pairs ${(x_k,(a_\ell,\alpha_\ell))}$, where ${x_k}$ is a point in ${P}$ and the inequality ${a_\ell^Ty \le \alpha_\ell}$ is valid and supporting for ${P}$, such that ${x_k}$ does not lie on the hyperplane defined by ${(a_k,\alpha_k)}$. Two such pairs are adjacent in ${G}$, if at least one of the two points does lie in one of the two hyperplanes. For this graph ${G}$ we have

Lemma 3
$\displaystyle \chi(G) \le \mbox{xc}(P). $

How to use LaTeX math formulas on blogspot

This is not a post, it's a sandbox. I try out math formulas here to see if they work. The reason is that the preview feature of blogger doesn't parse LaTeX.

On Blogspot, I can use MathJax, which produces beautiful math formulas. (That's why I switched from Wordpress, where the formulas are super ugly.)

Let's have a simple formula, say $x=0$, and let's also try something complicated like $\sum_{k=0}^\infty \frac{1}{k^2}$. Ok.

What about displaystyle?
$$ \int_{-\infty}^\infty e^{-x^2} \, dx. $$

What about aligned things:
$$ \begin{aligned}
x&=y & a&=b \\
u&=v & c&=d &\text{and with a text, too.}\\
Did that work, too?

For $j=1,\dots,n$ and $t\ge 0$, let $X_j(t)$ be 0 if bin $j$ is empty after the $t$th ball has been thrown, and 1 if it is non-empty. Let $$ \begin{aligned} X(t) &= \sum_{j=1}^n (1-X_j(t)) &&\text{the number of empty bins after the $t$th ball has been thrown, and}\\ Y(t) &= \sum_{j=1}^n X_j(t) &&\text{the number of non-empty bins after the $t$th ball has been thrown.}\\ \end{aligned} $$ $$ \begin{aligned} \mathbb{E}X(t) &= n(1-\tfrac{1}{n})^t \\ \mathbb{E}Y(t) &= n\Bigl( 1 - (1-\tfrac{1}{n})^t )\\ \end{aligned} $$