Research interests of Paul Kainen

As of July 2004

My research includes nonlinear analysis and approximation, topological graph embedding and immersion, theory of commutative diagrams, and mathematical biology. A list of publications is available. I am also interested in practical applications.

Nonlinear analysis

I am interested in the problem of ``learning from data'' which has recently been called ``Mathematical Learning Theory.'' (See F. Cucker and S. Smale, On the mathematical foundations of learning, Bull. Amer. Math. Soc. (N.S.) 39 (2002), 1-49. MR 2003a:68118 F. Cucker and S. Smale, Best choices for regularization parameters in learning theory, Found. Comput. Math. 2 (2002), 413-428. MR 2003k:68089.)

However, I believe the current paradigms are not yet approaching the subtlety of biological learning. In particular, taking data as a set ignores the fact that each data ``point'' is a parallel measurement of the form (x,y) where x is in some high-dimensional space of inputs and y is a real number, but the collection of data may occur over time. Not only do such data have an order but also they can be associated in varying ways. For example, given three events A,B,C in an animal's history, can anyone doubt that what is learned depends not only on the sequence - say A,B,C - but also on whether A and B occur in quick succession, while C follows much later, versus A followd much later by B and C in quick succession (or symbolically, AB.C not equal to A.BC?

A better solution to learning from data could result from using ordering and associational aspects of the data - which can be viewed as structuring the data into a binary tree. It would be very interesting to consider the application of hypercomplex number systems to this problem - notably the quaternions and octonions. This would be a modern embodiment of Felix Klein's program for the fusion of arithmetic and geometry.

Working with the standard set-oriented paradigm, my colleagues Vera Kurkova , Andrew Vogt , and I have obtained some results on neural network approximation, especially in the case that the hidden units are Heaviside perceptrons. Our study has focused on subsets of normed linear spaces determined by finite linear combinations of half-space characteristic functions.

This amounts to studying neural nets of a special kind. Suppose one is given a digraph D = (V,A) whose underlying graph consists of two bipartite graphs, K_{d,n} and K_{n,1}, where the d input vertices send real numbers to each of the n ``hidden'' units and, in turn, each of the hidden units sends a real number to the single output unit. Transmission from input to hidden and from hidden to output is multiplied by a weighting factor (that depends on the particular link - i.e., the pair of units chosen) and each hidden unit computes the value it sends to the output unit by applying a given activation function to the sequence of its inputs. The two main types are the perceptron type in which the sequence of inputs is summed, a real number weight (bias) is added and then the activation function is applied, and also the ``radial basis'' case in which the sequence of inputs is regarded as a point in $d$-space and the hidden unit calculates the distance of this point from some given point and applies the activation function to the distance.

It is known that one can get an arbitrarily good approximation to any reasonable function from $d$-space to the reals by using sufficiently many hidden units in the network. (The sense of the approximation depends on the norm chosen, but such ``universal approximation'' theorems, which really amount to the assertion that the neural network functions are dense, hold for all of the reasonable choices, such as L_p , 1 \leq p \leq \infty, i.e., 1<=p<=oo.) Our research is about resulting rates of approximation, what happens when the activation function is the Heaviside function, generalization of the paradigm to more general classes of numbers and the more general case of arbitrary digraphs with no directed cycles. We have also studied uniqueness of parameterization, the utilization of neural network functions to replace linear subspaces in approximation theory and generalizations of Kolmogorov's notion of n-width.

Topological graphs

I've worked on crossing number and embedding problems for 30 years, and I co-authored a book on the Four Color Problem with Tom Saaty. The notion of "book thickness" (aka "page number") and some related ideas received its first mathematical treatment in my papers. These concepts have turned out to be useful in both computer science and the analysis of biological polymers.

Graphs in space include knots and links, which may also be considered in terms of planar sheets. Also, as the Jones Polynomial has shown recently, there are interesting connections between knot theory and quantum theory. I have a technical report Quantum interpretations of the four color theorem which discusses this is in more detail.

As for mathematical biology, I divide my interests into four areas:

Viewing organisms from the perspective of computation, one is led to ask: How do biological organisms determine their quantitative parameters? Even a single-celled creature must make strategic decisions (swim or tumble, for example) based on complex information. Biochemical pathways are adjusted so that various constraints remain satisfied. If these situations were encountered by an artificial system, we would call the process by which it produced an answer a ``computation'' but, until rather recently, biologists seemed to strongly reject the notion of such an abstract entity occuring within natural contexts. In recent years, however, the computational paradigm has been accepted by many biologists and is now regarded as a promising frontier in mathematics.

Category-theoretic models were proposed for biology by Robert Rosen and Rene Thom, among others. A more recent and extensive program is due to Andree Ehresmann and Jean-Paul Vanbremeersch. I proposed a naive adjoint model about 15 years ago: Let Biology and Physics be viewed as categories, with functors from B to P and P to B labeled "action" and "perception" where p is left adjoint to a. Hence, perception preserves colimits (a model for Gestalt) and action preserves limits (a model for synergies). My model and the E-V model make similar claims for aspects of perception.

Specific models of visual perception with topological aspects go back at least to E. C. Zeeman (1961). I have studied the visual perception of Lissajous figures as an instance of topological coherence, and I have also studied the phenomenon of color constancy under Land-Benton conditions using graph-theoretic results of Thomassen.

Two aspects of cognition interest me. First is the issue of whether quantum computation plays a role. Second is the question of complexity and robustness. In regard to this latter topic, I would like to be able to apply results obtained on forcing commutativity of a diagram from that of a sparse subset of its faces. Back to the index page.