next up previous
Next: Related Work Up: The Typed -Calculus and Previous: Expressive Power and Type-Based

Application to Information Flow Analysis: Qualitative and Quantitative.

This project chooses information flow analyses (henceforth IFA for short) as the PA topic this project will focus, out of many possible PAs. Flow analyses are central to various PA techniques, including control flow, i.e. how a thread of activation runs through program points, or a data flow, i.e. how a datum flows through a program text. IFA integrates these two kinds of traditional flow analyses, and is a fundamental part of secrecy analysis, which intuitively ensures the lack of leakage of information from a given program, the concern which goes back to the early work by Lampson and Denning&Denning [20]. As a very simple example, consider a call-by-value $\lambda$-term

\begin{displaymath}
(\lambda x.(2 \ast x))(y2),
\end{displaymath} (1)

where $y$ is a function from integers to integers and ``$\ast$'' denotes a multiplication. The result of evaluating this term can be either $(y2)$ times $2$ (if $y2$ terminates) or divergence (if $(y2)$ diverges). Thus both control flow and data flow exist from $y$ to the result, which are together counted as information flow. But if $(y2)$ always terminates, there is only a data flow: and, in $(\lambda x.1)(y2)$, there is only a control flow from $y$ to the result.

IFA is essential for secrecy analysis because, in practice, information leakage (which secrecy analysis wishes to preclude) are generally attributable to the existence of insecure information flow in a program. Here a flow is insecure if it transmits high-level (private) information to low-level (public) channels (so if $y$ is high in (1) and the result is low there is insecure flow). As a practical example, if a user downloaded an applet for calculating tax payment, the user does not want the applet to reveal her/his income information to the general public. Thus IFA is a fundamental instance of PA for software safety, which is one of the basic reasons why we choose IFA as a central topic of the present project. Another reason is because IFA integrates two other prominent flow analyses in PA, control flow analyses and data flow analyses, so that the full understanding of IFA will directly and indirectly lead to that of other flow-based PAs.

The final reason is the experience we have accumulated so far in the fields of IFAs. For a last couple of years [45,41,46,75,68], we experimented with the $\pi $-calculus-based framework to develop secure information flow analyses for higher-order procedure, general references and concurrency, presenting a basic paradigm of the use of $\pi $-calculi in PAs. Concretely, we first develop IFA for the $\pi $-calculus extended with state and concurrency, which is then reflected it onto programming languages, resulting in a series of type-based IFAs which are more powerful than the state-of-the-art techniques [,,,,,]. The correctness of the IFAs for languages is automatically ensured through their embedding into the $\pi $-calculus. The fine-grained type structure of the linear/affine $\pi $-calculus plays an essential role in developing the IFA. From our experience, a general picture of the $\pi $-calculus-based PAs has emerged, as found in Figures 1 and 2 at the end of this proposal, The experience gives us a firm basis for further extensions.

The present project considers one basic extension to this general picture, quantitative secrecy, which is another area of IFA the authors of the present proposal have an expertise in. The traditional secrecy analysis, including our own [45,41,46,75,68], gives us a method to obtain (a safe approximation to) an answer to the following question:

Does a given program incur information leakage?
Since we expect a Yes-No answer in such a secrecy analysis, it is generally known as qualitative secrecy. Recently researchers observed that, in practical situations, we need a more fine-grained notion of secrecy, which is often called qualitative secrecy. For example, consider a common bank or credit card which allows the user to withdraw cash from an ATM (Automatic Teller Machine). Usually the secret PIN (Personal Identification Number) is stored in encrypted form on the card itself. However -- even assuming that the encryption method used for encoding the PIN code is effectively unbreakable -- it is impossible to claim that the PIN code (or an application which uses it) is perfectly secure, since each time an incorrect trial by an attacker (who got hold of the card by, say, stealing) can obtain a piece of information. For example, assume that the PIN is a four digit number, then there are less than $10000$ possible codes available (as banks usually exclude certain combinations like $1234$, etc. as valid PINs). Every time the thief tries a random number a certain amount of information about the PIN is revealed. The point is that even though the chances of success of a trial is low (in the order of $10^{-4}$ in this instance), there is a small amount of information revealed even if the attacker is ``unlucky'' and the chosen number is rejected.

So, for all practical purposes, the ATM program has only to ensure PINs to be ``approximately secret''. From the viewpoint of program analyses, we wish to have a method which can answer the following more refined question:

How much information leakage does a given program incur?
Many practical secrecy examples are best analysed using quantitative secrecy. For example, one of the significant attacks against cryptography includes timing attacks, in which the time a program spends on a certain operation, such as decryption, leads to revelation of information. As shown in [25,26], vulnerability of programs w.r.t. timing attacks can be cleanly understood using quantitative secrecy.

Quantitative secrecy is an emerging field whose leading researchers include the authors of the present proposal (cf. [55]). Successful development of this theory in the context of the typed $\pi $-calculus would mean scaling its key ideas to a broad class of language constructs, which has been difficult so far.


next up previous
Next: Related Work Up: The Typed -Calculus and Previous: Expressive Power and Type-Based
Igor Siveroni 2004-08-16