Natural Language and Formal Language

Posted on May 23, 2019

Helen Keller with Anne Sullivan, in 1887

On the picture you see the almost seven-year-old deaf-blind girl Helen Keller, on the day, in March 1887, when her teacher Anne Sullivan arrived, bringing a doll as a gift. Helen had not mastered human language, except for a few words. Anne’s first lesson was to spell d-o-l-l in Helen’s hand. Helen could copy this, and soon could copy other words spelled in her hand. But she could not comprehend the difference between m-u-g and w-a-t-e-r. And then, a month later, a miracle happened.

We walked down the path to the well-house, attracted by the fragrance of the honeysuckle with which it was covered. Some one was drawing water and my teacher placed my hand under the spout. As the cool stream gushed over one hand she spelled into the other the word water, first slowly, then rapidly. I stood still, my whole attention fixed upon the motions of her fingers. Suddenly I felt a misty consciousness as of something forgotten—a thrill of returning thought; and somehow the mystery of language was revealed to me. I knew then that “w-a-t-e-r” meant the wonderful cool something that was flowing over my hand. That living word awakened my soul, gave it light, hope, joy, set it free! There were barriers still, it is true, but barriers that could in time be swept away.

Helen Keller, The Story of My Life, many editions.

Helen Keller would remember the day Anne Sullivan arrived at her house as her soul’s birthday. Her soul was born with the birth of language in her.

It is highly significant, in the story of Genesis, that Adam gives names to the animals, just after creation, for it suggests that human language and consciousness developed together. Language allows us to use words in the absence of their objects, in order to evoke these objects in consciousness. Language allows us to share songs about the delights of the hunt at night, after the chase is over. Thus, human language use is closely tied to human imagination, that wonderful faculty for evoking images of what is not immediately present here and now.

We can only speculate how natural languages started to develop in early hominids. Human languages are flexible, open, evolving and morphing all the time, allowing an infinite number of meanings to be expressed by combining a limited number of symbols in an infinite number of different contexts.

Later on, much, much later, humans started to reflect on the formation principles of language. People seem to have an innate sense for how to say things with words in a particular language. This suggests that languages have formation rules, and that it is possible to capture syntactic regularies by listing the grammar rules, or syntax rules, of a given language. The European renaissance consisted of the rediscovery of the cultures of antiquity. Scholars like Erasmus of Rotterdam (1466 - 1536) eased the access to Latin and Greek by making the grammar and syntax rules of these languages explicit.

Still later, mathematicians started to design languages of their own. A pioneer in this field was the German philosopher and mathematician Gottlob Frege (1848 - 1925), who proposed the first so-called formal language, the Begriffsschrift. Later on, this became known as the language of predicate logic.

The remarkable thing about the new formal language was that former logical and philosophical difficulties with expressing mathematical statements such as “For every natural number there is a larger natural number that is prime,” had vanished completely. Thus, it is customary to view Frege as the father of modern logic.

In the 20th century, another giant of modern logic used the language of predicate logic to prove an important result about the notion of computability. Alan Turing (1912 – 1954) was a pioneer in the truest sense of the word. Almost single-handedly he developed the concept of a programming language long before reliable computers existed.

Turing viewed programming as interacting with a formally defined machine (later called Turing Machine) using a language of imperatives. Here are the four ingredients of imperative programming:

Variable Assignment: <var> := <expr>
Conditional Execution: if <bexpr> then <statement1> else <statement2>
Sequential Composition: <statement1> ; <statement2>
Iteration: while <expr> do <statement>

The remarkable thing is that these four simple ingredients make for a Turing complete programming language. A programming language is Turing complete if it is powerful enough to simulate a single taped Turing machine.

It is believed that Turing complete languages can express any function that can be computed by an algorithm. This article of faith is called the Church-Turing thesis. This is an article of faith because the statement can be viewed as an explanation of what we mean (and could possibly mean) by the concept of an algorithm.

With his invention of the Turing machine and the concept of imperative programming, Turing was able to prove that (a variation of) the language of Gottlob Frege is undecidable, or, in other words, that it is possible to use first order predicate logic to pose a question that no computing machine can answer. He did this by formulating the halting problem as a formula of predicate logic. The formal proof of the undecidability of first order predicate logic consists of:

A very general definition of computational procedures.
A demonstration of the fact that such computational procedures can be expressed in first order logic.
A demonstration of the fact that the halting problem for computational procedures is undecidable.
A formulation of the halting problem in first order logic.

In the course of the 20th century, formal language theory and logic became cornerstones of a new discipline of computer science. Type theory developed, first to deal with paradoxes in set theory, later as an important ingredient in a new programming paradigm called functional programming.

In 1991 a joint four-year-long project was started between ILLC at the University of Amsterdam, CWI in Amsterdam, and Uil-OTS and the department of Philosophy at the University of Utrecht to study structural and semantic parallels between natural languages and formal languages. I had the privilege of leading that project.

Are there similarities between how texts are structured and how programs are structured?
Are there connections between how natural languages and formal language handle concepts like quantity and number?
Are there parallels between how knowledge is represented in natural language and how formal knowledge representation languages encode knowledge?

The assumption was that that in these areas methods and techniques of natural language and programming language analysis are mutually applicable.

In the course of the project, linguists and philosophers learned about programming and computer scientists learned about topics like pronominal reference and quantification in natural language. The project was truly interdiscipinary and great fun. This was a time of exciting workshops where linguists explained their theories of pronominal reference to computer scientists and philosophers, where philosophers and logicians applied the theory of generalized quantifiers to natural language, where connections between type theory and categorial grammar were gradually made clear, and where dynamic logics were developed that applied to programming as well as natural language analysis. These were heydays of structured interdisciplinary action, and in this important period in the history of ILLC the seeds were sown for many later developments.