The Letter Spirit project, which seeks to model typeface design in the domain of a low-resolution grid, is about a challenge - as originally proposed in the early Eighties by Douglas Hofstadter - to write a program which will start with seed letters in a gridfont and design the remaining lowercase roman letters in a consistent style. The work proposed here is nominally Part Two, the first having been recorded by Gary McGraw in his dissertation work. His PhD thesis [McGraw '95] describes an implementation of the Examiner, a gridletter-recognizing module designed for use in a full gridfont designer. Since then, the McGraw Examiner has been modified by John Rehling [Rehling and Hofstadter '97] for quicker, more accurate, and more principled behavior in its recognition of gridletters. This proposal is for the development of a gridfont-designing program using the modified Examiner as a starting point and using architectural ideas found in other cognitive models developed by the Fluid Analogies Research Group. Whether or not a Part Three is ever put forth as dissertation work, Letter Spirit has boundless opportunity for elaboration and further research, and the work proposed in this document will surely not exhaust the project. Two succinct goals motivate the work proposed herein: The first program using a distinctively FARG-like architecture creating novel and consistent gridfonts will be developed. And, secondly, this program's capabilities will constitute an empirical display of certain principles that we believe are integral to creativity.
The Gridfont Domain
First, it will be necessary to describe (very briefly) the way of thinking about letters and Letter Spirit that Hofstadter and McGraw have laid out. This is not intended to completely recapitulate such descriptions as found in [Hofstadter and FARG '95], but simply to re-acquaint the reader with the nomenclature.
A specific graphical representation of a letter is a letterform, which will have category membership in one (or possibly more) letter categories. Letters have abstract, Platonic letter conceptualizations, each of which consists of a set of roles. A role is not a shape, but a description which constrains the possible shapes that can graphically represent that role. A specific shape that adequately meets the conditions of an abstract role is a role filler, and the fillers of any given role may vary in many ways from each other, and have only a family resemblance to one another. A collection of role fillers, correctly assembled, are the parts in a letterform. This is partially summarized in Table 1.
| Abstract, Platonic | Specific, Graphical | |
| Whole | Letter Conceptualization = "Whole" | Letterform |
| Component | Role | Role Filler = Part |
The way in which a role filler deviates from the generic properties of its associated role is called its slippage. Most letter conceptualizations consist of multiple roles, and abstract information about how they should interact (in terms of position, crossing, touching and distance at various points) goes by the rubric of the relational-roles, or r-roles, for that letter conceptualization. Letter conceptualizations are sometimes referred to as wholes, and there may be multiple wholes representing the same letter category. The letterforms we use are rendered on a grid consisting of 56 quanta, each of which is a horizontal, vertical or diagonal line segment connecting adjacent points in the 3 by 7 grid. Some sample gridletters will be seen later in the paper in Figure 1.
The grid allows considerable variety in the style of gridletters rendered therein. While it restricts a great deal of freedom enjoyed by designers of traditional typefaces, which permit arbitrary precision and detail, it is still a rich domain. Hundreds of full gridfonts have been created by human designers and are available at the CRCC. The presence of a single quantum on the grid is always significant and noticeable. The benefit of the grid is that it de-emphasizes the low-level perception which is the domain of the retina, and focuses the project squarely on high-level perception and cognition.
Creativity and Perception
In nearly all creative acts, there are a wide range of possible directions to explore, and rarely if ever is there but one possible correct answer. Many (perhaps uncountably many) themes and issues can be at work in even the simplest works of art. Numerous elements in the work of art permit some range of variation, in value, in color, in shape, in facial expression, in cultural referent, etc. Pressures to select various aspects that could go into the work interact, conflict, and compromise. The artist has a repertoire of design features that can be put into the work, which vary from artist to artist, limited at times by the palette, more often by the imagination. In the following discussion, the term "palette" will be used in an extended sense to refer to whatever set of tools a creator has with which to render. While it may seem too obvious to merit pointing out, the importance of the repertoire of tools as an entity in any creative endeavor bears elaboration. A child frolicking in a playpen, a diplomat dealing with a tense situation, and an artist involved on a project all have their own repertoires of tools that they can apply in their own art. Each has a finite repertoire, but productive systems with finitely-many primitives (the field postulates for addition and multiplication of real numbers, for instance) can nonetheless have the potential to generate any of infinitely-many end products. The domain does not uniquely determine the palette used for creating in that domain. Artists differ in which tools they possess. Generally a larger palette will enable greater variety in the art; the history of art is in large part the story of pioneers discovering new tools and adding them to the whole world's palette.
An artistic medium can impel certain restrictions on the palette used in that medium. For instance, the use of color has not commonly been part of typeface design. (Although certain "radicals" experiment with this and push the boundaries of what the domain is.) The Letter Spirit grid will strictly enforce a number of constraints, and shrink the designer's palette considerably; however, as is the case with more general typeface design, enormous freedom remains.
So far, the discussion of palettes has focused on those used in creation, but it is also important to consider those used in perception. It has always been tempting to imagine that there is an objective way of seeing the world, and that it is how we humans see the world. However, what we see is what comes through the filter of our perceptual palette, and is limited to the image that those tools can build in the mind. Limitations exist in the low-level sensory apparatus (we cannot see ultraviolet light or sense magnetism), but also on higher levels. We do not notice a harmony between colors whose wavelengths are related by simple ratios, although we do have that ability with sound. Most of us are not adept at sensing the path through the woods that a running deer would take. We cannot understand foreign languages with which we have no experience nor decode the nonverbal signals that a close couple exchanges. Clearly, the perceptual palette is not fixed for all people. Adding tools to our high-level perceptual palette is a - if not the - major task of education. An enumeration of tools for the creative palette of developing artists can be found in [Lauer '79].The existence of certain distinct perceptual tools can be seen in the limitations imposed upon patients suffering brain damage [Thompson '93].
Any particular implementation of the Letter Spirit program will have a certain perceptual palette available to it. With the current version of the Examiner, this is certainly a subset of the tools available to a practiced human perceiver of roman letters. For instance, parts comprised of non-contiguous sets of quanta are not allowed. Parts which have three tips will be broken into smaller pieces [McGraw '95]. Parts which require the viewer to make a figure-ground inversion will not be perceived correctly, nor will parts which require the viewer to deduce a three-dimensional structure. It is not clear exactly which perceptual tools a normal person has, but the Examiner admittedly falls short of this. We can imagine a conceptual space of possible gridfonts, with plainer and more mundane gridfonts near the center, and eccentricity increasing as we work outward. What we need to expect to find is that a perceiver will cease to recognize letters successfully beyond a certain boundary, depending upon the tools in the palette. Adding tools will push the boundary outward. Likewise, a creator of gridfonts will be limited by the available palette, and will be able to venture outward in this conceptual space only so far as the tools at hand permit. The claim for the proposed implementation of Letter Spirit is not that the tools will match those that people have, but that the way in which the tools are used is very human-like. Thus, as with the Examiner, the test of Letter Spirit will not be that the program can function fully within the boundary that human designers and perceivers can, but that it manages well within a boundary that allows for considerable variety. In order to succeed at this, Letter Spirit will have to use its perceptual and creative tools together, imitating the high-level task of a human designer.
Central feedback loop of creativity
An artist needs to be a good appreciator of art because review and revision are an essential part of the creative process. An important idea behind the organization of Letter Spirit is that there is a central feedback loop of creativity - that any candidate letter for use in the output gridfont will be evaluated after creation to make sure that it is acceptable, both with respect to the intended letter category and the style of the other letters. There is great power associated with the ability to review critically and revise, and theories of creativity often include this [Boden '90], [Schank and Childers '88]. Many types of activity benefit from Critical Revision (henceforth, "CR"). Commentaries on space probes often describe the precision in their long trajectories as being akin to a golfer sinking a putt from some vast distance. This is a faulty comparison, however, because the disposition of the space probe is frequently gauged and its course adjusted, while a golf ball can only be directed in the split second that the club strikes it. Likewise, it is not so remarkable that one can leave Indiana in a car, and park it in a pre-determined parking spot in California with excellent accuracy when the driver is constantly receiving feedback on where to steer, and may (and should be able to!) correct the wrong turns that are liable to happen on a long trip. For a blindfolded driver, however (and one without auditory, somatosensory, kinesthetic or other feedback on progress), this feat would be impossible. If a robot driver were completely unerring in ability to turn the wheel just the right amount to follow a course which had been perfectly known in advance (granting the very big concession that only static, unmoving obstacles exist in this world), then the trip may be conceivable, but without feedback, that sort of accuracy and control is unheard of, either in people or machines. In the real world, a cross-country trip (excepting those set on rails), the composition of a symphony, the sculpting of a statue, and the design of a gridfont all require frequent review and revision.
The problem with a long-distance car trip for our blind traveler with the good memory is that tiny errors multiply after a time. Even if the course to California were perfectly straight, a misdirection that would be invisible to the keenest eye at departure would lead to a detour bigger than a football field on (what should be) arrival. Long before reaching Iowa, our friend would likely be in a ditch, even in idealized circumstances. Of course, not all enterprises require CR. Certain (short) golf putts and billiards shots can be made with great confidence that the target will be reached successfully. Formally, there is no dichotomy between those undertakings which require CR and those that don't, for this is a matter not only of the task and the medium, but of the standards for the product as well. If one's goal were simply to reach the Pacific, rather than a particular driveway (and if the whole country were smooth and paved), then navigation would be simplified immensely. If one does not aspire to a very high degree of aesthetic quality, then blindfolded painting might be suitable. And if the standards of gridfont design were low enough, then a feedforward process of design without review (if the designer could suppress the internal visualization that would inevitably occur in any human designer's mind) might suffice. An example of a gridfont-producing program which does not have the capability to review its own work shows the importance of CR and of having alternative methods of producing a gridletter from those which preceded it. The output of a model by [Grebert et al. '91] can be seen below:

This gridfont was created by taking as input the unshaded gridletters and a three-layer connectionist network trained on the task of gridfont creation produced the remaining letters, those shaded in gray. The bizarre, jagged nature of many of these highlights many shortcomings in the approach, the most serious of which is that the program cannot try to correct substandard output. More complex feedforward networks with larger training corpora (the above network saw only five gridfonts in training) could improve the quality of its first drafts, but without CR, it will need to create more-or-less perfect first drafts. This is akin to trying to do away with the constant guiding of a spacecraft by making its initial launch so super-precise that no trajectory adjustments will be needed during flight.
In the more general domain of commercial typeface design, a professional human designer, Daniel Pelavin, stepped through the process he used in creating a typeface using various tools, ranging from paper and pencils to a computer. Although the account does not describe the process as completely as one might wish, fine-tuning and adjustment of drafts of letters still occur throughout. Even though his revisions do not fall into discrete phases, at least a half-dozen stages are mentioned which involve some alteration of his letterforms [Abes '94]. Doing away with CR would require either a significant lowering of standards or a drafter with more skill than a talented and experienced human designer.
Despite the fact that many models of creativity have been explored in numerous AI models [Boden '90], and certainly many programs subject their output to tests to guide internal refinement (for example, a program that computes square root by successive midpoint), we know of no program which performs a test for the aesthetic, rather than merely practical, aspects of its tentative outputs. Coupling a general gridletter designer with a sophisticated aesthetic evaluator, Letter Spirit will bring a new and powerful approach to computer creativity, and should be capable of subtle creative powers. Intuitively, this should lead to more consistency in the quality of the output, and free it from the need that many computer models of creativity have, to have a human edit the output and select only the gems for general consumption.
FARG architecture: LS modules
All processing is to be completed by four modules, which divide the entire task according to phases of the design process. The first, the Examiner, has a robust working implementation developed by Gary McGraw [McGraw '95] and honed by John Rehling [Rehling and Hofstadter '97]. The Examiner decides upon letter category for a given gridletter, and its task is similar to that of an optical character recognizer. Unlike traditional OCR programs, it also returns a parsing of the gridletter into meaningful parts; the importance of this will be made clear later. The Adjudicator also evaluates a gridletter, but handles the complementary task of resolving the style used in the gridletter's rendering. The Imaginer is the module which makes decisions about the abstract properties of a new gridletter being designed, coming up with an abstract and complete plan for its creation. The Imaginer knows nothing about the grid, but may select among possible role conceptualizations, and select some properties of the eventual role filers. Finally, the Drafter begins its work where the Imaginer leaves off, and turns the abstract plan for a letter into an actual letterform on the grid.
The four modules are to operate on the same data structures (although not every module will read or alter the contents of every data structure). Because many independent agents are coordinated via the common data structures, one can see the inspiration of the Blackboard model of Hearsay II [Erman et al. '80]. Most basic is the Scratchpad, where 26 slots will hold the original seed letters plus all gridletters created subsequently. The Scratchpad is just a repository, a virtual piece of paper. No operations more complex than copying-to and copying-from will take place there. Most of the work the program does will happen to one gridletter at a time, and the locus of this action is the Workspace. There, gridletters are recognized and dissected by the Examiner and Adjudicator, and are built up by the Imaginer and Drafter. The Thematic Focus is where stylistic information for the gridfont is built up by the Adjudicator (and to some extent, the Examiner) and is used by the Imaginer and the Drafter in the creation of new letters. Letter Spirit's long-term memory for concepts, the Conceptual Memory, is a network of nodes that represent permanent concepts and each node has a variable level of activation that will increase or decrease the involvement that that concept has in further processing. In addition, the nodes have hard-wired connections so that activation (represented by a variable value on a continuous scale for each concept) can spread between related concepts. Concepts in the Conceptual Memory include those for roles, role sets, properties of quanta such as horizontal or diagonal, properties of roles such as tall or wide, and potentially many others. An important part of the thinking behind Letter Spirit is that a long-term memory may be vast, but with only a portion of that which is active at any time; this principle is axiomatic to realms of cognitive science as diverse as Turing Machines and the psychological literature on human memory. This principle is realized as the Workspace and Thematic Focus hold relatively small amounts of information at any given point in time, and only a portion of the nodes in the Conceptual Memory will tend to be active at a time.
Perhaps the most defining characteristic of the architecture is the Coderack, where many codelets are held waiting as they are run, one by one. Codelets are relatively short computer routines which perform small operations, no one of which should do very much of the program's work. A few codelets are posted to start each module, and codelets may post new codelets so that the Coderack generally will sustain itself. Codelets are selected nondeterministically from the Coderack and are removed as they run. They can be thought of as function calls, with parameters determined at the time they are posted to the Coderack. Virtually all of the activity in Letter Spirit is carried out by the action of codelets. At any given point in time, the codelets on the Coderack may be serving but a few high-level trends or pressures, and because each of these gets (in all likelihood) only a small number of codelets run before another pressure gets some work done for it, the pressures act in implicit parallelism as each vies to have its way. In the end, whatever processing is done must be a good compromise between all the various high-level pressures invoked. Having a system's behavior emerge from the combined action of many small agents has been inspired by and likened to the work of an anthill as carried out by many ants, or like the activity in a cell, with countless proteins each performing a tiny part. By using information the system has gathered midway through a run to influence the amount of computational effort spent on each option, the program carries out a parallel terraced scan, exploring many pathways to a solution with preferential consideration to those which seem most promising [Hofstadter and FARG '95].
This interaction between high-level pressures plays a vital role in the behavior of Letter Spirit. The way that many pressures act together tends to make the behavior of the system well-informed, with many perspectives having had a chance to influence the course of action. To enhance this positive quality, the proportion of codelets devoted to each high-level pressure will reflect the intrinsic goodness of pursuing that tack. Ideally, the more favorable a direction is (or seems), the higher the proportion of codelets it will have run. Often, the activations in the Conceptual Memory will be used to determine the number of codelets that are devoted to exploring a particular concept. This is rather like the use of heuristics in many Classical Artificial Intelligence approaches to search, but simply prioritizes the options rather than selecting one to the exclusion of others, and allows many to be explored in parallel.
The last element in the architecture's data structures is the Temperature, simply a real number from 0 to 100. This is a sort of inverse "goodness", with high Temperature corresponding to situations where the system has not thus far built up much useful structure. Temperature focuses the behavior of Letter Spirit by determining how much weight to give each codelet when the virtual roulette wheel is spun which picks the next codelets to be run. An urgency is assigned to each codelet when it is posted to the Coderack, and the lower the Temperature is, the more this is the only criterion used in weighting the selection process. At high Temperature, the codelets all have nearly the same probability of being chosen. The aphorism "Drastic times call for drastic measures" captures the motivation behind this, and this phenomenon can be seen in many complex systems, from models using simulated annealing to the choices made by an electorate.
Although the Examiner is but a small part of the complete model, the architecture and principles used in other modules will relate closely to those of the Examiner, which in turn was influenced by the commonalities of architectures of other projects developed at the CRCC [Mitchell '90], [French '92]. All of the aforementioned data structures except the Thematic Focus are involved in the workings of the Examiner, and the type of processing that has been so successful in the Examiner will be adapted to the other modules.
Style defined
The study of aesthetic quality has its roots in antiquity. Despite the attention the subject has received, progress has been neither sure nor swift. In the use of the scientific method here, there is difficulty in finding easily-testable hypotheses of much relevance to the topic, and the lack of successes here led to much pessimism. Despite the opposing stance of hardline behaviorism, progress in this area has come from inferring internal representations and states of affect, and testing to see if these models are consistent with a range of experimental phenomena [Crozier and Chapman '84]. For centuries, the area had remained poorly-understood, in part due to the unscientific methodology wherein a thinker would pontificate at length on their ideas regarding aesthetics, with no accountability for these ideas being shown to work in an actual behaving system. Letter Spirit is an exciting experiment in using new tools - not only the computer, but also the insights of experimental cognitive psychology - to unravel old mysteries.
A typeface (or gridfont) reflects in each letterform two forces which conflict during creation. An incentric force draws the letterform inward as a central member of its letter category. An eccentric force seeks to deviate from this, making it distinctive and giving it character. (This distinctiveness should be similar to that of the other letterforms in the typeface, thus achieving coherence.) The asymmetry between the two is profound. For Letter Spirit (and in practice, in many other domains of typeface design), the alphabet is fixed to 26 possible categories, while the number of possible styles in which that alphabet can be rendered is vast. The program looks at a novel gridletter demanding one of 26 letter categories to fit (or the 27th option: finding no suitable answer) and then afterwards seeks to pick out the style. (A program which systematically sought a complete understanding of style first and afterwards tried to recognize letter would face certain difficulties; the issues are too many to delve into deeply here.) In recognizing letter category, the Examiner looks for correspondences between the shapes on the grid and the abstract roles in its Conceptual Memory. Style recognition, then, will look for the discrepancies between a gridletter and the idealized abstractions of its letter category.
These discrepancies can take two possible forms. First, the gridletters may have aspects which are explicitly in conflict with the abstract norms for those letters. Second, the gridletters may exhibit distinctive and recurring properties which are not in direct conflict with their letter identities, but nevertheless suggest a pattern that can mediate commonality between all the gridletters in the gridfont. The stylistic properties of the first kind will be known as norm violations. If a role filler lacks a particular property expected for its abstract role, that does not destroy its membership in the category, although it makes it a weaker, or less prototypical member of the category. A survey of ideas relevant to this way of thinking about categories can be found in [Lakoff '87]. The stylistic properties which do not directly conflict with the abstract role definitions are, for general typefaces, innumerable in variety, and include stylistic options and parameters such as serifs, the thickness of curves in various places on the letterform, and many more. For the purposes of Letter Spirit, there will be two ways of representing stylistic character of this kind. Abstract rules are properties that may either be present or not for a given gridletter, and may be considered in effect for a gridfont (or gridfont in progress) if they apply to all or most of the gridletters therein. Abstract rules tend overwhelmingly to be "Thou shalt not" rules when expressed in the most straightforward way, forbidding rather than requiring some property. These will include rules which forbid quanta of a certain orientation, angles of certain measure, quanta within certain zones of the grid, and continuous collinear stretches of segments of certain lengths. The second kind of property which does not necessarily cause conflict with the abstract role definitions is that of motifs. A motif is a particular shape which recurs in numerous gridletters. This can be thought of in a number of levels of literalness. Most literal would be the incorporation of the exact same quanta in multiple gridletters. Less literal would allow the same shape to be translated about the grid. Still less literal, but still incorporating the same motif, would be a shape allowed to undergo reflection or rotation. A motif could be as small as one quantum or of arbitrarily large size. The larger a motif is, the more noteworthy it is to see it occur in multiple gridletters. These notions of style make up Letter Spirit's perceptual palette, and will have a significant part in the program's creative palette. While it is clear that this does not match the palette that a human designer (or recognizer) is capable of, a survey of sample gridfonts shows that these properties do seem to explain the style within many of them. The ambition of this proposal is for Letter Spirit to show competence for gridfonts within these bounds.
Examiner and optimizations
Because the Examiner has already been implemented and described in great detail in [McGraw '95], and its optimizations discussed in [Rehling and Hofstadter '97], there will be no attempt to duplicate or expand upon those descriptions here. It should be instructive, however, to explain this module at the same level of description as is used with the Adjudicator and Drafter below. The differences between the three should raise interesting issues, while the similarities emphasize how the FARG architecture may be applied to many different problems.
While the Examiner is running, the Workspace contains all the quanta of the gridletter it is trying to recognize. At any point in time, every quantum is a member of exactly one part, and the quanta in a part are necessarily contiguous. Thus, the gridletter is parsed; there will be many possible parsings, but not all will be equally likely or useful. Ideally, the Examiner will find a parsing such that each part is a good filler of some role, and that the filled roles correspond to some letter category's role set.
Early on, Gestalt codelets give small amounts of activation based on holistic properties of the entire gridletter. This activation spreads down to the roles, and constitutes an important preliminary to the parallel terraced scan, allowing processing to concentrate on the best possibilities, potentially saving great effort later at the cost of a small investment early on. A quick, tentative parsing is applied to the gridletter, grouping quanta together in accord with principles found by [Palmer '78]. Given a parsing, the Examiner tries to label the parts. Labels are properties that can apply to a shape. These labels, (for instance, "tall"), are not Boolean properties, and are attached probabilistically, but with a higher probability for more appropriate parts. When parts are labelled, sparking can occur. Sparking is a process in which parts are used to trigger activation of roles. Roles are also defined as collections of labels (in addition to relationships between the role and role sets, which is not directly involved in sparking), and the extent to which a part sparks a role is determined by how well the labels of the gridbound part and the abstract role correspond. Activation of roles spreads upwards to their associated wholes, and then back down from wholes to roles in a way that allows top-down pressure to work as in the interactive-activation model of [McClelland and Rumelhart '81]. Here, Temperature is defined, roughly, as the degree to which one and only one whole has high activation. When this occurs, the probability is high that the Examiner will halt, returning a letter category answer, a parsing of the gridletter, and a set of labels for each part. If a solution does not occur, the system will eventually get around to re-parsing the gridletter by breaking large parts or sticking small ones together, and the process of labelling, sparking and spreading activation will begin anew.
It is important to remember that most of these activities occur through the action of many codelets, and so the nature of processing is much more parallel than serial. For example, one part may receive some labels, undergo sparking, then another part receives some labels, then the first part sparks again, then more labels are attached to each, then the second is sparked, and so on. This makes the system less brittle as a gridfont recognizer. At any given point in the run, the probabilistic nature of the processing that has led to that point allows for some variety in the labels that have been attached, and thus, over successive opportunities to parse, label and spark, the system may explore a variety of ways of considering the letter. If the Examiner were completely deterministic, then a correct recognition would occur only if the chosen method of labelling and sparking allowed that correct recognition, which would not always occur for unusual gridletters. The parallel, nondeterministic behavior allows the program to explore a range of promising attempts at a solution, but without searching the entire space of parsing, labellings and sparkings blindly.
Figure 2 shows the way in which processes and representations in the Examiner can influence one another. These boxes are relatively high-level, and do not show the full intricacy of the program. Arrows marked in gray are those added in the optimizations made by the author of this proposal to McGraw's original Examiner. These optimizations emphasize the use of top-down influence and have led to marked improvements in speed and a significant increase in accuracy.

Adjudicator
The Adjudicator is the next module of Letter Spirit in line for implementation. The Adjudicator's task can be stated simply: When a gridletter has been identified and parsed by the Examiner, the Adjudicator evaluates the stylistic properties of the gridletter and places that information in the Thematic Focus, whose many fields will include the sort of information described in the "Style defined" section above. The potential exists to add more fields to the Thematic Focus in successive implementations, and Letter Spirit's palette could be enhanced and further approach that of human designers.
In the case of norm violations, the Examiner will have already started the Adjudicator's work for it, because labeller codelets pick up precisely the information needed. The Adjudicator needs to take note of the conflicts between the labelled properties for a role filler and those of its corresponding role. It is expected that it will suffice to record these in terms of the expected label and which label actually occurred or which unexpected labels did occur. It may be of use to record them in terms of which slippage occurred for which roles. This underscores an open question on how to best capture the phenomenon of slippability. Height, width, weight (number of quanta), location, closure, information regarding tips and perhaps information about the curvature of the role filler are the broad categories for all labels.
Abstract rules will be tested in a virtually identical fashion, except that the significance of a gridletter obeying an abstract rule is a property of the letter category, and will not be recorded on the level of roles (although it is highly likely that unusual restrictions on roles will be noted independently as norm violations). As is the case with norm violations, it is only significant to find a property where it is not expected. Finding a 'b' with no descender should have no effect on whether or not to create one on a prospective 'p'. However, abstract rules will often not be contradictory to the nature of the letter's prototype. For example, the absence of diagonal quanta in an 'l' is neither expected nor unexpected, but something worth noting if it turns up in gridletter after gridletter. As noted above, abstract rules can forbid orientations of quanta, angles between quanta, various locations on the grid, or lengths of continuous stretches of quanta.
Motifs are also most worth noting where they are least expected, and the size of the motif is the most important parameter here. Also, the more literal the way a motif is being considered, the more likely it is to be noteworthy. To illustrate this, consider how an isolated quantum, with no particulars about location or orientation is the most unremarkable thing one could ever find in a gridletter. A particular quantum specified by its location on the grid is certainly more significant, as is a cluster of several quanta even without a specific location, or even orientation. Motifs will be grown from the small to the large. At first, codelets may note a motif of the smallest size (one quantum or one angle between them). If this recurs, then it will be worth adding to the Thematic Focus. In successive events in which the motif is noticed, the Adjudicator will try to take the next step and see if another quantum can be stuck on. By noticing a recurrence of properties that go one step beyond an existing motif, larger motifs will be grow like crystals. It is likely that most motifs will be rather small, on the order of two to four quanta. It may be expedient to represent the different versions of motifs, which vary with regard to literality, and whether orientation, location and handedness are preserved, as distinct fields in the Thematic Focus. For each evaluated letter, the entire letterform and its precise role fillers will also be stored, which does not require any sophisticated computation, but will be useful for the Drafter.
The Adjudicator will begin by having a number of codelets which begin the search for the above stylistic properties sprinkled onto the Coderack. These will not only carry out the task of building up a coherent style, but also activate permanent style concepts in the Conceptual Memory, and post new style codelets, with the activations used to influence the choice of which properties are sought out. The Workspace will once again hold the parsed letterform. Temperature here will be defined by the degree to which style has been built up for the current letter in a way consistent with the style that had already been found for any preceding letters. The Adjudicator will be able to make the probabilistic decision to halt when the Temperature is sufficiently low. The flow of control in the Adjudicator, somewhat simpler than that in the Examiner, can be seen in Figure 3.

One may wonder why it would not be better just to use a conventional straight-line program to check all of the possible stylistic properties for each gridletter. The first difficulty is that these are innumerable (or very numerous), certainly in the case of all possible motifs that could be extracted from a gridletter. In fact, the same is true in principle for either of the other categories of stylistic property, but in the proposed implementation, a large but tractable number of each (in the dozens) will simply be built into the program. Putting aside this concern, one could still press with the weaker claim that the Adjudicator could be at least make a systematic, serial check of all possible labels and abstract rules built into the program. The value of having the program notice only a portion of all possible norm violations and abstract rules is twofold. First, it enables variety in the results. A human designer is not forced to acknowledge all possible aspects of a template for further creation, nor to employ all sensible themes in a work of art. This may be something to strive for in certain cases, but there is certainly no such mandate. It calls to mind the aphorism that a camel is a racehorse designed by committee. It is sufficient to pick a small number of particular directions for expression and represent those well in the final product. Second, because activations of stylistic concepts direct the search for further stylistic properties, the non-exhaustiveness of the style detection emphasizes those properties which are picked out to investigate, and seem most important. Using a very similar mechanism to the one that powers the parallel terraced scan for the Examiner, and leads to speedy and accurate recognition, the Adjudicator will thus focus on the most pertinent stylistic properties.
By itself, the Adjudicator does not perform any task for which there has been great historical motivation to find a mechanism. While the Examiner performs much the same task as programs for Optical Character Recognition, there is no practical analog for the Adjudicator. For the purposes of Letter Spirit, the direct purpose of the Adjudicator is to find the style that will be used by other modules in creating new gridletters. Thus, to evaluate the quality of the Adjudicator's work prior to the development of a program that uses the Thematic Focus for gridletter creation, a task must be invented. This will be as follows: 25 letters of a human-produced gridfont will be given to the Adjudicator (after they have each been given to the Examiner). Then, a number of candidates for the final letter, including the actual one chosen by the human designer, will be presented to it, each in isolation of the others. This will constitute a multiple choice test to see which the Adjudicator prefers. As with the Examiner, it is not the goal to reproduce in full the powers of human letter recognizers, but to behave in a human-like way with a respectable range of gridfonts, one which should correspond fairly well with the range of the gridfonts for which the Examiner shows proficiency.
Imaginer and Drafter
The original planning for Letter Spirit makes provision for a module called the Imaginer which decides high-level aspects of how to draw a gridletter without considering details about how rendering should proceed on the grid. The Imaginer will not be developed in great richness in this implementation. Certain decisions of this type will be made, but without the use of a full-blown architecture using the Coderack and the data structures used by the other modules.
In the proposed program, the Imaginer will simply decide between several methods for producing the new gridletter, and then direct the Drafter to proceed with the actual creation. One method of producing the new letter is to take the entire letterform of an existing letter and reflect or rotate it into the new letter. The most accessible example of this would be to reflect a 'b' into a 'd', or vice versa. If this method is not used, then a letter conceptualization will be chosen, and the constituent roles must be created. One way to do this is to borrow a role filler for that role from the Thematic Focus, or to reflect, rotate or translate a role filler for a different role, as appropriate. These borrowing operations are not complex, and will not involve any interesting complexities.
Beyond straight borrowing (which will rarely be capable of finishing off a partial gridfont), the final method the Imaginer can recommend to the Drafter will be to draw a new role filler. The decisions of the Imaginer will be made in a probabilistic way, not necessarily favoring borrowing over drawing. These decisions should be influenced by a simple memory of which methods have been used on previous letters.
Thus, the interesting work of the Drafter will be in its ability to draw role fillers from scratch, respecting the role's norms and the style information of the gridfont both. This will be done by picking a starting point and proceeding through the grid, quantum-by-quantum, on a fairly direct (but by no means completely straight!) course for the rough area of a finish point. So, after finding a starting point, the Drafter will basically be iterating the same subtask over and over: deciding from one point on the grid which of eight possible choices to make, with seven choices being the directions the next quantum could go, and the final choice being whether or not to stop and call the role filler complete. (If, as will often be the case, the current vertex is on the edge of the grid, then the number of choices will be smaller, because it is not legal to draw off of the grid.)
In order to decide which of the directional choices to make, the Drafter will attend to the very local world of the current vertex on the grid to which the prospective role filler has been drawn, and the square of vertices which surround it at a one-quantum distance. For each of the three types of stylistic property, as well as a fourth source in the norms of the relevant role, codelets will note pressures coming from the style/norms and add (or subtract) bits of weight, as appropriate, to the vertices to which the next quantum can be drawn. Abstract rules, for instance, will place negative weight on those options which would violate an abstract rule. The role norm pressures will account for the size and curvature of the role, as well as its approximate location on the grid. No one codelet will be able to add very much weight to a vertex, and only through the action of many codelets will appreciable weights build up. The Temperature will lower as more pressures have had a say on more options, until the Drafter halts and makes a decision on the next step to be drawn. Then the process will begin anew, with the Drafter's "tunnel vision" focusing on the end of the new quantum, considering a fresh set of options for the next step. Thus, the Drafter should make a probabilistic walk about the grid, drawing the role filler as it goes, with each of the pressures expressing the style of the gridfont having had a say in the decisions made. None is guaranteed to get its way in any case (except to respect the edges of the grid), but each will be able to influence the decisions in relation to how strongly the pressure "feels" about it. This is less like the way democracy operates at an election, where every voter (who chooses to vote!) gets exactly one vote, and more like the way representational government operates between elections, with the more vocal constituents having more influence on the behavior of the elected representative, but with none having the final say alone. A schematic of the control within the Drafter is seen in Figure 4.

The precise nature of how decisions regarding initial and final endpoints will be made has not yet been decided (for roles with closure, the two may often be the same), but this aspect of the implementation will also make sure that the program has the same pressures involved in drawing acting here. Options include, for instance, treating the decision to halt as a special case of the drawing step in which weight is added to this option which has no relation to the grid. Alternatively, a move back to the same vertex from which the quantum began could be tantamount to calling the role filler complete. It is expected that there will be a variety of options in this aspect of the implementation, each respecting the idea that multiple pressures should contribute to any decision. When this implementation of the Drafter is made to work together with the other modules of Letter Spirit, then novel and interesting gridfonts designed by a computer should be a reality.
Putting it all together: Top-level Control
In a sense, there will be five modules in the proposed Letter Spirit implementation. The Examiner is - and the Adjudicator and Drafter will both be - examples of the FARG architecture. The Imaginer will be implemented in a simple way, intended to operate in a nonprobabilistic way exhibiting some sensitivity to global pressures, but in a much simpler way than the other three modules.
A top-level program will coordinate the integration of the modules in a fairly simple way. When seed letters have been input, they will be evaluated one-by-one. Each will be given first to the Examiner, and then when it is identified and parsed, to the Adjudicator. The seed letters will be stored in the Scratchpad along with a fitness/happiness value for each based on the Temperature and various activations at the end of processing by the Examiner and Adjudicator. When the seeds have each been evaluated in this way, the program will pick a letter category for which a new gridletter will be drawn. This will be made by choosing probabilistically based on the happiness values for each category, and thus a previously filled category may be chosen. The Imaginer and Drafter will create a new gridletter, and then it will be evaluated by the Examiner and Adjudicator, and placed in the Scratchpad if it improves the happiness over the previous rendition of the category. This will continue until the sum of the 26 happiness values seems suitable.
The goal for this implementation is couched in terms of the earlier discussion regarding palettes. The Examiner, Adjudicator and Drafter each has a particular palette built in, and some subset of all possible gridletters can be perceived or created by using each respective palette (category membership will not be Boolean and rigidly fixed due to the nondeterminism of the program, but approximate boundaries will exist). Letter Spirit will be able to create gridfonts with all 26 gridletters inside the intersection of those three subsets. It has already been seen that the Examiner's range covers a reasonable variety of gridfonts. If the Adjudicator and Drafter are capable over similar (or larger) ranges, then Letter Spirit will produce competent gridfonts within the same area. The chain will most certainly be only as strong as its weakest link, and it will be impossible for the program as a whole to exceed the powers of any of its modules. Thus, the multiple-choice benchmarking of the Adjudicator will be performed with an eye towards having success in exactly those gridfonts which the Examiner already recognizes. In like fashion, the Drafter will be coded to be capable of designing the role fillers of those gridfonts.
This implementation of Letter Spirit looks to bear out more than the functioning of three modules, however. Its success will rest upon the power of the role model of letter recognition; the central feedback loop of creation with its inherent Critical Revision; the sufficiency of norm violations, abstract rules and motifs, as described above, to capture a wide variety of style; and the flexibility of the FARG architecture in a number of different tasks. It is hoped that the results of this project will provide a vivid and exciting result in computer creativity, and promote ways of thinking about cognition and cognitive modelling that may be applied in ever-richer domains in the future.
Bibliography
Abes, C. (1994). Expert graphics: Graphics professionals share their secrets. Macworld, v.11, n.8, p. 122(2).
Boden, M. A. (1990). The creative mind: Myths and mechanisms. New York: Basic Books.
Crozier, W. R. and Chapman, A. J. (1984). The perception of art: The cognitive approach and its context. In Cognitive processes in the perception of art. Advances in Psychology (19). Amsterdam: North-Holland.
Erman, L.D. and Hayes-Roth, F. and Lesser, V.R. and Raj Reddy, D. (1980). The Hearsay-II speech-understanding system: Integrating knowledge to resolve uncertainty. Computing Surveys, v. 12, n. 2, pp. 213-253.
French, R. M. (1992). Tabletop: An emergent stochastic computer model of analogy-making. PhD thesis, University of Michigan, Ann Arbor, Michigan.
Grebert, I. and Stork, D. and Keesing, R. and Mims, S. (1991). Connectionist generalization for production: An example from GridFont. In Proceedings of the 1991 International Joint Conference on Neural Networks.
Hofstadter, D. R. and the members of FARG. (1995). Fluid concepts and creative analogies: Computer models of the fundamental mechanisms of thought. New York: Basic Books.
Lakoff, G. (1987). Women, fire, and dangerous things. Chicago: University of Chicago Press.
Lauer, David A. (1979). Design basics. New York: Holt, Rinehart and Winston.
McClelland, J. L. and Rumelhart, D. E. (1981). An interactive activation model of context effects in letter perception: Part 1. An account of basic findings. Psychological Review, 88:375-407.
McGraw, G. (1995). Letter Spirit (part one): Emergent high-level perception of letters using fluid concepts. PhD thesis, Indiana University, Department of Computer Science and the Cognitive Science Program, Bloomington, Indiana.
Mitchell, M. (1990). Copycat: A computer model of high-level perception and conceptual slippage in analogy making. PhD thesis, University of Michigan, Ann Arbor, Michigan.
Palmer, S.E. (1978). Structural aspects of visual similarity. Memory and Cognition, v.6, n.2, pp.91-97.
Rehling, J. A. and Hofstadter, D. R. (1997). The parallel terraced scan: An optimization for an agent-oriented architecture. To be published in the Proceedings of the IEEE First International Conference on Intelligent Processing Systems, Beijing.
Schank, R.C. and Childers, P. (1988). The creative attitude: Learning to ask and answer the right questions. New York: MacMillan.
Thompson, R. F. (1993). The brain: A neuroscience primer. New York: W. H. Freeman and Company.
Back to John Rehling's Home Page