In modern user interfaces, graphics play an important role in the communication between human and computer. When a person employs text and graphic objects in communication, those objects have meaning under a system of interpretation, or "visual language." Formal visual languages are ones which have been explicitly designed to be syntactically and semantically unambiguous. The research described in this paper aims at spatially parsing expressions in formal visual languages to recover their underlying syntactic structure. Such "spatial parsing" allows a general purpose graphics editor to be used as a visual language interface, giving the user the freedom to first simply create some text and graphics, and later have the system process those objects under a particular system of interpretation. The task of spatial parsing can be simplified for the interface designer/implementer through the use of visual grammars. For each of the four formal visual languages described in this paper, there is a specifiable set of spatial arrangements of elements for well-formed visual expressions in that language. Visual Grammar Notation is a way to describe those sets of spatial arrangements; the context-free grammars expressed in this notation are not only visual, but also machine-readable, and are used directly to guide the parsing.