This paper describes the Visual Software Agent (VSA), an Internet-based interface agent with rocking realistic face and speech dialog function. The VSA is connected with the WWW/Mosaic. Therefore, we can describe information for the VSA by Hyper-Text Markup Language (HTML), a widely used language for hyper-media in the WWW. A sub agent autonomously gathers information like weather forecast and news topics into a local database. It accesses periodically specific WWW servers, then picks necessary information out and converts it into a suitable format for the VSA. A user can also navigate on the Internet by speech dialog with the VSA, in addition to current mouse interface in the Mosaic. Both the speech dialog operation through the VSA and the mouse operation have the same priority. The user can choice the suitable operation method in any time in accordance with the various situations. It is useful for persons unfamiliar with a computer and physically handicapped persons, and for situations that the mouse interface is not suitable, for example, in a system using a wall-type roomwide display.