17 August, modified section on other behavior systems to reflect recent VRML proposals, and added sections on time-critical behaviors, why a scripting language is a good idea, and api considerations.
30 August, modified the syntax, and restructured the document to clarify the ideas a lot.
We would like to provide for objects which can change over time and in response to external stimuli. In David Zeltzer's taxonomy of graphical simulation systems (Autonomy, Interaction, and Presence), this corresponds to adding autonomy (animation, physics), and interaction (user manipulation) to the system. At the moment, presence is beyond the scope of VRML, since it refers to such things as the type and quality of the hardware used to interact with the virtual world.
Any proposed extension to VRML faces a unique set of challenges. It must be flexible enough to provide for the needs of an extremely diverse group of users, on platforms ranging from PCs to supercomputers, and must be able to work over an untrusted wide-area network. Additionally, it should fit the spirit of both VRML and the WWW, be defined through a process of group decision-making, and should use public-domain tools where possible.
Inventor, from SGI, is a 3D graphics system written in C++, and is the one of the most commonly-used interactive 3D systems. It provides for nodes in a scene graph, and for adding additional node types using dynamically-loaded C++ code. Internally, Inventor makes use of multimethods, also used in Cecil and CLOS to provide for a scene graph that can perform several different kinds of behaviors in an extensible way. For example, Inventor lets the programmer define both new nodes and new actions that nodes can perform (e.g., alternative rendering techniques).
TBAG, developed at Sun Microsystems, is a functional, stateless 3D system. Function objects in TBAG can be related to each other by multi-way constraints and then evaluated with different parameters to create time-varying or user-controlled geometries. TBAG also makes use of multimethods internally, even more pervasively than Inventor.
The UGA system, which was developed at Brown University, uses an interpreted language, Flesh. This is a prototype-delegation object-oriented language which was specifically developed to rapidly prototype complex interactive 3D scenes. It supports a variety of animation techniques, including keyframing, inverse kinematics, physically-based modeling, and the evaluation of arbitrary functions.
Alice, from the University of Virginia, is specifically designed to support rapid prototyping of 3D immersive environments, and uses Python as an interpreted extension language.
ANIM3D was developed at Digital to visualize algorithms, and uses a custom prototype-delegation language, Obliq, which provides for concurrent behaviors.
Worlds, a commercial product developed by Worlds, Inc, allows for multiple participants to take part in a dynamic environment with animations, spatialized audio, and texture-mapped video. VRML+, which has been proposed by Worlds, includes a behavior extension protocol which should be language independent, and also defines an API for modifying the scene graph. They also propose a networking protocol for connecting VRML servers and clients.
The BE system, developed by the BE Software Company, provides an excellent system of Inventor-based nodetypes for describing behaviors. They define a Pascal-based syntax for embedding new behaviors inside nodes, and demonstrate some nodes for physically based modeling.
The graphics systems described above, however, tend to use different mechanisms that are more powerful in many ways than the mechanisms provided by C++. Multimethods in a graphics system allow for the extension of both the set of primitives and the set of renderers. Most graphics systems use some form of dynamic object model, such as the prototype/delegation model used in the Self language. This model is especially prevalent in MOOs.
Finally, most of the systems above make some use of concurrency. While not usually thought of as an extension mechanism, the use of concurrency permits the flexible extension of the system while it is running --- adding new behavior is simply a matter of adding a new thread (note that timer callbacks and similar mechanisms are essentially highly constrained forms of simulated concurrency).
We think that, given a suitable library of existing behaviors, most new behaviors should be simple to implement based on compositions of other existing behaviors. Only in cases where entirely new functionality is being added (for example, a node which opens up a socket and listens to it), should it be necessary to use an external programming environment.
Additionally, a simple scripting language specifically designed for the requirements of VRML (flexible inheritance, concurrency, VRML-embedded syntax), should be easier for non-programmers to learn.
This scheme has several benefits. It allows information about the contents of a scene (e.g., what kinds of nodes it contains) to be factored out into a separate world consisting of prototypes. Prototypes also provide the ability to DEF something without actually using it, a capability the current semantics of DEF doesn't support.
We can consider an example VRML file, providing some sample geometries that can be used in a plug-and-play fashion, included via a WWWInline, and used without duplicating models.
Prototype {
DEF Refrigerator {
fields [tempSetting SFFloat]
tempSetting 40
# Children for geometry here
}
}
Cube {
fields [ SFFloat width, SFFloat height,
SFFloat depth ]
}
However, this extension mechanism is not powerful enough to describe
the behavior of these new node types.
We propose an extension mechanism that permits new node types, browser extensions, and the interaction of new node types and new browser extensions. The intention of our extension mechanism is to provide a mechanism that is simple and easy to understand and implement. We wish to build as much as possible on VRML 1.0 for backwards compatibility, while adding just enough new syntax to get the job done.
As in VRML 1.0, new nodes have both fields and isA fields. We suggest the minor extension that the isA field be an MFString, rather than an SFString, allowing multiple inheritance while maintaining backwards compatibility with old syntax.
In VRML 1.0, the isA field is defined only to provide an alternate implementation of a new node if the VRML browser is not able to resolve the node. We would like to modify the semantics of isA to make it more like inheritance in a traditional programming language.
The first change is that any node which references another node using isA inherits all its field definitions. Also, since we may want to inherit from a specific instance of that node, we can use the following syntax:
DEF RedCube Cube {
color 1 0 0
}
AnotherRedCube {
isa USE RedCube
}
In this case, AnotherRedCube not only inherits the fields of
Cube, but also the color value of RedCube.
By default, this value is forwarded to/from the parent. This means
that if we change the color of AnotherRedCube, that actually
changes the color of RedCube.
However, a new node may use the new reserved word COPY before the name of a prototype, thereby creating a copy of all of the fields of the prototype. other). Any field values which are copied using COPY can be changed in the copy independent of the original prototype's value. Thus, a copy is ``dead,'' in that the new copy does not have any further ties to the object it was copied from. A copy, then, may be located on a different machine from its prototype, with contact occurring only during the initial copying.
For example, we could specify that AnotherRedCube is a real copy of RedCube:
AnotherCube {
isa COPY USE RedCube
}
Alternatively, a new node may use COPYALL,
indicating that this should be a "deep copy", and all the referenced
nodes should be copied as well (e.g., we want to ship an object and
its references over the network to a remote machine).
If COPY is not used values are shared, even remotely. Thus, a simple way to provide networked shadows of remote objects is by using isA:
# Remote source
DEF Remote {
fields { someValue SFBool }
someValue TRUE
}
# Local source
DEF ShadowOfRemote {
isA Remote
}
Combined with the use of the Prototypes nodes, this scheme
provides some support for distribution of functionality, as the
prototypes need not be in the same world as the nodes that inherit
from them or copy them. Of course, the too frequent use of this
mechanism could result in exceedingly slow worlds. We would recommend
that this only be used for cases in which two objects need to be seen
in exactly the same way, e.g., a locked door is important but fish in
a bowl may not be.
AnyMessage : Message
| UseMessage;
Message : Execute
| SetValue
| Wait;
Execute : fieldName
| SELF
| fieldName . Message;
SetValue : fieldName = Value;
Wait : WAIT fieldName;
UseMessage : USE nodeName . Message;
Value : literal | fieldName | UseClause;
This syntax is quite simple and can easily be added to a parser. It
supports a simple form of concurrency. As observed before, concurrency
allows extension of behavior while the system is running. In
addition, this form of concurrency is asynchronous, allowing for the
possibility that the objects receiving messages may be widely
distributed.
Setting and getting values are atomic operations (i.e., nothing else can change that value while it is being used), but in general all operations should be considered to happen in parallel. It is extremely important that these operations be atomic even when used with shared distributed objects. There are a several commercially available distributed databases which might be useful to use as a general lock server in such environments.
Note that, while conceptually every node on a single machine may be a thread, it need not be implemented in that way. A simple scheme would involve a round-robin of every node currently executing code, evaluating one message for each node in turn. A WAIT can be treated specially, taking the node waiting out of the round-robin. This provides conceptual concurrency without the use of operating-system threads.
# This just takes two nodes and does nothing with them
Sequencer {
fields [ one SFNode, two SFNode]
}
# Execute one, wait for it to finish, then execute the next
Serial {
isA Sequencer
code [ "one, WAIT one, two, WAIT two" ]
}
# Execute two nodes together, and wait for them both to finish
Parallel {
isA Sequencer
code [ "one, two, WAIT one, WAIT two" ]
}
Note that in order to specify a blocking operation, it is necessary
first to start execution of a node and then block on it.
Using the two nodes we just defined, we can create some interesting behaviors. For example, assume that we are writing a program which connects to a video server, and then plays an audio and a video stream simultaneously.
VideoInitialize {
code "http://vrml.org/VideoInitialize.java"
}
VideoPlay {
code "http://vrml.org/VideoPlay.java"
}
AudioPlay {
code "http://vrml.org/AudioPlay.java"
}
TVDisplay {
fields [ serial SFNode, parallel SFNode ]
parallel Parallel { }
serial Serial { }
code [ serial.one = VideoInitialize,
parallel.one = VideoPlay,
parallel.two = AudioPlay,
serial.two = parallel,
serial, WAIT serial ]
}
Since the assignment syntax just assigns an initial value to the
fields of the sequencing nodes, the previous example is equivalent to
the following:
TVDisplay {
fields [ serial SFNode, parallel SFNode ]
parallel Parallel {
one VideoPlay
two AudioPlay
}
serial Serial {
one VideoInitialize
two parallel
}
code [ serial, WAIT serial ]
}
For example, here is a Generic representing rendering, with several specifics representing particular ways to render.
Render {
isA [ COPY Generic ]
# This field defines the GenericField node containing the parameters
genericFields DEF RenderFields GenericFields {
fields [ viewer SFNode, node SFNode ]
}
#One implementation
GLRenderCube {
isA [
COPY USE Params
COPY Specific ]
generic USE Render
viewer USE GLViewer
node USE Cube
code BuiltinGLRenderCube
}
#Another Implementation
GLRenderSphere {
isA [
COPY USE Params
COPY Specific ]
generic USE Render
viewer USE GLViewer
node USE Sphere
code BuiltinGLRenderSphere
}
}
As a behavioral example of such a system, consider a physically based model of a pendulum. There are several different algorithms one could use to simulate the motion of a pendulum, ranging from extremely inaccurate but fast (linear extrapolation of velocity), to accurate but slow (fourth-order Runge-Kutta with a small stepsize).
In previous work, we have developed scheduling algorithms which attempt to balance computational demands versus rendering demands across multiple-processor systems in order to attain a constant frame rate in computationally demanding scientific-visualization environments. Such a complex system is not necessary in VRML, however, since such schedulers have high overhead, and since we can use simple reactive scheduling to good effect, as is obvious in SGI's Webspace browser.
Since a Specific is just another nodetype, we can embed several versions of the same algorithm inside an LOD node, as follows:
ComputePendulum {
isA [ COPY SimpleBehavior, COPY Generic ]
LOD {
range [1 8]
DEF ComputePendulumRK4 Prototype {
isA [ COPY SimpleBehavior, COPY Specific }
generic USE ComputePendulum
code "http://www.physics.com/rk4.java"
}
DEF ComputePendulumEuler Prototype {
isA [ COPY SimpleBehavior, COPY Specific }
generic USE ComputePendulum
code "http://www.physics.com/euler.java"
}
}
}
Note that this assumes a slightly different interpretation of
LOD than that traditionally used, but it is one that we would
like to encourage. This interpretation is: LOD does not
imply a set of representations of an object, to be interchanged based
on distance from an object; instead, it defines a set of relative
benefit values for alternate representations of an object or
algorithm. The browser is responsible for choosing among the set of
alternate representations based the actual cost of each representation
relative to the benefit of that representation and the current load on
the system. This interpretation allows for VRML scenes which can
contain fast geometry and behaviors, independent of the speed of the
host machine, but possibly sacrificing some accuracy.
Any proposed API should be able to be described both in terms of local invocations (direct function calls) and remote invocations (RPC, OLE, HTTP, etc). We would prefer a specification which respected the object-oriented nature of the scene graph, as well as provided for some ability to specify different levels of security (interface hiding), both on the local machine and remotely.
It is also necessary to provide an additional intermediate, device-independent API for managing such devices as the renderer, audio tools, video cameras, networking, etc. This is in itself a very difficult problem.
There are several important criteria for a 3D behavior-specification language. They must have the following:
Since it is advertised as a ``safe'' language, because it is well-defined and documented, with released source code for Sun Solaris, and because Netscape has recently licensed it for integration into their browser, Java would seem to be a good enough choice for a common language for the World Wide Web. Although we will present the following examples in terms of Java, the API and protocols could be reworked in any other object-oriented language.
Mitra from Worlds describes a process by which an interpreter for a new extension language could be downloaded dynamically. If this interpreter is able to generate code in the platform's native extension language, then it is no longer necessary to choose a single language. Note that this may cause confusion and make it difficult to create new nodes based on existing code, but it may be a viable solution for the polyglot web.
Due to time limitations, we have not attempted to describe how one would implement a new VRML node type in Java. This should happen relatively infrequently, since the language we have described in this paper is quite powerful; however, adding new functionality such as the ability to open sockets, etc., will require developing new libraries, probably in Java.
These nodes use fields to describe interesting values. These values would be more useful if VRML supported the ability to refer to only a particular field in a node, as can be done in Inventor. We suggest adapting that simple syntax (the node, followed by a period, followed by the field name), a syntax we are already making use of in message sends.
Other prototypes could then represent special capabilities of other browsers. For example, this provides an alternative mechanism to WebSpace's use of DEF with Info nodes to set special parameters. Thus, a prototype of WebSpace might support fields like backgroundColor and viewer.
Since some of these filters could be purely geometric, while others will be based on the semantics of the scene, generating useful composite behaviors is difficult for all scenes. However, multimethods can be used to determine how to render particular nodes for particular kinds of lenses.
Based on a combination of the possible input devices and the desired 3D representation of the person, we would keep a live connection (possibly using mechanisms as simple as Java's existing networking code) between the local browser and the remote participant, so that users could experience real-time shared behavior.
Worlds has developed a set of documents on how one might develop shared, multiuser VRML worlds. We are also currently writing a separate white paper on some of the important issues for multi-user worlds and how one might address them.
This is an ideal situation for multi-methods, where browsers which are capable of computing collision detection could decide to subclass scene-navigation methods and detect collisions with standard objects, or a world creator could create a new class of collidable objects, all of which would register themselves with a collision-detection manager.
We plan to continue to develop this proposal in participation with the VRML community and create a reference implementation as a testbed.
This work was supported in part by grants from NSF, ARPA, NASA, Taco, Sun, HP, and ONR grant N00014-91-J-4052, ARPA order 8225.