|
| |
Java 3D Performance
Sourced from java3d-interest list. Posted by
Doug Twilleager of Sun
The Java 3D API was designed with high performance 3D graphics
as a primary goal. Since this is a new API, many of its performance
features are not well known. This document presents the performance
features of Java 3D in a number of ways. It describes the specific
API's that were included for performance. It describes which
optimizations are currently implemented in Java 3D 1.1. And, it describes
a number of tips and tricks that application writers can use to improve
the performance of their application.
Performance in the API
There are a number of things in the API that were included specifically
to increase performance. This section examines a few of them.
- Capability bits
Capability bits are the applications way of describing its intentions
to the Java 3D implementation. The implementation examines the
capability bits to determine which objects may change at run time.
Many optimizations are possible with this feature.
- Compile
The are two compile methods in Java 3D 1.1. They are in the
BranchGroup and SharedGroup classes. Once an application calls
compile(), only those attributes of objects that have their
capability bits set may be modified. The implementation may then
use this information to "compile" the data into a more efficient
rendering format.
- Bounds
Many Java 3D object require a bounds associated with them. These
objects include Lights, Behaviors, Fogs, Clips, Backgrounds,
BoundingLeafs, Sounds, and Soundscapes. The purpose of these
bounds is to limit the spatial scope of the specific object.
The implementation may quickly disregard the processing
of any objects that are out of the spatial scope of a target
object.
- Unordered Rendering
All state required to render a specific object in Java 3D is
completely defined by the direct path from the root node to the
given leaf. That means that leaf nodes have no effect on other
leaf nodes, and therefore may be rendered in any order. There
are a few ordering requirements for direct descendents of
OrderedGroup nodes or Transparent objects. But, most leaf nodes
may be reordered to facilitate more efficient rendering
- Appearance Bundles
A Shape3D node has a reference to a Geometry and an Appearance.
An Appearance NodeComponent is simply a collection of other
NodeComponent references that describe the rendering characteristics
of the geometry. Because the Appearance is nothing but a
collection of references, it is much simpler and more efficient for
the implementation to check for rendering characteristic changes when
rendering. This allows the implementation to minimize state changes
in the low level rendering API.
Current Optimizations in Java 3D 1.1
This section describes a number of optimizations that are currently
implemented in Java 3D 1.1. Other optimizations will be implemented as
the API matures. The purpose of this section is to help application
programmers focus their optimizations on things that will compliment
the current optimizations in Java 3D.
- Hardware
Java 3D uses OpenGL and Direct3D as the its low level rendering API's.
It relies on the underlying OpenGL and Direct3D drivers for its low
level rendering acceleration. Using a graphics display adapter that
offers OpenGL or Direct3D acceleration is the best way to increase
overall rendering performance in Java 3D.
- Compile
Java 3D currently implements only one compile optimization at this
time. When a BranchGroup is compiled, the implementation descends
the subtree looking for static Shape3D nodes. Once it finds all the
static Shape3D nodes, it will combine Shape3D nodes that have the
same rendering characteristics into a single Shape3D node.
- State Sorted Rendering
Since Java 3D allows for unordered rendering for most leaf nodes,
the implementation sorts all objects to be rendered on a number
of rendering characteristics. The characteristics that are sorted on
are, in order, Lights, Texture, Geometry Type, Material, and finally
localToVworld transform. The only exception to this is any child of
an OrderedGroup node. There is no state sorting for those objects.
- View Frustum Culling
The Java 3D implementation implements view frustum culling.
The view frustum cull is done when an object is processed for a
specific Canvas3D. This cuts down on the number of objects needed
to be processed by the low level graphics API.
- Multithreading
The Java 3D API was designed with multithreaded environments in mind.
The current implementation is a fully multithreaded system. At any
point in time, there may be parallel threads running performing
various tasks such as visibility detection, rendering, behavior
scheduling, sound scheduling, input processing, collision detection,
and others.
Tips and Tricks
This section presents a number of tips and tricks for an application
programmer to try when optimizing their application. These tips focus on
improving rendering framerates, but some may also help overall application
performance. A number of these optimization will eventually be handled
directly by the Java 3D implementation.
- Capability bits
Only set them when needed. Many optimizations can be done when they
are not set. So, plan out application requirements and only set the
capability bits that are needed.
- Bounds and Activation Radius
Consider the spatial extent of various leaf nodes in the scene and
assign bounds accordingly. This allows the implementation to prune
processing on objects that are not in close proximity. Note, this
does not apply to Geometric bounds. Automatic bounds calculations
for geometric objects is fine.
- Change Number of Shape3D Nodes
In the current implementation there is a certain amount of fixed
overhead associated with the use of the Shape3D node. In general,
the fewer Shape3D nodes that an application uses, the better.
However, combining Shape3D nodes without factoring in the spatial
locality of the nodes to be combined can adversely effect performance
by effectively disabling view frustum culling. An application
programmer will need to experiment to find the right balance of
combining Shape3D nodes while leveraging view frustum culling.
- Geometry Type and Format
Most rendering hardware reaches peak performance when rendering
long triangle strips. Unfortunately, most geometry data stored
in files is organized as independent triangles or small triangle
fans (polygons). The Java 3D utility package includes a stripifier
utility that will try to convert a given geometry type into long
triangle strips. Application programmers should experiment with
the stripifier to see if it helps with their specific data. If not,
any stripification that the application can do will help. Another
option is that most rendering hardware can process a long list of
independent triangles faster than a long list of single triangle
triangle fans. The stripifier in the Java 3D utility package will
be continually updated to provided better stripification.
- Appearance/Texture/Material by Reference
To assist the implementation in efficient state sorting, applications
can help by sharing Appearance/Texture/Material NodeComponent objects
when possible.
- Application Threads
The built in threads support in the Java language is very powerful,
but can be deadly to performance if it is not controlled. Applications
need to be very careful in their threads usage. There are a few
things to be careful of when using Java threads. First, try to use
them in a demand driven fashion. Only let the thread run when it has
a task to do. Free running threads can take a lot of cpu cycles from
the rest of the threads in the system - including Java 3D threads.
Next, be sure the priority of the threads are appropriate. Most Java
Virtual Machines will enforce priorities aggressively. Too low a
priority will starve the thread and too high a priority will starve
the rest of the system. If in doubt, use the default thread priority.
Finally, see if the application thread really needs to be a thread.
Would the task that the thread performs be all right if it only ran
once per frame? If so, consider changing the task to a Behavior that
wakes up each frame.
- Java 3D Threads
Java 3D uses many threads in its implementation, so it also needs to
implement the precautions listed above. In almost all cases, Java 3D
manages its threads efficiently. They are demand driven with default
priorities. There are a few cases that don't follow these guidelines
completely.
- Behaviors
One of these cases is the Behavior scheduler when there
are pending WakeupOnElapsedTime criteria. In this case, it needs
to wakeup when the minimum WakeupOnElapsedTime criteria is about
to expire. So, application use of WakeupOnElapsedTime can cause
the Behavior scheduler to run more often than might be nessesary.
- Collision
Another case where Java 3D threads are not well controlled is when
collision detection is enabled. When collision detection is
enabled, the collision detection thread is effectively free
running. This is a shortcoming of the current implementation,
and should be fixed in a future release. An alternative to using
the collision detection system might be to use picking with a
PickSegment or PickShape once per frame. This can be used for
specific types of applications. Applications should also use
bounds based collision detection rather than geometry based
collision detection.
- Sounds
The final special case for Java 3D threads is the Sound subsystem.
Due to some limitations in the current sound rendering engine,
enabling sounds cause the sound engine to potentially run at a
higher priority than other threads. This may adversely effect
performance.
- Threads in General
There is one last comment to make on threads is general. Since
Java 3D is a fully multithreaded system, applications may see
significant performance improvements by increasing the number of
CPU's in the system. For an application that does strictly animation,
then two CPU's should be sufficient. As more features are added to
the application (Sound, Collision, ...), more CPU's could be utilized.
| Note
|
|
When running in the Solaris environment, be sure that native
threads are enabled. Green threads will not take advantage
of multiple CPU's.
|
- Switch Nodes for Occlusion Culling
If the application is a first person point of view application, and
the environment is well known, Switch nodes may be used to implement
simple occlusion culling. The children of the switch node that are
not currently visible may be turned off. If the application has
this kind of knowledge, this can be a very useful technique.
- Switch Nodes for Animation
Most animation is accomplished by changing the transformations that
effect an object. If the animation is fairly simple and repeatable,
the flipbook trick can be used to display the animation. Simply put
all the animation frames under one switch node and use a
SwitchValueInterpolator on the switch node. This increases memory
consumption in favor of smooth animations.
- OrderedGroup Nodes
OrderedGroup and its subclasses are not as high performing as the
unordered group nodes. They disable any state sorting optimizations
that are possible. If the application can find alternative solutions,
performance will improve.
- LOD Behaviors
For complex scenes, using LOD Behaviors can improve performance by
reducing geometry needed to render objects that don't need high
level of detail. This is another option that increases memory
consumption for faster render rates.
- Picking
If the application doesn't need the accuracy of geometry based
picking, use bounds based picking.
- Move Object vs. Move ViewPlatform
If the application simply needs to transform the entire scene,
transform the ViewPlatform instead. This changes the problem from
transforming every object in the scene into only transforming the
ViewPlatform.
|