Tutorials

Sections

j3d.org

Implementing Collision Avoidance in Java 3D

Collision avoidance is the mechanism by where we do some prediction based on the user's current movement through the world and prevent them from doing silly things like walking through walls. What Java 3D provides by default is a collision detection system - notification that a collision has already occurred. In this document we are going to show you how to code a collision avoidance mechanism within Java 3D.

What we are going to present is a generalised system. There are a lot of known ways to optimise this particular process according to your application needs. During the process we'll point to some simple optimisation points and let you decide what needs to be done for your application.

     Note
    If you want to use the basic algorithms appearing here, there is a complete implementation available in the j3d.org Code Repository. This code is the generalised solution that you see here and so can be plugged into any application immediately.

 

Basic Process

The basic process of collision detection is about detecting where the user will be at some time in the future and making sure that they don't go there. Ok, so now you know everything there is to know about collision avoidance... let's move onto the serious stuff.

Where am I going to?

To do collision avoidance correctly, you have to know not only where you are now, but where you are going to. If something is in the road, you want to stop the user going there now, rather than later after you've gotten there.

What this simple statement implies is that the collision detection routine (and by association, terrain following) needs to insert itself into the same routine that does your user movement. That is, you need to know where you are now, and also have the movement routine calculate where you are going to be next and then perform your calculations and adjustments before the movement routine actually commits those changes.

Now, collision avoidance is all about predicting where you are going to be and modifying the actions so that you don't end up where predicted. The million-dollar question is just how far forward you should predict. In the model that we present here, we only ever look ahead the distance that we would travel in the next frame. This means that we get information about this frame, where we are going to be next frame and make the calculations during the current frame. While it is possible to predict further ahead, for most applications it is probably not worth it.

Picking your Collision

So now we know where we are and where we intend to go, the important question is to find out what we've hit - if anything. We do this in a multi-pass method by getting selectively more detailed.

The first step in the collision process is to use the standard Java3D picking routines. As the standard routines pick based on bounding boxes, this gives us a clear indication of whether there may be anything of interest. If the pick routine returns nothing, then we will not hit anything and so bug out now. To do the actual picking we create a line segment PickShape for the task. The start point is where we are now and the end point is where we will be in the next frame. If there is anything along that segment then we know we're about to hit something and should act.

     Note
    Note that we use a finite length line segment rather than an infinite ray. This is because we really don't care about anything beyond the next frame. If we were to worry about what happens two or three frames from now, we might end up with two or three times the amount of collidable objects to have to sort through. Besides, we have to do the same thing in the next frame only we'll be one step closer and we have better information on where the user is likely to be next. In other words, don't do today what you can put off until tomorrow.

So we now have a list of objects that we have potentially collided with we need to know in more detail if we really have collided with these objects. We need to check each piece of geometry returned to us to make sure that it really does get in our road. The best way to do this is to get all the picked items in a sorted list and check from the closest to the furthest away. This way the closest item that we actually hit will be discovered first and allows us to not have to process any others.

Getting down and dirty

Ah, the fun part. At this point we are playing with individually selected, hand crafted pieces of exquistely made polygons. We need to run our ray through them and see if they survive the picking process. What this really means is that we have to extract the geometry and go through every single polygon to see if we have a collision point. As soon as we have one piece of geometry that intersects, that all we have to know. There is no point processing the other for vanity's sake. Remember, we are trying to implement a real-time system here. The less work we can do, the better. Unless you are after very high resolution detail, you don't need to know which particular polygon had the intersection, just that one did. The mear fact that you have a collision is enough to know.

Implementing polygon intersection routines is quite a slow process. For very complex geometry, or badly structured scenegraphs, this could result in a lot of extra, unnecessary processing. In a later section we will look at how to construct optimal scenegraphs for fast collision detection.

Is there anybody out there?

If we find during the polygon search that there are no intersections, then we can just walk away and let the user carry on to where they were planning to be next frame. However, what should we do once we discover an imminent collision?

There are several different ways to respond, but they all follow the same basic idea - adjust where they are supposed to be to some other place so that next frame they have not passed through the collidable object. In the most simplistic case, this just sets all the transforms to zero so that the user does not move at all in the next frame. At the next level, we might just stop one component and leave the others. Finally, a more complex system will be to adjust the transform to leave you right on the object (adjust the transform to be the exact distance between you and the object) or you could do something like modifying the distance traversed in an inverse log type manner.

 

Implementation

Now comes the fun part, implementing it.

Design Assumptions

The design is based around an arbitrary item of scenegraph. We really don't know where we are, or really where the geometry is that has been picked. Thus we always assume the worst and make sure we transform everything correctly.

The second assumption is that we don't know how the data is structured. There are four important considerations:

  1. do we always have triangle sets or do we have to deal with arbitrary sized data.
  2. Can we guarantee that there is only one GeometryArray per Shape3D?
  3. Has the user structured the scene graph properly so that we never have intersecting geometry bounding boxes for the type of operation that we want to do (collisions or terrain following)
  4. Do either the viewpoint or the data exist under a large set of transforms or are they all at the world root?
Now remember that we are building this thing for absolute worst case situations. If you have any of the above assumptions that you can make, all the better.

Another key assumption in this design is that we have something that will do the lowest of level geometry and picking handling. We can just pass the appropriate information and get the right answer back. For the purposes of this tutorial, we are interested only in the higher level algorithms.

Representing the avatar is the next biggest decision to make. This time we've opted for the fastest system - the avatar is effectively just a point in space. For collisions, the avatar has no width and the height is exactly where the location of the transform is. For terrain following, we adjust that eye point to be a given height above the terrain.

Big Bunch o' variables

The enemy of realtime systems is slow operations. The more we can speed things up, the faster they become - simple really :). As almost any intermediate level Java programmer will tell you - the Garbage Collection system is a major PITA when dealing with systems that have to respond fast. What we really want to do is avoid having to allocate any memory at all because if we are allocating memory, that means we are also freeing memory too - a two way hit every frame.

With this in mind, our first task is to create a collection of variables that we can continue to reuse frame after frame. Everything that we know we are going to be re-using that is a class or an array should be preallocated and reused. And I mean everything. It doesn't matter whether it is a temporary variable that only gets used once per frame or not. It is zero allocation and zero garbage collection from our code.

Let's introduce a few of the common variables you will be seeing throughout the next few pages:
viewTg The TransformGroup that we are using to define our avatar's location in the virtual world.
viewTx The Transform3D that belongs to the view transform group. Used to get and set the local transformation for each frame.
worldEyeTransform The Transform3D that contains the complete transformation information about where our avatar is in world coordinates.
oneFrameTranslation The pre-calculated amount that we going to be moving this frame assuming everything else turns out right. We can adjust values in here as required because it has not yet been used to modify the view transformation.

Pickup Lines

First place to start is the picking. As discussed earlier, we need to make a segment of the correct length to use as the picking. We also, naturally need a PickShape to do the picking with, so we start with a PickRay that will do the job for us. The ray has a start and end point, so we need to set those.

Here is where we run into our first assumption - Our current location is described by some large collection of transformations above us. In order to make our picking accurate, we have to make sure the geometry being queried and the picking object are both in the coordinate system. The easiest solution to this is to make sure that they are both in the world coordinate system.

    viewTx.transform(oneFrameTranslation);

    viewTg.getLocalToVworld(worldEyeTransform);
    worldEyeTransform.mul(viewTx);
What we are doing here is making sure that the transformation information is defining exactly the transform to the centre of our position. Remember that the vworld transform of the TransformGroup does not include your local transformations. To make everything accurate, we make sure it does.

The next thing we need to do is set up the current point where we are now. To do this, we ask the eye transform for the location component.

    worldEyeTransform.get(locationVector);
    locationPoint.set(locationVector);
The problem we then have is that the PickRay will only take a Point3d instance where we can only get location information from the transform as a vector, so we have to do a bit of shuffling that you see in the second line.

In the next step, we have to determine just where the user is going to end up. That is, where are we going to end the pick ray. Here is a second assumption. When doing a transform, we have to transform a vector that already exists. The Transform3D gives us absolutely no information about what direction the user is pointing, all it gives us is rotation information - ie an angle about an arbitrary axis. To make sense of this, we have to make an assumption - what direction was the user actually facing when they first started. In every 3D graphics application I've ever seen, the user is always assumed to be facing along the -Z axis in the local coordinate system. Therefore, as a starting point to work out where we are now point, we transform this vector (which we've defined as a constant since we use it so often)

    worldEyeTransform.transform(COLLISION_DIRECTION, collisionVector);
The next item to deal with is representing the size of the avatar. Although we have the information about how far we are going to move in the next frame, we must also take into account how fat our avatar is. A big avatar will hit something before a smaller one. Now, for efficiency reasons, our original collision vector was a unit vector along the -Z axis (ie (0,0,-1)). To make that vector look like our avatar, we scale it by the avatar size.
    collisionVector.scale(avatarSize);
Now, what we have is a vector that represents our avatar. To find out where the end point is of the pick ray, we need to do a small amount of vector math. The end point is just the sum of the various vectors that describe our current position, the direction we are travelling in and the adjustment for the next frame. If that has confused you, take a look at Figure 2.

Vectors contributing to find the end of the pick segment
Figure 2: Vectors contributing to find the end of the pick segment

One little quirk in our code is that we do this all as a point rather than vector. Mathematically the representations are identical, it's just that Java 3D makes a point of the distinctions.

    locationEndPoint.add(locationVector, collisionVector);
    locationEndPoint.add(oneFrameTranslation);
There we go, we now have both the start and end point for our picking. The last thing to do is to setup the pick shape and do it
    collisionPicker.set(locationPoint, locationEndPoint);
    SceneGraphPath[] closest = collidables.pickAllSorted(collisionPicker);

    if(closest == null)
        return;
The last little section is about dealing with the case when there is nothing to collide with. If the shape didn't find something to collide with, then there is no point going any further in the calculations.

Filtering Geometry

At this stage we know that there are objects potentially in our road. What we haven't determined is if they really are in the way. The basic pick routines of Java just deal with bounding boxes. As we mentioned before - what if our geometry is an arch? The bounding box would say that we can't walk through it even though we really can. To deal with this, we have to start looking at the actual geometry of the object - is there really a polygon right in front of us?

From what I can tell, the core Java3D code does not provide us this information so we need to go to some outside sources. You can either use the picking utilities that Sun includes in the com.sun utilities package, or you have to find some other code to do it. (The Code Repository has some of this code). In either way, what we are really interested in is dealing with only knowing whether there was something in front of us or not.

In the last code snippet in the previous section, you will notice that we asked for the picking to return all the items that could be picked. This is to deal with someone who's badly constructed the scene and may have a number of objects where the bounding boxes overlap. It is possible that we may have two objects really close together where the first one ends up not being in the road, but a further away one will be. Under most circumstances this won't be a problem, but if we are flying a jet through the scene, the amount that we could move in a single frame could be quite a long way, so we want to cater for that. Naturally, we also ask for everything to be sorted so that as soon as we find the closest item, we can stop. We are most likely to hit the closest item so a sorted check is preferred.

With each item that we have picked, we want to then check it for the collision. As a Shape3D may contain multiple geometries, we also need to deal with that situation too.

    boolean real_collision = false;

    for(int i = 0; (i < closest.length) && !real_collision; i++)
    {
        Shape3D i_shape = (Shape3D)closest[i].getObject();

        Enumeration geom_list = i_shape.getAllGeometries();

        while(geom_list.hasMoreElements() && !real_collision)
        {
            GeometryArray geom = (GeometryArray)geom_list.nextElement();

            if(geom == null)
                continue;

            real_collision =     // call specialist routine here
        }
    }
Note that at the top of the loop here, we have added an extra working variable. We use this to decide when to break out of the various loops when we have detected a real collision - that is an intersection between the polygons and our pick segment.

     Note
    If you can guarantee that you never have intersecting bounding boxes for the geometry and/or only one GeometryArray per Shape3D then we can eliminate these loops. That should add a modest performance improvement.

One part that we have not put in the above code is dealing with the details of the pick. One of our earlier assumptions was that the viewpoint and the geometry were both nested under some collection of transformations. If this is the case then we need to make sure that both items are talking the same coordinate system. To do this, we need to access the transformation. Now we could ask the Shape3D for this, but there is no real need. The SceneGraphPath gives us this already. By calling the getTransform() method you will have the local vworld transform for the picked object. You can then pass that along with your individual geometry items to the low-level routines.

Handling the Results

After this stage we really know whether we have collided with anything. If we haven't detected any real collisions, then the code continues on it's merry way - possibly to do terrain following next. Otherwise, it will just set the view transform and exit
    viewTranslation.add(oneFrameTranslation);
    viewTx.setTranslation(viewTranslation);

    try
    {
        viewTg.setTransform(viewTx);
    }
    catch(Exception e)
    {
        //check for bad transform:
        System.out.println("Transformgroup set invalid");
        viewTx.rotX(0);
        viewTg.setTransform(viewTx);
    }
That's it. You now have functioning collision detection in Java 3D.

 

Optimal scene construction

Now that you understand the basics of implementing collision detection, you might want to look at how to build your scene in an optimal fashion. As you might be able to tell, the above code involves an awful lot of calculation and it has to be done every frame. The more calculation that you can get rid of by simply constructing the scene better, the faster your application will run. This next section is devoted to providing a series of tips for implementing more efficient scene graphs.

Geometry organisation

The last section highlighted a number of areas that could come in for some improvements by restructing your geometry to be more collision-friendly. The major aim of this is to reduce the amount of choices that you have to look at for in-depth intersections. If we can get this down to one choice per frame, that is wonderful and sure to save a good lot of calculations.

As the first area to look at, we want to try to elminate objects that have shared bounding boxes. Shared or overlapping bounding boxes can come in many forms. They could be two objects side by side but with overhangs or it might be one object inside another - for example a table inside a house. Ideally, we want to minimise the number of structures that you can "climb inside" the bounding box of. As soon as your avatar is allowed inside that bounding box the chances of having to do extra collision detection on multiple objects sky-rocket. So, for a room type example, don't create all the sides of the room with one piece of geometry. Instead have each wall as a separate item. Similar story with a table - don't create the table as one big set of polygons, build each leg and the top as separate entities.

Multiple geometries are the next target for optimisation. Here we find an interesting problem - if we can reduce each Shape3D to a single GeometryArray item we'll gain maximum picking performance. As a competing goal, it is recommended that for the highest framerate you try to put as many geometry items into a single shape as possible so that the Java 3D internals can optimize everything better. This really doesn't help our cause and so you will have to do a trade-off between which set of goals are more important.

Lastly, we look at shared groups. As you might notice, in the above code we don't explicitly look at shared information. However, a tip that has come from the Java 3D teams is that they recommend reducing the usage of SharedGroup because it adversely effects performance. At the same time, it doesn't really help our collision detection either as it means we also have to transform all of the coordinates for detection into whatever the local coordinate system is.

Transformation

A very costly action that we must perform each frame is the transformation of points back and forth between all of the local coordinate systems that may be in use. If we can reduce the transformation costs to zero then we get a massive gain in performance. The way to do this - make sure that all the objects are located at the root of the world coordinate system rather than under a set of nested transforms.

Next steps

That's all there is to the collision avoidance process. We now move on to the specifics of implementing terrain following.