Spatial Data Structures
To reduce the ray-object intersection tests for raytracing a spatial data structure is needed. I started with an octree first, because it seemed to be the best choice, until i read about BVH’s (Bounding Volume Hierarchy) which seem to be used more often than octrees, due to some advantages (simpler, faster, collision detection, dynamic scenes, memory usage, …).
The following shows a comparions of those two: Octree and BVH.
The bounding-box of the scene defines the size of the root node. I tested both, using the largest extend to make a cubic volume, and one that fits tight. in the end, performance wise, there wasn’t a big difference.
First pass: Triangles are stored in the last child that fully contains it.
Second pass: Now some triangles remain in nodes, that aren’t leafes, so they are index now by every leaf, that overlaps with the triangle. Doing this in a second pass was faster than including this step in the first pass.
Third pass: Some further subdivisioning is happening here to get a finer grid. The two criteria i’m using are: MAX-DEPTH and DEPTH-FILL-RATIO. This prevents from having too big nodes and too small ones. The values i’m using usually are for the depth 10-15 and for the depth-fill-ratio 0.9-1.5. (depends on the number of triangles etc…)
The octree generation is quite fast, since no expensive computations are done for some smarter storing/subdividing.
But in the end it takes quite a lot of memory since a lot of triangles are indexed in multiple leaf-nodes. (see statistics on the screenshots)
The tree ends up with only having objects in leafes, objects are indexed by multiple leafes, and some leafes contain multiple objects. So during traversal only object-intersections WITHIN the leaf (= hit happens inside the leaf bounds) are to consider, and some “intersections-test flagging” is needed.
I implemented the algorithm presented in this paper:
“An Efficient Parametric Algorithm for Octree Traversal”
BVH (Bounding Volume Hierarchy)
I use the top-down-construction method, … so the root-node is defined by the bounds of the scene and the childs (left, right) are generated recursively by separating the triangles using a cost-function (SAH). I tested several splitting operations (median, mean, …), but the best result for raytracing gave the SAH-cost function. To get the best SAH-cost the objects are sorted along the splitting axis and the test happens for every objects. The surface areas at each splitting position are saved in a prior step. The test is made for all three axis to get the best splitting position at the best axis. This is by far the slowest construction method i tested but the ray-traversing was a lot faster therefore. The BVH- building was still a lot faster than the Octree-building. To speed up the process one can optionally test at a higher intervall for the lowest SAH-cost.
During the building, the orderering of the childs (A before B, or reverse) matters for faster traversal as is explained in the next step. The options for ordering are (in my case): number of descending childs, number of objects (in the descending childs), aabb-surface area.
I started using a simple recursive approach, which gave me quite an acceptable performance. Starting at the root, each childs AABB is tested for ray-intersection. The child, that got the closest intersecting is inspected first for further intersections, …until a leaf is reached, then the actual object is tested for intersection. The process is finished as soon as an intersection with an objects is found, and no childs AABB-intersection is closer (in that case, those have to be tested too). This version attempts to always use the first intersected child to traverse further. But still, a lot of unecessary traversing steps are done. Using a stack is quite a bit faster, but also encounters the same problem.
What i do now is, i use a simple linked list as a todo list for intersection-tests. The difference here to other algorithms is, that i keep the list sorted, by inserting nodes depending on their intersection-distance, so the node with the nearest intersection distance is always at the first position. At every iteration, the first element gets removed from the list and is inspected further. This reduces the AABB intersection tests to a minimum, since the list is always up to date, no matter what got tested first. Sometimes a decision has to be made, when two (or more) list items have the same intersection distance, at which list position to save it. In that case the ordering of the childs becomes important, which is done during the building process. Also other parameters, e.g. current depth etc. can be taken into account.
A nice (side-) effect of this sorted list is, that the travering is done as soon as an object (triangle) got intersected by the ray (assuming one object per leaf) and the current stack-items AABB has no closer intersection-distance. This gave me an performance plus of 2-10% to other traversal algorithms i tested (differs also from scene to scene).
EDIT: Modified the Linked List version to a Stack-version. The stack is beeing kept sorted too, using insertion sort, to reduce computationt time. However, i didn’t notice any real performance-boost by now.
Tested both, i found the BVH (traversal and construction) to be faster then the Octree, when comparing it in the raytracing (path-tracing) setup im working on at the moment.
Not tested yet, but i hope the BVH will also be easier to use for raytracing in combination with OpenCL than the octree version.
The following images (made with processing) show a comparision of the BVH (left) and Octree (right) using the same model. The current tree-depth is highlighted by the gray semitransparent boxes.