[ZZ] The Naked Truth About Anisotropic Filtering

http://www.extremetech.com/computing/51994-the-naked-truth-about-anisotropic-filtering

In the seemingly never-ending quest for more perfect 3D rendering, numerous filtering techniques are used to map an apparent three-dimensional shape into a 2D monitor. Usable anisotropic filtering (AF) is one of the more recent effects to migrate from the digital cinematographer’s workstation to the PC, joining bilinear and trilinear filtering, among others (for a complete description of AF, see the next section). Both nVidia and ATI now support this type of filtering in hardware, but each approaches the problem somewhat differently. And that has repercussions both in overall performance and in rendered scene quality.

3D graphics capabilities on the PC are coming closer to equaling or surpassing the computer graphics (CG) effects we see in today’s films. But rendering horsepower and memory bandwidth remain finite commodities that have to be spent judiciously, and that means tradeoffs between frame rate and image quality.

Real-time 3D is the black art of getting the most visual bang for the least processing horsepower/memory bandwidth buck, and as real-time 3D in games continues to approach CG effects in films, the incremental improvements have become increasingly subtle, though still important. Anisotropic filtering would seem to be the next frontier of texture filtering in 3D, but if applied to every pixel in a scene, it can induce a severe performance hit. ATI and nVidia implement AF differently, and for now, nVidia’s GeForce 4 Ti 4600 suffers a considerably higher performance hit in order to enable AF versus ATI’s Radeon 9700.

Does one company’s approach deliver obviously superior image quality to the other? Does one technique cut too many corners so as not to blotto frame rate, but compromise overall image quality improvement? And, how much of a palpable difference does either method really make in a game where scenery is whipping by at 60fps? We’ll show you the different ways nVidia and ATI implement AF, look at the performance hit of each approach, and illustrate image quality differences so you can judge the importance of AF for yourself. Is AF the next must-have feature? Let’s find out.

 

Anisotropic filtering (AF) is used to address a specific kind of texture artifact that occurs when a 3D surface is sloped relative to the view camera.

Before we drill too deeply, here’s a working definition of the word itself. Isotropy describes when an object’s vectors are of equal value along its different axes – like a square or a cube. For instance, bilinear and trilinear filtering are both isotropic filtering techniques, since their filtering pattern is square. Anisotropic filtering occurs when the filtering pattern exhibits different values along different axes. AF uses a non-square, or an-isotropic filtering pattern, hence the name. The pattern used by AF is typically rectangular, though it can at times be trapezoidal or parallelogram-shaped.

A single screen pixel could encompass information from multiple texture elements (texels) in one direction, such as the y-axis, and fewer in the x-axis, or vice-versa. This requires a non-square texture filtering pattern in order to maintain proper perspective and clarity in the screen image. If more texture samples are not obtained in the direction or axis where an image or texture surface is sloped into the distance (like a receding perspective view), the applied texture can appear fuzzy or out of proportion. The problem worsens as the angle of the surface relative to the view camera approaches 90 degrees, or on-edge.

To correct the problem, as mentioned, AF uses a rectangular, trapezoidal, or parallelogram-shaped texture-sampling pattern whose length varies in proportion to the orientation of the stretch effect. With AF, textures applied to the sloped surfaces will not look as fuzzy to the viewer.

A classic example is a texture with text; recall the text at the beginning of every Star Wars film that sets up the story. As the text scrolls off into the distance, its resolution and legibility both tail off. Another example is in a billboard of a racing game, where the text looks fuzzy and/or disproportionate without AF, and much clearer with AF applied.

Anisotropic filtering is a difficult concept to convey in words, so we’ve included some images provided by ATI and Nvidia to demonstrate the effects of AF. These images clearly show differences in render quality with AF enabled, but own game testing told a somewhat different story. For more insight on AF technology, you might want to read this document.

 

 

Anisotropy Example

 

 

 

Isotropic vs. Anisotropic

 

Because all filtering operations are addressing essentially the same problem — not enough texture samples — two other factors come into play that will limit AF’s effectiveness: pixel resolution and monitor screen size. Dialing up pixel resolution helps address all Nyquist/sampling-related problems because you’re giving the renderer more sample points (pixels) with which to describe the 3D images. A resolution like 1600×1200 without full scene anti-aliasing (FSAA) can help improve overall image quality, and on a small monitor (17″ and under) the pixel density is such that artifacts aren’t as visible because of the “pixel compression” that occurs when you stuff that many pixels (1.92 million to be exact) onto a piece of glass that small. But AF’s main intent is to address insufficient texture resolution, and the resultant distortion that occurs when part of a texture map is projected and effectively stretched over a sloped polygon surface.

The Solution: Anisotropic Filtering

Anisotropic filtering has classically averaged sixteen texture samples, or taps, in a non-square sampling pattern to generate one texture element that is applied to a single pixel– four times as many as bilinear filtering, and twice as many as trilinear filtering. ATI’s AF implementation will actually use up to 128 texture samples per pixel when running at its 16X Quality setting, so applying this type of filtering to an entire scene is very costly, especially in the area of memory bandwidth. For its part, nVidia’s highest setting, 8X, uses up to 64 texture samples per pixel for its AF operations.

 

Here are the numbers of taps that get used at each of the two companies’ AF settings (note that ATI has two general settings, called “performance” and “quality” modes):


 
 
AF Setting ATI Performance ATI Quality nVidia
 
1X Not supported Not supported 8
 
2X Up to 8 Up to 16 16
 
4X Up to 16 Up to 32 32
 
8X Up to 32 Up to 64 64
 
16X Up to 64 Up to 128 Not supported
 

 

Number of Texture Samples or “Taps” per Pixel

 

According to the canonical definition of AF, which uses 16 filter taps per pixel, ATI isn’t really doing AF until either 2X Quality or 4X Performance modes. And nVidia isn’t doing “true” AF until you dial it up to at least 2X.

The “up to” qualifier for all sample figures is due to both GPUs using an adaptive implementation of AF, where they do not apply AF to the entire scene, but instead apply it in varying degrees only to those areas selected by their algorithms.

Why are both GPUs using an adaptive implementation to AF? Three reasons: bandwidth, bandwidth, and bandwidth. Consider these numbers:

Using an example of 1024x768x32 at 60fps, here’s a few bandwidth equations: Starting with the case of trilinear filtering, eight texture samples are read and averaged, with a resulting single texel applied to a single screen pixel. The 786,432 screen pixels per frame in a 10×7 display would require 6,291,456 texture reads (when not using texture compression). At four bytes per texel that’s 25,165,824 bytes per frame, times 60 frames per second = 1,509,949,440 bytes/sec or 1.5GB/sec. As it turns out, many games are now using texture compression, which generally nets about a 4:1 compression ratio, so that drops the bandwidth requirement to a mere 377,487,360 bytes/sec or 360MB/sec.

Assuming we use nVidia’s Full Monty AF setting of 8X, which uses 64 samples per pixel– the approximate bandwidth requirement would be 8 * 360MB/sec = 2880MB/sec or 2.8GB/sec. At a pixel resolution of 1600x1200x32, which puts 2.4X as many pixels on-screen as 1024x768x32, your bandwidth requirement increases to 6.7GB/sec.

These are of course raw bandwidth figures, and texel cache hits obviously reduce the actual load placed on graphics memory considerably in these types of operations. This is key, because the 2.8GB/sec memory bandwidth requirement is close to half of GeForce 4 Ti 4600′s usable bandwidth of 7.5-8GB/sec. Without 4:1 texture compression, that figure skyrockets to 11.2GB/sec, which exceeds GeForce 4 Ti 4600′s theoretical graphics memory bandwidth. You can see that without high texel cache hit rates, AF, and FSAA for that matter, become unwieldy problems because of the huge amounts of memory bandwidth they could eat. And this doesn’t include other processing that uses frame buffer bandwidth including depth buffering and the RAMDAC reading final data out for display.

3D graphics is rife with examples where 3D renderers seek to eliminate redundant and unnecessary work to conserve GPU cycles and memory bandwidth. But most of these optimizations involve eliminating non-visible geometry or pixels (culling, clipping, depth-buffering, backface culling, etc.), or reducing texture and geometric resolution using a distanced-based algorithm (MIP-mapping, geometric LOD). But none of these techniques cut corners in filtering operations, especially on near-field objects where texture and geometric details are crucial.

Matrox’s fragment antialiasing (FAA) is one example of a selective algorithm that only gets used on parts of a 3D scene. This algorithm, which Matrox admits has some compatability problems, operates only on scene object edges to smooth out the stair-stepping effect.. But again, most 3D work-reduction algorithms focus on eliminating non-visible geometry and textures. These elements add nothing to the scene; so throwing them away doesn’t detract from a scene’s image quality.

 

ATI took some deserved flak for not supporting simultaneous use of trilinear MIP-mapping and anisotropic filtering in the Radeon 8500, and they have addressed this shortcoming in the Radeon 9700. ATI’s implementation of AF in the Radeon 9700 takes the slope information calculated during triangle setup (scan-line conversion) and retains it for use later in the pipeline for texture filtering operations. This slope information is determined on a per-polygon basis. ATI’s approach not only determines whether to use AF or not, but also how much AF to apply, depending on how steeply sloped a surface is relative to the view camera.

The level of AF is user configurable via ATI’s driver settings. Users can select either performance or quality mode, and then the level of AF filtering at 2X, 4X, 8X, or 16X, per this chart. The card then does filtering up to the level selected, as a function of the slope of the polygon surface.


 
 
AF Setting ATI Performance ATI Quality
 
2X Up to 8 Up to 16
 
4X Up to 16 Up to 32
 
8X Up to 32 Up to 64
 
16X Up to 64 Up to 128
 

The number of taps for each setting is the maximum number that will be used to correct the most severe distortion of surfaces that are very steeply sloped (nearly on-edge) relative to the view camera, and as the angle of a surface’s slope decreases, ATI’s AF algorithm uses incrementally fewer taps to perform the AF operations. As the Radeon 9700 uses more taps, the AF rectangular sampling pattern is lengthened to factor in these additional taps that are used to correct the more severe distortion.

The principal difference between ATI’s Performance and Quality settings is that the former uses bilinear filtered AF, and the latter uses trilinear filtered AF. As with the image quality difference between standard bilinear and trilinear filtering, the incremental image quality gain is often not very discernible, though it may be visible in certain scenes.

The AF settings selected by the user are forced on regardless of whether it’s requested by an application, since few games have an explicit AF switch. According to ATI, the Performance mode forces on bilinear/AF for both Direct3D and OpenGL, while the Quality mode forces on trilinear/AF for Direct3D, but not OpenGL.

There are some cases, text overlays in HUD panels being a notable example, where AF produces adverse image quality effects, and so forcing it on without the game’s knowledge actually can degrade image quality. This is the main reason why ATI does not support trilinear on OpenGL.

ATI’s adaptive implementation of AF is a trade-off between not applying AF where it isn’t needed (in ATI’s opinion) and using dynamically more filter taps on affected areas, with the most severely affected areas getting the most taps. It’s a kind of budgeting of AF that ATI believes delivers the best net effect.

ATI readily admits that its technique selectively applies AF, and makes no secret about not applying AF to the entire 3D scene – unless every surface is sloped relative to the view camera, an almost unheard of case. ATI engineers knowingly implemented AF in a sort of “triage” fashion, where the most severely affected areas get the most filtering, and unaffected areas get none (other than bilinear or trilinear).

The problem? It’s impossible to know exactly how much additional fill rate load the 9700 takes on when AF is enabled, since it’s a function of the number of sloped surfaces in the scene, and the degree of each individual slope. And because nVidia uses its own adaptive algorithm, an apples-to-apples comparison between the two implementations where each both are doing equal filtering work is all but impossible.

 

nVidia contacted us a few weeks ago, and accused ATI of possible AF corner-cutting. At the same time, nVidia product marketing management told us that the GeForce Ti actually applied AF to every pixel on the screen. Subsequently, nVidia engineers contradicted that explanation.

After a lot of back and forth, we finally got to the bottom of the mess. It turns out that nVidia also uses an adaptive implementation, although nVidia’s AF uses trilinear filtering plus AF (equivalent to ATI’s “Quality” setting for AF). The differences between ATI’s and nVidia’s implementations turn out to be more subtle than we first suspected, or were first told by nVidia.

The main difference appears to be the sample pattern shape, with ATI using a rectangular pattern, and nVidia using a four-sided polygonal pattern that will change depending upon the degree of slope-related distortion along the x and y-axes. nVidia can vary the number of samples and the sample pattern “footprint,” whereas ATI can vary only the number of samples. Radeon 9700′s sample pattern is always rectangular, which in many instances poses no problem.

nVidia Chief Scientist David Kirk explains GeForce 4 Ti’s AF implementation:

 

The goal here is to correctly sample the footprint of the source texture that lies within the pixel, including orientation, and perspective. What anisotropic filtering does better than simple trilinear is those last two: orientation and perspective.

 

Anisotropic is not necessarily a rectangle, since the direction of anisotropy may not be axis aligned. Also, the actual shape may be more of a quadrilateral, due to perspective. The combination of these is very complex – the nearer edges of the quadrilateral will be larger, and thus lower LOD (level of detail) and weighted more. Also, because of the perspective effect, the entire “front half” of the sample is often weighted more than twice as much as the other half.

Consequently, the LOD may change throughout the pixel, by a lot more than 2x

Our anisotropic is a weighted average of up to 8 trilinear samples, along the major axis of anisotropy. So, we may include up to 64 samples, but the samples may be taken from only two MIP maps, and the weighting is nonlinear. The samples are effectively closer together in the part of the pixel that is nearer to the eye.

Because GeForce 4 Ti is able to dynamically change its sampling footprint to better account for the amount of slope of a given surface on both the x and y axes, nVidia believes this is a better (and more expensive) solution to the problem of slope-related distortion. When a surface is completely screen-aligned, that is, completely perpendicular to the Z-axis, neither GPU applies AF to this surface, since there’s no image quality benefit to be had. So neither side (ATI or nVidia) is really accusing the other of cheating, but rather just pointing out fairly subtle differences in their implementations.

Both implementations can dynamically change the aspect ratio of the sampling footprint as well, so in the instance of using an 8X setting, the aspect ratio could be as much as 8:1 for the sampling footprint. The difference is that ATI’s sampling pattern remains rectangular, whereas nVidia’s can be dynamically altered to address the amount of slope occurring on both the X and Y-axes.

So after all this, we end up with a simple question. For the amount of extra filtering work it entails, does AF deliver enough additional image quality to justify the extra GPU cycles and additional memory bandwidth?

The answer, at least today, is no. And this is why both companies have implemented dynamic AF: to maximize any perceivable image quality improvement while minimizing the performance hit.

So after looking at both companies’ approaches to AF, let’s move on to whether AF really makes a difference.

 

For all of the work a GPU must perform to enable AF, the image quality benefit it returns seems minimal. We took a long, hard look at IL-2 Sturmovik, NASCAR 2002 and Serious Sam SE. Running at 1024x768x32, there were obvious issues relating to line aliasing and some texture crawling, but those were addressed by FSAA. In terms of distortion on long, sloped surfaces, we just weren’t seeing enough artifacting or blurring to warrant all those extra AF samples.

In large outdoor environments, we found that by the time you’re looking far enough down a given angled scene surface to see where distortion occurs, you’re also already down several MIP levels. Thus any loss of texture resolution is more a function of a lower-res MIP map than it is due to sloping effects.

And while a sloped surface is exactly where AF is supposed to deliver great benefit, , if you’ve already greatly reduced your texture fidelity, then higher texture sampling in a given direction can’t bring back much of the lost detail.

We first tested at 10x7x32, so the pixel resolution may well be gating whatever image quality improvement AF delivers. We also repeated these tests at 16×12, and there (with 2.44X more on-screen pixels) some of the issues that AF would resolve are partially solved by the higher pixel resolution.

As you’ve probably figured out, our hands-on testing of typical game scenes did not demonstrate significant or noticeable image quality improvements when using AF. In fact, the differences were so imperceptible that we decided not to include screen shots at different AF levels across either board. However, just so you can see it for yourself, here are two screens, one with AF enabled, one with it disabled. Can you tell the difference? Answers at the end of the story.

 

Serious Sam SE With AF:
Serious Sam SE With AF

 

Serious Sam SE Without AF:

Screen Shot 2 Without AF

 

Note: Both of these images are very heavy (500k each).

 

So AF doesn’t really provide much image benefit. But how much of a performance hit does it cause? See the next sections for the comparative performance impact of running with and without the varying degrees of AF.

Update/Mea Culpa: When our story was first posted, some readers in the discussion forum noted that the AF screenshot was incorrect, because it did not show any differences compared to the non-AF screenshot. They said AF should provide some improvement. Well, the readers were right– I missed seeing the AF setting deep inside of Sam’s controls, and went with the Nvidia driver’s AF setting. While the driver setting will force on AF if the application does not make an explicit AF request, apparently Sam explicitly set no AF, which took precedence over the driver setting. After directly setting Sam’s AF level to 8x, you can now see in the above two screen comparisons that AF is making a difference mostly in the floor textures. AF also helped minimize texture crawling that’s quite common at 1024×768. Going to 1600×1200 helps clean up texture crawling as well, but AF can be useful here too.

I was able to see a subtle improvement in NASCAR 2002, but only on the side panels of cars as I approached them from the rear. Turning on AF cleaned up the decals and the numbers on the car, but it wasn’t a plainly obvious improvement. So yes, there’s a benefit, and provided it’s coming without an overly burdensome performance penalty, it’s worth enabling.

The above shots were taken using a GeForce 4 Ti 4600 at 1024x768x32. The first shot is with 8X (up to 64 samples) AF, the second with no AF.

For this round of testing, we used an Intel test system, which has the following load-out:


 
 
Intel Pentium 4 2.53GHz Check Prices
 
Intel 850EMV2 motherboard using Intel 850E chipset check prices
 
512MB PC1066 RDRAM  
 
Sound Blaster Audigy Gamer check prices
 
3com NIC  
 
40GB ATA-100 EIDE hard-drive check prices
 
Toshiba DVD-ROM  
 
KDS Avitron 21" monitor  
 
Fresh install of Windows XP Pro Service Pack 1 with latest system updates as of 9/17/02       
 
DirectX 8.1  
 

We tested extensively with four games from our 3D GameGauge 3.0:

  • Dungeon Siege
  • Jedi Knight II
  • Unreal Performance Test 927
  • Comanche 4

We pitted ATI’s Radeon 9700 Pro on the Catalyst 2.3 driver against MSI’s GeForce 4 Ti 4600 running nVidia’s 40.41 Detonator driver, and tested at two resolutions, 1024x768x32 and 1600x1200x32. In a departure from our usual testing regimen, we left FSAA disabled at both resolutions, so as to better isolate performance hits induced by the varying degrees of AF.

We first did baseline runs at both test resolutions. We then tested using the following AF settings:


 
 
Card AF Settings Tested Check Prices
 
GeForce 4 Ti 4600    2X, 4X, 8X check prices
 
Radeon 9700 Pro 4X, 8X, 16X Quality (Direct3D)
8X, 16X Performance (OpenGL)

 

check prices
 

We didn’t test 2X with ATI, because it essentially nets us nothing in terms of improved image quality, because we didn’t think it would deliver any significant results. And we were right.

As mentioned earlier, ATI does not support their “Quality” AF mode under OpenGL, so the only way we could test what effectively is 4X and 8X Quality mode under OpenGL was by using higher multipliers in the Performance AF mode for OpenGL testing. We could not, however, test the equivalent 16X Quality Mode under OpenGL, which would require a maximum of 128 samples per pixel. Note also that ATI’s Performance setting uses bilinear filtering instead of trilinear, and an 8X setting uses the same maximum number of texture samples (32) as Nvidia’s 4X Quality setting.

 

1024x768x32

 

Comanche 4 & UPT

 

What you first notice here at the lower of the two test resolutions is that neither of these tests put much of a dent in the Radeon 9700, which is a result of two things: first, the pixel fill rate requirements at this resolution aren’t very high, especially considering that FSAA is disabled. Second, ATI’s AF algorithm is applying AF only selectively to the scene, and so even at its 16X quality setting, ATI is able to keep its frame rates very close to its baseline performance. nVidia steadily scales down until we hit 4X, at which point it bottoms out. We suspect this is because the GeForce 4 Ti 4600 is now bottlenecked elsewhere, likely by memory bandwidth.

Most of the frame rates seen here remained in the acceptable range for smooth game-play, although the GeForce 4 Ti 4600 dipped pretty low on UPT, and in general, you really don’t want to be south of 30fps in a shooter, but instead want to see your frame rates up around 60fps.

Here are two tables that show the percentages of the different AF settings, and what percentage those frame rates are of the baseline frame rate. Both of these games use the Direct3D API.


 
 
GeForce 4 Ti Baseline 2X 4X 8X
 
Comanche 4 100% 93.1% 82.7% 82.2%
 
UPT 100% 73.4% 49.9% 49.9%
 
         
 
Radeon 9700 Baseline 4X 8X 16X
 
Comanche 4 100% 97.5% 95.4% 88.2%
 
UPT 100% 94.6% 91.2% 89.5%
 

 

1600x1200x32

 

 

Comanche 4 & UPT

 

What we find here is that the higher pixel resolution finally causes the Radeon to break a sweat, and Comanche 4 cuts the 9700′s performance nearly in half when we crank it up to 16X AF at this resolution.

Much of this testing seeks to ferret out the point where we’re exceeding the texel cache and hitting graphics memory on each of these GPUs. More often than not, this requires looking at numbers from this higher 1600×1200 test resolution.

For its part, nVidia took incrementally more severe hits as we dialed up AF on the UPT test, and it also had its performance halved by the higher AF settings, although it bottomed out at the 4X setting, and held its ground when we moved to 8X. This indicates that the bottleneck has moved elsewhere. We suspect a saturated memory bandwidth.

Because this resolution brings a higher fill-rate requirement, frame rates here were generally sub-par for delivering smooth game-play. Frame rates seen on Comanche were borderline acceptable, but neither GPU could get UPT moving at anything over 30 fps, never mind the more desirable 60fps.

Here are two tables that show the percentages of the different AF settings, and what percentage those frame rates are of the baseline frame rate. Both of these games use the Direct3D API.


 
 
GeForce 4 Ti 4600 Baseline 2X 4X 8X
 
Comanche 4 100% 68.2% 49.1% 49.1%
 
UPT 100% 58.4% 39.3% 23.4%
 
         
 
Radeon 9700 Baseline 4X 8X 16X
 
Comanche 4 100% 72.3% 63.2% 56.7%
 
UPT 100% 70.0% 64.2% 61.8%
 

As we ran these tests, we looked for differences in image quality between the two GPUs, but few if any were readily apparent, leading us to conclude that this topic is more academic than anything else, and that whatever difference in image quality the two companies’ approaches to AF yield, they are subtle at best.

 

1024x768x32

 

Jedi Knight 2 and Dungeon Siege

 

At 1024×768, both GPUs maintain their baseline frame rate throughout all the AF settings tested on both of these applications, which is the result of several factors. For starters, the lower pixel resolution allows both GPUs to keep many of the needed texels nearby in the texel cache, and Jedi Knight II is a Quake3-based game, and the Quake3 engine has had more optimizations made for it in GPU drivers than probably any other game or game engine in the history of gaming. Note that both of these games are CPU-bound at this resolution, which also goes toward explaining the lack of a performance hit with AF being in the rendering equation.

All frame rates seen here were in the very acceptable range for smooth game-play.

Here are two tables that show the percentages of the different AF settings, and what percentage those frame rates are of the baseline frame rate. Jedi Knight II uses OpenGL, whereas Dungeon Siege uses Direct3D.


 
 
GeForce 4 Ti 4600 Baseline 2X 4X 8X
 
JK2 100% 99.9% 99.9% 99.9%
 
Dungeon Siege 100% 100.0% 96.1% 98.2%
 
         
 
Radeon 9700 Baseline 4X 8X 16X
 
JK2 100% 99.7% 99.8% not run
 
Dungeon Siege 100% 98.9% 99.9% 100.2%
 
Note: The Dungeon Siege result on the 16x Radeon test is within the margin of error for the test, and because the game is bottlenecked elsewhere, the performance hit that AF would produce is masked.

 

1600x1200x32

 

Jedi Knight 2 and Dungeon Siege

 

As further evidence that these two games are CPU-bound, at our higher test resolutions, both GPUs held frame rates very close to baseline frame rate, with the exception being nVidia’s knee in the curve that occurred between 4X and 8X AF. Of course, the additional load from adding FSAA to this cocktail would bring things to a crawl. And even though both GPUs saw their texel cache efficiencies start to tank, the loss of fill-rate performance was masked by the CPU being the bottleneck.

Here are two tables that show the test results at different AF settings, and what percentage those frame rates are of the baseline frame rate. Jedi Knight II uses OpenGL, whereas Dungeon Siege uses Direct3D.

 


 
 
GeForce 4 Ti 4600 Baseline 2X 4X 8X
 
JK2 100% 99.6% 97.2% 57.1%
 
Dungeon Siege 100% 87.2% 83.3% 83.7%
 
         
 
Radeon 9700 Baseline 4X 8X 16X
 
JK2 100% 100.1% 100.2% not run
 
Dungeon Siege 100% 98.6% 97.0% 100.6%
 
Note: The Dungeon Siege result on the 16x Radeon test is within the margin of error for the test, and because the game is bottlenecked elsewhere, the performance hit that AF would produce is masked.

 

Is this whole affair much ado about nothing? In some ways it is, but it’s still a topic that has had techies talking, and a bit fur flying between nVidia and ATI, so it certainly warranted a deeper look.

Updated Final Thoughts:

Our conclusion: compared to filtering techniques such as bilinear, trilinear, and FSAA, which are very important to cleaning up overall image quality, AF looks more like an expendable luxury. We didn’t notice any real image quality differences between varying AF levels in our testing. Our sample screen shots, for instance, looked exactly the same.

Update: Well, actually, they don’t. AF can make a discernible difference in certain kinds of games, with first-person shooters being a good example. And if those improvements can be had without killing off too much frame rate, then AF is something you’ll want to use. But, if you want FSAA and AF at 1600×1200, Radeon 9700 will pretty much take you there, whereas GeForce 4 Ti 4600 really starts to wheeze when these kinds of demands are put on it.

Given the seemingly expendable nature of AF, the adaptive implementation designdecision used by both nVidia and ATI makes sense — even if some people might consider it “cheating”. It’s also worth noticing that once you disable FSAA, Radeon 9700 and GeForce Ti 4600 are much more evenly matched, though once you begin piling back on those rendering features, the R9700 has a bigger reserve tank, and will pull ahead.

Because DirectX 8 content is scarce, simultaneous execution of expensive and somewhat exotic rendering features are one of the main attractions of these newer, high-end 3Dcards.. And while we saw that in many cases, you can run with large degrees of AF and maintain good frame rates, the combination of AF and FSAA can really tank performance..

If the choice comes down to AF or FSAA, FSAA looks better to me. But image quality is very subjective, and your mileage may vary. So play with your card’s AF settings and see if you can see a difference. If not, stick with FSAA, and get back to the real reason we buy 3D cards in the first place: to play games!

Of course, if you’re doing off-line rendering of a scene or a movie, like Monsters Inc., AF makes a lot of sense. It’s expensive, but it will make a subtle difference in image quality. And down the road, when technology advances deliver AF to us essentially for free, why not? But for today, I’d skip it.

Both as an aside and in closing, with the increasing number of user-tweakable features in current GPU control panels, I’d like to see a simple slider where a user can merely specify the target frame they want, and then let the GPU driver dynamically “throttle” the render state, dialing AF and FSAA up or down. This simplified control would let those uninitiated in the ways of 3D, get the most out of their card with minimal futzing.

This is much easier said than done, though, and could produce a battle over who owns the render state: the game, graphics board or user. One last grouse about driver panel settings: we need a Desktop shortcut that leads directly into the real panels for the 3D features config of the GPU. Currently, these panels are simply buried under too many mouse-clicks.

转载于:https://www.cnblogs.com/kylegui/p/3855717.html

Guess you like

Origin blog.csdn.net/weixin_34219944/article/details/94036803