HLSL着色器的优化

5
我有以下像素着色器(HLSL)编译出68条指令(采用下面建议的优化)。但我想要使用着色器模型2,因此遗憾的是只能使用多达64条指令。是否有人看到可能的优化方案,而不改变着色器的结果?
该着色器将屏幕上的一个更或多或少球形区域(具有正弦形状的边界)从RGB转换为白色 -> 红色 -> 黑色的渐变,并进行一些亮度等修改。
着色器代码如下:
// Normalized timefactor (1 = fully enabled)
float timeFactor;

// Center of "light"
float x;
float y;

// Size of "light"
float viewsizeQ;
float fadesizeQ;

// Rotational shift
float angleShift;

// Resolution
float screenResolutionWidth;
float screenResolutionHeight;
float screenZoomQTimesX;

// Texture sampler
sampler TextureSampler : register(s0);

float4 method(float2 texCoord : TEXCOORD0) : COLOR0
{
// New color after transformation
float4 newColor;

// Look up the texture color.
float4 color = tex2D(TextureSampler, texCoord);

// Calculate distance
float2 delta = (float2(x, y) - texCoord.xy)
             * float2(screenResolutionWidth, screenResolutionHeight);

// Get angle from center
float distQ = dot(delta, delta) - sin((atan2(delta.x, delta.y) + angleShift) * 13) * screenZoomQTimesX;

// Within fadeSize
if (distQ < fadesizeQ)
{
   // Make greyscale
   float grey = dot(color.rgb, float3(0.3, 0.59, 0.11));

   // Increase contrast by applying a color transformation based on a quasi-sigmoid gamma curve
   grey = 1 / (1 + pow(1.25-grey/2, 16) );

   // Transform Black/White color range to Black/Red/White color range
   // 1 -> 0.5f ... White -> Red
   if (grey >= 0.75)
   {
   newColor.r = 0.7 + 0.3 * color.r;
   grey = (grey - 0.75) * 4;
   newColor.gb = 0.7 * grey + 0.3 * color.gb;
   }
   else // 0.5f -> 0 ... Red -> Black
   {
   newColor.r = 1.5 * 0.7 * grey + 0.3 * color.r;
   newColor.gb = 0.3 * color.gb ;
   }

   // Within viewSize (Full transformation, only blend with timefactor)
   if (distQ < viewsizeQ)
   {
 color.rgb = lerp(newColor.rgb, color.rgb, timeFactor);
   }
   // Outside viewSize but still in fadeSize (Spatial fade-out but also with timefactor)
   else
   {
      float factor = timeFactor * (1 - (distQ  - viewsizeQ) / (fadesizeQ - viewsizeQ));
      color.rgb = lerp(newColor.rgb, color.rgb, factor);
   } 
}
3个回答

6

有些细节需要注意,你需要用x和y来表示光源的中心加上屏幕的宽度/高度。

替换为:

float2 light;
float2 screenResolution;

然后在你的代码中:

float2 delta = (light - texCoord.xy) * screenResolution;

应该删除另外两个指令。

接下来是 atan2 的使用,这可能是最耗费资源的部分。

您可以声明另一个 float2(float2 vecshift),其中 x = cos(AngleShift),y = sin(angleShift)。只需在 CPU 中预计算此内容。

然后您可以执行以下操作(基本上是通过执行叉乘来提取角度,而不是使用 atan2):

float2 dn = normalize(delta);
float cr = dn.x *vecshift.y -dn.y * vecshift.x;
float distQ = dot(delta, delta) - sin((asin(cr))*13) *screenZoomQTimesX;

请注意,我不太喜欢sin或asin之类的东西,但是多项式形式不适合你的用例。我相信有一种比使用sin * asin更干净的调制版本。使用?构造而不是if / else也可以(有时)帮助您的指令计数。
color.rgb = lerp(newColor.rgb, color.rgb, distQ < viewsizeQ ? timeFactor : timeFactor * (1 - (distQ  - viewsizeQ) / (fadesizeQ - viewsizeQ)));

减少了2个指令。

查看完整版本,设置为60条指令。

// Normalized timefactor (1 = fully enabled)
float timeFactor;

float2 light;

float viewsizeQ;
float fadesizeQ;

float2 screenResolution;
float screenZoomQTimesX;

float2 vecshift;

// Texture sampler
sampler TextureSampler : register(s0);

float4 method(float2 texCoord : TEXCOORD0) : COLOR0
{
// New color after transformation
float4 newColor;

// Look up the texture color.
float4 color =tex2D(Samp, texCoord);

// Calculate distance
float2 delta = (light - texCoord.xy) * screenResolution;

float2 dn = normalize(delta);
float cr = dn.x *vecshift.y -dn.y * vecshift.x;

float distQ = dot(delta, delta) - sin((asin(cr))*13) *screenZoomQTimesX;
//float distQ = dot(delta, delta) - a13 *screenZoomQTimesX;

if (distQ < fadesizeQ)
{
   // Make greyscale
   float grey = dot(color.rgb, float3(0.3, 0.59, 0.11));

   // Increase contrast by applying a color transformation based on a quasi-sigmoid gamma curve
   grey = 1 / (1 + pow(1.25-grey/2, 16) );

   // Transform Black/White color range to Black/Red/White color range
   // 1 -> 0.5f ... White -> Red
   if (grey >= 0.75)
   {
       newColor.r = 0.7 + 0.3 * color.r;
       grey = (grey - 0.75) * 4;
       newColor.gb = 0.7 * grey + 0.3 * color.gb;
   }
   else // 0.5f -> 0 ... Red -> Black
   {
       newColor.r = 1.5 * 0.7 * grey + 0.3 * color.r;
       newColor.gb = 0.3 * color.gb ;
   }

   color.rgb = lerp(newColor.rgb, color.rgb, distQ < viewsizeQ ? timeFactor : timeFactor * (1 - (distQ  - viewsizeQ) / (fadesizeQ - viewsizeQ)));
}
return color;

}

4

几个建议

  • 对于你的准Sigmoid函数,可以使用一维采样器(查找表)。如果power从0到1,则创建一个1x256的纹理(或者保留你的函数最佳水平大小的其他水平大小),并且只需使用tex1D查找当前power的值。您需要在CPU上运行此函数以填写此纹理,但它只需要在加载时完成一次。
  • 您可以使用lerp函数代替我们拼出来的color.rgb = /*0.7 */ factor * newColor.rgb + /*0.3 **/ (1 - factor) * color.rgb;,而是使用color.rgb = lerp(newColor.rgb, color.rgb, factor);(lerp通常会在大多数GPU上编译为汇编指令),这将节省您的指令。

谢谢您的建议!我该如何在HLSL中实现查找表 - 您能给我一个例子吗?上面的代码哪一部分可以通过lerp指令进行优化? - ares_games
1
在上面的答案中添加了解释。 - Ani
谢谢,lerp(现在包含在上面的着色器代码中)节省了一条指令,使我们达到了68(从69)。 - ares_games
1
你试过查找表了吗?那应该可以让它降低一些。 - Ani
还没有,但我很快会尝试。 - ares_games

1
使用了更多的lerp函数,我成功将指令数量降低到64以下。查找表没有起作用,因为使用atan2函数比查找纹理指令更少。
// Normalized timefactor (1 = fully enabled)
float timeFactor;

// Center of "light"
float x;
float y;

// Size of "light"
float viewsizeQ;
float fadesizeQ;

// Rotational shift
float angleShift;

// Resolution
float screenResolutionWidth;
float screenResolutionHeight;
float screenZoomQTimesX;

// Texture sampler
sampler TextureSampler : register(s0);

float4 method(float2 texCoord : TEXCOORD0) : COLOR0
{
float4 newColor;

// Look up the texture color.
float4 color = tex2D(TextureSampler, texCoord);

// Calculate distance
float2 delta = (float2(x, y) - texCoord.xy)
             * float2(screenResolutionWidth, screenResolutionHeight);

// Get angle from center
float distQ = dot(delta, delta) - sin((atan2(delta.x, delta.y) + angleShift) * 13) * screenZoomQTimesX;

// Outside fadeSize: No color transformation
if (distQ >= fadesizeQ) return color;

// Otherwise (within color transformed region) /////////////////////////////////////////////////////////

// Make greyscale
float grey = dot(color.rgb, float3(0.3, 0.59, 0.11));

// Increase contrast by applying a color transformation based on a quasi-sigmoid gamma curve
grey = 1 / (1 + pow(1.25-grey/2, 16));

// Transform greyscale to white->red->black gradient
// 1 -> 0.5f ... White -> Red
if (grey >= 0.5)
{
newColor = lerp(float4(0.937,0.104,0.104,1), float4(1,1,1,1), 2 * (grey-0.5)
}
else // 0.5f -> 0 ... Red -> Black
{
newColor = lerp(float4(0,0,0,1), float4(0.937,0.104,0.104,1), 2 * grey);
}

float factor = saturate(timeFactor * (1 - (distQ  - viewsizeQ) / (fadesizeQ - viewsizeQ)));
color.rgb = lerp(color.rgb, newColor.rgb, factor);

return color;
 }

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接