前幾天看到關於陣列跑迴圈時,比對條件裡陣列長度改用變數提升執行效率的討論。亦即

for (int i = 0; i < array.Length; i++) ...

若改成

int c = array.Length;
for (int i = 0; i < c; i++) ...

會不會變快?

支持變快的理由是比對條件會反覆執行,array.Length透過屬性取值會比直接存取變數耗時。

但有另一派說法:

Some programmers believe that they can get a speed boost by moving the length calculation out and saving it to a temp, as in the example on the right.

The truth is, optimizations like this haven't been helpful for nearly 10 years: modern compilers are more than capable of performing this optimization for you. In fact, sometimes things like this can actually hurt performance. In the example above, a compiler would probably check to see that the length of myArray is constant, and insert a constant in the comparison of the for loop. But the code on the right might trick the compiler into thinking that this value must be stored in a register, since l is live throughout the loop. The bottom line is: write the code that's the most readable and that makes the most sense. It's not going to help to try to outthink the compiler, and sometimes it can hurt.

意思是,當代編譯器早非吳下阿蒙,面對這種情境能判斷array.Length不會變,JIT產生的執行碼在比對條件時會直接用常數;刻意改用變數,編譯器反而被誤道以為其值可能變化而改用Regiter儲存,效率反而較差。

兩派主張都有道理,那就實測看看好了。

寫了一支簡單測試程式,宣告一個由65535個int組成的陣列,分別存入0到65534,使用迴圈進行累加,因累加一輪時間太短,連續跑5000回合做為計時單位,等同執行約32憶次迴圈所耗時間。分別用for迴圈+array.Length、for迴圈+長度變數、foreach三種做法各測一次,另外怕只測一次數據會受其他因素干擾失真,故三種測試都跑100次求平均減少誤差。

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Text;
 
namespace TestLoop
{
    class Program
    {
        static void Main(string[] args)
        {
            int[] ary = new int[65535];
            for (int i = 0; i < ary.Length; i++)
                ary[i] = i;
            int TIMES = 5000;
            long sum = 0;
            Stopwatch sw = new Stopwatch();
            StringBuilder sb = new StringBuilder();
            sb.AppendLine("ary.Length,count var,foreach");
 
            for (int run = 0; run < 100; run++)
            {
                sw.Reset();
                sw.Start();
                for (int t = 0; t < TIMES; t++)
                {
                    sum = 0;
                    //比對i < ary.length
                    for (int i = 0; i < ary.Length; i++)
                    {
                        sum += ary[i];
                    }
                }
                sw.Stop();
                Console.WriteLine("{1} ary.Length = {0:N0}ms", 
                                    sw.ElapsedMilliseconds, sum);
                sb.Append(sw.ElapsedMilliseconds + ",");
                
                sw.Reset();
                sw.Start();
                for (int t = 0; t < TIMES; t++)
                {
                    int count = ary.Length;
                    sum = 0;
                    //將ary.Length存入count變數比對
                    for (int i = 0; i < count; i++)
                    {
                        sum += ary[i];
                    }
                }
                sw.Stop();
                Console.WriteLine("{1} count variable = {0:N0}ms", 
                                    sw.ElapsedMilliseconds, sum);
                sb.Append(sw.ElapsedMilliseconds + ",");
 
                sw.Reset();
                sw.Start();
                for (int t = 0; t < TIMES; t++)
                {
                    sum = 0;
                    //使用foreach
                    foreach (int n in ary)
                    {
                        sum += n;
                    }
                }
                sw.Stop();
                Console.WriteLine("{1} foreach = {0:N0}ms",
                                    sw.ElapsedMilliseconds, sum);
                sb.AppendLine(sw.ElapsedMilliseconds.ToString());
            }
            File.WriteAllText("d:\\result.csv", sb.ToString());
            Console.Read();
        }
    }
}

由執行結果來看,三種做法各有勝負,並未出現一面倒:

2147385345 ary.Length = 1,274ms
2147385345 count variable = 1,260ms
2147385345 foreach = 1,701ms //foreach > Length > 變數
2147385345 ary.Length = 1,805ms
2147385345 count variable = 1,472ms
2147385345 foreach = 2,151ms //foreach > Length > 變數
2147385345 ary.Length = 1,266ms
2147385345 count variable = 1,238ms
2147385345 foreach = 1,621ms //foreach > Length > 變數
2147385345 ary.Length = 1,623ms
2147385345 count variable = 1,658ms
2147385345 foreach = 1,298ms //變數 > Length > foreach
2147385345 ary.Length = 1,214ms
2147385345 count variable = 1,271ms
2147385345 foreach = 1,310ms //foreach > 變數 > Length
2147385345 ary.Length = 1,358ms
2147385345 count variable = 1,125ms
2147385345 foreach = 1,404ms //foreach > Length > 變數
2147385345 ary.Length = 1,421ms
2147385345 count variable = 1,471ms
2147385345 foreach = 1,432ms //變數 > foreach > Length
2147385345 ary.Length = 1,801ms
2147385345 count variable = 1,475ms
2147385345 foreach = 1,848ms //foreach > Length > 變數
2147385345 ary.Length = 1,608ms
2147385345 count variable = 1,527ms
2147385345 foreach = 1,587ms //Length > foreach > 變數
2147385345 ary.Length = 1,554ms
2147385345 count variable = 1,392ms
2147385345 foreach = 2,057ms //foreach > Length > 變數
...略...

然而,約略感覺將array.Length存成變數的做法領先次數較多,100次測試時間取平均值也印證了這點,for+count變數最快,其次是for+ary.Length,foreach最慢! (下圖紅框為100次測試的耗時平均值)

【結論】

雖然由平均值來看,將陣列長度存成變數的時間最短,但是30億次只差0.14秒,且執行期間很可能因其他因素逆轉結果(由反覆測試時三種方法各有勝負得知,甚至確知較複雜的foreach寫法也可能最快)。
故用.NET程式寫陣列迴圈時,不必庸人自擾去求這"奈米級"的差別,用最直覺、易讀的方式撰寫更重要。我會選擇直接用Array.Length,甚至用foreach也無妨,程式碼易理解就好。


Comments

# by Puritys

使用變數還是比較好一點,畢竟不同的程式語言 & Compiler 會有不同的編譯方式。

# by Nye

[舉手] "宣告一個由65535個int組成的陣列,分別存入0到65535" 這樣不是有 63336 個才對?

# by Nye

我自己也打錯... 65536

# by Jeffrey

to Nye, 哈! 謝謝指正。文章說明處改為"存入0到65534"以便與程式碼相符。

Post a comment