實作Equals()與==、!=運算子注意事項一則
2 | 7,676 |
在C#自訂物件型別,基於Referece Type特性,只有兩個變數指向同一物件,==或Equals()才會傳回true(如果對Reference Type跟Value Type間的差異感到模糊,可以來個小測驗自虐釐清一番),而這常不待我們的期待。以股票代號物件為例,假設有個Ticker物件,將股票代號分為Symbol(ex: 2330)與Market(ex: TW)兩部分,另外有FullSymbol傳回2330.TW:
public class Ticker
{
public string Symbol { get; set; }
public string Market { get; set; }
public Ticker(string symbol, string market)
{
Symbol = symbol;
Market = market;
}
public Ticker(string fullsymbol)
{
var p = fullsymbol.Split('.');
if (p.Length != 2) throw new ArgumentException();
Symbol = p[0];
Market = p[1];
}
public string FullSymbol
{
get
{
return Symbol + "." + Market;
}
}
}
測試程式中,t1,t2的內容均為2330.TW,t3則指向t1,進行Equals()及==比對:
static void Main(string[] args)
{
var t1 = new Ticker("2330", "TW");
var t2 = new Ticker("2330.TW");
var t3 = t1;
Console.WriteLine("Equals Test: {0}", t1.Equals(t2));
Console.WriteLine("== Test: {0}", t1 == t2);
Console.WriteLine("== Test(Same Object): {0}", t1 == t3);
Console.Read();
}
結果t1.Equals(t2)與t1 == t2都傳回false,只有t1 == t3傳回true:
Equals Test: False
== Test: False
== Test(Same Object): True
依據MSDN文章教學,我們可以覆寫Equals()、==、!=運算子自訂Ticker比較規則,判定Symbol與Market都一致就相等:
public class Ticker
{
public string Symbol { get; set; }
public string Market { get; set; }
public Ticker(string symbol, string market)
{
Symbol = symbol;
Market = market;
}
public Ticker(string fullsymbol)
{
var p = fullsymbol.Split('.');
if (p.Length != 2) throw new ArgumentException();
Symbol = p[0];
Market = p[1];
}
public string FullSymbol
{
get
{
return Symbol + "." + Market;
}
}
//REF: https://msdn.microsoft.com/en-us/library/ms173147(v=vs.90).aspx
public override bool Equals(System.Object obj)
{
// If parameter is null return false.
if (obj == null) return false;
// If parameter cannot be cast to Point return false.
Ticker p = obj as Ticker;
if ((System.Object)p == null) return false;
// Return true if the fields match:
return FullSymbol == p.FullSymbol;
}
public bool Equals(Ticker p)
{
// If parameter is null return false:
if ((object)p == null) return false;
// Return true if the fields match:
return FullSymbol == p.FullSymbol;
}
public override int GetHashCode()
{
return FullSymbol.GetHashCode();
}
public static bool operator ==(Ticker a, Ticker b)
{
// If both are null, or both are same instance, return true.
if (System.Object.ReferenceEquals(a, b)) return true;
// If one is null, but not both, return false.
if (((object)a == null) || ((object)b == null)) return false;
// Return true if the fields match:
return a.FullSymbol == b.FullSymbol;
}
public static bool operator !=(Ticker a, Ticker b)
{
return !(a == b);
}
}
重新測試,Equals()與==比對結果會依Symbol與Market是否相同決定,符合我們的期望。
static void Main(string[] args)
{
var t1 = new Ticker("2330", "TW");
var t2 = new Ticker("2330.TW");
var t3 = new Ticker("1234", "TW");
Console.WriteLine("Equals Test: {0}", t1.Equals(t2));
Console.WriteLine("== Test: {0}", t1 == t2);
Console.WriteLine("!Equals Test: {0}", !t1.Equals(t3));
Console.WriteLine("!= Test: {0}", t1 != t3);
Console.Read();
}
測試結果:
Equals Test: True
== Test: True
!Equals Test: True
!= Test: True
講完了?且慢!以上範例埋藏了一個錯誤。
同事轉來ReSharper的警告:Non-readonly fields referenced in GetHashCode(),GetHashCode的計算來源必須保證不會變動,而使用readonly欄位是最直接有效的做法。而我這才注意,MSDNTwoDPoint範例,其中的x, y就是readonly,代表它們只能在建構時指定,事後不得變更。而我原本的寫法使用FullSymbol.GetHashCode(),一旦Symbol或Market變動,GetHashCode()的結果就會不同。
Eric Lippert有篇GetHashCode須知,節錄摘要相關說明下:
Rule: 相等的項目,其Hash Code必定也相同
如果兩個物件相等,其Hash Code必定相等;反之,若兩物件Hash Code不相等,其Equals()必為false。
但依邏輯學,若兩個物件的Hash Code相等,不代表物件相等。(Hash Code只有40億種變化,存在不同物件擁有Hash Code相同的機率。)Guideline: GetHashCode傳回的整數值永遠不可改變
理想上GetHashCode應由不會異動的欄位計算而得,在物件存在的生命週期不得改變。但這只是理想,真實的規則是:至少要做到當有其他資料結構(註:例如Dictionary<T, T>,Hashtable)依賴物件的Hash Code運作時,GetHashCode()的傳回結果絕不可變動。
想像一下,若物件被放在雜湊資料結構,GetHashCode()結果卻發生改變,很明顯Contains()查詢就會壞掉。物件放進去時依Hash Code放進位置#5,修改物件Hash Code變成47,Contains()該物件時去找第#47位置,啥都沒有。
除此之外,許多LINQ運算也依賴GetHashCode()運行,一旦允許它變來變去,產生的靈異現象足以讓你鬼打牆到想改行。
洗心革面改寫程式,將Symbol及Market屬性改為唯讀,另外宣告修改readonly版欄位symbol及market,透過建構式給值,GetHashCode則改由兩個readonly欄位取值,如此才能杜絕Symbol/Market事後被修改GetHashCode()結果異動的風險:
public class Ticker
{
readonly string symbol;
readonly string market;
public string Symbol { get { return symbol; } }
public string Market { get { return market; } }
public Ticker(string symbol, string market)
{
this.symbol = symbol;
this.market = market;
}
public Ticker(string fullsymbol)
{
var p = fullsymbol.Split('.');
if (p.Length != 2) throw new ArgumentException();
this.symbol = p[0];
this.market = p[1];
}
//...餘略...
public override int GetHashCode()
{
return symbol.GetHashCode() ^ market.GetHashCode();
}
}
大家在自訂GetHashCode()時,請留意此一原則。
Comments
# by 路人甲
感謝黑大分享,但若是該物件需要用於序列化/反序列化的動作時,就必須提供一個無參數的建構式,但這樣的話,物件當中的屬性便無法給予值了(以此為例,symbol跟market必為null)。
# by Jeffrey
to 路人甲,同意。若所使用的序列化/反序列化機制要求一定要有預設建構式(無參數),此法就行不通了。面對此種狀況,我會考慮換個比較有彈性的序列化程式庫,例如:Json.NET http://blog.darkthread.net/post-2016-08-10-json-net-constructor-issue.aspx