如果无法保证为字符串或整数的GethashCode()唯一使用?[英] If getHashCode() for string or integer is not guaranteed to be unique why use it?

本文是小编为大家收集整理的关于如果无法保证为字符串或整数的GethashCode()唯一使用?的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

正如我在标题中所写的.

如果在您的应用程序中使用gethashcode(),为什么使用它? (对于字符串和整数) 我想使用它与LINQ模型中的Metods相交的方法 或创建我自己的iEqualityCompare类. 感觉就像是一个机会 - 如果不是100%安全?

还是我错过了什么?

As quoted in String.GetHashCode Method in https://learn.microsoft.com/

重要

如果两个字符串对象相等,则GethashCode方法返回相同的值.但是,每个唯一的字符串值都没有唯一的哈希代码值.不同的字符串可以返回相同的哈希代码.

不能保证哈希代码本身是稳定的.对于单个版本的.NET,跨.NET版本以及.NET版本(例如32位和64位)的Hash代码在.NET实现之间可能有所不同.在某些情况下,它们甚至可能因应用程序域而有所不同.这意味着同一程序随后的两次运行可能会返回不同的哈希码.

因此,绝对不应在应用程序域之外使用哈希码 创建的,它们绝不应该用作集合中的关键字段,并且绝不应该坚持下去.

最后,不要使用哈希代码,而不是由加密哈希函数返回的值 您需要一个密码强的哈希.对于加密哈希,请使用从 system.security.cryptography.hashalgorithm或system.security.cryptography.keyedhashalgorithm类.

有关哈希代码的更多信息,请参见object.gethashcode.

推荐答案

我认为让您感到困惑的是,您认为,哈希代码映射到一个值的地址,但并非完全如此.

想象它像书架和哈希码图一样,要解决架子的问题.如果他们中的两个具有相同的哈希码,将放置在同一架子上,并在其中包含3本书的架子地址,那么词典只能检查架子上的三本书,而不是所有书籍.因此,越独特的哈希码是字典查找速度越快.

创建IEqualityComparer时,如果您可以使GetHashCode()返回唯一值,则使用它的字典或哈希集将比有许多重复时的执行速度更快.

检查此示例:

public int GetShashCode(string ojb)
{
     return obj.Length;
}

尽管它比整个字符串循环得多,但它不是很唯一(尽管有效)

此示例也有效,但甚至是一个更糟糕的选择:

public int GetShashCode(string ojb)
{
     return (int)obj[0];
}

基于您可以猜到的字符串的内容,您可以做得更好(例如,您知道它是以这种格式的社会安全号码:" xxx-xx-xxxx",每个x代表一个代表一个数字)将是一个很好的选择:

public int GetShashCode(string ojb)
{
     return int.Parse(obj.Replace("-",""));
}

其他推荐答案

如果在您的应用程序中使用gethashcode(),为什么使用它?

a>有不同的目的.如果您需要对字符串的平等测试,则可能应该使用 String.Equals 或==运算符,保证它们可以正常工作.

哈希代码并不是为每个可能的字符串生成唯一数字的一种方式,这是不可能的.这是 hash函数:

的定义.

a哈希函数是可用于将任意大小数据映射到固定尺寸值的任何函数.

它只是将几乎无限的字符串映射到(相对)非常有限的整数集.如果您需要统一将大量字符串扩展到较小的"桶",则可能需要使用哈希代码.哈希码广泛用于基于哈希的集合中,例如 .

GetHashCode的文档提到了此方法的不同问题:

  • 该方法可以在不同域/机器/版本的.NET上为同一字符串生成不同的结果.这意味着将哈希在外部存储作为某种唯一标识符;
  • 不是一个好主意;
  • 结果在密码上并不强大,因此,如果您需要坚不可摧的密码盐,则不应使用它.

当然,它看起来很恐怖,但GetHashCode仍然足以容纳内存收藏,例如HashSet或Dictionary.

另外,请参见以下问题:为什么在覆盖等于equals方法时覆盖Gethashcode很重要?

本文地址:https://www.itbaoku.cn/post/1556766.html

问题描述

As i wrote in the title.

If its not safe to use getHashCode() in your application, why use it? (for string and integer) I want to use it to intersect methods and except metods in Linq models or create my own IEqualityCompare class. It feels like a chance - if its not 100% secure?

Or have i missed something?

As quoted in String.GetHashCode Method in https://learn.microsoft.com/

Important

If two string objects are equal, the GetHashCode method returns identical values. However, there is not a unique hash code value for each unique string value. Different strings can return the same hash code.

The hash code itself is not guaranteed to be stable. Hash codes for identical strings can differ across .NET implementations, across .NET versions, and across .NET platforms (such as 32-bit and 64-bit) for a single version of .NET. In some cases, they can even differ by application domain. This implies that two subsequent runs of the same program may return different hash codes.

As a result, hash codes should never be used outside of the application domain in which they were created, they should never be used as key fields in a collection, and they should never be persisted.

Finally, don't use the hash code instead of a value returned by a cryptographic hashing function if you need a cryptographically strong hash. For cryptographic hashes, use a class derived from the System.Security.Cryptography.HashAlgorithm or System.Security.Cryptography.KeyedHashAlgorithm class.

For more information about hash codes, see Object.GetHashCode.

推荐答案

I think what makes you confused is that you think that, that hash code maps to an address of a value, but it's not exactly like that.

Imagine it like bookshelves, and Hash Code maps to address of a shelf. If two of them have the same HashCode will be placed in the same Shelf, and having the address of a shelf with 3 books in it, dictionary only checks the three books on the shelf and not all the books. So the more unique hash codes are, the faster the dictionary lookup is.

When you create IEqualityComparer if you can make the GetHashCode() to return unique values, the Dictionary or HashSet using it will perform faster than when there are many duplicates.

Check This example:

public int GetShashCode(string ojb)
{
     return obj.Length;
}

although it makes it much faster than looping through the whole strings, but it is not very unique (although it is valid)

This example is also valid but even a worse choice:

public int GetShashCode(string ojb)
{
     return (int)obj[0];
}

Based on the content of the string that you can guess, you can make much better hashcodes (for example you know that that it is a social security number in this format: "XXX-XX-XXXX" which each X represent a digit) will be a great choice:

public int GetShashCode(string ojb)
{
     return int.Parse(obj.Replace("-",""));
}

其他推荐答案

If its not safe to use getHashCode() in your application, why use it?

GetHashCode has a different purpose. If you need an equality test for strings you should probably use String.Equals or == operator, these are guaranteed to work correctly.

Hash code isn't meant to be a way to generate a unique number for each possible string, this is impossible. Here's the definition of hash function:

A hash function is any function that can be used to map data of arbitrary size to fixed-size values.

It just maps a nearly infinite set of strings to a (comparatively) very limited set of integers. You might want to use a hash code if you need to uniformly spread a large number of strings to smaller "buckets". Hash codes are used extensively in hash-based collections, e.g. HashSet.

The documentation for GetHashCode mentions different issues with this method:

  • The method can generate a different result for the same string on different domains/machines/versions of .Net. This means that it's not a good idea to store the hash externally as some sort of unique identifier for later use;
  • The result is not cryptographically strong, so you shouldn't use it if you need an unbreakable password salt.

Surely, it looks scary, but still, GetHashCode is good enough for in-memory collections, such as HashSet or Dictionary.

Also, see this question: Why is it important to override GetHashCode when Equals method is overridden?