-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Closed
Closed
Copy link
Labels
area-System.GlobalizationquestionAnswer questions and provide assistance, not an issue with source code or documentation.Answer questions and provide assistance, not an issue with source code or documentation.
Milestone
Description
Description
I'm extending a package to support .NET 5.0 and ran into a breaking change. Given the console application:
using System;
namespace ConsoleApp
{
class Program
{
static void Main(string[] args)
{
var actual = "Detail of supported commands\n============\n## Documentation produced for DelegateDecompiler, version 0.28.0 on Thursday, 22 October 2020 16:03\n\r\nThis file documents what linq commands **DelegateDecompiler** supports when\r\nworking with [Entity Framework Core](https://docs.microsoft.com/en-us/ef/core/) (EF).\r\nEF has one of the best implementations for converting Linq `IQueryable<>` commands into database\r\naccess commands, in EF's case T-SQL. Therefore it is a good candidate for using in our tests.\r\n\r\nThis documentation was produced by compaired direct EF Linq queries against the same query implemented\r\nas a DelegateDecompiler's `Computed` properties. This produces a Supported/Not Supported flag\r\non each command type tested. Tests are groups and ordered to try and make finding things\r\neasier.\r\n\r\nSo, if you want to use DelegateDecompiler and are not sure whether the linq command\r\nyou want to use will work then clone this project and write your own tests.\r\n(See [How to add a test](HowToAddMoreTests.md) documentation on how to do this). \r\nIf there is a problem then please fork the repository and add your own tests. \r\nThat will make it much easier to diagnose your issue.\r\n\r\n*Note: The test suite has only recently been set up and has only a handful of tests at the moment.\r\nMore will appear as we move forward.*\r\n\r\n\r\n### Group: Unit Test Group\n#### [My Unit Test1](../TestGroup01UnitTestGroup/Test01MyUnitTest1):\n- Supported\n * Good1 (line 1)\n * Good2 (line 2)\n\r\n#### [My Unit Test2](../TestGroup01UnitTestGroup/Test01MyUnitTest2):\n- Supported\n * Good1 (line 1)\n * Good2 (line 2)\n\r\n\r\n\nThe End\n";
var expected = "\n#### [My Unit Test2](";
Console.WriteLine($"actual.Contains(expected): {actual.Contains(expected)}");
Console.WriteLine($"actual.IndexOf(expected): {actual.IndexOf(expected)}");
}
}
}
I get different results based on the runtime from .NET Core 3.0 -> .NET 5.0:
.NET Core 3.0:
actual.Contains(expected): True
actual.IndexOf(expected): 1475
.NET 5.0:
actual.Contains(expected): True
actual.IndexOf(expected): -1
Configuration
Windows 10 Pro Build 19041 x64
.NET Core 3.1.9
.NET 5.0.0-rc.2.20475.5
Regression?
Yes, it worked through .NET Core 3.1.9
gldraphael, vcsjones, jeremygauthier, markgould, andrii-litvinov and 35 moreoskardudycz, AraHaan and aeoriwndxtmq, dadhi, tomkerkhove, ForNeVeR, amoerie and 3 moreelgonzo and AraHaantheolivenbaum, Tsonov, fals, tomkerkhove, BorisWilhelms and 14 more
Metadata
Metadata
Assignees
Labels
area-System.GlobalizationquestionAnswer questions and provide assistance, not an issue with source code or documentation.Answer questions and provide assistance, not an issue with source code or documentation.
Type
Projects
Relationships
Development
Select code repository
Activity
danmoseley commentedon Oct 22, 2020
@tarekgh
tarekgh commentedon Oct 22, 2020
This is by design as in .NET 5.0 we have switched using ICU instead of NLS. You can look at https://docs.microsoft.com/en-us/dotnet/standard/globalization-localization/globalization-icu for more details.
You have the option to use the config switch
System.Globalization.UseNls
to go back to old behavior but we don't recommend doing that as ICU is more correct and moving forward using ICU will give consistency across OS's.tarekgh commentedon Oct 22, 2020
forgot to say, if you want
IndexOf
behave asContains
, you should use Ordinal comparisons at that time.safern commentedon Oct 22, 2020
Yeah, if you run this code in Unix targeting
netcoreapp3.1
you will see this same behavior:and as @tarekgh with
Ordinal
comparison it returns the expected result.ericstj commentedon Oct 22, 2020
I think this is failing because the mix of
\r\n
and\n
in the source string. If I replace all instances of\r\n
with\n
it works. Same is true if I make everything\r\n
. It's just the mix that is resulting in different results from ICU's comparison.GrabYourPitchforks commentedon Oct 22, 2020
Issue as reported on Twitter was that within the 5.0 runtime, repeated calls to string.IndexOf with the same inputs was giving different results on each call.https://twitter.com/jbogard/status/1319381273585061890?s=21Edit: Above was a misunderstanding.
pinkli commentedon Oct 29, 2020
Yes, this totally expressed my concerns. As a typical Chinese developer, we rarely put



StringComparison
orCultureInfo
explicitly when calling string related methods in our application codes, and it just works. We defintely don't expect a different behaviour betweenIndexOf
andContains
!.net 5.0
.net core 3.1
.net framework
twinter-amosfivesix commentedon Oct 29, 2020
Perhaps a key point here is that the two methods have always been inconsistent. They did not suddenly become inconsistent in .NET 5.0. If I'm following things correctly,
IndexOf
has always used current culture comparison,Contains
has always used ordinal comparison. Granted .NET 5.0 adds more inconsistency. But the mistake here was in the original API design that allowed this inconsistency.safern commentedon Oct 29, 2020
That is correct but it is the other way around,
IndexOf(string)
uses current culture,IndexOf(char)
uses Ordinal andContains
uses ordinal.GrabYourPitchforks commentedon Oct 29, 2020
I'll elaborate briefly on the
IndexOf
vs.Contains
differences that others alluded to recently.IndexOf(string)
has always assumed CurrentCulture comparison, andContains(string)
has always assumed Ordinal comparison. This discrepancy existed way back in .NET Framework. It is not a new discrepancy introduced in .NET 5. For example, on .NET Framework (which uses Windows's NLS facility), the ligature "æ" and the two-char string "ae" compare as equal under a linguistic comparer. This results in the following discrepancy:This discrepancy has existed for over a decade. It's documented and there is guidance regarding it. Is it a misdesign? Perhaps. We're discussing over at #43956 ways to make the ecosystem healthier moving forward.
For this specific issue, what we're really asking ourselves is: "Given that the discrepancy has existed forever, how far apart are these two methods allowed to deviate in practice before it harms the larger ecosystem?" I think we're still trying to define where that threshold should be. Reports like this are extremely helpful because they give us insight into people use these APIs in practice and what expectations customers have regarding their behavior. As stewards of the framework, we need to consider not just the technical documentation, but also the manner in which real-world apps consume these APIs.
This comment isn't really intended to defend any particular point of view. My intent is to clarify some misconceptions that I've seen and to help explain how we've framed the problem.
ufcpp commentedon Oct 30, 2020
Even if InvariantCulture specified, is it correct behavior that
\n
doesn't match in the ICU version?ufcpp commentedon Oct 30, 2020
Maybe, the following code displays 5 on Linux and -1 on Windows if we use git and its default settings (autocrlf=true)?
120 remaining items