A simple text clustering algorithm in c#.
It will add the extension method ClusterBy on IEnumerable. You only need to specify which string property to use and some options.
.NET Standard 2.0
To get the latest version:
Install-Package TextClustering
Consider the following model:
public class Document
{
    public string Content { get; set; }
}
How to invoke it:
using TextClustering;
// ...
var documents = new List<Document>();
// Fill list of documents.
var result = documents.ClusterBy(document => document.Content, options => options
    .WithMinClusterSize(5) // The minimum cluster size (default value: 5, but you should change it)
    .WithMinWordLength(5) // The minimum word length
    .WithMaxPresencePercent(10) // The maximum overall presence in percent of one word among all text
    .UseCaching(true) // (optional, true by default. Will use more ram, but prevent redoing the same calculation multiple times)
    .WithMaxDegreeOfParallelism(Environment.ProcessorCount) // (optional, will use one thread by default)
    .WithLanguages(Language.English, Language.French) // (optional, will use English stop words if not specified) This is used to eliminate words that are so commonly used that they carry very little useful information.
);
// result.Unclassified // List<Document>
// result.Clusters // List<List<Document>>
For more complete example, please see the project TextClusteringExample.
Code released under the MIT license.