Parser

A CsvParser is at the core of the library. It is used to parse the given CSV data into strongly-typed objects.

Contructing a Parser

A CsvParser needs the CsvParserOptions and a CsvMapping to be constructed.

Mapping

The parser has to know how to map between the textual CSV data and the strongly typed .NET object. This mapping is defined with a CsvMapping, which defines the mapping between the CSV column index and the property of the .NET object. It is an abstract base class, that needs to be implemented by the user of the library.

The CsvMapping exposes the method MapProperty to define the actual property mapping.

You have seen an example for a CsvMapping in the Quickstart document.

private class CsvPersonMapping : CsvMapping<Person>
{
    public CsvPersonMapping()
        : base()
    {
        MapProperty(0, x => x.FirstName);
        MapProperty(1, x => x.LastName);
        MapProperty(2, x => x.BirthDate);
    }
}

Options

The CsvParser doesn’t know by default, if the header row of the CSV data should be skipped or how to tokenize (see Tokenizer) a line. The options are set in the CsvParserOptions and passed into to the CsvParser. Since input data can processed in parallel, so there are also options for the degree of parallelism.

In the simplest case it is sufficient to pass the flag for the header skip and the column delimiter.

You have seen an example for CsvParserOptions in the Quickstart document.

CsvParserOptions csvParserOptions = new CsvParserOptions(false, ';');

Parsing CSV Data

The CsvParser exposes the methods ReadFromFile and ReadFromString to read the CSV data from a given file or string.

You have seen an example for CsvParserOptions in the Quickstart document.

var result = csvParser
    .ReadFromFile("person.csv", Encoding.UTF8)
    .ToList();

Working with the Results

The return value of the CsvParser.ReadFromFile and CsvParser.ReadFromString methods is a ParallelQuery<CsvMappingResult<TEntity>>.

A ParallelQuery? A ParallelQuery is a special IEnumerable from the Parallel LINQ namespace, that behaves almost like a normal IEnumerable (with a few exceptions). In order to evaluate the results, you can iterate through the ParallelQuery, which is the preferred way of working with the results. If you are uncomfortable with enumerables, you can also turn the data into a simple list by calling the method ToList() on it.

Note

The library uses Parallel LINQ (PLINQ) to support a high degree of parallelism. Building a parallel processing pipeline with PLINQ may not be intuitive, so reading the most important PLINQ concepts is suggested. There is a great documentation on working with Parallel LINQ at MSDN: Parallel LINQ (PLINQ).

The CsvMappingResult holds the parse results. You can access the result through the property CsvMappingResult<TEntity>.Result, but the property is only populated, when the parsing was successful. You can check if the CSV data was parsed successfully by evaluating the property CsvMappingResult<TEntity>.IsValid.

Attention

The CsvParser doesn’t throw any exceptions during parsing, because the input data is processed in parallel and the CsvParser can’t stop parsing, just because a single line has an error. So the CsvMappingResult can also contain an error, if parsing a line was not successful.

If a CSV line could not be parsed, the property CsvMappingResult<TEntity>.Error is populated and contains the problematic column and error message.