Home > Uncategorized > More on CSV parsers in C++

More on CSV parsers in C++

I was watching Herb Sutter talking about the new C++ the other day. The video was from the Windows build conference and the talk was really good by the way. Anyway, Herb claims that modern C++ is clean, type safe and efficient. This got me thinking about my CSV parsers which I was not really happy with. My main concern is clarity but not without sacrificing too much efficiency.

I started playing around with the boost tokenizer trying to get more out of it. After a while I also re-discovered the lexical_cast from the boost library which provides a short and clean way to convert strings to integers or doubles. Combining this with the stream iterators this was my result:

#include <iostream>

#include <string>

#include <fstream>

#include <boost/tokenizer.hpp>

#include <boost/lexical_cast.hpp>

#include <boost/algorithm/string.hpp>


template<class T>

std::vector<std::vector<T>> ParseCsv(std::istream& is)


    using namespace std;

    using namespace boost;

    typedef istream_iterator<char> iterator;

    typedef char_separator<char> separator;

    typedef tokenizer<separator, iterator, string> Tokenizer;




    vector<vector<T>> result;

    Tokenizer tokens(iterator(is), iterator(), separator(“,”, “\n”));   

    bool newLine = true;   

    for(auto token = tokens.begin(); token != tokens.end(); ++token)


        if (newLine)



            newLine = false;



        if (*token == “\n”)

            newLine = true;





    return result;



Usage is simple:


vector<vector<int>> result = ParseCsv<int>(ifstream(“test.csv”));


I think I like this one better than my previous attempts; assuming though that boost is available.

Categories: Uncategorized
  1. H.
    July 23, 2012 at 4:30 pm

    Very nice and useful code, thank you.

    However, g++ 4.6.3 on Ubuntu 12.04 with -Wall will come up with the following:

    warning: ‘auto’ will change meaning in C++0x; please remove it

    I changed the line

    for(auto token = tokens.begin(); token != tokens.end(); ++token)


    for(Tokenizer::iterator token = tokens.begin(); token != tokens.end(); ++token)

    to remove it.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: