Home > Uncategorized > Reading CSV files in C++ (updated)

Reading CSV files in C++ (updated)

Reading and parsing CSV files in C++ is not as straight forward as one might think. Many different solutions exist such as this one and this one. I collect a few here for easy accessibility.

This version only depends on the standard library:

#include <iostream>

#include <vector>

#include <string>

#include <sstream>

 

template<class T>

std::istream& ReadCsv(std::istream& myfile, std::vector<std::vector<T>>& data)

{

    using namespace std;

    string row;

    while(getline(myfile, row))

    {

        data.push_back(vector<T>());

        istringstream tokenS(row);

        string token;

        while(getline(tokenS, token, ‘,’))

        {

            istringstream valueS(token);

            valueS.imbue(myfile.getloc());

            T value;

            if (valueS >> value)

                data.back().push_back(value);

        }

    }

 

    return myfile;

}

 

This one also uses only standard library components but is using stream operator style instead:

 

#include <fstream>

#include <iostream>

#include <sstream>

#include <string>

#include <vector>

 

template<class T>

std::istream& operator >> (std::istream& ins, std::vector<T>& record)

{

    using namespace std;

 

    string field;

    while (getline(ins, field, ‘,’))

    {

        T v;

        istringstream fieldS(field);

        fieldS >> v;

        if (fieldS && fieldS.eof())

            record.push_back(v);

        else

            ins.setstate(ios::failbit);

    }

 

    return ins;

}

 

template<class T>

std::istream& operator >> (std::istream& ins, std::vector<std::vector<T>>& data )

{

    using namespace std;

 

    string row;

    while(getline(ins, row))

    {

        data.push_back(vector<T>());

        istringstream rowS(row);       

        rowS >> data.back();

        if (!rowS.eof())

            ins.setstate(ios::failbit);

    }

 

    return ins; 

}

void sampleusage()

{

    using namespace std;

    vector<vector<int>> data;

    ifstream infile("matrix.txt");

    infile >> data;

}

 

Stream iterators does not help much:

 

#include <iterator>

#include <vector>

#include <sstream>

 

template<class T>

std::istream& ReadCsv(std::istream& is, std::vector<std::vector<T>>& data)

{

    using namespace std;

 

    is.unsetf(ios_base::skipws);

    istream_iterator<char> it(is), eof;

    bool newline = true;

    while(it != eof)

    {

        istream_iterator<char> it2 = it;       

        stringstream temp;

 

        // seek next delimiter

        while(it2 != eof && *it2 != ‘,’ && *it2 != ‘\n’)

            temp << *it2++;

 

        // parse and store value

        if (newline)

            data.push_back(vector<int>());

        T value;

        if (temp >> value)

            data.back().push_back(value);

 

        // prepare for next iteration

        newline = *it2 == ‘\n’;

        it = ++it2;

    }

 

    return is;

}

 

If boost is available this one adds a bit of robustness:

 

#include <vector>

#include <iostream>

#include <string>

#include <sstream>

#include <boost/tokenizer.hpp>

 

template<class T>

std::istream& ReadCsv(std::istream& myfile, std::vector<std::vector<T> >& data)

{

    using namespace std;

    using namespace boost;   

    typedef tokenizer<escaped_list_separator<char> > Tokenizer;

 

    string row;

    while(getline(myfile, row))

    {

        Tokenizer tokens(row);

        data.push_back(vector<T>());

        transform(

            tokens.begin(), tokens.end(), back_inserter(data.back()),

            [&myfile](const string& t) -> T {

                istringstream valueS(t);

                valueS.imbue(myfile.getloc());

                T value;

                valueS >> value;

                return value;

            });

    }

 

    return myfile;

}

 

Just for comparsion, here is a snippet in F#:

[|

    use sr = new StreamReader("matrix.txt")

    while sr.EndOfStream |> not do

        yield sr.ReadLine().Split(‘,’) |> Array.map float

|]

Advertisements
Categories: Uncategorized
  1. Mat Hunt
    April 13, 2016 at 2:48 pm

    There are no comments in the code to explain what is going on so they are all unusable

    • April 13, 2016 at 3:02 pm

      Sorry you feel that way. I am sure you can find some other csv parsers that meets your expectations.

  2. Luke
    January 4, 2017 at 12:31 am

    I’m having trouble modifying the code so that a blank section of the csv is still stored as an empty string. For example I had a line “data0,data1,,,data5,,data7” I’d like to have the third string in the vector to be an empty string.

    • Luke
      January 4, 2017 at 12:58 am

      ok I think I got it figured out. The condition: if (valueS >> value) was failing, so I just added a else if (value == “”). Its not elegant, but it fixes my issue. Apparently I just needed to ask to figure it out.

      • January 4, 2017 at 6:53 am

        Thanks for your improvement Luke. Just expressing the problem for someone else is a magical thing, isn’t it. šŸ˜€

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: