Strings In C++

Page Contents

References

https://cal-linux.com/tutorials/strings.html
https://www.oreilly.com/library/view/c-cookbook/0596007612/ch04s08.html
https://stackoverflow.com/questions/14265581/parse-split-a-string-in-c-using-string-delimiter-standard-c
https://stackoverflow.com/questions/53849/how-do-i-tokenize-a-string-in-c
https://tristanbrindle.com/posts/a-quicker-study-on-tokenising/
https://www.bfilipek.com/2018/07/string-view-perf.html

Anonymouse Streams To Strings

// https://stackoverflow.com/questions/19665458/use-an-anonymous-stringstream-to-construct-a-string
// http://www.velocityreviews.com/forums/t543728-how-do-you-create-and-use-an-ostringstream-in-an-initialisation-list.html
#define MAKE_STRING(stream) (static_cast<std::ostringstream&>(std::ostringstream() << std::dec << stream)).str()

Parsing Strings

Use A StringStream

Really basic string parsing can be done using a std::stringstream. But it is not very flexible. Look at the example below:

// See https://ideone.com/thoHPV
#include <iostream>
#include <sstream>
#include <string>

int main(int argc, char *argv[])
{
    std::string parseMe = "This is     a test string";
    std::stringstream parser(parseMe);
    std::string token;

    while (std::getline(parser, token, ' '))
    {
        std::cout << token << "\n";
    }
    
    return 0;
}

/* Outputs:
This
is




a
test,
string
*/

Everything is split using whitespace - but not as we'd expect. Only one space is used as a separator between "is" and "a". The other 4 spaces are returned as tokens.

We are also not able to use a token seperator list. We can't, for example, use std::getline(parser, token, ' ,') to split tokens on spaces and commas.

String Find Functions

If we look up the std::string docs, under "search" we can find the following functions that might be of relevance:

  • find(): This will find the first substring, starting at an index in the string.
  • find_first_of(): This will find the first character in a string of chars in the target string.
  • find_first_not_of(): This will find the first character in a string of chars not in the target string.

They all return the index into the string of the first found thing, or std::npos if not found.

find()

Look at the following example:

// See https://ideone.com/aY8liy
#include <iostream>
#include <string>

int main(int argc, char *argv[])
{
    const std::string line = "I am a little test string...";

    std::cout << line.find("I am") << "\n";
    std::cout << (line.find("I am", 1) == std::string::npos ? "not found" : "found") << "\n";
    std::cout << line.find("a") << "\n";
    std::cout << line.find("little") << "\n";

    return 0;
}
/* Outputs:
0
not found
2
7
*/

Note how the first encountered substring is returned. You can also use rfind() to search from the end to the start, but you will still only get the first match.

find_first_of()

Look at the following example:

// See https://ideone.com/wWHV7b
#include <iostream>
#include <string>

int main(int argc, char *argv[])
{
    const std::string line("I am a test, string");
    const std::string::size_type pos = line.find_first_of(",");
    std::cout << pos << " is " << line[pos] << "\n";
    return 0;
}
/* Outputs:
11 is ,
*/

Write Your Own Class

https://stackoverflow.com/questions/14265581/parse-split-a-string-in-c-using-string-delimiter-standard-c

Use Boost::Tokenizer

https://stackoverflow.com/questions/53849/how-do-i-tokenize-a-string-in-c

Avoid Copying - StringViews (C++17)