Spyke

My Last Resort - I Need Help Debugging my Program

I can't figure out what is going on in my program, I'm having a hard time wrapping my mind around what steps its taking to get to the wrong result.

I have this function which should make my parser ignore C-style comments, ie /* */. When I type /* * */ or any number of * in the comment my program is detecting the * characters inside the comment itself, and when it detects a * it also detects the last /:

	// Handle C-style comments
	private void HandleComment() {
		// Consume until closing characters, multi-line comments are supported
		while (Peek() != '*' && PeekAhead() != '/' && !IsAtEnd()) {
			if (Peek() == '\n') line++;
			Advance();
		}

		Advance();
		Advance();
	}

I figured I would need to Advance() twice in order to consume both the characters that denote the end of the comment.

Peek(), PeekAhead(), Advance() and IsAtEnd() are functions from the book, Crafting Interpreters, here they are:

	// Peek at the next char
	private char Peek() {
		if (IsAtEnd()) return '\0';
		return source[current];
	}

	// Peek after next char
	private char PeekAhead() {
		if (current + 1 >= source.Length) return '\0';
		return source[current+1];
	}

	private char Advance() {
		return source[current++];
	}

	private bool IsAtEnd() {
		return current >= source.Length;
	}

The source variable is just a string that contains the text contents of a file or the input from Console.ReadLine().

I based the comment logic on the string logic here:

	// Handle string lexemes
	private void HandleString() {
		// Consume until closing quote, multi-line strings are supported
		while (Peek() != '"' && !IsAtEnd()) {
			if (Peek() == '\n') line++;
			Advance();
		}

		// Handle unterminated strings
		if (IsAtEnd()) {
			DotLox.Error(line, "Unterminated string.");
			return;
		}

		// Consume
		Advance();

		// Tokenize string and store value without quotes
		string value = source[(start+1)..(current-1)];
		AddToken(TokenType.STRING, value);
	}

I almost forgot to share this crucial bit of logic here:

	private void ScanToken() {
		char c = Advance();

		// Match characters to tokens
		switch(c) {

			// Division and comments
			case '/':
				if (Match('/')) {
					// Consume comment but don't turn it into a token
					while (Peek() != '\n' && !IsAtEnd()) Advance();
				} else if (Match('*')) {
					// Handle C-style comments
					HandleComment();
				} else {
					// Turn lone slash into a token
					AddToken(TokenType.SLASH);
				}
				break;
                }
       }

View original on reddthat.com

Comments11

fruitcantfly

programming.dev

The problem is fairly simple:

		while (Peek() != '*' && PeekAhead() != '/' && !IsAtEnd()) {

You are breaking the loop if either the current character is a * or if the next character is a /. In other words, you don't require the full */ token.

Instead, you'll want something like

		while ((Peek() != '*' || PeekAhead() != '/') && !IsAtEnd()) {

dr_robotBones reply

reddthat.com

That is so weird, I thought everything in the while statement would have to be true at the same time.

fruitcantfly reply

programming.dev

That is the case. But if you are at a *, then the original while statement reads while (false && …) { and immediately breaks. The remaining checks are not evaluated, since they cannot affect the outcome when the first check fails

dr_robotBones reply

reddthat.com

Oh right, forgot && did that. Thank you, it looks like your solution worked!

xianjam reply

programming.dev

This would also work:

while (!(Peek() == '*' && PeekAhead() == '/') && !IsAtEnd())

gjoel

programming.dev

private char Advance() {
   	return source[current++];
   }

This is wrong. At least as I would expect advance to behave. While you do advance, you are returning the old value, not the new one. ++current would return the new value.

Also, it would help if you said what the wrong behaviour was.

dr_robotBones reply

reddthat.com

Thank you, I'll see if that change improves the logic. The wrong behaviour is right here:

"When I type /* * */ or any number of * in the comment my program is detecting the * characters inside the comment itself, and when it detects a * it also detects the last /"

gjoel reply

programming.dev

while (Peek() != '*' && PeekAhead() != '/' && !IsAtEnd()) {

Probably this guy.

If you encounter an asterisk or a slash, then it stops. You want to stop when you encounter an asterisk and a slash, so:

while (!(Peek() == '*' && PeekAhead() == '/') && !IsAtEnd()) {

I also find your detection of when to remove the comment a bit scary, but that might be because you haven't shown the Match method.

dr_robotBones reply

reddthat.com

Oh yes I forgot about that one, Match() compares source[current] to the parameter and then consumes it if true, so if the program encounters /* it'll consume the / with the Match() method and then it will also consume the * before going into the HandleComment() logic.

fruitcantfly reply

programming.dev

It's not strictly speaking a bug, but the behavior of Advance is a bit confusing. Based on how you use it in the code snippets, I would suggest changing it to

private void Advance() {
    current++;
}

So that it only does what it says, and then changing the place where you read the next char to

char c = Peek();
Advance();

Alternatively, you could rename it to something like ReadChar, which more closely matches its current behavior, since a read is expected to return the data (character) at the current position, and then advance the position

dr_robotBones reply

reddthat.com

Its designed to consume and increment, the naming is a bit weird but its what the author used in the book I was following. I could also rename it to Consume().