movlw 4294967295 compiler bug
Moderators: David Barker, Jerry Messina
movlw 4294967295 compiler bug
* I've changed the title of this thread since this has turned out to be an actual compiler problem
I've been hesitant to post this because up until now, I've just assumed I was doing something wrong in my code. But let me ask the question anyway.
Does anyone else get quirky results when compiling large projects?
Swordfish seems rock solid on most things I do. However, three times now, I've worked on large projects that seems to produce buggy code. Everything works fine until I pass some imaginary code size barrier, around 85% code space and RAM usage. (I mostly use the 18F4620 PIC, so 55KB is a big project to me at least.)
At first I thought it was a bug in the ENC28J60 library since my first two large / buggy projects used it. However, the third project didn't use that library and I still get the "weird" behavior after my project got large.
Let me describe the behavior: Everything always compiles great, no build errors. But I'll go from perfectly executing code to craziness by simply changing DelayMS(10) to two consecutive DelayMS(5) lines.
or maybe I change the text in a UART write statement from "Hello" to "HELLO".
The result is code that either won't run on the PIC or behavior that is completely unpredictable. Sometimes my debug statements will reply with the completely wrong info or have missing text.
I can do certain things to "fix" the code, like rearrange variable declarations or random insert Nop() here and there. Although that is far from ideal.
I know I can cause the code to be buggy by one of the following:
1) creating a function that doesn't return anything
2) overrunning a string
3) overrun an array
4) calling a function from within an interrupt routine
I've triple checked my code I know these aren't the problem.
Are there any other ways I can create problems that the compiler won't detect?
I've been hesitant to post this because up until now, I've just assumed I was doing something wrong in my code. But let me ask the question anyway.
Does anyone else get quirky results when compiling large projects?
Swordfish seems rock solid on most things I do. However, three times now, I've worked on large projects that seems to produce buggy code. Everything works fine until I pass some imaginary code size barrier, around 85% code space and RAM usage. (I mostly use the 18F4620 PIC, so 55KB is a big project to me at least.)
At first I thought it was a bug in the ENC28J60 library since my first two large / buggy projects used it. However, the third project didn't use that library and I still get the "weird" behavior after my project got large.
Let me describe the behavior: Everything always compiles great, no build errors. But I'll go from perfectly executing code to craziness by simply changing DelayMS(10) to two consecutive DelayMS(5) lines.
or maybe I change the text in a UART write statement from "Hello" to "HELLO".
The result is code that either won't run on the PIC or behavior that is completely unpredictable. Sometimes my debug statements will reply with the completely wrong info or have missing text.
I can do certain things to "fix" the code, like rearrange variable declarations or random insert Nop() here and there. Although that is far from ideal.
I know I can cause the code to be buggy by one of the following:
1) creating a function that doesn't return anything
2) overrunning a string
3) overrun an array
4) calling a function from within an interrupt routine
I've triple checked my code I know these aren't the problem.
Are there any other ways I can create problems that the compiler won't detect?
Last edited by JWinters on Sat Sep 12, 2009 1:21 am, edited 2 times in total.
-
- Swordfish Developer
- Posts: 1473
- Joined: Fri Jan 30, 2009 6:27 pm
- Location: US
Jason,
You're not alone...I've seen similar behavior in some of the stuff I've done.
In addition to what you've listed, some things to watch out for:
- bits (and booleans) can be problematic, esp. if they're not statically declared (such as auto vars inside a sub/function).
I've had the compiler get confused, and start misallocating them, which results in corrupting other vars. Definitely, don't
mix them with other data types in an expression.
- sometimes "complex" expressions result in incorrect code, and have to be broken up into simpler statements.
I see this mostly with using the result of function calls and the creation of temp variables. For example
(completely oversimplified example follows),
might have to become
If rearranging declarations or adding some nops seems to 'fix' things, then I'd compare the asm files before and after.
Usually, you'll find something moving around in memory, and if you look at who's referencing them,
that can help narrow down where the problem lies... sometimes it's not exactly where you think.
- Jerry
You're not alone...I've seen similar behavior in some of the stuff I've done.
In addition to what you've listed, some things to watch out for:
- bits (and booleans) can be problematic, esp. if they're not statically declared (such as auto vars inside a sub/function).
I've had the compiler get confused, and start misallocating them, which results in corrupting other vars. Definitely, don't
mix them with other data types in an expression.
- sometimes "complex" expressions result in incorrect code, and have to be broken up into simpler statements.
I see this mostly with using the result of function calls and the creation of temp variables. For example
(completely oversimplified example follows),
Code: Select all
a = b + funct(c)
Code: Select all
a = funct(c)
a = a + b
Usually, you'll find something moving around in memory, and if you look at who's referencing them,
that can help narrow down where the problem lies... sometimes it's not exactly where you think.
- Jerry
Ok, at least I know I'm not going crazy now.
I have noticed the strange behavior when declaring bits. I avoid them at all cost now. I should look through the TCP/IP stack and make sure there are none in it as well. I've been hesitant to change anything in that library since it was ported by David himself.
I'll look for any places where the functions are complex and break them up.
Is there some way to turn off some of the compiler optimizations? Although, depending on how much optimizing it does, my code might not even fit on the PIC.
Another problem area for me seems to be large Select Case blocks. Is there a limit to how large that block can get?
I have noticed the strange behavior when declaring bits. I avoid them at all cost now. I should look through the TCP/IP stack and make sure there are none in it as well. I've been hesitant to change anything in that library since it was ported by David himself.
I'll look for any places where the functions are complex and break them up.
Is there some way to turn off some of the compiler optimizations? Although, depending on how much optimizing it does, my code might not even fit on the PIC.
Another problem area for me seems to be large Select Case blocks. Is there a limit to how large that block can get?
-
- Swordfish Developer
- Posts: 1473
- Joined: Fri Jan 30, 2009 6:27 pm
- Location: US
Finally, some evidence!
I was cleaning up some code today (mostly cosmetic changes) and suddenly my code started behaving badly again. However this time, I dug a little deeper into the asm and I cannot figure out what is going on.
First the background: I'm using the ethernet stack on a 18F4620 as I mentioned before. There are a few places in the stack with Select Case blocks. I'm not sure why but some of the "case" statements include a colon, like this Case xyz: . Since the colon isn't documented in help files, I thought I would go through and remove them all. I should mention that it worked fine with the colons in there, but I wanted to make the code standard.
I've gotten into the habit of recompiling after every minor change because of the bug mentioned above. The program size had been staying the same as I deleted colons from the Case keywords. However in one particular instance, I recompiled and the program size increased. I tried the code on my hardware and of course it failed.
I made the change to the TCP.bas module as follows:
With the colon, my code size is 53997. After deleting it, the size changes to 53999. I went through the asm files and found where the extra bytes came from. For some reason, the assembly changed in my DHCP module!
Here is the "before" (again, this is the DHCP module, not the TCP module where I made the change).
and here's what I get after the change is made
Now I don't know much about assembly code, but why did MOVF F27_U08,1,0 turning into MOVLW 4294967295?
The source code for the offending lines is:
If I change it to the following, the compiler gets is correct again.
However, this is hardly a workaround because the with the next change I make, the error jumps somewhere else!
I was cleaning up some code today (mostly cosmetic changes) and suddenly my code started behaving badly again. However this time, I dug a little deeper into the asm and I cannot figure out what is going on.
First the background: I'm using the ethernet stack on a 18F4620 as I mentioned before. There are a few places in the stack with Select Case blocks. I'm not sure why but some of the "case" statements include a colon, like this Case xyz: . Since the colon isn't documented in help files, I thought I would go through and remove them all. I should mention that it worked fine with the colons in there, but I wanted to make the code standard.
I've gotten into the habit of recompiling after every minor change because of the bug mentioned above. The program size had been staying the same as I deleted colons from the Case keywords. However in one particular instance, I recompiled and the program size increased. I tried the code on my hardware and of course it failed.
I made the change to the TCP.bas module as follows:
Code: Select all
Public Sub TCPDisconnect(hTCP As TCP_SOCKET)
TCBStubs(hTCP).rxTail = TCBStubs(hTCP).rxHead
Select TCBStubs(hTCP).smState
Case TCP_FIN_WAIT_1, TCP_FIN_WAIT_2, TCP_LAST_ACK
SendTCP(hTCP, RST Or ACK, false)
CloseSocket(hTCP)
Case TCP_SYN_SENT
CloseSocket(hTCP)
Case TCP_SYN_RECEIVED, TCP_ESTABLISHED
SendTCP(hTCP, FIN Or ACK, true)
TCBStubs(hTCP).smState = TCP_FIN_WAIT_1
Case TCP_LOOPBACK: // <----- THIS IS THE COLON I'M DELETING
TCBStubs(hTCP).smState = TCP_LOOPBACK_CLOSED
End Select
End Sub
Here is the "before" (again, this is the DHCP module, not the TCP module where I made the change).
Code: Select all
?I000941_F020_000661_P000300 ; L#MK UDPGET(J) // GET OPTION LEN
CLRF F18_U16H,0
MOVLW 58
MOVWF F18_U16,0
CALL PROC_UDPGET_0
?I000942_F020_000662_P000300 ; L#MK WHILE J <> 0 // IGNORE OPTION VALUES
WHILE_71
MOVF F27_U08,1,0 <----- KEEP YOUR EYE ON THIS LINE
BZ FALSE_100
?I000943_F020_000663_P000300 ; L#MK UDPGET(V)
CLRF F18_U16H,0
MOVLW 56
MOVWF F18_U16,0
CALL PROC_UDPGET_0
and here's what I get after the change is made
Code: Select all
?I000941_F020_000661_P000300 ; L#MK UDPGET(J) // GET OPTION LEN
CLRF F18_U16H,0
MOVLW 58
MOVWF F18_U16,0
CALL PROC_UDPGET_0
?I000942_F020_000662_P000300 ; L#MK WHILE J <> 0 // IGNORE OPTION VALUES
WHILE_71
MOVLW 4294967295 <-------- WHAT THE....???????
SUBWF F27_U08,0,0
BZ FALSE_100
?I000943_F020_000663_P000300 ; L#MK UDPGET(V)
CLRF F18_U16H,0
MOVLW 56
MOVWF F18_U16,0
CALL PROC_UDPGET_0
The source code for the offending lines is:
Code: Select all
...
Else // select case
UDPGet(j) // Get option len
While j <> 0 // Ignore option values
UDPGet(v)
Dec(j)
Wend
End Select
Until lbDone
...
Code: Select all
...
Else // select case
UDPGet(j) // Get option len
While j <> 0 // Ignore option values
UDPGet(v)
Dec(j)
ASM
Nop
End ASM
Wend
End Select
Until lbDone
...
sound like the same kind of bug as this one i reporter earlier
http://www.sfcompiler.co.uk/forum/viewtopic.php?t=1086
the pic was also pretty full when i started to have problems.
http://www.sfcompiler.co.uk/forum/viewtopic.php?t=1086
the pic was also pretty full when i started to have problems.
Hmmm..
I played around with it a little more and found that it appears to be related to conditional statements.
A MOVLW 4294967295 with always occurs right after a If X <> 0 Then or While X <> 0
I've been chasing this bug for over a year now. I hope this is fixable.
A MOVLW 4294967295 with always occurs right after a If X <> 0 Then or While X <> 0
If you really think you have the same issue, compile, press F2 to open up the Assembly View and do a search with "MOVLW 4294967295".sound like the same kind of bug as this one i reporter earlier
I've been chasing this bug for over a year now. I hope this is fixable.
interesting
are the versions of the compiler the same ?
can you do a diff on the 2 different libraries ?
can you do a diff on the 2 different libraries ?
Hmmm..
All the Library, UserLibrary and Include directories are identical. Both compilers were up to date.
It not like my dev machine was buggy or full of viruses. It's a very stable computer. As a matter of fact, I rarely ever had a need to restarted. It's 64 bit CPU, but I have 32 bit Windows XP on it. I can't imagine that makes a difference though. Very strange indeed.
Maybe a .NET framework issue? I use the same machine for Visual Basic work and have a lot of "developer pack" type things installed.
It not like my dev machine was buggy or full of viruses. It's a very stable computer. As a matter of fact, I rarely ever had a need to restarted. It's 64 bit CPU, but I have 32 bit Windows XP on it. I can't imagine that makes a difference though. Very strange indeed.
Maybe a .NET framework issue? I use the same machine for Visual Basic work and have a lot of "developer pack" type things installed.
-
- Swordfish Developer
- Posts: 1473
- Joined: Fri Jan 30, 2009 6:27 pm
- Location: US
I did notice that there was one instance of SwordfishICC.exe (2136 KB) located in C:/ but a different one in C:\Program Files\Mecanique\Swordfish\Bin which is 2013 KB. However, it didn't matter if a switched them or even copied them from another machine.
It was almost as if something external to Swordfish (like a random windows DLL or .NET framework library) was causing the problem. I suspect this was also the cause of the IDE errors I mentioned here (http://www.sfcompiler.co.uk/forum/viewtopic.php?t=844).
I've reinstall XP and everything seems good now.
It was almost as if something external to Swordfish (like a random windows DLL or .NET framework library) was causing the problem. I suspect this was also the cause of the IDE errors I mentioned here (http://www.sfcompiler.co.uk/forum/viewtopic.php?t=844).
I've reinstall XP and everything seems good now.