movlw 4294967295 compiler bug

Coding and general discussion relating to the compiler

Moderators: David Barker, Jerry Messina

User avatar
JWinters
Posts: 106
Joined: Mon Feb 04, 2008 4:56 pm
Location: North Carolina, USA
Contact:

movlw 4294967295 compiler bug

Post by JWinters » Wed Sep 02, 2009 11:14 pm

* I've changed the title of this thread since this has turned out to be an actual compiler problem

I've been hesitant to post this because up until now, I've just assumed I was doing something wrong in my code. But let me ask the question anyway.

Does anyone else get quirky results when compiling large projects?

Swordfish seems rock solid on most things I do. However, three times now, I've worked on large projects that seems to produce buggy code. Everything works fine until I pass some imaginary code size barrier, around 85% code space and RAM usage. (I mostly use the 18F4620 PIC, so 55KB is a big project to me at least.)

At first I thought it was a bug in the ENC28J60 library since my first two large / buggy projects used it. However, the third project didn't use that library and I still get the "weird" behavior after my project got large.

Let me describe the behavior: Everything always compiles great, no build errors. But I'll go from perfectly executing code to craziness by simply changing DelayMS(10) to two consecutive DelayMS(5) lines.
or maybe I change the text in a UART write statement from "Hello" to "HELLO".

The result is code that either won't run on the PIC or behavior that is completely unpredictable. Sometimes my debug statements will reply with the completely wrong info or have missing text.

I can do certain things to "fix" the code, like rearrange variable declarations or random insert Nop() here and there. Although that is far from ideal.

I know I can cause the code to be buggy by one of the following:

1) creating a function that doesn't return anything
2) overrunning a string
3) overrun an array
4) calling a function from within an interrupt routine

I've triple checked my code I know these aren't the problem.
Are there any other ways I can create problems that the compiler won't detect?
Last edited by JWinters on Sat Sep 12, 2009 1:21 am, edited 2 times in total.

Jerry Messina
Swordfish Developer
Posts: 1473
Joined: Fri Jan 30, 2009 6:27 pm
Location: US

Post by Jerry Messina » Thu Sep 03, 2009 11:01 am

Jason,

You're not alone...I've seen similar behavior in some of the stuff I've done.

In addition to what you've listed, some things to watch out for:

- bits (and booleans) can be problematic, esp. if they're not statically declared (such as auto vars inside a sub/function).
I've had the compiler get confused, and start misallocating them, which results in corrupting other vars. Definitely, don't
mix them with other data types in an expression.

- sometimes "complex" expressions result in incorrect code, and have to be broken up into simpler statements.
I see this mostly with using the result of function calls and the creation of temp variables. For example
(completely oversimplified example follows),

Code: Select all

       a = b + funct(c)
might have to become

Code: Select all

       a = funct(c)
       a = a + b
If rearranging declarations or adding some nops seems to 'fix' things, then I'd compare the asm files before and after.
Usually, you'll find something moving around in memory, and if you look at who's referencing them,
that can help narrow down where the problem lies... sometimes it's not exactly where you think.

- Jerry

User avatar
JWinters
Posts: 106
Joined: Mon Feb 04, 2008 4:56 pm
Location: North Carolina, USA
Contact:

Post by JWinters » Thu Sep 03, 2009 3:10 pm

Ok, at least I know I'm not going crazy now.

I have noticed the strange behavior when declaring bits. I avoid them at all cost now. I should look through the TCP/IP stack and make sure there are none in it as well. I've been hesitant to change anything in that library since it was ported by David himself.

I'll look for any places where the functions are complex and break them up.

Is there some way to turn off some of the compiler optimizations? Although, depending on how much optimizing it does, my code might not even fit on the PIC.

Another problem area for me seems to be large Select Case blocks. Is there a limit to how large that block can get?

Jerry Messina
Swordfish Developer
Posts: 1473
Joined: Fri Jan 30, 2009 6:27 pm
Location: US

Post by Jerry Messina » Thu Sep 03, 2009 4:48 pm

I've seen the select case issue as well, and have had to resort to rewriting it as a series of if... elseif's.

I haven't figured out what sets this one off...I've had some simple ones screw up, and some pretty large ones that seem to work fine.

User avatar
JWinters
Posts: 106
Joined: Mon Feb 04, 2008 4:56 pm
Location: North Carolina, USA
Contact:

Post by JWinters » Thu Sep 03, 2009 5:02 pm

have had to resort to rewriting it as a series of if... elseif's
Exactly! I've been doing the same thing. However I've never been able to figure out if the Select Case issue is the symptom or the cause.

User avatar
JWinters
Posts: 106
Joined: Mon Feb 04, 2008 4:56 pm
Location: North Carolina, USA
Contact:

Post by JWinters » Thu Sep 10, 2009 11:13 pm

Finally, some evidence! :shock:

I was cleaning up some code today (mostly cosmetic changes) and suddenly my code started behaving badly again. However this time, I dug a little deeper into the asm and I cannot figure out what is going on.

First the background: I'm using the ethernet stack on a 18F4620 as I mentioned before. There are a few places in the stack with Select Case blocks. I'm not sure why but some of the "case" statements include a colon, like this Case xyz: . Since the colon isn't documented in help files, I thought I would go through and remove them all. I should mention that it worked fine with the colons in there, but I wanted to make the code standard.

I've gotten into the habit of recompiling after every minor change because of the bug mentioned above. The program size had been staying the same as I deleted colons from the Case keywords. However in one particular instance, I recompiled and the program size increased. I tried the code on my hardware and of course it failed.

I made the change to the TCP.bas module as follows:

Code: Select all

Public Sub TCPDisconnect(hTCP As TCP_SOCKET)
   TCBStubs(hTCP).rxTail = TCBStubs(hTCP).rxHead
	
   Select TCBStubs(hTCP).smState

      Case TCP_FIN_WAIT_1, TCP_FIN_WAIT_2, TCP_LAST_ACK
	     SendTCP(hTCP, RST Or ACK, false)
	     CloseSocket(hTCP)

      Case TCP_SYN_SENT
	     CloseSocket(hTCP)

      Case TCP_SYN_RECEIVED, TCP_ESTABLISHED
            SendTCP(hTCP, FIN Or ACK, true)
            TCBStubs(hTCP).smState = TCP_FIN_WAIT_1
	
      Case TCP_LOOPBACK:   //             <-----  THIS IS THE COLON I'M DELETING
	    TCBStubs(hTCP).smState = TCP_LOOPBACK_CLOSED
   
   End Select
End Sub
With the colon, my code size is 53997. After deleting it, the size changes to 53999. I went through the asm files and found where the extra bytes came from. For some reason, the assembly changed in my DHCP module!

Here is the "before" (again, this is the DHCP module, not the TCP module where I made the change).

Code: Select all

?I000941_F020_000661_P000300 ; L#MK UDPGET(J)                     // GET OPTION LEN
        CLRF F18_U16H,0
        MOVLW 58
        MOVWF F18_U16,0
        CALL PROC_UDPGET_0
?I000942_F020_000662_P000300 ; L#MK WHILE J <> 0                     // IGNORE OPTION VALUES
WHILE_71
        MOVF F27_U08,1,0            <-----  KEEP YOUR EYE ON THIS LINE
        BZ FALSE_100
?I000943_F020_000663_P000300 ; L#MK UDPGET(V)
        CLRF F18_U16H,0
        MOVLW 56
        MOVWF F18_U16,0
        CALL PROC_UDPGET_0

and here's what I get after the change is made

Code: Select all

?I000941_F020_000661_P000300 ; L#MK UDPGET(J)                     // GET OPTION LEN
        CLRF F18_U16H,0
        MOVLW 58
        MOVWF F18_U16,0
        CALL PROC_UDPGET_0
?I000942_F020_000662_P000300 ; L#MK WHILE J <> 0                     // IGNORE OPTION VALUES
WHILE_71
        MOVLW 4294967295                  <--------   WHAT THE....???????
        SUBWF F27_U08,0,0
        BZ FALSE_100
?I000943_F020_000663_P000300 ; L#MK UDPGET(V)
        CLRF F18_U16H,0
        MOVLW 56
        MOVWF F18_U16,0
        CALL PROC_UDPGET_0
Now I don't know much about assembly code, but why did MOVF F27_U08,1,0 turning into MOVLW 4294967295?

The source code for the offending lines is:

Code: Select all

...
         Else // select case
            UDPGet(j)                     // Get option len
            While j <> 0                     // Ignore option values
               UDPGet(v)
               Dec(j)
            Wend
         End Select
      Until lbDone
...
If I change it to the following, the compiler gets is correct again.

Code: Select all

...
         Else // select case
            UDPGet(j)                     // Get option len
            While j <> 0                     // Ignore option values
               UDPGet(v)
               Dec(j)
               ASM
               Nop
               End ASM
            Wend
         End Select
      Until lbDone
...
However, this is hardly a workaround because the with the next change I make, the error jumps somewhere else!

richardb
Posts: 310
Joined: Tue Oct 03, 2006 8:54 pm

Post by richardb » Fri Sep 11, 2009 10:32 am

sound like the same kind of bug as this one i reporter earlier

http://www.sfcompiler.co.uk/forum/viewtopic.php?t=1086

the pic was also pretty full when i started to have problems.
Hmmm..

User avatar
JWinters
Posts: 106
Joined: Mon Feb 04, 2008 4:56 pm
Location: North Carolina, USA
Contact:

Post by JWinters » Fri Sep 11, 2009 3:02 pm

I played around with it a little more and found that it appears to be related to conditional statements.

A MOVLW 4294967295 with always occurs right after a If X <> 0 Then or While X <> 0
sound like the same kind of bug as this one i reporter earlier
If you really think you have the same issue, compile, press F2 to open up the Assembly View and do a search with "MOVLW 4294967295".

I've been chasing this bug for over a year now. I hope this is fixable.

User avatar
JWinters
Posts: 106
Joined: Mon Feb 04, 2008 4:56 pm
Location: North Carolina, USA
Contact:

Post by JWinters » Mon Sep 14, 2009 4:39 am

I decided to try a different computer. No problems while compiling on my little netbook. The compile size was 51279 on my development machine, yet it was 51397 on the netbook. I wonder what the 100 byte difference is?

Off to re-install an operating system...

richardb
Posts: 310
Joined: Tue Oct 03, 2006 8:54 pm

interesting

Post by richardb » Mon Sep 14, 2009 8:13 am

are the versions of the compiler the same ?

can you do a diff on the 2 different libraries ?
Hmmm..

User avatar
JWinters
Posts: 106
Joined: Mon Feb 04, 2008 4:56 pm
Location: North Carolina, USA
Contact:

Post by JWinters » Mon Sep 14, 2009 8:41 am

All the Library, UserLibrary and Include directories are identical. Both compilers were up to date.

It not like my dev machine was buggy or full of viruses. It's a very stable computer. As a matter of fact, I rarely ever had a need to restarted. It's 64 bit CPU, but I have 32 bit Windows XP on it. I can't imagine that makes a difference though. Very strange indeed.

Maybe a .NET framework issue? I use the same machine for Visual Basic work and have a lot of "developer pack" type things installed.

richardb
Posts: 310
Joined: Tue Oct 03, 2006 8:54 pm

Post by richardb » Mon Sep 14, 2009 9:06 am

just to get my head around what your saying, have you done a byte for byte check to see that the compiler and libs are all the same ?
Hmmm..

User avatar
JWinters
Posts: 106
Joined: Mon Feb 04, 2008 4:56 pm
Location: North Carolina, USA
Contact:

Post by JWinters » Mon Sep 14, 2009 9:22 am

I copied all of the library files from one computer to the other to be sure (overwriting the ones put there by the installer). So although I didn't do a byte by byte check, I am certain the files are the same.

Jerry Messina
Swordfish Developer
Posts: 1473
Joined: Fri Jan 30, 2009 6:27 pm
Location: US

Post by Jerry Messina » Tue Sep 15, 2009 9:03 am

did you compare the executables as well?

I've had occasion where the online updater didn't place files in the right spot, but this may be due to
the fact that I use a different drive/dir structure than the default install.

User avatar
JWinters
Posts: 106
Joined: Mon Feb 04, 2008 4:56 pm
Location: North Carolina, USA
Contact:

Post by JWinters » Tue Sep 15, 2009 11:42 am

I did notice that there was one instance of SwordfishICC.exe (2136 KB) located in C:/ but a different one in C:\Program Files\Mecanique\Swordfish\Bin which is 2013 KB. However, it didn't matter if a switched them or even copied them from another machine.

It was almost as if something external to Swordfish (like a random windows DLL or .NET framework library) was causing the problem. I suspect this was also the cause of the IDE errors I mentioned here (http://www.sfcompiler.co.uk/forum/viewtopic.php?t=844).

I've reinstall XP and everything seems good now.

Post Reply