movlw 4294967295 compiler bug

Coding and general discussion relating to the compiler

Moderators: David Barker, Jerry Messina

User avatar
JWinters
Posts: 106
Joined: Mon Feb 04, 2008 4:56 pm
Location: North Carolina, USA
Contact:

Post by JWinters » Wed Nov 04, 2009 11:00 pm

I'm not sure whether so be happy or sad. On one hand I now know that I'm not crazy, but on the other hand, it appears to be an actually compiler bug that needs addressed. And with the disappearance of developer support, I'm not holding my breath.

Besides compiling on my "dinosaur" of a laptop, I've found no sure-fire work-around for this bug. Changing code might suppress it for that immediate build; however, the next change would make it come back again.

If you're desperate and need something compiled, I can sign a non-disclosure agreement and compile it on my "miracle machine" that seems impervious to this compiler bug.

normnet
Posts: 55
Joined: Mon Jan 01, 2007 6:32 pm

Post by normnet » Thu Nov 05, 2009 2:13 am

I know exactly what what this is like.
The main program I use (another board) has a hardware compatibility
problem (blue screen crashes) on its latest version with my new PC.
Since I am the only one that has this issue their is no support.
Too bad I can't trade PC's or software licenses.

Is their anyone else experiencing this Swordfish bug?

Swordfish is touted for its organization for longer programs because of its
use of local variables, structures, procedures, functions etc....

Norm

User avatar
RadioT
Registered User
Registered User
Posts: 157
Joined: Tue Nov 27, 2007 12:50 pm
Location: Winnipeg, Canada

Post by RadioT » Thu Nov 05, 2009 11:27 am

In case anyone didn't notice, 4294967295 is FFFFFFFF in hex. It almost seems like something overflowed and a full-range number gets stuck in there.

After working with SF for so long, I'm thinking it's probably not an issue of a bug in the compiler as much as it may be that I did something wrong in my code but the compiler doesn't give me an error for it. Or the error is from MPASM and doesn't translate back to the SF error display. (maybe there is a way to find the suppressed MPASM warnings?) I'm going through my code to make sure all arrays and strings are properly defined and that I use BOUND and code inspection to make sure I haven't exceeded any, or missed any casts to a different type that may have been needed. Code reading is a necessary part of programming anyway and I need to review the work once in a while to ensure it is working as efficiently as possible.

I will also work on minimizing the number of comparisons. As I built my code over months and the number of if...then, case and while statements grew, that's when I first saw problems with SF.

SF is a great compiler but at times like this I see it needs more than one person working on it. This is a risk investors also run into, where they invest in a company that has one genius running it but if the genius can't work for some reason (health, whatever), the investors are stuck. An altenative positive strategy is to sell the asset to a new owner who can take it further, (and it's not just the "thing" that is sold but also the potential for future revenues that come with a user base) and the genius and original investors make a tidy profit for their risk and efforts.

I would say that if very simple tutorials and instructions were set up to take new users through how SF works in small steps, especially it's unique features, more users would adopt it and provide the profit justification to the owners to keep investing in it and supporting it. But people also have to know SF exists. Generating publicity, adding more resellers and even advertising at key places might also help get the word out. Getting on to the Microchip developer and tool list (SF is not there, if you can believe it) would give the product fantastic exposure right on the chip manufacturer's site. Or if a third party published a book that expanded on how to use it, it would also build the user base as more people will figure out how to use it, as we see with many popular software packages. But if a programming tool looks cryptic to most new users then they won't adopt it. An aftermarket book is also another way to generate revenues for an additional value-added item. Some program authors even use a shadow writer to publish the book if they are too busy to write.

73's,

de Tom

Doj
Posts: 362
Joined: Wed Apr 11, 2007 10:18 pm
Location: East Sussex

Post by Doj » Sat Nov 07, 2009 12:08 am

Can only offer my experience here, I use SF on very large commercial projects daily and have done for more than 2 years now I think!

Such issues as this when experienced were eventually found to be simple overflows of my code where i am not taking care to keep arrays or strings in proper order or not terminating a string with a zero etc...

Not wishing to say the compiler is not at fault, but it's really easy to get snow blinded into thinking it is the cause.

In all the time I use the compiler I have not managed to see this issue thankfully, whatever the problem, it could well be a result of an issue which is not being reported correctly by the compiler, the fault may well have occurred and the compiler is missing it and reporting whatever is next in its view.

As an exercise, if it were to happen here, I would try to think that it must be my fault, so what is going on that might possibly be an issue of my own making especially if it has nothing to do with what is seen as the result.

My favourite **** ups are:-
1, Creating a string from received data and writing too many values to it.
2, forgetting a string must have an extra byte at the end and it must be a zero.
3, Creating a string in a loop but not putting explicitly a zero at its end, regardless of the number of characters in the string.
4,The compiler default is 24 bytes for a string, 23 characters plus the terminator.
5,Expecting the compiler to know under every single circumstance how many bytes will be stored in a string without me telling it(its usually very good but how the heck can we expect it to be perfect?)
6,Leaving the computer on when the twins come home and they think they are still at school.
7,Writing code while drinking far too much beer and debugging the twins code.

The first five are honest mistakes, the last 2 are because it seems you are more likely to have twins after 40 years old, bugger.

liak
Registered User
Registered User
Posts: 195
Joined: Fri Oct 05, 2007 12:26 am

Post by liak » Sat Nov 14, 2009 1:23 pm

Dear all,

I have been following this thread. Though I haven't encounter the problem, likely because my projects are not so huge. So what I conclude is that:

1. this is a compiler bug
2. support is not available on this issue.

Am I right?
If so, then this will definitely limit the credibility of SF. Loved SF for all this while, too much wasted if these are true.

Anyone got news from David?
Is he alright?

Regards,
Loh

Raistlin
Registered User
Registered User
Posts: 69
Joined: Tue Apr 01, 2008 1:13 pm

Post by Raistlin » Sat Nov 14, 2009 2:06 pm

David is alive and well just very busy at the moment trying to bring in the pennies :)

I have spoken too him recently and he sounded very much alive ; lol
If you can read this you are too close

User avatar
RadioT
Registered User
Registered User
Posts: 157
Joined: Tue Nov 27, 2007 12:50 pm
Location: Winnipeg, Canada

Post by RadioT » Sun Nov 15, 2009 2:40 pm

One other thing I found...one of the subs had defined a variable that was the same name as a global ("x" and "y", the old standbys in engineering...). I would think this would only confuse the compiler....

Jerry Messina
Swordfish Developer
Posts: 1473
Joined: Fri Jan 30, 2009 6:27 pm
Location: US

Post by Jerry Messina » Sun Nov 15, 2009 4:13 pm

That's not a problem at all...
the scope of a variable definition is limited to the module/sub/function that it's declared in (unless you declare it 'public').

It wouldn't be a very useful compiler otherwise.

User avatar
RadioT
Registered User
Registered User
Posts: 157
Joined: Tue Nov 27, 2007 12:50 pm
Location: Winnipeg, Canada

Post by RadioT » Fri Nov 20, 2009 3:38 pm

Maybe I found something here. The is the sub that writes to external EEPROM on the I2C bus. I write one byte at a time. The line calling the write sub has the EEPROM address (pControl, called PARAM_EEPROM, which is a constant alias for the location in hex), the address in that EEPROM (pAddress), and the byte data to be written (pData).

When I look at the assembler that comes from a call to the EEPROM writing routine, I see that the hex of address Byte0 is converted to decimal. But the addresses are supposed to be in hex.

Here is the sub:

Code: Select all

Sub WriteByte(pControl As Byte, pAddress As Word, pData As Byte)
 
      I2C.Start               
      I2C.WriteByte(pControl) 
      I2C.WriteByte(pAddress.Byte1) 
      I2C.WriteByte(pAddress.Byte0) 
      I2C.WriteByte(pData) 
      I2C.Stop     
     // WaitForWrite(pControl) // found it didn't work....put in the 
 wait below instead
      
   //USART2.Write("Write_word address is: ",DecToStr(paddress))  //Debug message
   I2C.Stop            
   DelayMS(10) //put this in to make sure it's really done
End Sub
Here is the line that calls it:

WriteByte(PARAM_EEPROM,TXING_MSG_STATE,$02)

So I am trying to write a 2 to the EEPROM location TXING_MSG_STATE.

Here is the constant that gives the address in the EEPROM:

const TXING_MSG_STATE = $80F


And here is the resulting assembler generated in SF:

Code: Select all

?I002916_F000_004359_P000360 ; L#MK WRITEBYTE(PARAM_EEPROM,TXING_MSG_STATE,TRANSMITTING_MESSAGE_...
        MOVLW 162
        MOVLB 2
        MOVWF PCONTROL_F575_U08
        MOVLW 8
        MOVWF PADDRESS_F576_U16H
        MOVLW 15  ;<<<-- LOOKIE HERE - 15 not $0F!
        MOVWF PADDRESS_F576_U16
        MOVFF TRANSMITTING_MES_F1088_U08,PDATA_F578_U08
        MOVLB 0
        CALL PROC_WRITEBYTE_2
ENDIF_228
I put a note at byte 0 of the address in the code above to point out that somehow, it was converted from hex $0F to it's decimal equivalent (15)!

To see what would happen, I changed the location to $816:

Code: Select all

const TXING_MSG_STATE = $816
And the resulting assembler was as follows:

Code: Select all

?I002916_F000_004359_P000360 ; L#MK WRITEBYTE(PARAM_EEPROM,TXING_MSG_STATE,TRANSMITTING_MESSAGE_...
        MOVLW 162
        MOVLB 2
        MOVWF PCONTROL_F575_U08
        MOVLW 8
        MOVWF PADDRESS_F576_U16H
        MOVLW 22  ;<<<-- LOOKIE HERE - 22 not $16
        MOVWF PADDRESS_F576_U16
        MOVFF TRANSMITTING_MES_F1088_U08,PDATA_F578_U08
        MOVLB 0
        CALL PROC_WRITEBYTE_2
ENDIF_228
Now location $16 was converted to it's decimal equivalent, 22!

When messing with EEPROM locations like this and making calls to key parameters, this would obviously cause problems. I only have a few locations higher that $09 as the LSB so maybe that's why the program basically runs fine when parameters are rearranged and it's recompiled after a problem is found.

What am I missing here?? What would cause the value of the address byte to be converted to decimal? How could I prevent that?

-Tom

Jerry Messina
Swordfish Developer
Posts: 1473
Joined: Fri Jan 30, 2009 6:27 pm
Location: US

Post by Jerry Messina » Fri Nov 20, 2009 6:28 pm

When I look at the assembler that comes from a call to the EEPROM writing routine, I see that the hex of address Byte0 is converted to decimal. But the addresses are supposed to be in hex.
by default, the assembler works in decimal. there's nothing wrong with that code at all (at least from the standpoint of $0f showing up as 15)

User avatar
RadioT
Registered User
Registered User
Posts: 157
Joined: Tue Nov 27, 2007 12:50 pm
Location: Winnipeg, Canada

Post by RadioT » Fri Nov 20, 2009 6:35 pm

Jerry Messina wrote:
When I look at the assembler that comes from a call to the EEPROM writing routine, I see that the hex of address Byte0 is converted to decimal. But the addresses are supposed to be in hex.
by default, the assembler works in decimal. there's nothing wrong with that code at all (at least from the standpoint of $0f showing up as 15)
Weeell....that would be OK but the upper byte is an 8. Put it together and you get and 8 as the upper byte to the EEPROM and then a 15. So the EEPROM gets "815" as the location, not "80F". So maybe the routine needs to be modified to get the correct number. The EEPROM expects the location in hex, if it is changed to decimal along the way, unless the object code puts it in hex (there's another half hour to check that out).

-Tom

Jerry Messina
Swordfish Developer
Posts: 1473
Joined: Fri Jan 30, 2009 6:27 pm
Location: US

Post by Jerry Messina » Fri Nov 20, 2009 6:59 pm

maybe it would be clearer if you worked in binary, as do all the chips.

User avatar
RadioT
Registered User
Registered User
Posts: 157
Joined: Tue Nov 27, 2007 12:50 pm
Location: Winnipeg, Canada

Post by RadioT » Fri Nov 20, 2009 7:07 pm

Jerry Messina wrote:maybe it would be clearer if you worked in binary, as do all the chips.
It would be a much simpler world.

BTW, I see your point, the machine code is in hex!! And I thought I found something to explain these random errors...I'll keep checking. Actually, not exactly random. They occur in functions that are nested in CASE and IF...THEN statements. Always.

-Tom

richardb
Posts: 310
Joined: Tue Oct 03, 2006 8:54 pm

Post by richardb » Sat Nov 21, 2009 10:10 am

just a thought, is your problem just a simple hardware stack overflow problem? or does swordfish use a software stack?

the hardware stack is 31...
Hmmm..

User avatar
RadioT
Registered User
Registered User
Posts: 157
Joined: Tue Nov 27, 2007 12:50 pm
Location: Winnipeg, Canada

Post by RadioT » Tue Nov 24, 2009 4:24 am

Stack overflow...good one. I looked over the code and eliminated a layer by moving a case statement into the main loop rather than having it in a sub. Then all the called subs and their nested loops are all one level higher.

But things really started working better after I corrected code where the I2C bus was being accessed by a function without first disabling interrupts. The code has been working well since making that change, even after many recompiles, except for one situation where some assembler prior to a branch statement was missed. This has happened before, and looking at the code, it's happening where the code is nested in an "if" statement, then jumps to a sub, then a "Case" statement, then another "if" statement, and another "if" statement in that. So basically nested 5 deep. And I've seen this exact error before in the same piece of code, I change a statement or the order it's in, and it compiles fine.

So here's one for the "watch out" list: think about whether your I2C bus reads/writes need to be preceeded by disabling interrupts and followed by re-enabling. In my case, I put in the following statement at the start of the function:

Code: Select all

   
Dim Restore_interrupt As Boolean
   Restore_interrupt = false
   If INTCON.7 = 1 Then Disable(interrupt) Restore_interrupt = true EndIf 
Then at the end of the function, put in the line:

Code: Select all

If Restore_interrupt Then Enable(interrupt) EndIf
This seems to work well, as far as I can tell. Anytime the global interrupt was enabled on going into the function, it's disabled and renabled, and anytime it was not enabled going into the function, it's left as-is.

By the way, to see all the MPASM messages, I went to the Microchip directory and ran MPASMWIN.exe in C:\Program Files\Microchip\MPASM Suite\ . I was able to load the latest .asm file and choose the error and warning messages to display by checking the respective boxes. There wasn't anything in the messages that provided any clues, but it was good to know what the "supressed" messages and warnings were.

-Tom

Post Reply