#emc-devel | Logs for 2008-03-29

[00:11:32] <alex_joni> * alex_joni is off to bed
[00:11:39] <alex_joni> g'night again ;)
[02:10:00] <Guest252> Guest252 is now known as skunkworks
[02:10:13] <skunkworks> jepler: I have an odd issue..
[02:10:56] <skunkworks> I thought that 0xfff would set all the pins to inputs. Which it seems to do - but 5 times now - the computer hard locsk
[02:11:00] <skunkworks> locks
[02:11:11] <jepler> that's no good
[02:12:14] <skunkworks> Initially - I would set the hal file to 0x000. This works - I can toggle outputs. Then I change the hal file to 0xfff - the computer will lockup.
[02:12:51] <jepler> increase your period
[02:13:20] <jepler> hm, I used a 1ms period and it seems to have killed my machine too
[02:13:31] <jepler> must be a bug :-P
[02:13:36] <skunkworks> If I reboot - It seems to run wth 0xfff - but when I try to set an input to low or high - I don't see a change on halmeter
[02:13:37] <skunkworks> heh
[02:14:01] <jepler> it locked up after I issued "halcmd start"
[02:14:12] <jepler> having added 'write-all' to the thread
[02:14:16] <skunkworks> that is what I am using I think.. I am running the 8255 in the servo period
[02:14:23] <skunkworks> heh
[02:14:38] <skunkworks> *servo thread
[02:15:49] <jepler> er, maybe 'read-all' -- I already forgot what I did
[02:15:56] <jepler> and then I closed the terminal on the crashed machine
[02:16:33] <skunkworks> I could see if I can figure out the exact steps.. It will be running thru emc-axis though
[02:17:12] <skunkworks> I don't do the fancy hal only stuff ;)
[02:18:26] <skunkworks> sounds like a race condition.. (j/k I thought I would throw buzz phrase out there.)
[02:24:10] <jepler> it must be some bonehead error on my part
[02:24:24] <jepler> I'll try to look at it sunday -- I have no excuse, since I have the board in a machine
[02:24:39] <skunkworks> Cool - thanks jepler.
[02:24:45] <skunkworks> :)
[02:27:21] <skunkworks> * skunkworks is glad it may not be all him
[02:30:19] <skunkworks> jepler: also - do you think the pins should have an -in or -out like the printer port pins do?
[02:30:40] <skunkworks> like pci8255.0.0.b7-out
[02:34:02] <jepler> skunkworks: oh probably
[03:00:31] <jmkasunich> skunkworks: what driver? pci_8255?
[03:01:57] <skunkworks> yes
[03:03:36] <cradek> seems like it would be easy to spot, but I don't spot it
[03:04:13] <jmkasunich> the macros don't help
[03:05:15] <cradek> I wonder if you can set them to all inputs if you don't call the write functions. that would narrow it down.
[03:05:46] <skunkworks> I can try that. hold on
[03:10:41] <skunkworks> still locks up - removing the addf pci8255.write-all servo-thread -1
[03:11:36] <cradek> did you still have write-relay?
[03:11:45] <skunkworks> I took them both out.
[03:11:46] <jmkasunich> what exactly are you doing? setting dir = 0xfff?
[03:11:49] <skunkworks> yes
[03:12:17] <cradek> huh
[03:12:28] <cradek> that narrows it down to export()?
[03:12:40] <cradek> (not what I expected)
[03:15:31] <jmkasunich> but it doesn't crash until you do halcmd start?
[03:15:57] <cradek> jepler said that
[03:16:10] <jmkasunich> then the crash isn't export
[03:16:19] <cradek> oh that's true, hmm
[03:16:32] <jepler> read is in a thread too?
[03:17:33] <skunkworks> asking me? yes. http://pastebin.ca/961064
[03:18:08] <skunkworks> (I had just commented out the two writes.)
[03:18:21] <skunkworks> I rebooted and ran again - same result.
[03:18:47] <jmkasunich> so its read that must be dying
[03:18:48] <cradek> can you also remove read from the thread and see if still hangs?
[03:18:57] <skunkworks> sure
[03:19:49] <skunkworks> (booting)
[03:22:43] <jmkasunich> I see something suspicious in export
[03:22:53] <jmkasunich> inst is passed in as "ptr to port"
[03:23:13] <jmkasunich> but then it does memset(inst,0,sizeof(struc state))
[03:23:30] <cradek> aha
[03:23:31] <jmkasunich> struct state is 3x plus bigger than struct port
[03:24:24] <skunkworks> It didn't lock up instantly.. Gave it some time and then added the function in halcmd. Didn't loock up instantly but did after about 20 seconds.
[03:24:29] <cradek> ding ding ding
[03:25:01] <jmkasunich> the size thing?
[03:25:09] <cradek> yeah that's gotta be bad
[03:25:32] <jmkasunich> might not be the particular bad that we're chasing, but worth a test
[03:25:41] <jmkasunich> skunkworks: you are working from a CVS checkout, right?
[03:26:20] <jmkasunich> once you reboot, open src/hal/drivers/pci_8255.c in your favorite editor
[03:26:59] <jmkasunich> line 63, change from "sizeof(struct state)" to "sizeof(struct port)"
[03:27:35] <skunkworks> yes - ok
[03:27:40] <skunkworks> then make?
[03:27:43] <jmkasunich> wait a sec
[03:28:10] <jmkasunich> interesting - struct port and struct state both have ioaddr fields
[03:28:34] <jmkasunich> ok, export is clearly dealing with a port
[03:28:44] <jmkasunich> skunkworks: yes, make that change, the run make
[03:28:52] <skunkworks> ok
[03:30:27] <jmkasunich> if it is the memset thing, I wonder why it only breaks with fff
[03:31:04] <jmkasunich> oh - for outputs, there is a param invert, for inputs its a pin -invert
[03:31:19] <jmkasunich> if that data is getting stomped, stomping a param doesn't hurt
[03:31:32] <jmkasunich> stomping a pin means there is a pointer off into the wild somewhere
[03:31:50] <jmkasunich> and read does a write using that pointer
[03:31:58] <skunkworks> still locke
[03:32:01] <skunkworks> locked
[03:32:04] <jmkasunich> well poo
[03:32:12] <skunkworks> let me reboot
[03:34:17] <cradek> what a pain :-)
[03:35:21] <jmkasunich> I'm a little unnerved by the fact that there is a global inst = ptr to struct state, but inst is also used in several functions and macros as a local ptr to port
[03:35:34] <jmkasunich> but I don't think that's actually an issue
[03:38:32] <jmkasunich> where is the data for this board?
[03:38:58] <jmkasunich> (I'm curious about SHIFT in the inline functs READ and WRITE
[03:39:01] <cradek> I think they mailed a cd with the card, but I'm not sure
[03:39:32] <jmkasunich> this board looks like it only uses the low byte of every 32-bit word
[03:39:57] <jmkasunich> (or at least thats what the code says - it does outb and inb to offset*4
[03:40:33] <jmkasunich> skunkworks: when you use a dir value that doesn't make it crash, have you tested the actual physical outputs?
[03:40:49] <jmkasunich> turn them on/off and check with a meter?
[03:41:18] <cradek> the fff is all inputs
[03:41:26] <jmkasunich> right
[03:41:29] <cradek> with all outputs, it seems to run, and at least halmeter shows the pins working
[03:41:48] <jmkasunich> but are the physical outputs working?
[03:41:52] <cradek> good question
[03:42:01] <jmkasunich> the card addressing could be totally wrong, and the hal pins wouldn't show that
[03:42:08] <cradek> if skunkworks sends me one, I'll test it
[03:42:19] <jmkasunich> OTOH, if the correct outputs do change state, then the addressing is probably right
[03:42:32] <skunkworks> Outputs seem to work ( I have only tested a few pins on the first plug)
[03:42:45] <jmkasunich> ok - the fact that some work is usefull info
[03:42:47] <cradek> good sign
[03:43:19] <skunkworks> On a side note with 0xff now the first 8 bits are outputs now after the reboot.. and no lockup - of cource.
[03:43:27] <skunkworks> I mean 0xfff
[03:43:34] <jmkasunich> huh?
[03:43:41] <skunkworks> (that is what I thought)
[03:43:55] <cradek> skunkworks: talk slower and try again :-)
[03:43:55] <jmkasunich> I thought fff meant "all inputs" and that caused lockups?
[03:44:20] <skunkworks> wait. let me past it.
[03:45:21] <skunkworks> sorry - I was wrong. but it is running right now - no lockup and they all seem to be inputs.
[03:45:37] <skunkworks> Let me see if I can change the state
[03:45:49] <cradek> after that code change it's working with all inputs 0xfff?
[03:45:50] <jmkasunich> hold on
[03:46:07] <skunkworks> after it locked up initially.. And I rebooted
[03:46:12] <jmkasunich> don't change anything - we need to understand exactly what is going on here
[03:46:31] <skunkworks> :) what do you need?
[03:46:40] <jmkasunich> a while ago, it locked up, then I asked you to make an edit after rebooting
[03:46:56] <jmkasunich> you made the edit, did the compile, then tried the exact same thing
[03:46:59] <jmkasunich> it locked up again
[03:47:12] <jmkasunich> then you rebooted again, and tried the exact same thing, and it didn't lock up?
[03:47:40] <skunkworks> yes - exactly
[03:48:14] <jmkasunich> so there is no difference between the last two runs, but one worked and one didn't
[03:48:18] <jmkasunich> oh joy
[03:48:30] <skunkworks> I may have tried a 0x000 before the change.. during the same session - just to make sure I could flip an output.
[03:48:51] <jmkasunich> before the change - what does that mean?
[03:48:55] <skunkworks> then changed it to 0xfff and it locked - reboot - now working.
[03:48:55] <jmkasunich> before you edited?
[03:49:03] <skunkworks> sorry - don't remember.
[03:49:23] <jmkasunich> ok, from now on, please tell us exactly what you are doing before you do it
[03:49:38] <skunkworks> I know - I know - You have scolded me before about that.
[03:49:42] <skunkworks> I will learn
[03:49:46] <jmkasunich> that memset bug modifies memory that it shouldn't regardless of the value of dir
[03:50:28] <jmkasunich> so if you rebooted, ran the old code (with dir=000), it may not have locked, but it might have corrupted someting - then you did the edit, compiled, and ran again, but maybe it was already corrupted
[03:50:38] <jmkasunich> (that is just speculation....)
[03:50:47] <skunkworks> I could believe that..
[03:51:09] <jmkasunich> right now, with the change, it is running with dir=fff?
[03:52:49] <skunkworks> yes - and I just tested the input - thru the voltmeter - when I change the probe to ground - the pin is false - when I put the probe to +5 it is true. First time I have gotten any input responce.
[03:53:05] <cradek> whee
[03:53:31] <jmkasunich> it looks like the memset bug was it, but I'm disturbed that you didn't see it fixed right after making the change
[03:54:07] <skunkworks> I am chalking it up to running it before the source edit.. (if that is what I did)
[03:54:28] <skunkworks> well - let me switch from0x000 and 0.fff a few times
[03:54:41] <jmkasunich> even if you did, I'd be astonished if the corruption persisted after you shut down HAL and restarted it again after the edit and make
[03:55:34] <jmkasunich> anyway, I'm convinced that the memset is _a_ bug, even if I'm not sure it was _the_ bug
[03:55:40] <jmkasunich> do you have commit access?
[03:55:56] <skunkworks> me? heck no.
[03:56:03] <jmkasunich> ok, I'll commit it
[03:56:11] <jmkasunich> this driver is only in trunk, right?
[03:56:36] <cradek> it's also in 2.2
[03:57:41] <skunkworks> hmm - when I set an output to true (+5) and exit emc - it is still +5... that doesn't seem right
[03:57:51] <jmkasunich> why not?
[03:57:55] <jmkasunich> nothing cleared it
[03:58:45] <skunkworks> ok - I for some reason thought emc would leave it as it had started..
[03:58:56] <skunkworks> I guess that is what posthalmumble file isfore
[03:59:06] <jmkasunich> cradek: I'm confused by this: http://cvs.linuxcnc.org/cvs/emc2/src/hal/drivers/pci_8255.c?graph=1
[03:59:07] <cradek> or a charge pump
[03:59:24] <cradek> jmkasunich: howso
[03:59:45] <jmkasunich> oh, never mind
[04:00:03] <skunkworks> shit - I locked up
[04:00:07] <jmkasunich> shit
[04:00:13] <cradek> well shit
[04:00:39] <cradek> might be informative to watch dmesg in the time before the lockup
[04:01:03] <jmkasunich> this driver isn't very verbose
[04:01:19] <cradek> sometimes rtai tells you things...
[04:01:32] <jmkasunich> only after the lockup I bet
[04:01:39] <jmkasunich> at which point its too late
[04:01:51] <cradek> yeah, long shot.
[04:01:52] <skunkworks> I exited emc - changed the hal file to 0x000 - ran emc - set the first output to 5v (true) (worked) then exited emc - changed the hal file to 0xfff started emc - went to halmeter to pick an input - locked up
[04:02:17] <jmkasunich> have you been using emc all along, or just halcmd?
[04:02:22] <skunkworks> emc
[04:02:30] <cradek> you did compile it right? and you're sure you're running the one you compiled?
[04:02:58] <cradek> not calling you stupid - but we all screw up those steps sometimes
[04:03:00] <skunkworks> I did a make - I saw the 8255 file fly by
[04:03:08] <cradek> ok
[04:03:49] <skunkworks> I am going to have to call it a night.
[04:04:01] <cradek> goodnight
[04:04:12] <skunkworks> Thanks for trying.
[04:04:13] <jmkasunich> jepler said something about looking at it in sunday - he's gonna be away till then?
[04:04:15] <cradek> jepler will be able to spot it, he even has the card too
[04:04:19] <jmkasunich> goodnighty skunkworks
[04:04:22] <cradek> yes
[04:04:23] <jmkasunich> -y
[04:04:28] <skunkworks> Yes - he said maybe sunday
[04:04:39] <skunkworks> thanks again
[04:04:45] <jmkasunich> cradek: you do agree that the memset is a bug, right?
[04:04:50] <cradek> yes
[04:04:56] <jmkasunich> I'll commit that fix now, and leave the rest for jeff
[04:05:00] <cradek> ok
[04:05:50] <CIA-22> EMC: 03jmkasunich 07TRUNK * 10emc2/src/hal/drivers/pci_8255.c: zeroing out the wrong size structure
[04:06:20] <cradek> jmkasunich: I got half of my frame done tonight. It would have been better if I had done the brazing instead of letting my friend the raccoon (who was apparently drunk) do it.
[04:06:41] <jmkasunich> raccoon?
[04:06:45] <cradek> (I'm not quite an expert at that)
[04:07:01] <cradek> it's sloppy, so I couldn't have done it
[04:07:11] <cradek> but maybe I'm misremembering
[04:08:15] <CIA-22> EMC: 03jmkasunich 07v2_2_branch * 10emc2/src/hal/drivers/pci_8255.c: zeroing out the wrong size structure
[04:08:17] <jmkasunich> I've always been better at metal removal technologies than metal addition ones
[04:08:29] <cradek> yep
[04:09:07] <jmkasunich> midnight again
[04:09:11] <jmkasunich> dammit
[04:09:23] <cradek> but friday!
[04:09:40] <jmkasunich> true
[04:09:47] <jmkasunich> but I have to take the dog to the vet tomorrow at 9
[04:09:59] <jmkasunich> which means the alarm is going to go off pretty much normal workday time
[04:10:26] <cradek> uh-oh, is he sick?
[04:10:30] <jmkasunich> the good news is that the first operation is done on all 20 of my parts
[04:10:43] <jmkasunich> no, routine maintainence - change the oil, rotate the tires, etc ;-)
[04:10:59] <cradek> ah, good
[04:11:03] <jmkasunich> shots and such
[04:11:53] <cradek> what's the second op?
[04:12:03] <jmkasunich> face the other end to length
[04:12:08] <jmkasunich> easy by comparison to the first
[04:12:33] <cradek> what are these?
[04:12:38] <jmkasunich> I'll install the studs in the parts (in the hole I made first op), chuck up a nut in the three jaw, screw part in, start program, unscrew, repeat
[04:12:57] <jmkasunich> 2x2x1 with a M12 stud sticking out about 5/8"
[04:13:00] <cradek> cool, sounds really easy.
[04:13:32] <jmkasunich> they glue the bottom to a tile, attach a bayonet type widget to the stud, and stick it in a tensile tester to pull the tile off of a concrete substrate, to check bond strength
[04:13:53] <jmkasunich> the wiget mates it with the tester
[04:13:56] <cradek> ah
[04:14:19] <jmkasunich> then need a bunch of them because they test large batches
[04:14:30] <jmkasunich> then they bake them to break down the glue and clean for reuse
[04:15:08] <jmkasunich> the 2nd op is relatively easy, but will still be slow
[04:15:22] <jmkasunich> I'm still facing off a square, with all the pounting that implies
[04:15:31] <cradek> and with no css
[04:15:32] <jmkasunich> still running a large diameter range, so slow speed
[04:15:46] <jmkasunich> yep
[04:16:12] <jmkasunich> I have to take off about 0.100
[04:16:37] <jmkasunich> I wonder if it would actually be more secure to put them in the 4-jaw like I did for the first op
[04:16:51] <cradek> I guess it doesn't matter if they're centered
[04:16:55] <jmkasunich> well, I know it would be more secure, I guess I wonder if I need that security
[04:16:56] <cradek> not sure how you'd get them flat though
[04:17:18] <jmkasunich> seat the already machined face against the front of the chuck
[04:17:32] <cradek> oh right, they're bigger than the hole
[04:18:47] <jmkasunich> oh well, thats tomorrows problem
[04:18:49] <jmkasunich> goodnight
[04:18:51] <cradek> goodnight
[10:06:37] <CIA-22> EMC: 03alex_joni 07TRUNK * 10emc2/src/hal/drivers/mesa7i43-firmware/.cvsignore: a bit more CVS silencing
[18:30:30] <CIA-22> EMC: 03tissf 07TRUNK * 10emc2/docs/src/config/ini_homing_fr.lyx: French translation update
[18:30:41] <CIA-22> EMC: 03tissf 07TRUNK * 10emc2/src/po/ (fr_axis.po fr_rs274_err.po): French translation update
[19:03:30] <CIA-22> EMC: 03seb 07TRUNK * 10emc2/src/hal/utils/bfload.c: Enable READY in PLX 9030 LASxBRD registers if the EEPROM forgot to.
[20:52:13] <skunkworks> sometimes when I run 0xfff it will run for a few minutes - up to maybe 20 on a fresh boot. and the inputs work. but after that - hardlock
[20:52:24] <skunkworks> bbs
[20:52:27] <skunkworks> bbl
[21:06:25] <jmkasunich> arg - worst possible scenario for debugging
[21:17:17] <LawrenceG> how about running top to see if memory is being eaten up?
[21:18:25] <LawrenceG> I am not sure how kernel memory shows up on top... less user space ram?
[21:19:04] <alex_joni> LawrenceG: I bet it doesn't eat that much memory up
[21:19:23] <alex_joni> it's rather that it writes outside of it's bounds and does some data corruption
[21:20:06] <LawrenceG> death by pointers... can be hard to spot
[21:23:26] <jmkasunich> yep
[21:23:46] <jmkasunich> LawrenceG: on another topic - I finally got in touch with CentryCo (those screw covers)
[21:24:12] <LawrenceG> $$$?
[21:24:44] <jmkasunich> a cover that is 15.75" long extended, 1.18" long compressed, 1" ID, and 1.7" OD, is $72.30 and two weeks
[21:24:51] <jmkasunich> minimum order is $150
[21:25:48] <jmkasunich> the travel range might be marginal for a shoptask X, especially if you ever remove the tailstock and run the saddle way down there
[21:26:18] <jmkasunich> the next size up is 19.69" extended, 1.18" compressed, 1" ID, 1.93" OD
[21:26:23] <jmkasunich> and I didn't think to ask the price
[21:26:33] <jmkasunich> probably not a lot more, maybe $80ish
[21:28:40] <jmkasunich> its more than I had hoped, but not enought to make me want to spend hours improvising something that probably won't work as well
[21:29:56] <LawrenceG> hmmm... kind of pricey.... I will live with what I have for now
[21:30:10] <jmkasunich> I can understand that ;-/
[21:30:40] <jmkasunich> since they are so pricey, the minimum order really isn't an issue, I'll need two, and that will be enough
[21:31:01] <jmkasunich> I think I'm gonna do the Y ballscrew first though
[21:31:41] <jmkasunich> credit card is kinda smoking this month, let it cool off, then do Y, then a month later do X ;-)
[21:32:20] <LawrenceG> I am building another small table for doing pcb's.... it should be better backlash wise as I can use small antibacklash nuts
[21:32:46] <jmkasunich> gonna plant it on top of the regular table? or is this an entirely separate machine for PCBs?
[21:33:42] <LawrenceG> I am happy you are getting some lathe experience on the shoptask... new machine is a toally new table and spindle.. about 6x8x3" work area
[21:33:59] <LawrenceG> not that much different from the shoptask actually
[21:34:19] <LawrenceG> but finer screws and slower
[21:34:31] <jmkasunich> faster spindle I hope?
[21:34:55] <jmkasunich> I have a plan for doing PCBs, but it requires getting/making a faster spindle first
[21:35:15] <LawrenceG> trim router... still need to build speed controller... should probably do 10k-30k
[21:36:12] <jmkasunich> I have a (probably unwarranted) bias against motors with brushes in them
[21:36:18] <jmkasunich> but maybe I should get over it ;-)
[21:36:19] <LawrenceG> pcb's are working on the shoptask, but because of spindle speed, I can only cut at about 4ipm
[21:37:10] <LawrenceG> and backlash limits my pcb accuracy
[21:38:48] <jmkasunich> I just had a vision of a rube goldberg contraption
[21:39:37] <jmkasunich> a small steel cable or rope, attached to the table and heading off at a 45 degree angle back and toward the tailstock, to a pulley that aims it toward the floor, and 30-50 lbs of barbell weights
[21:39:54] <jmkasunich> the angle means it preloads both X and Y
[21:40:11] <jmkasunich> the pulley would have to be a few feet away, hooked to the wall or something
[21:44:06] <LawrenceG> I like the idea of air cylinder uses as a spring.... regulator sets force... but it needs a relief valve when compressing