#emc-devel | Logs for 2010-06-25

[00:08:07] <dgarr> cradek: is this right? 318fbd9a line 179 x += currentToolOffset.tran.z;
[00:08:30] <andypugh> Just quickly, do you reckon that this: http://www.pastebin.org/357227 will be effective at spotting if my kinematics module is sending back NaN?
[00:08:39] <cradek> dgarr: where?
[00:08:53] <dgarr> line 179 emc/task/emccanon.c
[00:09:36] <cradek> certainly not, thank you
[00:10:20] <dgarr> it caused a gcode program of mine to hang i think
[00:12:13] <andypugh> mozmck: Oh :-( I think Visteurs is using your packages and he gets the problems too.
[00:12:28] <SWPadnos> andypugh, how about isnan()?
[00:12:36] <andypugh> Not there
[00:12:42] <SWPadnos> oh right, RT
[00:12:57] <CIA-2> EMC: 03cradek 07master * rab53d7de05de 10/src/emc/task/emccanon.cc: fix stupid cut-n-paste error with tool offset
[00:13:37] <cradek> thanks for finding that!
[00:15:05] <dgarr> for consideration: http://www.panix.com/~dgarrett/stuff/0001-halshow-add-menu-options-for-load-save-exit.patch
[00:21:40] <andypugh> I am printing out the joint positions and (I think) checking the axis positions for NaN when the come back to control.c from my kinematics module. Here is a sample of output. http://www.pastebin.org/357234
[00:22:31] <andypugh> (I only get to run inverse kins as a one-shot, but forward kins runs all the time, so actually printing that is a non-starter.
[02:23:03] <jepler> cdecl> declare p as pointer to array 16 of struct hal_data_u
[02:23:03] <jepler> struct hal_data_u (*p)[16]
[02:50:17] <CIA-2> EMC: 03seb 07master * r1f39204a0722 10/src/hal/drivers/mesa-hostmot2/stepgen.c: fix a stepgen bug
[02:52:03] <CIA-2> EMC: 03seb 07v2.4_branch * r02c2e884b5ea 10/src/hal/drivers/mesa-hostmot2/stepgen.c: fix a stepgen bug
[14:28:03] <Dave911> seb_kuzminsky: Did the mesa stepgen bug fix have anything to do with a following error fault while jogging?
[14:34:57] <seb_kuzminsky> Dave911: no i dont think it would cause that problem
[14:35:11] <seb_kuzminsky> i made 2 hm2 stepgen bugfixes yesterday
[14:35:32] <seb_kuzminsky> one should prevent the stepgen from moving at all, in some very rare configurations
[14:36:05] <seb_kuzminsky> the other sometimes caused *very* slow creep when it should have been stopped after a move in the negative direction
[14:36:24] <seb_kuzminsky> neither one should have caused occasional following errors on an otherwise working machine
[14:43:19] <pcw_home> Hi seb
[14:43:34] <cradek> he stepped away
[14:46:32] <pcw_home> ha ha
[14:46:34] <pcw_home> I just wondered if anyone looked into that weird HM2 stepgen bug with base thread that Chris Morley reported
[14:49:01] <seb_kuzminsky> hi peter
[14:49:45] <seb_kuzminsky> is that bugreport recorded anywhere? was it on irc or on the mailing list?
[14:50:15] <pcw_home> Just on developers list
[14:50:21] <seb_kuzminsky> ok i
[14:50:25] <seb_kuzminsky> i'll look
[14:52:48] <jepler> seb_kuzminsky: http://mid.gmane.org/86425787-A31E-44E7-8E21-F51045AB882B@bgp.nu
[14:53:56] <pcw_home> I think pncconf leaves a "empty" basethread on step motor configs, the sample HM2 step configs run fine
[14:53:57] <pcw_home> but the base thread causes trouble somehow
[14:54:56] <seb_kuzminsky> hmm, cmorley's description sounds a bit like the casting/underflow bug i fixed last night
[14:57:05] <pcw_home> The other manifestation is random following errors in the middle of a move (looks like feedback is lost momentarily)
[14:58:25] <seb_kuzminsky> pcw_home: and this bug is only with a base thread, right? never without a base thread?
[14:58:59] <cradek> do we have any halscope plots that show this happening?
[14:59:13] <seb_kuzminsky> do we have a config that has the problem?
[15:05:10] <pcw_home> I think any pncconf generated step config will exhibit the bug,
[15:05:12] <pcw_home> Ill ask customer with troubles to send failing config
[15:07:32] <seb_kuzminsky> there's config files & halscope screenshots in the thread jeff linked: http://thread.gmane.org/gmane.linux.distributions.emc.devel/2607/focus=3233
[15:12:33] <seb_kuzminsky> bummer, all the pastebin files have expired and are gone... :-(
[15:12:41] <cradek> argh
[15:19:51] <pcw_home> dave911: you might try removing the basethread, seems to fix the problem for everyone so far...
[17:17:42] <andypugh> Fascinating.
[17:19:09] <andypugh> As stepgen.0 was drifting (-fb increasing steadily for no command) for no reason I have been able to work out, I moved that axis to stepgen.6. Now stepgen.1 is drifting....
[17:19:39] <andypugh> Is that diagnostic in any way?
[17:36:34] <chester88> Hey Seb it seems if a Base thread is loaded ( even if not used ) it will affect the step generators.
[17:37:33] <chester88> Pncconf always adds a base thread so any config made with pncconf will show this error.
[17:38:04] <chester88> though I'm sure if you add a base thread to any hostmot2 stepper config you will get the same problem.
[17:38:36] <chester88> it follow errors very quickly mostly in one direction.
[17:38:45] <andypugh> Which stepgens? Hostmot2 or HAL?
[17:38:58] <chester88> hostmot2
[17:40:29] <andypugh> In case there is any confusion, the stepgens I was referring to above are HAL ones.
[17:42:25] <chester88> When you say HAL you actually mean the software steppers. Both steppers use HAL of course :)
[17:42:40] <andypugh> Yeah, well, yeah.
[17:43:53] <JT-Work> hi chester88
[17:44:15] <chester88> Hey John How are you?
[17:44:34] <JT-Work> good, nice and hot here in the 90's
[17:44:54] <chester88> wow! sunny here but not that hot!
[17:44:58] <JT-Work> I have almost everything done on the Hardinge :)
[17:45:07] <chester88> sounds like beer drink weather
[17:45:19] <JT-Work> too hot for beer
[17:45:21] <chester88> do you have more video?
[17:45:41] <JT-Work> did you see the one cutting the black top hat looking washer?
[17:45:47] <chester88> no
[17:46:16] <JT-Work> http://www.youtube.com/watch?v=isTD6bDF_LI
[17:47:28] <JT-Work> that one is cutting some VHMW seal washers for the Hardinge screws that hold the cover plates on
[17:48:34] <chester88> very cool!
[17:49:12] <JT-Work> yea, I got the coolant tank and screens cleaned out finally and put fresh coolant in last night
[17:49:28] <chester88> hey did you every try the mux16 with out using the suppress-no-output ?
[17:49:44] <chester88> i'm curious how jumpy the feed changes would have been
[17:50:47] <JT-Work> yea, it kinda bounces around a bit between positions
[17:50:51] <chester88> yes colant can be really smelly. Do you use soluble oil or synthetic ?
[17:51:20] <JT-Work> my last thought is we need to make a gray code comp and wire it up for gray code output instead of binary
[17:51:24] <chester88> ok good then supress-no-output was worth while
[17:51:37] <chester88> ya why gray code?
[17:51:44] <JT-Work> semi-synthetic soluble ValCool VPTech
[17:52:08] <JT-Work> well there would be no in between areas with gray code
[17:52:17] <JT-Work> only one bit at a time changes
[17:53:01] <andypugh> Argh! I don't want to jump to conclusions, but I think I might have found my Kinematics problems.
[17:53:06] <chester88> well the inbetween areas would come from the switch ? or would the comp ignore changes till they are valid?
[17:53:30] <JT-Work> the comp would only act on valid inputs
[17:53:50] <JT-Work> scroll down a bit on this page http://en.wikipedia.org/wiki/Gray_code
[17:56:15] <chester88> Ahh i see. so the feed rates bounce at bit even with suppress-no-input
[17:56:50] <JT-Work> I think the switch was originally wired up for gray code and that is why it didn't make sense to me at the time
[17:56:53] <JT-Work> yea
[17:58:07] <chester88> well that should be doable. surprised some one hasn't made a comp yet
[17:58:26] <JT-Work> I thought about it for a while does that count?
[17:58:59] <JT-Work> I would even wire that monster switch again to try one out
[18:00:20] <andypugh> WooHoo! Fixed!
[18:00:29] <JT-Work> what was it?
[18:00:47] <andypugh> Here's a clue: pos->c = (180/M_PI) * atan2(sin(joints[4]), sin(joints[5])/cos(joints[5]));
[18:01:08] <andypugh> That works :-)
[18:01:33] <andypugh> There is no "tan" in rtai_math....
[18:01:53] <chester88> You should write a wiki page about writing kinematics !
[18:02:08] <andypugh> The compiler was pulling one out of math.h which isn't RT-safe (and was probably expecting a different number format)
[18:02:12] <JT-Work> http://www.plcdev.com/using_ladder_logic_for_gray_code_conversion
[18:02:45] <JT-Work> that must have been tough to sort out :/
[18:03:31] <andypugh> The kins? Up to 2am every night for a week staring at 100 lines of code type of tough.
[18:04:01] <andypugh> And it isn't even for a machine I will ever see, or anyone I even know.
[18:04:03] <JT-Work> I can barely stay up to 11pm LOL
[18:04:05] <micges> andypugh: seriously you should write a wiki page
[18:04:15] <andypugh> Yeah, I might.
[18:04:48] <chester88> would help the next poor sucker ! feels goos to win though doesn't it.
[18:04:52] <chester88> good
[18:06:21] <chester88> PCW : any theory why base thread would affect stepgens in hostmot2?
[18:09:14] <PCW> No other than some bug in the driver that causes basethread interrupting servo thread to break some variable in HM2 stepgen
[18:10:37] <PCW> Sebastian could not duplicate it
[18:12:11] <PCW> Di you have the error one more than one card or only the 5I20?
[18:12:18] <micges> PCW: between calculations and sending result to mesa?
[18:12:20] <JT-Work> chester88: I found this http://pastebin.ca/1889339 here http://www.mpi-hd.mpg.de/astrophysik/HEA/internal/Numerical_Recipes/f20-2.pdf
[18:14:05] <PCW> micges yes something like that but its all random speculation, very odd that a basethread that does nothing should cause breakage
[18:14:18] <chester88> I have only tried tested the 5i20 for this bug. The fact the base period had effect was pointed out to me by a user.
[18:15:00] <chester88> but it definitely did make a difference. I will have to do more testing.
[18:15:00] <PCW> micges: do you use a basethread in your stepper configs?
[18:15:22] <micges> PCW: luckly (sadly) no
[18:15:57] <PCW> Yes, Ricks problem went away when he deleted the basethread
[18:16:11] <chester88> micges: can you add it and try ?
[18:16:27] <micges> can you give me a link to bug report? I
[18:16:39] <micges> 'm not fully follow
[18:16:51] <chester88> there is no bug report lust mail list ..hold on
[18:19:20] <chester88> starts here: http://news.gmane.org/gmane.linux.distributions.emc.devel
[18:20:39] <micges> thanks, I can test it on monday if it won't be solved until then
[18:22:41] <PCW> andypugh: do you have a 400K 7I43? (finally getting around to making the SVSS4_8 bitfile)
[18:23:12] <andypugh> I do, you were out of stock of what I ordered :-)
[18:24:13] <PCW> Probably a good thing, not sure the config will fit in 200K
[18:24:52] <chester88> JT-Work: I will check those gray-code links out .
[18:25:20] <chester88> gotta go . take it easy guys.
[19:36:23] <seb_kuzminsky> i just spoke with tom, he said removing the base thread did not fix the occasional hm2 stepgen ferror
[19:41:00] <andypugh> Maybe he has a real f-error problem?
[19:41:23] <andypugh> Though that seems unlikely with a Mesa card stepgen
[19:42:25] <andypugh> I have been seeing this with a p-port stepper system, where the axis max speed was way in advance of stepgen-scale x base-thread-period
[19:49:43] <seb_kuzminsky> tom's using steppers in position control mode, with encoders for position feedback to emc
[19:50:03] <seb_kuzminsky> so the hm2 stepgen creep bug i fixed last night could explain his problem
[19:50:28] <seb_kuzminsky> he's going to test 2.4.1-53 from the buildbot, which has the fix, and report back to us
[20:01:32] <andypugh> SWPadnos: Did you say that hal_float_t is a different level of precision to double? ie it is encoded in a different way in memory?
[20:04:53] <dgarr> ./hal/hal.h:#define hal_float_t volatile real_t
[20:04:53] <dgarr> ./hal/hal.h:typedef double real_t __attribute__((aligned(8)));
[20:32:07] <andypugh> I am curious now why a) The compiler was finding tan from math.h but gave an error when I tried to use arctan (math.h was _not_ in the includes) and b) Why using a non-rtapi trig function caused such odd bugs. It wasn't wrong answers, it was out-of-function memory writes and that sort of thing.
[21:05:15] <seb_kuzminsky> matt and i just moved his 6130 lathe with a 3x20 :-)
[21:05:55] <cradek> neato
[21:06:28] <andypugh> Hi cradek. It was my kins code after all.
[21:06:40] <cradek> yay!
[21:06:48] <andypugh> I was using "tan"
[21:07:36] <cradek> why is that bad?
[21:08:14] <andypugh> It isn't in rtapi_math.h
[21:08:22] <andypugh> So it uses the one from math.h
[21:08:31] <cradek> wow yuck
[21:08:34] <andypugh> (Not sure how it finds that, it wasn't an include)
[21:08:56] <cradek> that would be ok in sim... explains it.
[21:09:34] <andypugh> It doesn't quite explain why it is quite so very horribly nd randomly bad.
[21:10:58] <andypugh> Is there a way to prevent people using "tan" in future?
[21:11:10] <cradek> I don't know
[21:11:41] <andypugh> I got a warning about atan not being in the library, but is silently found tan from math.h without a #include
[21:12:23] <andypugh> And writing to the wrong bits of memory is a lot worse than just getting the wrong answer back.
[21:12:31] <cradek> you should have had a warn,ing if no prototype
[21:12:43] <cradek> sorry, phone kb
[21:12:44] <andypugh> Try it :-)