#emc-devel | Logs for 2006-11-29

Back
[03:06:19] <jmk-away> jmk-away is now known as jmkasunich
[04:37:59] <SWPadnos_> SWPadnos_ is now known as SWPadnos
[06:06:25] <jmk-vm04> jmkasunich@localhost:~/emc2head/src$ make
[06:06:26] <jmk-vm04> Depending libnml/rcs/rcs_print.cc
[06:06:26] <jmk-vm04> cc1plus: error: unrecognized option `-funit-at-a-time'
[06:06:26] <jmk-vm04> make: *** [depends/libnml/rcs/rcs_print.d] Error 1
[06:07:17] <jmk-vm04> this is on BDI-4.51, after an apparently successull run of ./configure --enable-run-in-place --with-realtime=/usr/realtime-2.6.16.20-rtai --with-pytho=/usr/bin/python2.4
[06:07:38] <jmk-vm04> its late, will investigate later
[13:20:11] <jepler> ./src/Makefile:OPT := -O2 $(call cc-option, -funit-at-a-time) $(call cc-option,-fno-strict-aliasing)
[13:20:40] <jepler> jmkasunich: this code is supposed to find out whether the compiler supports -funit-at-a-time, and enable it if it is supported
[13:21:23] <jepler> what values did you get for CC and CXX? Are they different versions of the C compiler?
[13:25:05] <jepler> er, "different versions of GCC" I suppose I should say
[14:55:36] <jepler> Ed Nisley has posted a halscope screenshot
[15:00:46] <cradek> interesting
[15:12:55] <jepler> here's mine: http://emergent.unpy.net/files/sandbox/halscope-stepgen-bug.png
[15:18:27] <skunkworks> jepler: (stating the obvious) you where able to repeat it?
[15:21:48] <jepler> skunkworks: yes
[15:22:49] <jepler> I used the 2.0.4 stepper/inch.ini and got that trace from my first jog towards -X
[20:32:27] <skunkworks> jepler: in your picture - are the top two traces step gen dir and step? or is it at the printer port pins?
[20:33:34] <jepler> skunkworks: it's the signals connected to the stepgen component
[20:38:49] <skunkworks> I looked at the stepgen.0.dir and could not get it to do it. stepperinch.ini using axis as the front end - in both 2.0.4.and head as of yesterday (or the day befor)
[20:39:17] <skunkworks> (jogging in the -x direction.)
[20:54:37] <jepler> for me it only happened on the very first jog
[20:57:10] <jepler> tkemc with the jog speed set to 1 inch per minute
[20:59:44] <cradek> jmk is very gracious
[21:08:46] <skunkworks> I got it to do it on 2.0.4 - but so far only at 1 ipm - odd
[21:10:11] <jepler> what behavior did you see? I saw a single step in the wrong direction.
[21:11:34] <skunkworks> http://www.electronicsam.com/images/KandT/oops.png
[21:11:47] <skunkworks> direction drops for a second and it seems like there is an extra step
[21:12:07] <skunkworks> 'second' = really short time period
[21:14:34] <skunkworks> I also cannot see it happen except after the first start up. Although I didn't jog too long
[21:15:06] <jepler> that looks like my graph -- but in my png, look at the bottom two traces
[21:15:14] <jepler> the red is the commanded position and the green is feedback
[21:15:27] <jepler> they do stay "together", there's just that weird 1-step reversal
[21:15:40] <skunkworks> ok - I was just graphing the ouput of stepgen and the printer port pin.
[21:15:45] <skunkworks> pins
[21:15:46] <jepler> those will always be the same
[21:15:51] <jepler> because they are connected to the same signal
[21:16:13] <skunkworks> I know - I was just making sure and playing with halscope. which I will say again - damn cool
[21:16:39] <jepler> you were trying HEAD at first and never saw it happen there?
[21:17:29] <skunkworks> I can try it again - like I said.. I only see it with the jog set at 1ipm. 10 doesnt do it neither does lower speeds.. so far anyways
[21:17:42] <skunkworks> let me see
[21:19:00] <skunkworks> is there a way to kill whatever is running? I can't seem to go between head and 2.0.4 without a reboot
[21:20:07] <jepler> what error do you get?
[21:20:24] <rayh> One of em isn't dragging the kernel modules out.
[21:20:31] <skunkworks> n/m - my stupidity
[21:20:41] <jepler> what error did you get?
[21:21:12] <skunkworks> I had picked the 2.0.4 condig by mistake.
[21:21:21] <jepler> I ask because we recently had a user reporting that he couldn't compile HEAD on his system with 2.0.4 -- he got some error every time he started HEAD. "shared memory mismatch"
[21:21:29] <jepler> er, he could compile it, but couldn't use it
[21:21:32] <jepler> if you got that message I wanted to know how you fixed it
[21:21:41] <cradek> rtapi version mismatch iirc
[21:27:23] <skunkworks> it does it in head also
[21:28:08] <jepler> I still haven't seen it in HEAD
[21:28:14] <jepler> but I believe you
[21:32:55] <skunkworks> I did a g1x-.01f1 and got it also - in a differnt spot
[21:34:17] <cradek> it's good to know you get it with coordinated motion too
[21:34:25] <cradek> sure points to stepgen
[21:35:39] <skunkworks> http://www.electronicsam.com/images/KandT/headjog.png
[21:36:06] <skunkworks> http://www.electronicsam.com/images/KandT/headcom.png
[21:36:08] <jepler> yep, I got it that way in HEAD too
[21:36:29] <skunkworks> top one is head jogged and the bottom one is the mdi commanded motion
[21:36:42] <jepler> single shot mode, with g1x-.01f1 as my MDI command
[21:37:02] <jepler> only the first time
[21:37:16] <skunkworks> right. so far.
[21:37:35] <cradek> seems really low speed makes it worse
[21:38:23] <skunkworks> like I said before - I have only seen it at 1ipm so far.
[21:38:39] <skunkworks> have not seen it slower or faster yet
[21:39:26] <jepler> this glitch, combined with a drive that needs setup or hold time on direction, *could* lead to loss of opsition
[21:39:54] <skunkworks> that is what I was thinking - the change in dir isn't when the step is - normally
[21:39:55] <cradek> could cause a stall?
[21:40:19] <cradek> usually when you reverse you have lots of time, but not with this bug
[21:41:07] <skunkworks> I don't remember what version I had when I cut that circuit board. probably 2.0.3. no issues on my machine. that was a pretty intense file.
[21:44:16] <cradek> I have run a lot of long files too without problem, but I agree with jepler that this could be a serious bug on some setups
[21:48:12] <jepler> the "dirsetup" and other parameters -- those are in periods, not nS or uS, right?
[21:48:46] <jepler> the documentation has them as floats and doesn't specifically say what units they are .. the source code shows them as u32s
[21:50:01] <cradek> I'd have to look at the source too
[21:50:16] <cradek> they were periods in emc1, bet they still are
[21:51:06] <jepler> (FLOAT) stepgen.<chan>.steplen - Length of a step pulse (step type 0 only).
[21:51:12] <jepler> I should ought to fix the docs then
[21:51:46] <cradek> UTSL first
[21:52:01] <cradek> don't just believe me
[21:52:17] <jepler> no, I read the source as "periods"
[21:52:28] <cradek> ah
[21:53:28] <jepler> which, let me just say, sucks
[21:55:22] <cradek> the alternative is to not give the user what he asks for unless he's very lucky
[21:55:49] <cradek> I guess you would ceil to periods
[21:55:56] <jepler> yes
[21:58:24] <cradek> fix for 2.2 I guess
[21:59:59] <jepler> when I use setup and hold times, they seem to be respected .. as nearly as I can tell with halscope, anyway
[22:04:11] <skunkworks> for what it is worth - increasing the stepgens vel and acc headroom doesn't fix the problem :)
[22:08:36] <skunkworks> although - lowering the period to 20us makes it go away for 1ipm
[22:09:35] <skunkworks> so I though going from 50 to 20 is 2.5 increas (don't know if that is right) so I tried a jog at 2.5ipm - still no reversak
[22:12:06] <skunkworks> could there be a problem with resolution of 50us and the default accelleration / vel of the stepperinch.ini? So that the step gen gets ahead of itself and say 'crap - I need to slow down' and it does so in one period.
[22:12:12] <skunkworks> * skunkworks is talking out of his ass again
[22:22:15] <skunkworks> ew - never mind - I got it to do it.
[22:26:27] <skunkworks> http://www.electronicsam.com/images/KandT/20usx-01f10.png
[22:34:54] <skunkworks> bbl
[23:38:09] <jmkasunich> hi guys
[23:38:16] <jmkasunich> only here for a few minutes
[23:41:04] <SWPadnos> hi
[23:41:16] <jmkasunich> looks like we have two interesting issues going on
[23:41:30] <SWPadnos> at least one, for sure
[23:41:32] <jmkasunich> one that has been around forever I think, and causes a maximum error of 1 step
[23:41:50] <jmkasunich> and another that may be unique to one user and might not even be software
[23:42:18] <SWPadnos> I'd be surprised if it isn't software, considering that the direction issues have shown up in halscope traces for skunkworks and jepler
[23:42:29] <jmkasunich> thats the first one
[23:42:42] <jmkasunich> been around forever, and causes only 1 count of error
[23:42:40] <SWPadnos> oh - you're talking about the overall drift
[23:42:59] <jmkasunich> the "goes a half a turn in the wrong direction" problem is the one thats gonna be hard to reproduce
[23:43:24] <SWPadnos> yeah, unless he's got an FError of +/- 1" or something (which would explain that reasonably well)
[23:44:00] <jmkasunich> even a wide FE wouldn't explain it
[23:44:17] <jmkasunich> if the error starts to get large, the control loop is gonna be trying to correct it
[23:44:23] <SWPadnos> the TP wouldn't try to correct if FE limit is high
[23:44:29] <jmkasunich> it won't just keep getting bigger at the same rate
[23:44:31] <SWPadnos> err - DEADBAND
[23:44:32] <jmkasunich> not the TP
[23:44:46] <jmkasunich> there is a position loop inside stepgen
[23:44:57] <SWPadnos> yes - that's in the update_freqs function, right?
[23:45:02] <jmkasunich> yes
[23:46:12] <jmkasunich> if the fb accurately represents the steps out of the stepgen module, then you'd get a runaway, or a recovey, but _not_ a uniform move for half a turn
[23:46:51] <jmkasunich> the only way to get uniform behavior is if the fb says "we're doing exactly what we're supposed to be doing" and the direction inversion is happening after that
[23:46:53] <SWPadnos> but that loop is in the slower thread, so you'd possibly get a full SERVO_PERIOD of wrong output
[23:47:11] <jmkasunich> a half turn is _many_ servo periods of wrong output
[23:47:34] <SWPadnos> true - should be
[23:47:39] <jmkasunich> in fact, its inpossible for direction to toggle more than once per servo period
[23:47:50] <jmkasunich> update-freq generates a signed frequency command
[23:47:58] <SWPadnos> actually, the make_pulses code may be able to toggle more than once
[23:48:06] <jmkasunich> make-pulses obeys that command, and uses its sign for the dir bit
[23:48:15] <SWPadnos> I'm still looking at it critically, so I'm not sure of that yet
[23:48:18] <jmkasunich> how?
[23:49:05] <jmkasunich> freq is used to compute addval, (which is proportional to frequency, no reciprocals or anything)
[23:49:12] <SWPadnos> I haven't looked at all the code that controls the dir pin, so I'm not sure yet - I've probably just missed the mechanism by which it would be prevented
[23:49:17] <jmkasunich> and addval is ramped if needed in makepulses
[23:49:30] <jmkasunich> but 31 of the ramped addval is used to drive dir
[23:49:45] <SWPadnos> ramping addval could cause a problem in some cases (possibly)
[23:50:06] <jmkasunich> since addval can't cross zero twice in a servo period, you can't have two dir changes in a servo period
[23:50:10] <jmkasunich> how would ramping hurt?
[23:50:30] <SWPadnos> dunno yet - in this conversation, you're the expert and I'm just figuring it out :)
[23:50:35] <jmkasunich> heh
[23:50:42] <SWPadnos> Ive donned my stupid hat, and am taking nothing for granted so far
[23:50:55] <jmkasunich> sometimes the guy who isn't the "expert" is the one who sees the problem the expert has overlooked
[23:51:10] <SWPadnos> yep - hence the stupid hat I keep in my wardrobe ;)
[23:51:35] <jmkasunich> my plan of attack for the first problem (the one step wrong problem) will be to add a couple params so I can see key internal vars
[23:51:39] <jmkasunich> like the ramped addval
[23:52:13] <SWPadnos> actually, addval is computed by update_freq, then may be changed by makepulses based on a ramp, no?
[23:52:21] <jmkasunich> the other problem I doubt I'll be able to duplicate - so I'll have to work with the guy who's having it happen
[23:52:36] <jmkasunich> there are two I think
[23:52:41] <jmkasunich> lemme open the source file
[23:53:16] <SWPadnos> don't worry about it if you're about to head out the door - I'm sure I'll have more intelligent (or at least less stupid) comments later
[23:53:49] <jmkasunich> struct field newaddval is the one that update-freq should be setting, and addval is the ramped one
[23:53:59] <SWPadnos> ok
[23:54:16] <SWPadnos> and the ramp value is calculated in update_freq?
[23:54:36] <jmkasunich> deltalim you mean?
[23:54:40] <jmkasunich> yes
[23:54:49] <SWPadnos> err - could be ;)
[23:55:01] <jmkasunich> note - the ramping in make-pulses is very rudimentary
[23:55:06] <SWPadnos> I'm not looking at it right now - IRC and EMC are on separate computers at the moment :)
[23:55:35] <jmkasunich> and should be irrelevant to the situation jeff saw
[23:56:16] <jmkasunich> newaddvel steps from one servo period to the next, but those steps should never exceed the max accel rate
[23:56:30] <jmkasunich> the ramping in make pulses just interpolates the servo rate stesls
[23:56:31] <jmkasunich> steps
[23:56:33] <jmkasunich> gotta go
[23:56:41] <SWPadnos> see you
[23:56:44] <SWPadnos> thanks