#emc-devel | Logs for 2007-05-17

[00:54:39] <jepler> can someone with a real realtime system tell me whether this "continuous capture" mode works for you? The patch enables it all the time ..
[00:54:42] <jepler> can someone with a real realtime system tell me whether this "continuous capture" mode works for you? The patch enables it all the time .. static void force_button_clicked(GtkWidget * widget, gpointer * gdata)
[00:54:46] <jepler> {
[00:54:49] <jepler> here's the patch: http://emergent.unpy.net/index.cgi-files/sandbox/scope-cont.patch
[00:54:51] <jepler> ctrl_shm->force_trig = 1;
[00:54:54] <jepler> }
[00:54:56] <jepler> er, oops
[00:55:43] <jepler> it seems to me there's no guarantee that it won't show some "odd" data around the spot where data is actively being captured
[00:56:07] <jepler> (no way to atomically copy that much data without radically changing scope_rt)
[00:56:37] <jepler> HAL file I was using for testing, with halrun: http://emergent.unpy.net/index.cgi-files/sandbox/scope-cont.patch
[01:06:36] <jepler> the way it interacts with a trigger is interesting
[01:18:25] <cradek> does trunk sim/tkemc work for anyone?
[01:19:59] <cradek> or sim/xemc
[01:20:12] <cradek> keystick and axis work
[01:22:02] <cradek> libnml/os_intf/_shm.c 257: No shared memory buffer exists for this key and the IPC_CREAT was not given.
[01:22:48] <jepler> hum I hope that's not my fault
[01:22:55] <jepler> let me test it
[01:23:15] <cradek> thanks
[01:23:28] <jepler> I bet I broke it
[01:23:33] <jepler> it's broken here too
[01:23:55] <jepler> beats me why some UIs work and others don't, though
[01:24:33] <cradek> the nml files should be the same...
[01:27:08] <jepler> indeed
[01:31:51] <jepler> I know what change broke it, but it doesn't make much sense to me
[01:32:20] <cradek> yuck
[01:32:22] <jepler> revert src/libnml/os_intf/_shm.c to rev 1.4
[01:32:30] <jepler> but all that changes is the behavior when shared memory segments are removed!
[01:34:05] <cradek> that does fix it
[01:36:10] <jepler> OK I reverted that change...
[01:37:59] <jepler> grumble, when tkemc exits (with that change reverted), the shared memory segments get cleaned up
[01:38:14] <jepler> with axis exits, segments 3e9, 3ea, 3eb stick around
[01:38:25] <cradek> bizarre!
[01:38:41] <jepler> that's command, status, and error
[02:13:54] <jepler> aha -- the problem was actually halui, not axis
[02:14:01] <cradek> whee
[02:14:05] <jepler> not sure why halui is turned on for configs/sim/axis but there must be a reason
[02:14:09] <cradek> good find
[02:14:25] <jepler> you should see the wide variety of debugging statements I added before I figured it out :-P
[02:14:30] <cradek> I'm not sure why you think that...
[02:17:37] <jmkasunich> hi guys
[02:18:09] <cradek> hi
[14:41:38] <jepler> ouch -- halcmd crashed when I resized it
[14:42:37] <SWPadnos> ouch
[14:46:18] <jepler> er, halscope I mean
[14:46:29] <jepler> it seems to happen only in roll mode
[14:46:39] <SWPadnos> oh good. I was wondering why a terminal resize would affect halcmd :)
[14:49:59] <jepler> ==31184== More than 100000 total errors detected. I'm not reporting any more.
[14:50:03] <jepler> wow valgrind doesn't seem to be happy with this program
[14:50:09] <jepler> ==31184== Final error counts will be inaccurate. Go fix your program!
[14:51:28] <SWPadnos> hah
[14:51:33] <SWPadnos> that's halscope?
[14:53:56] <jepler> yes
[14:54:07] <SWPadnos> isn't that like 10 errors per line or something?
[14:54:26] <SWPadnos> maybe 20
[14:54:28] <cradek> a loop can cause that
[14:54:50] <SWPadnos> right - it's a dynamic checker, not static code checking - got it
[15:16:31] <cradek> here's an idea, if you're running software 5 releases old, try updating before you report bugs
[15:18:21] <SWPadnos> oh - did Alfred respond to your response?
[15:19:35] <cradek> yes and I still don't understand his other problem
[15:19:42] <cradek> I'll let someone else handle it.
[15:20:26] <jepler> here's the gdb traceback I got on the halcmd crash .. doesn't enlighten me much .. http://pastebin.ca/493002
[15:20:40] <jepler> I just got another one
[15:20:59] <cradek> scope! scope!
[15:21:25] <jepler> argh scope
[15:21:27] <jepler> halscope
[15:21:28] <jepler> halscope
[15:21:33] <jepler> I must type halcmd a heck of a lot more often
[15:22:28] <jepler> second one is also in g_param_spec_override / g_signal_stop_emission / g_signal_emit_valist / g_signal_emit but below that the traces are different
[15:23:41] <cradek> yuck
[15:24:56] <cradek> google knows about this kind of segfault
[15:25:20] <jepler> oh does it
[15:26:00] <jepler> what's it say?
[15:26:19] <cradek> yours are all triggered by resize?
[15:26:26] <cradek> I'm looking for one that has resize involved
[15:26:41] <jepler> no; I think the second one I got was triggered by switching back to the desktop where halcmd was
[15:27:12] <cradek> hm
[15:27:42] <cradek> halscope!
[15:27:46] <SWPadnos> heh
[15:28:14] <cradek> "I can't say the letter B"
[15:28:43] <SWPadnos> perhaps a Sesame Street infusion is necessary?
[15:29:03] <cradek> HAL ....... SCOPE!
[15:29:07] <cradek> HAL ..... SCOPE!
[15:29:16] <cradek> HAL...SCOPE!
[15:29:17] <cradek> HALSCOPE!
[15:29:21] <SWPadnos> halscope - cope!
[15:29:36] <cradek> what was the two-headed monster's name?
[15:29:49] <SWPadnos> I have no idea - I've necer watched sesame street
[15:29:51] <SWPadnos> never
[15:30:41] <SWPadnos> hmmm. have you ever seen an LCD with 1220x800 resolution?
[15:31:47] <jepler> 1280x800, yes
[15:31:49] <cradek> http://en.wikipedia.org/wiki/Image:Sesametwoheadedmonster.jpg
[15:32:13] <cradek> "The Two-Headed Monster sounded out words coming together"
[15:33:32] <jepler> I think someone just typoed 1220 instead of 1280
[15:33:42] <SWPadnos> that could be
[15:34:06] <SWPadnos> ah yes - it's 1280 elsewhere on the page
[15:36:19] <cradek> it appears that his name is "the two-headed monster" (and yes I know nobody cares)
[15:36:51] <SWPadnos> I saw that on the "sesame street characters" page linked from the inage page
[15:36:57] <SWPadnos> err - image
[15:53:46] <jepler> found one bug in halscope
[15:54:35] <cradek> oh, he answered me privately, hmm. guess waiting for someone else to help him isn't going to work
[15:55:37] <jepler> + if(midx != x2)
[15:55:38] <jepler> + points[pn].x = midx; points[pn].y = y2; pn++;
[15:55:41] <jepler> spot the mistake...
[15:56:22] <SWPadnos> {}
[15:56:34] <cradek> ouch
[15:56:51] <SWPadnos> if only those had been commas instead of semicolons
[15:57:15] <jepler> s/commas.*/python/
[15:57:21] <SWPadnos> heh
[15:57:28] <cradek> yeah that's gotta be the right fix
[15:57:31] <cradek> commas, I mean
[15:57:38] <jepler> just like I can't reliably type "halscope" I can't reliably type "C"
[15:57:46] <SWPadnos> heh - not right, but it would have worked
[15:57:46] <cradek> jmk would have a fit
[15:58:39] <SWPadnos> yeah - I wonder what thereaction would be ifthat second line were changed to:
[15:58:41] <SWPadnos> + points[pn].x = midx, points[pn].y = y2, pn++;
[15:59:15] <SWPadnos> hmmm. I wonder if that's legal. it may need to be pn2= (yada yadda);
[15:59:27] <jepler> a fit about the commas, or about me replacing halscope with Python code
[15:59:33] <SWPadnos> either
[15:59:37] <cradek> well yeah
[15:59:41] <SWPadnos> commas would be "WTF???"
[15:59:52] <SWPadnos> python would be "now I can't do anything with it"
[16:00:38] <jepler> can someone check in the "add brackets" fix for that file? I have a bunch of other debugging stuff in my local copy that I'm not ready to throw away yet.
[16:00:40] <cradek> he religiously adds {} for single line bodies, which I think is silly, and I decline to say whether he would have avoided this bug due to his religion
[16:00:46] <jepler> I guess I can commit from another copy of the CO, nevermind
[16:01:02] <jepler> I had the same thought
[16:02:42] <cradek> "What should a recent computer science graduate, who is currently job hunting, do to gain the experience almost every potential employer is seeking?"
[16:02:47] <cradek> I know, I know!
[16:03:06] <SWPadnos> learn to kiss ass
[16:03:27] <cradek> no...
[16:03:35] <cradek> learn to program
[16:03:48] <SWPadnos> hmmm. that code change has no effect
[16:04:07] <SWPadnos> err - well, ther eis one
[16:04:31] <SWPadnos> that could have been an else, since X is the only thing that depends on midx ~= x2
[16:04:40] <SWPadnos> !=, I mean
[16:29:10] <SWPadnos> it looks like this would have the intended effect: points[pn].x = (x1 == x2) ? x1 : (x1+x2)/2;
[16:29:42] <SWPadnos> and leave "points[pn].y = y2; pn++;" out of the decision altogether
[16:30:01] <cradek> I have no idea what the intended effect is, but the current code adds two points for the != case, and you're proposing that it should add one
[16:30:13] <SWPadnos> ah - that would be a difference ;)
[16:30:23] <cradek> so do you think it's still buggy or did you miss that?
[16:30:26] <SWPadnos> silly me - I was thinking that it should add only one
[16:30:29] <SWPadnos> I missed that
[16:30:30] <cradek> (I don't know what the right behavior is)
[16:30:32] <SWPadnos> time for more coffee
[16:31:10] <cradek> I wish it was time for lunch here...
[16:31:19] <cradek> we could go outside and thaw for one thing
[16:31:19] <SWPadnos> soon ...
[16:31:22] <SWPadnos> heh
[16:31:25] <jepler> if the sample is more than one pixel wide, it's supposed to look like _/ (rising) or ~\ (falling). that entails adding two points, one at the midpoint pixel
[16:31:41] <jepler> if the sample is not more than one pixel wide, that's silly
[16:31:42] <SWPadnos> ok - I just missed that
[16:32:11] <SWPadnos> midx can still be eliminated, since x1==x2 should give the same results
[16:32:42] <SWPadnos> midx would tend to round down to x1 (assuming this is a left->right ordering of samples)
[16:33:22] <jepler> I think this is what causes the occasional "horizontal line through the whole sample" artifact
[16:33:44] <cradek> yay
[16:33:48] <SWPadnos> whole sample meaning wholse screen?
[16:34:04] <SWPadnos> (scope screen, that is)
[16:34:34] <cradek> yes
[16:35:46] <jepler> I just got another crash in g_param_spec_override though. this time, I was trying to drag the trigger line around
[16:37:05] <jepler> it seems like "roll" mode has really destabalized halscope though I can't understand why
[16:41:51] <lerman_> lerman_ is now known as lerman
[20:10:08] <petev> guys, I managed to capture some startup issues. Some diagnostic info is here:
[20:10:12] <petev> http://www.pastebin.ca/493468
[20:10:12] <petev> http://www.pastebin.ca/493469
[20:10:21] <petev> I'm pretty sure I can re-create it
[20:10:58] <petev> it seems to be caused by homing the A axis, which is probably not configured correctly as it has min/max limits both set to 0
[20:11:36] <cradek> did you update since last night?
[20:11:39] <petev> in the first case, the "exceed software limits: bug" error dialog happened on homing A
[20:11:50] <jepler> please make sure you're completely up to date. chris reported a similar problem recently and I reverted a change that seemed to have caused it
[20:11:53] <petev> I think the last update was some time yesterday
[20:12:22] <jepler> 20:22:02 <cradek> libnml/os_intf/_shm.c 257: No shared memory buffer exists for this key and the IPC_CREAT was not given.
[20:12:36] <petev> in the first case, no moves in MDI mode would be accepted, and complaints about SW limits were issued
[20:12:38] <jepler> luckily it doesn't cost you anything to type "cvs up"
[20:12:47] <petev> I think there are 2 things here
[20:12:54] <cradek> could be
[20:12:56] <petev> something with the SW limits and the NML buffer
[20:13:04] <petev> the second case had no errors
[20:13:11] <cradek> the nml thing is probably fixed
[20:13:16] <petev> but a re-start of emc resulted in the NML problem
[20:13:21] <cradek> I don't see anything else in pastebin though
[20:13:23] <petev> both times the A axis was homed
[20:13:33] <petev> when I don't home A, I never see this
[20:13:44] <petev> I will do an update today and test again
[20:13:46] <cradek> fix this, then retest, then we'll check into the other thing
[20:13:52] <petev> ok
[20:15:17] <cradek> if min = max limit, I don't see how you can expect to home (or move afterward)
[20:16:52] <petev> the A has no home sequence, it's just a GUI thing
[20:16:59] <petev> don't expect it to move at all
[20:17:06] <petev> HOME_SEARCH_VEL is 0
[20:17:28] <petev> the move afterwards was for other axis, not the A axis
[20:17:38] <cradek> why do you have an axis defined that doesn't ever move?
[20:17:40] <petev> but the error on A seemed to cause problems for the other axis
[20:17:49] <petev> I have not configured A yet
[20:18:09] <petev> I'm not sure of the proper config yet, so it just has zeros
[20:18:28] <cradek> don't home it then - that turns on the soft limits, which are set wrong
[20:18:43] <petev> it will never have a home operation, as it has no position switches
[20:18:44] <cradek> or just disable the axis, it's not like it's hard to do
[20:18:59] <petev> I agree, but I think it is showing a bug
[20:19:15] <cradek> can you say again what you think the bug is
[20:19:23] <petev> if the soft limits are 0, how are they getting exceeded if the axis is not moved?
[20:19:33] <cradek> the soft limits are *equal*
[20:19:39] <petev> first I don't think an error should happen on home
[20:19:45] <petev> and it doesn't always
[20:19:46] <cradek> what position of the axis do you think should be accepted if the limits are equal?
[20:19:53] <petev> only 0
[20:20:04] <petev> or the values of the limits
[20:20:13] <petev> 0 in this case
[20:20:23] <petev> and like I said, it doens't always error on home
[20:20:25] <petev> just sometimes
[20:20:30] <cradek> sorry, I guess I don't care about this
[20:20:34] <petev> and if it's commanded pos for soft limits
[20:20:43] <petev> then I don't understand why it errors sometimes
[20:20:45] <cradek> it's just misconfigured IMO
[20:20:58] <petev> yes, but there it's showing a bug
[20:21:32] <petev> misconfigured or not, if the axis is in a valid pos, why do I sometimes see the error?
[20:21:45] <petev> the behavior is not even consistent
[20:22:36] <cradek> if you set the limits to -.001 and +.001, can you home it?
[20:23:13] <petev> it homes sometimes the way it is, in fact, I think it always homes as the axis turns green even when the soft limit error pops up
[20:23:28] <petev> but something in the home seems to trigger the soft limit error sometimes
[20:23:56] <SWPadnos> one encoder count of noise could cause the axis to exceed the soft limits
[20:25:04] <petev> it should be using cmd pos after jmks changes from what I understand
[20:26:11] <SWPadnos> this may be a bug, but I'm not sure it makes sense to require EMC to do anything with misconfigured axes
[20:26:26] <petev> I never said it should
[20:26:29] <SWPadnos> I suspect the easiesf fix is to set AXES=3 in the ini :)
[20:26:33] <SWPadnos> easiest
[20:26:34] <cradek> the test for a position being allowed is > the min limit and < the max limit
[20:26:43] <petev> I'm just pointing out that there seems to be a bug
[20:26:45] <cradek> that is the right test IMO
[20:27:05] <petev> whatever, if you don't think there is a bug I don't care
[20:27:07] <SWPadnos> well, <= and >= may be acceptable as well
[20:27:14] <petev> the dialog that pops up even says "
[20:27:19] <petev> bug?
[20:27:46] <SWPadnos> yeah "exceed software limits: bug" seems to indicate a bug ;)
[20:28:54] <petev> and the behavior is not consistent either
[20:29:22] <SWPadnos> if the feedback is connected to hardware, then I'm not sure I'd expect consistent behavior
[20:29:40] <SWPadnos> since I'm not sure you can guarantee that there is no change to the inputs to the software (noise issues ...)
[20:29:42] <petev> but feedback should not play any part here
[20:29:50] <petev> it should all be based on cmd pos
[20:29:51] <cradek> if (joint->pos_cmd > joint->max_pos_limit) {
[20:29:51] <cradek> onlimit = 1;
[20:29:51] <cradek> }
[20:29:51] <cradek> if (joint->pos_cmd < joint->min_pos_limit) {
[20:29:51] <cradek> onlimit = 1;
[20:29:53] <cradek> }
[20:30:28] <cradek> if those are equal, and you home, which enables these checks, you shouldn't be surprised that they trigger
[20:30:49] <petev> why is on limit an error?
[20:30:57] <petev> you should be able to move to on limit
[20:31:05] <petev> and why is it not always displayed?
[20:31:18] <cradek> brb
[20:31:19] <petev> those checks should proabably be <= >=
[20:31:41] <SWPadnos> I think you have to use the feedback position, since the axis could move due to outside influence
[20:31:52] <petev> no, that's what FE is for
[20:32:10] <petev> using cmd-pos is much cleaner and the right way IMO
[20:32:33] <SWPadnos> when you're checking to see if a move will be acceptable, that's true
[20:32:47] <SWPadnos> if you're checking to see that the machine is within limits, then it needs to be the FB
[20:32:53] <petev> no
[20:33:00] <petev> that causes problems
[20:33:16] <petev> and is totally dependant on what error the machine can hole
[20:33:19] <petev> hold
[20:33:25] <petev> and we already specify that with FE
[20:33:54] <jepler> maybe FE should be subtracted from the limits, so that you can't hit the limit with max position + max error .. and backlash has to play a role here too
[20:34:34] <SWPadnos> I suspect Pete's solution iscleaner - if you're not asking for anything outside the limits, it's not an error
[20:34:34] <petev> I think SW limits should be treated like any other motion, you can go to the cmd pos and if you are within FE, it is not an error
[20:34:51] <petev> otherwise you have all kinds of issues and you don't know where the limits really are
[20:36:33] <petev> the old code used to use pos-fb and it had problems with not being able to move to the limits (some fudge factor) to prevent getting stuck on them
[20:36:53] <petev> and the fudge factor was hard coded and should really be FE based
[20:37:02] <petev> so cmd-pos really seems like the clean solution
[20:37:46] <petev> jepler, most commercial machine have the soft limits set an inch or so from the hard limits
[20:38:04] <petev> so things like backlash and FE shouldn't be an issue
[20:38:27] <petev> it's expected that the machine can be commanded to the soft limit without error, but not beyond
[20:42:07] <jepler> petev: so 1 inch is acceptable, but 1 inch + 1 ulp is not?
[20:42:38] <petev> no, the point is that it's expected that the machine can go exactly to the soft limit
[20:42:50] <petev> and the operator needs to know the exact value of it
[20:43:00] <petev> so it should be the number from the INI file
[20:43:16] <petev> how much dist between the soft/hard limits is up to the builder
[20:43:22] <petev> faster machines would likely have more
[20:45:05] <jepler> fwiw I wasn't able to get this behavior until I set [AXIS_1]MIN_LIMIT=0 MAX_LIMIT=0 HOME_OFFSET=0.1 .. then I got it whenever I tried to home Y
[20:45:31] <jepler> before I set HOME_OFFSET=0 I did not get the "Exceeded soft limit (bug?)" pop up
[20:45:32] <petev> I have HOME_SEARCH_VEL=0 on the A axis
[20:45:40] <petev> and the error is intermittent
[20:46:07] <petev> it was my understanding that HOME_SEARCH_VEL=0 meant the axis does not home
[20:46:12] <petev> is this incorrect?
[20:46:27] <petev> I don't think I have any offset set, but let me check to be sure
[20:47:03] <petev> no, offset is 0.0
[20:47:18] <petev> and HOME is 0.0
[20:48:04] <petev> maybe this is a +/- 0.0 floating point thing?
[20:48:39] <petev> I should try it with the non-zero min/max limits
[20:48:48] <petev> and the same for HOME
[20:49:51] <SWPadnos> I wonder if the soft limits exceeeded condition is only checked when the machine is "on"
[20:51:33] <petev> ok, I just got the soft limit bug witht the lates code, lets see if it has the restart problem
[20:51:57] <petev> and I do see the DRO bouncing from -0 to +0, but that should be the pos-fb, not pos-cmd
[20:52:27] <jepler> hmm -- if you are in ESTOP RESET, pos-fb is copied to pos-cmd
[20:52:45] <SWPadnos> hmmm. according to the code cradek posted, pos_cmd is used
[20:52:46] <petev> ahh, that may be it with +/- 0 then
[20:52:48] <jepler> then when you go to MACHINE ON again you would get exceed the soft limit
[20:52:59] <jepler> since you're not actually at 0, you're at 0 + 1 count or 0 - 1 count
[20:53:04] <SWPadnos> and that's why I was wondering if limits are checked in "machine on" only ...
[20:53:15] <petev> looks like the NML bug is fixed, or at least it didn't happen this time
[20:53:22] <petev> it was pretty consistent before
[20:54:04] <petev> but you have to be "on" to home and I didn't go to machine off after homing
[20:54:26] <petev> it seems the limits are not checked when the axis is not homed, as it should be
[20:55:04] <jepler> disconnect the feedback from axis.3.motor-pos-fb and see if it ever happens then
[20:55:14] <petev> ok
[20:56:47] <jepler> I am *guessing* here, but if the feedback motor position is used to establish the the offset when homing, then you would end up with the motor offset being +- 1 count, and thus the new commanded position would also be +-1 count (with opposite sign)
[20:57:25] <petev> could be, so maybe that's where the randomness is coming from
[20:58:19] <jepler> and why I can't manage to see it with a simulator configuration
[20:58:26] <jepler> ooh I get to go home now
[20:58:29] <petev> I think you got it
[20:58:42] <petev> I just hommed A about 20 times without pos-fb and no issues
[20:58:48] <SWPadnos> heh
[20:59:03] <SWPadnos> I could swear someone mentioned feedback before ;)
[21:00:42] <petev> and looking at the code snipet cradek posted again, the checks look fine as it's checking for exceeded, not within limits
[21:03:18] <lerman_> lerman_ is now known as lerman
[21:04:24] <petev> so the question here is, how do we handle the case when the axis is at 0.0 and the min limit is 0.0?
[21:04:41] <petev> sounds like getting a -0.0 on homing will cause a problem
[21:05:27] <petev> it doesn't seem right that you can't home to the soft limit value if it happens to be 0.0
[21:09:02] <petev> I guess the same might happen with any small value where there is enough precision in the float to capture one encoder count
[21:14:36] <petev> I just tested again with min=0.0, max=360 and the problem is still there, so I'm pretty sure jepler got it
[22:07:30] <jepler> petev: it's not -0, it's -1 count
[22:07:55] <jepler> so emc sees that the home position is -1 counts of motor position
[22:08:16] <jepler> but it's commanding 0 counts of motor position, so it must be commanding 1 count of joint position
[22:08:38] <jepler> but 1 count of joint position is strictly greater than zero, so it's beyond the soft limit