#emc-devel | Logs for 2006-10-06

Back
[00:02:06] <jmkasunich> I think that means I got lucky
[00:02:33] <jmkasunich> does "64 bit machine" means sizeof(int) = 8?
[00:03:01] <jmkasunich> or sizeof(long) = 8?
[00:33:24] <jmkasunich> jepler: ray and I are both having problems running stock emc configs that worked fine recently
[00:33:35] <jmkasunich> john@ke-main-ubuntu:~/emcdev/emc2head$ scripts/emc
[00:33:35] <jmkasunich> EMC2 - pre-2.1 CVS HEAD
[00:33:35] <jmkasunich> Machine configuration directory is '/home/john/emcdev/emc2head/configs/stepper/'Machine configuration file is 'stepper_inch.ini'
[00:33:35] <jmkasunich> Starting EMC2...
[00:33:35] <jmkasunich> emc/usr_intf/emcsh.cc 5152: can't connect to emc
[00:33:37] <jmkasunich> Shutting down and cleaning up EMC2...
[00:33:39] <jmkasunich> emc/task/emctaskmain.cc 2492: can't initialize motion
[00:33:41] <jmkasunich> Cleanup done
[00:33:43] <jmkasunich> he gets different errors
[00:34:04] <jmkasunich> I manually reverted the 4096->8192 emc.nml change, no effect
[00:40:03] <jmkasunich> stepper_inch doesn't work, stepper_mm does
[00:44:34] <jmkasunich> found it - stepper_inch has "degrees" instead of "degree" for units
[00:44:43] <jmkasunich> no error message, it just fails
[00:56:11] <jepler> oops!
[00:56:23] <jepler> I'm glad you found the problem before I got back, I'd have spent ages puzzled.
[00:56:39] <jepler> jmkasunich: sizeof(int) == 4, sizeof(long) == sizeof(void*) == 8
[00:56:57] <jmkasunich> hmm
[00:57:15] <jmkasunich> I think there are places where I assume sizeof(void*) = sizeof(int)
[00:57:23] <jmkasunich> not sure - maybe I used long instead of int
[00:58:03] <jmkasunich> I'm trying to find the actual code changes alex made for units - if it reads something it doesn't understand it should print an error message, not silently fail
[00:58:42] <jepler> ./emc/nml_intf/emcglb.h: { "degree", 1.0 },
[00:59:18] <jepler> ./ini/initraj.cc: if (strcmp(angularUnitsName, angular_nv_pairs[i].name) == 0) {
[01:00:59] <jmkasunich> ah, he does a for loop looking for a match, and doesn't deal with the "no match found" case
[01:01:02] <jepler> yeah
[01:01:06] <jepler> if you want I'll fix it
[01:01:15] <jepler> I want sampler! I want sampler!
[01:01:25] <jmkasunich> ok, you fix it
[01:03:04] <rayh> I'm starting with a new checkout.
[01:09:05] <jmkasunich> jepler: for sampler... I'm trying to decide what is appropriate when the buffer is full
[01:09:37] <jmkasunich> option 1: (easier) - discard the new data - traditional "fifo full" behavior
[01:09:49] <jepler> jmkasunich: for testing, I would want to distinguish between "the test failed because there was an overrun" and "the test failed because the component(s) under test failed"
[01:09:54] <jmkasunich> option 2: (a little tricky to make thread safe) - override the oldest data
[01:10:09] <jmkasunich> s/override/overwrite/
[01:10:18] <jepler> so I don't care about discard vs overwrite as much as the ability to detect
[01:10:31] <jmkasunich> well, ability to detect is a given
[01:10:53] <jmkasunich> I'll either print a blank line whenever there is an overrun, or prefix the next good line with an asterisk, or something
[01:11:07] <jmkasunich> plus, the optional sample numbers would be out of sequence
[01:12:33] <jepler> do the thing that is easiest to code then
[01:12:52] <jepler> invalid inifile value for [TRAJ] ANGULAR_UNITS: degrees
[01:13:01] <jmkasunich> yay
[01:13:03] <jepler> it will print this, and make the unit default to 1, if the table entry is not found
[01:13:15] <jmkasunich> I think it should fail
[01:13:30] <jmkasunich> that will force them to read the error message
[01:13:59] <jmkasunich> people who run from an icon would be completely without clue that they are using a default unit instead of the one they specified in their ini file
[01:14:58] <jepler> hum, apparently returning -1 from loadTraj doesn't make emc abort
[01:15:24] <jmkasunich> whatever it was doing before seemed to abort it just fine ;-)
[01:15:34] <jmkasunich> only with no message to say way
[01:15:34] <jepler> it just tries a bunch of times to call it, with AXIS running but unresponsive
[01:15:35] <jmkasunich> why
[01:16:24] <jepler> non-standard length units, setting interpreter to mm
[01:16:33] <jepler> hm, I changed it a bit and now AXIS displays this
[01:16:46] <jepler> and my X, Y and Z are 'nan'
[01:16:52] <jmkasunich> oops
[01:17:01] <jmkasunich> that _is_ non-standard
[01:18:32] <jmkasunich> hmm
[01:19:00] <jepler> emc/usr_intf/emcsh.cc 5152: can't connect to emc
[01:19:05] <jmkasunich> does the userspace sim mean that now I need to allow for the case where the user part of sampler interrupts the "realtime" part?
[01:19:06] <jepler> I get this and it exits, if I run tkemc
[01:19:18] <jepler> axis opens its GUI and just sits there
[01:19:27] <jmkasunich> ah
[01:19:37] <jepler> yes
[01:19:41] <jmkasunich> drat
[01:22:50] <jmkasunich> fsckit - the critical region is only a few lines long, and the worst case behavior is just a false overrun indication, not a crash or deadlock or anything
[01:23:14] <jepler> when the userspace removed the oldest value "just in time"?
[01:23:19] <jepler> yeah, who cares
[01:23:22] <jmkasunich> yeah
[01:23:32] <jepler> "may have been an overrun"
[01:42:24] <cradek> step sure is busted
[01:42:40] <cradek> in a bunch of G1s, sometimes it goes 2-3 lines before stopping
[01:43:01] <jmkasunich> it works in 2.0.3?
[01:43:23] <cradek> that's the million dollar question
[01:43:32] <cradek> unfortunately I can't run 2.0 on this machine
[01:43:44] <cradek> can you try it?
[01:43:47] <jmkasunich> no RT anymore?
[01:43:50] <cradek> nope
[01:44:13] <jmkasunich> ok, first lemme see if I can replicate the problem on head
[01:44:18] <jmkasunich> what config did you try?
[01:44:24] <cradek> sim/axis
[01:44:33] <cradek> just running the splash screen gcode it's easy to see
[01:44:57] <jmkasunich> hmm
[01:45:20] <jmkasunich> sim/axis won't run
[01:45:46] <jmkasunich> stand by, might be something I did
[01:48:27] <jmkasunich> jepler: is this a side effect of the change to emc.nml?
[01:48:28] <jmkasunich> Machine configuration file is 'axis.ini'
[01:48:28] <jmkasunich> Starting EMC2...
[01:48:28] <jmkasunich> libnml/os_intf/_shm.c 238: shmget(1005(0x3ED),8192,0) failed: (errno = 22): Invalid argument
[01:48:28] <jmkasunich> libnml/os_intf/_shm.c 247: Either the size is too big or the shared memory buffer already exists but is of the wrong size.
[01:48:29] <jmkasunich> libnml/cms/cms_cfg.cc 908: cms_config: -1(CMS_MISC_ERROR: A miscellaneous error occured.) Error occured during SHMEM create.
[01:48:32] <jmkasunich> libnml/nml/nml.cc 369: NML: cms_config returned -1.
[01:49:10] <cradek> is that head?
[01:49:19] <jmkasunich> yeah
[01:49:20] <cradek> I bet that is the nml change
[01:49:37] <SWPadnos> is everything using the same emc.nml file?
[01:49:41] <cradek> yes
[01:49:43] <jepler> jmkasunich: I haven't seen that error
[01:49:58] <jepler> "or the shared memory buffer already exists but is of the wrong size."
[01:49:58] <SWPadnos> rt or sim (or both)?
[01:50:03] <jmkasunich> rt
[01:50:03] <jepler> is it possible some old buffer is lingering on?
[01:50:31] <SWPadnos> jepler: are you running RT or sim?
[01:50:35] <jepler> SWPadnos: me? sim.
[01:50:39] <SWPadnos> hmmm
[01:50:54] <SWPadnos> jmkasunich, try sim instead of RT
[01:51:02] <cradek> jepler: did the stepping problem happen just with enable-simulator?
[01:51:09] <SWPadnos> I wonder if there's a 1-page allocation issue for the RT code
[01:51:44] <jmkasunich> building for sim
[01:53:41] <jepler> cradek: no, I think I saw and refused to pay attention to it before yesterday
[01:53:49] <cradek> ok
[01:54:07] <jmkasunich> same problem using non-rt
[01:54:29] <jepler> so revert that nmlfile change
[01:54:35] <jepler> it's not needed for real people
[01:55:01] <jmkasunich> thats the toolCmd entry?
[01:55:02] <cradek> are you sure you have the new nml file in your config?
[01:55:02] <jepler> yes
[01:55:10] <jmkasunich> I thought you changed it from 4096 to 8192
[01:55:13] <jepler> yes
[01:55:19] <jmkasunich> its 1024 here
[01:55:26] <SWPadnos> hmmm
[01:55:48] <jmkasunich> you changed common/emc.nml, and I think the build is supposed to copy that to the individual config dirs...
[01:55:52] <jepler> B toolCmd SHMEM localhost 1024 0 0 4 16 1004 TCP=5005 xdr
[01:55:52] <jmkasunich> I guess it didn't
[01:55:55] <jepler> -B toolSts SHMEM localhost 4096 0 0 5 16 1005 TCP=5005 xdr
[01:55:58] <jepler> +B toolSts SHMEM localhost 8192 0 0 5 16 1005 TCP=5005 xdr
[01:56:01] <jepler> I changed toolSts
[01:56:08] <jmkasunich> duh
[01:56:38] <jmkasunich> changed back to 4096, works in sim
[01:57:09] <jmkasunich> building rt
[01:57:50] <SWPadnos> so 8192 works on at least one 64-bit system, and doesn't on at least one 32-bit system. odd
[01:58:31] <jmkasunich> whats really odd is that I ran other configs since jepler's commit
[01:58:41] <jmkasunich> and I think the 8192 worked fine
[01:58:43] <jepler> if you didn't 'make' it wouldn't have copied the nml file
[01:59:53] <jmkasunich> fsck - now I'm seeing some of the same strangeness that ray reported
[02:00:18] <SWPadnos> that's good. wasn't the idea to try 2.0.x on that machine? ;)
[02:01:49] <jepler> this is with 2.0 or with 2.1 CVS?
[02:01:55] <jmkasunich> somethings fscked with trivkins now
[02:02:11] <jmkasunich> john@ke-main-ubuntu:~/emcdev/emc2head$ scripts/emc
[02:02:11] <jmkasunich> EMC2 - pre-2.1 CVS HEAD
[02:02:11] <jmkasunich> Machine configuration directory is '/home/john/emcdev/emc2head/configs/sim/'
[02:02:11] <jmkasunich> Machine configuration file is 'axis.ini'
[02:02:11] <jmkasunich> Starting EMC2...
[02:02:12] <jmkasunich> HAL:5: ERROR: module 'trivkins' not loaded
[02:02:14] <jmkasunich> HAL config file /home/john/emcdev/emc2head/configs/sim//core_sim.hal failed.
[02:02:16] <jmkasunich> S
[02:02:18] <jmkasunich> but...
[02:02:25] <jmkasunich> john@ke-main-ubuntu:~/emcdev/emc2head$ lsmod | head
[02:02:25] <jmkasunich> Module Size Used by
[02:02:25] <jmkasunich> trivkins 1796 0
[02:02:26] <jmkasunich> hal_lib 29976 1 trivkins
[02:02:28] <jmkasunich> rtapi 26048 1 hal_lib
[02:02:30] <jmkasunich> r
[02:02:42] <jmkasunich> funny thing is, halcmd unloadrt won't remove it
[02:02:44] <jepler> did you 'make clean' after switching from realtime?
[02:02:47] <jmkasunich> and halcmd show comp doesn't show it
[02:03:04] <SWPadnos> halcmd unloadrt is basically rmmod, right?
[02:03:04] <jmkasunich> no, is a clean needed when switching?
[02:03:05] <jepler> unfortunately, I don't think make catches that it needs to rebuild some of that stuff. like halcmd
[02:03:13] <jmkasunich> ok
[02:03:48] <jmkasunich> SWPadnos: unloadrt invokes module_helper which invokes rmmod
[02:03:58] <jmkasunich> I manually rmmod'ed it just fine
[02:04:03] <SWPadnos> interesting
[02:04:14] <jmkasunich> doing a clean build now, we'll see what happens
[02:04:55] <jepler> but if you're using --enable-simulator, it runs rtapi_app
[02:05:20] <jmkasunich> I'm done with non-rt (it worked)
[02:05:31] <jmkasunich> I rebuilt for RT and thats when things went to hell
[02:05:43] <jmkasunich> recap:
[02:05:49] <jmkasunich> sim/axis didn't work in RT
[02:05:57] <jmkasunich> rebuild (not from clean) for sim
[02:06:06] <jmkasunich> sim/axis didn't work in nonRT
[02:06:11] <jmkasunich> changed 8192 to 4096
[02:06:18] <jmkasunich> sim/axis did work in non-rt
[02:06:26] <jmkasunich> rebuild (not from clean) for RT
[02:06:37] <jmkasunich> sim/axis didn't work with trivkins problem
[02:06:47] <jmkasunich> rebuilding from clean for RT now
[02:10:45] <jepler> I ran through about 200 lines on 2.0.3 and didn't see a problem
[02:10:51] <jepler> 2.0.3, real machine, real rt
[02:11:08] <jmkasunich> sim/axis with 8192 didn't work for RT
[02:11:20] <jmkasunich> sim/axis with 4096 did work for RT
[02:11:29] <SWPadnos> and you do see the problem on the same machne with a build from HEAD?
[02:12:02] <jepler> HEAD from at least 1 day ago had the problem; I ran into it on line 23
[02:15:09] <jmkasunich> do you still want me to try steping on an RT system, or should I go back to sampler?
[02:17:52] <jepler> I may be guessing wrong here, but does it happen when there is no cruise phase for a segment?
[02:18:12] <jepler> I am surprised that it doesn't happen predictably!
[02:18:26] <jmkasunich> turn off blending (G61?) and see if it happens?
[02:20:39] <jepler> I didn't get it to happen in several runs of the first 50 lines with G61, but did get it to happen at line 23 in the next run with G64
[02:29:07] <jepler> goodnight guys
[02:29:15] <jmkasunich> goodnight
[02:29:19] <jepler> good luck too
[02:32:49] <jmkasunich> heh, seems I never committed the makefile changes to make the user part of streamer
[02:43:30] <cradek> jepler: I also think that's the key
[02:43:34] <cradek> I'm looking into it now
[04:44:33] <cradek> I understand the remaining step bug, it's the segment-combining code that does it
[04:58:14] <jmkasunich> tough to fix?
[04:58:51] <cradek> well it's c++
[04:58:56] <cradek> might have it now, testing
[05:00:01] <cradek> yep looks like I got it
[05:01:03] <jmkasunich> yay!
[05:02:24] <cradek> no, I broke it worse
[05:02:26] <cradek> arg
[05:02:34] <jmkasunich> !yay
[05:04:14] <cradek> well heck I can't tell
[05:04:24] <cradek> step works right now, but if you step too fast it freaks out
[05:04:27] <jmkasunich> can't tell if its better or worse?
[05:04:29] <cradek> I think that's a separate bug
[05:04:34] <cradek> I think it's better
[05:04:41] <jmkasunich> fixabug findabug
[05:07:44] <cradek> committed
[05:08:06] <jmkasunich> CIA needs another boot in the pants
[05:08:39] <jmkasunich> still another bug lurking tho?
[05:09:05] <cradek> yes I think so, seems like stepping while a step is already running causes a problem
[05:09:39] <cradek> alex_joni reported earlier today that doing that causes it to take off running, but for me it stops
[05:09:46] <cradek> I may have changed that behavior unwittingly
[06:19:02] <jmkasunich> yay, sampler committed
[06:19:05] <jmkasunich> bedtime
[12:44:29] <jepler> cradek: thanks for working on that step thing. I suspected that segment merging would play a role in it, but I was afraid to look.
[12:52:46] <alex_joni> jepler: is it fully solved now? or does the motion_id issues still exist (e.g. missing motion_id's)
[12:54:24] <jepler> alex_joni: not sure, I only read the logs
[12:54:36] <alex_joni> same here..
[13:40:13] <skunkworks> logger_devel: bookmark
[13:40:13] <skunkworks> I'm feeling lazy .. but here's the log anyways: http://81.196.65.201/irc/irc.freenode.net:6667/emcdevel/2006-10-06#T13-40-13
[13:40:26] <skunkworks> ohh - alex is getting cute :)
[14:51:33] <cradek> I think step is fully fixed, but I'd appreciate you guys testing it anyway
[15:14:19] <skunkworks> cradek: what did it end up being?
[15:16:35] <cradek> there were two problems, some adjacent segments thought they came from the same line number (so you couldn't pause between them) and the way we pause was incompatible with blending because while blending it's not very clear which gcode you're executing (and where to stop)
[15:17:17] <cradek> so now while stepping, blending is disabled, because you want to stop right at the programmed corners anyway.
[15:18:35] <skunkworks> make sense. Nice
[15:25:38] <cradek> I haven't tested it but I know step is going to work in a surprising way when in tolerance mode
[15:26:07] <cradek> and there's really no fix as far as I can see
[15:27:17] <skunkworks> is this an issue in head?
[15:28:00] <cradek> yes
[15:28:21] <cradek> although I haven't tested stepping in 2.0, it might have the other problem
[15:36:21] <skunkworks> is step going to stop short in tolerance mode?
[15:36:38] <cradek> no it'll skip over some tiny segments
[15:37:00] <skunkworks> interesting. Do you see an issue with it?
[15:37:15] <cradek> I haven't tried it...
[15:37:32] <skunkworks> I ment mentally.
[15:38:20] <cradek> not sure what you mean by issue, I know it won't stop on every line of gcode like you might expect/want, so I guess that's an issue
[15:39:00] <skunkworks> ahhh - I see. thanks
[15:39:20] <skunkworks> * skunkworks wasn't getting it.
[15:40:37] <cradek> 2.0 won't have it, though, so we'll see how it goes for 2.1
[15:58:36] <skunkworks> could it be as easy as changing to exact stop mode also when you go to single step?
[15:58:47] <skunkworks> transparent to the user.
[15:59:15] <cradek> not really, since you can pause and start stepping anywhere
[16:00:10] <cradek> this is one of those I shouldn't have mentioned - should have waited to see if anyone reports it.
[16:00:20] <skunkworks> :)
[16:00:34] <skunkworks> I have already told every one I know :)
[16:18:21] <jepler> ugh. stepgen gives different results on sim+gcc3.2 and rtai+gcc3.4
[16:18:37] <jepler> so much for comparing sampler outputs as a way to do regression testing
[16:21:11] <SWPadnos> what results are different, and how do they differ?
[16:21:51] <jepler> the step pulses don't happen on the same iteration of the realtime threads
[16:21:53] <SWPadnos> oh wait, of course things like feedback positions will be different, sim isn't realtime
[16:22:40] <SWPadnos> with sim, I'd bet you would get jitter if you move the mouse around a lot (dragging windows especially)
[16:22:41] <jepler> when nothing external is involved, and everything is running in the same thread, the behavior should be identical
[16:23:02] <jepler> the results I get on sim do not change from run to run
[16:23:07] <SWPadnos> interesting
[16:23:25] <SWPadnos> does sampler run in the base thread or in the servo thread?
[16:24:02] <jepler> there is only one thread in use
[16:24:25] <SWPadnos> ok, and in sim, it's explicitly a one-of-N divide to get the slower threads?
[16:24:29] <jepler> this is an example hal file used for a test: http://emergent.unpy.net/files/sandbox/test.hal
[16:25:29] <jepler> here's the whole thing, including results generated on 'sim': http://emergent.unpy.net/files/sandbox/emc2-regression-test.patch
[16:25:46] <SWPadnos> hmmm. in sim, does the thread get passed the actual lelapsed time, or the set thread period?
[16:25:59] <jepler> the set thread period
[16:26:21] <SWPadnos> ok, I think in RT, it gets the actual elapsed time
[16:26:22] <jepler> the problem is not that it's unpredictable on sim. It's predictable on sim and rtai, but different from the one to the other
[16:27:17] <jepler> take a look at hal_lib.c:thread_task, which is the same code on sim or rtapi
[16:27:19] <jepler> /* call the function */
[16:27:19] <jepler> funct_entry->funct(funct_entry->arg, thread->period);
[16:27:22] <SWPadnos> I'd expect identical runs on sim (since the programmed time interval is passed), does RT also give identical runs (but different from sim)?
[16:27:30] <SWPadnos> hmm
[16:27:33] <jepler> the function is called with the given callback argument and period
[16:27:55] <SWPadnos> interesting. that should be the actual elapsed time, I think
[16:28:21] <jepler> RT gives identical results from run to run
[16:28:56] <SWPadnos> hmmm. take a look at the actual thread period - it will be dependent on the hardware clock (whereas sim will be an exact value)
[16:29:21] <jepler> oh -- I hadn't realized that, but you're right.
[16:29:24] <SWPadnos> then try sim with the actual value you see in RT
[16:30:03] <SWPadnos> not that that will help with a regression check, since the actual RT value may differ from machine to machine
[16:31:01] <jepler> yeah, hm
[16:47:57] <jepler> yeah, if I use that value on sim they give the same results for these tests
[16:48:27] <SWPadnos> cool
[16:49:24] <jepler> it sucks
[17:16:57] <SWPadnos> well, that too. at least it's consistent though
[17:21:17] <SWPadnos> well, CIA is certainly active now ;)
[17:33:12] <Lerneaen_Hydra> indeed it was
[17:34:00] <alex_joni> hi
[17:34:06] <alex_joni> cradek: stepping sure seems right now
[17:34:18] <alex_joni> don't see any off-the path problems
[17:51:12] <alex_joni> the only minor thing is the point where it resumes running if you push step too often
[18:20:11] <cradek> alex_joni: I think that's in the motion controller, I think the fix is simple but I have not tested: ignore step messages if you're already in a step
[18:20:32] <cradek> alex_joni: thanks for testing
[18:22:38] <alex_joni> no sweat, thanks for fixing this
[19:10:43] <jepler> I've added a flag in the HAL structure to have hal report to the realtime functions that the exact requested period is actually being used. I set this flag with a 'halcmd' command. What should I call this command?
[19:10:52] <alex_joni> cradek: I added a small thing to your fix
[19:11:00] <alex_joni> yet I'm puzzled by the outcome
[19:11:04] <alex_joni> http://pastebin.ca/193294
[19:11:57] <cradek> what's the outcome?
[19:12:01] <alex_joni> nothing :(
[19:12:02] <jepler> now the tests of stepgen work the same on sim and rtai
[19:12:09] <alex_joni> jepler: cool
[19:12:22] <alex_joni> cradek: I would have expected a dialog with the error message
[19:12:27] <cradek> that's because the message isn't sent; it's blocked in task
[19:12:43] <cradek> look at the other file's diff
[19:13:26] <alex_joni> ah.. so it doesn't ever get there
[19:13:44] <alex_joni> cool.. guess then this won't hurt.. right?
[19:13:53] <cradek> nope
[19:15:08] <alex_joni> cradek: got a minute to talk about something?
[19:15:15] <cradek> sure
[19:15:28] <alex_joni> I asked around (german users mainly) what they still feel missing from emc2
[19:15:53] <alex_joni> and amongst the "plausible implementable" features there was a request for jogging while paused
[19:16:21] <alex_joni> how dangerous does that sound?
[19:16:24] <cradek> jogging and offsetting/zeroing I suppose
[19:16:29] <alex_joni> yes
[19:16:39] <cradek> I wanted that before I had tool holders
[19:16:51] <alex_joni> the exact reason tehy asked for :)
[19:16:53] <cradek> someone reported that you CAN do it already in tkemc
[19:17:00] <cradek> sure I understand
[19:17:15] <cradek> I have not tried this but it seems a little unlikely to me
[19:17:46] <alex_joni> * alex_joni tries now
[19:18:08] <alex_joni> can't do that (EMC_AXIS_JOG) in auto mode with the interpreter paused
[19:18:08] <alex_joni> can't do that (EMC_AXIS_ABORT) in auto mode with the interpreter paused
[19:18:09] <alex_joni> can't do that (EMC_AXIS_ABORT) in auto mode with the interpreter paused
[19:19:14] <cradek> switch to manual
[19:19:19] <alex_joni> I did
[19:19:30] <cradek> oh it doesn't switch?
[19:19:32] <alex_joni> but when switching back to auto it starts from the beginning of the program
[19:20:02] <cradek> ok I didn't think it was likely to work
[19:20:09] <alex_joni> same here
[19:20:14] <cradek> I really wonder what this guy was talking about - wish I could remember who it was
[19:20:30] <alex_joni> I vaguely remember something about that too
[19:21:32] <alex_joni> hmm.. I think it involved a start-line-number
[19:21:49] <cradek> you could sure do it that way today
[19:21:58] <cradek> but one gcode file per tool is more foolproof
[19:23:17] <SWPadnos> hmmm - there's code in the interpreter for skipping lines that are before the current line - I thought that was there for resume functionality ..
[19:24:17] <cradek> yes "run from line" definitely works
[19:24:25] <alex_joni> any idea how to do that?
[19:24:50] <cradek> click the line you want in axis (gcode OR preview), pick "set next line" on the menu, hit run
[19:25:44] <alex_joni> pick?
[19:26:13] <cradek> click? poke? invoke?
[19:26:52] <alex_joni> no, I mean from where should I get the "set next line" ?
[19:26:59] <alex_joni> I don't seem to find it on the menu
[19:27:05] <cradek> one of the menus
[19:27:06] <alex_joni> nor does rightclick work..
[19:27:19] <cradek> machine?
[19:27:21] <alex_joni> oh.. it's grayed out
[19:27:31] <cradek> you have to be stopped
[19:27:51] <alex_joni> right.. figured that out now
[19:29:55] <alex_joni> hmm.. seems a bit counterintuitive to me :)
[19:30:15] <cradek> try to do it in xemc if you want to see counterintuitive
[19:30:30] <alex_joni> how does this sound? if there's a line number, resume should work when switching to auto
[19:30:42] <alex_joni> if you still push start it starts from the beginning of the file
[19:31:05] <cradek> overloading resume to do that is an interesting idea
[19:31:07] <alex_joni> so only resume starts from the "set next line" number
[19:31:12] <cradek> so as soon as you highlight a program line, resume ungrays?
[19:31:29] <alex_joni> yeah :)
[19:31:45] <alex_joni> that way we could set the line number by default on stopping
[19:31:55] <alex_joni> switch to manual, jog, whatever, come back and resume
[19:32:10] <cradek> how would you stop?
[19:32:13] <alex_joni> would seem like a "natural" way to work with it
[19:32:18] <alex_joni> push stop
[19:32:25] <alex_joni> or pause, then stop
[19:33:16] <alex_joni> hmm.. or maybe stop by pushing pause, then if the user switches to manual teh line gets remembered, and the interpreter stopped
[19:33:29] <alex_joni> also when switching to MDI
[19:34:27] <alex_joni> bet ray would really kick me for such a change :D
[19:34:40] <cradek> actually you're starting to make me nervous too
[19:35:08] <cradek> I still think the foolproof way is to have one gcode file per tool
[19:35:11] <SWPadnos> isn't there a "reset" button in tkemc?
[19:35:29] <cradek> it's an Fkey I think, no button
[19:35:54] <SWPadnos> actually, there's run and resume, which should respectively start over or continue from the last executed line,
[19:36:33] <cradek> it's not as easy as you guys are making it
[19:36:40] <cradek> imagine this program
[19:36:44] <cradek> g0x2y0
[19:36:53] <cradek> g3x0y0i-1j0
[19:36:53] <cradek> m0
[19:37:23] <cradek> x1y0i.5j0
[19:37:41] <cradek> ok now on the m0, I abort (remembering the line number) then jog the machine 99" to the left
[19:37:43] <alex_joni> m0 stop?
[19:37:51] <cradek> m0 is pause
[19:37:53] <alex_joni> ok
[19:37:56] <cradek> you'd have to hit esc there, then jog
[19:38:07] <cradek> now when I resume I'll get an arc of 99" diameter
[19:38:34] <cradek> experiment with run-from-line and see what I mean
[19:38:39] <jepler> won't you get an error since the arc is too small to reach the endpoint?
[19:38:56] <cradek> hmm yeah you're right
[19:39:29] <cradek> ok use R format or something
[19:39:30] <alex_joni> jepler: right, but it's still a plausible cause for problems
[19:39:42] <cradek> I'm sure it can be screwy if you aren't really careful
[19:39:51] <alex_joni> * alex_joni wonders what happens if the line is inside an O-loop
[19:39:52] <jepler> iirc you could get some pretty funky arcs by using "start from line" with arcspiral.ngc
[19:40:03] <jepler> alex_joni: ow my head
[19:40:11] <cradek> yes you can
[19:40:19] <alex_joni> jepler: :P
[19:40:29] <cradek> ouch, good question
[19:40:35] <alex_joni> * alex_joni runs to try it :D
[19:40:38] <cradek> ok, this is a bad idea
[19:40:50] <jmkasunich> hi guys
[19:40:56] <SWPadnos> hi jmk
[19:40:56] <cradek> hi jmk
[19:40:59] <jmkasunich> * jmkasunich is playing hooky
[19:41:05] <cradek> me too!
[19:41:15] <alex_joni> hooky?
[19:41:27] <cradek> jmkasunich: you explain it
[19:41:27] <jmkasunich> slang for skipping school as a kid
[19:41:35] <jepler> so what are you playing hooky from?
[19:41:37] <cradek> I have no idea why it's called "hooky"
[19:41:47] <jmkasunich> or work in this case - I had a doctor appointment at 2pm, and when it was over I didn't go back to work
[19:41:58] <jepler> *gsps*
[19:42:01] <jepler> *gasps* too
[19:42:08] <jepler> jmkasunich:
[19:42:09] <alex_joni> jepler: it seems to be somehow correct :D
[19:42:09] <jepler> $ ./scripts/runtest tests
[19:42:09] <jepler> Running test: tests/stepgen.0
[19:42:09] <jepler> Running test: tests/stepgen.1
[19:42:09] <jepler> Runtest: 2 tests run, 2 successful, 0 failed
[19:42:25] <jmkasunich> slick
[19:42:39] <jmkasunich> I saw the talk about "period"
[19:43:07] <alex_joni> jepler: disregard that..
[19:43:27] <jepler> jmkasunich: yeah -- I'm creating a halcmd command called "setexact_for_test_suite_only" which will lie to the realtime functions about the actual period
[19:43:29] <alex_joni> jepler: it's not something you'd want to do on a real machine :)
[19:44:02] <jmkasunich> you gonna have tab completion on that one? ;-)
[19:44:17] <alex_joni> jmkasunich: it's not for humans :D
[19:44:28] <SWPadnos> if so, I'd call it something like Xact_mode_for_test_suite, so you only need the X ;)
[19:44:52] <jepler> nope, no tab completion
[19:44:57] <alex_joni> I's use an ascii char that's not on the keyboards :D
[19:45:00] <jepler> alex_joni: that's what the ugly name is all about
[19:45:19] <alex_joni> s/I's/I'd/
[19:46:00] <jmkasunich> regression testing will be a nice thing to have
[19:46:00] <alex_joni> jepler: surprisingly the starting from a line in an o-word loop is almost correct
[19:46:16] <alex_joni> the initial vars are assigned correct
[19:46:56] <SWPadnos> I wonder if there may be a way to get the actual period to use in sim, rather than changing RT to fake the timing
[19:47:13] <SWPadnos> something like halcmd how_long_is 100000
[19:47:13] <jmkasunich> that defeats jeff's goal
[19:47:43] <jmkasunich> which is to have repeatable tests that give the same results regardless of the "actual" timing
[19:47:44] <SWPadnos> hmmm - true, you'd only be able to check sim vs RT on a single machine
[19:48:36] <jmkasunich> for 99% of applications, we could pass the requested period instead of the actual one with no apparent difference
[19:48:51] <jmkasunich> the error is rarely more than a couple percent
[19:49:02] <SWPadnos> true. positions would be correct, but feeds/accels would be off by a bit
[19:49:03] <jmkasunich> so you ask for 60 ipm, and you get 60.8...
[19:51:07] <jmkasunich> jepler: I gather sampler is working ok for you?
[19:51:16] <jepler> jmkasunich: yes
[19:51:18] <jepler> bbl
[19:52:10] <alex_joni> jmkasunich: got 5 mins?
[19:52:14] <jmkasunich> ok
[19:52:25] <alex_joni> wanna talk about comp?
[19:52:29] <jmkasunich> ok
[19:52:47] <alex_joni> I got the comp tables to RT
[19:52:56] <alex_joni> you said it's easy once the data is there :)
[19:53:15] <jmkasunich> is it "put up or shut up" time for me?
[19:53:21] <alex_joni> no..
[19:53:25] <alex_joni> pointers?
[19:53:44] <alex_joni> also .. wanted to ask you about alter..
[19:53:51] <alex_joni> did you figure that out?
[19:53:58] <jmkasunich> I don't know anything about alter
[19:54:03] <alex_joni> I mean.. what it's supposed to be
[19:54:05] <jmkasunich> I bet we'd have to ask fred what that is
[19:54:19] <alex_joni> the motion.h description is quite informative
[19:54:44] <alex_joni> it says /* additive dynamic compensation */
[19:55:00] <jmkasunich> yeah, and I haven't a clue what that means
[19:55:12] <alex_joni> the most info I got from this:
[19:55:45] <alex_joni> http://cvs.linuxcnc.org/cvs/emc/src/emctask/alter.c?rev=1.3;content-type=text%2Fplain
[19:55:52] <alex_joni> it sounds like a simple motor offset
[19:56:18] <jmkasunich> yeah
[19:56:26] <jmkasunich> and sounds like something we'd do in hal today
[19:56:29] <alex_joni> like a poor-man's HAL :)
[19:56:35] <alex_joni> exactly
[19:56:42] <jmkasunich> so it can go away
[19:57:06] <alex_joni> that's what I was thinking too
[19:57:19] <alex_joni> but wanted your oppinion before I started to rip out code again :)
[19:57:31] <jmkasunich> the comp array is fixed size, right?
[19:57:37] <alex_joni> for now, yes
[19:57:50] <SWPadnos> fixed at 256 entries
[19:58:09] <jmkasunich> the way I see it, backlash is a simplified case of screw comp
[19:58:18] <jmkasunich> and they should be handled by the same code
[19:58:26] <alex_joni> although I start to wonder if comp is the way to do it
[19:58:31] <SWPadnos> that's what the comments in the code point at
[19:58:35] <alex_joni> I like that kinematics approach better :D
[19:58:56] <alex_joni> using a 10th degree polynomial & all :)
[19:59:09] <alex_joni> probably a bit more precise than linear interp. between comp points
[19:59:11] <jmkasunich> I disagree
[19:59:54] <jmkasunich> the difference between screw comp and kins based comp isn't the nature of the curve (linear interp, poly, etc), its the fact that kins allows for cross-axis coupling, and screw comp is one axis at a time
[19:59:56] <SWPadnos> it's especially hard to map backlash compensation onto non-tricial kinematics
[20:00:07] <SWPadnos> err - non-trivial
[20:00:11] <jmkasunich> I see a place for both actually
[20:00:22] <jmkasunich> screw comp (the traditional way) is easier
[20:00:30] <jmkasunich> kines is more complex, but more powerfull
[20:00:46] <jmkasunich> (can compensate for bent ways, out-of-square machines, etc)
[20:01:27] <SWPadnos> it seems a good division of labor to have kins tell the positioning system where it wants the joints to be, and for each joint to apply compensation to make sure it's there
[20:02:00] <SWPadnos> but you're right - multi-axis compensation (like a wobbly Y due to the X nut being ellipsoid) would need kins compensation
[20:02:49] <jmkasunich> or a lathe that has worn ways and turns 0.002 bigger in diameter near the headstock
[20:03:07] <jmkasunich> you could have a function that adjusts X (crossslide) as a function of Z
[20:03:24] <jmkasunich> but we digress - lets talk about screw comp
[20:03:40] <alex_joni> right
[20:03:46] <alex_joni> * alex_joni gets busy removing alter
[20:03:47] <SWPadnos> you know - backlash comp may be simplified by treating one direction as "correct", and just applying an offset to any position when traveling in the other direction
[20:03:55] <jmkasunich> nope
[20:04:08] <SWPadnos> hmmm - why not?
[20:04:12] <alex_joni> SWPadnos: you already have 3 values for comp
[20:04:16] <jmkasunich> backlash comp as should go away when screw comp comes in
[20:04:17] <alex_joni> nominal, forward, backward
[20:04:44] <alex_joni> so you can say forward = backlash/2, backward=-backlash/2 :)
[20:04:53] <jmkasunich> right
[20:04:55] <SWPadnos> yes - screw comp needs both fwd and reverse deviations, since the two sides of the screw thread may have different wear
[20:04:56] <alex_joni> make that nominal + ..
[20:05:13] <jmkasunich> actually alex_joni thats one change I want to make
[20:05:33] <jmkasunich> right now, its nominal, actual_fwd, actual_rev
[20:05:55] <jmkasunich> I'd like to change it to nominal, fwd_correction, rev_correction
[20:06:13] <jmkasunich> the correction numbers are smaller and less subject to loss of precision
[20:06:25] <SWPadnos> I think it's actuals because they're easier to measure (in theory)
[20:06:42] <jmkasunich> even that depends on the situation
[20:06:48] <SWPadnos> though the average home shop may use an indicator and gage blocks instead of an absolute scale ;)
[20:06:51] <jmkasunich> if you are using a laser interferometer, probably
[20:06:57] <jmkasunich> right - thats where I was going
[20:07:17] <alex_joni> jmkasunich: that's easy
[20:07:24] <jmkasunich> anyway, we probably want to support both formats in the file (with something to say which one it is)
[20:07:25] <alex_joni> I can do the userspace part
[20:07:25] <SWPadnos> it may be possible to use them either way, and have the code figure out which one was used
[20:07:36] <SWPadnos> the only difficulty would be near zero
[20:07:37] <jmkasunich> but the internal data should be corrections
[20:07:58] <jmkasunich> SWPadnos: probably, but I hate it when computers try to read peoples minds
[20:08:12] <jmkasunich> just let the person say what he has in mind
[20:08:14] <SWPadnos> when it's obvious, it should be automatic
[20:08:17] <SWPadnos> in all other cases, it should barf ;)
[20:08:24] <SWPadnos> unless told otherwise
[20:08:57] <SWPadnos> sure - the algo I thought of basically looks at the magnitude of a few opints, and if they're different by more than a few percent, assumes the data is offsets
[20:09:21] <jmkasunich> another approach is to look at the two endpoints
[20:09:45] <jmkasunich> if their slope is approximately 1, they're actuals, if the slope is approximately zero, they're corrections
[20:09:55] <SWPadnos> right
[20:10:16] <jmkasunich> say -0.1 to +0.1 = correction, +0.9 to +1.1 = actual, anything else yell for help
[20:10:27] <SWPadnos> sure - that would work
[20:10:42] <jmkasunich> in any case, that is in user space
[20:10:46] <SWPadnos> anyway - just a thought. we can go back to compensation if you like
[20:10:47] <SWPadnos> yep
[20:11:01] <jmkasunich> the shmem data can be floats instead of doubles, if we use corrections
[20:11:19] <alex_joni> jmkasunich: it's not in shmem I thought :)
[20:11:23] <SWPadnos> using offsets in RT may make sense, since it may save a subtraction
[20:11:33] <alex_joni> and those 3 doubles to pass the data .. is not that much memory
[20:12:13] <jmkasunich> how do you do the user->rt thing?
[20:12:21] <alex_joni> pass one pair at a time
[20:12:26] <jmkasunich> ah
[20:18:16] <Lerneaen_Hydra> silly musing: instead of having it guess the format, why not have a row in the start called "correction type" with either "offset" or "absolute"?
[20:18:33] <alex_joni> Lerneaen_Hydra: that's one way
[20:18:59] <jmkasunich> that was my original thought
[20:19:32] <alex_joni> but it only adds to the configuration extent we already have
[20:19:49] <Lerneaen_Hydra> seems better than some type of second guessing (maybe some strange slide has values that would be regarded as the opposite type)
[20:19:57] <alex_joni> jmkasunich: I'd rather have another ini setting for this
[20:20:37] <SWPadnos> lh: the compensation values we're talking about are for each individual joint, so they should be very close to equal to the nominal values
[20:22:22] <Lerneaen_Hydra> yeah, but in my mind it seems like you'd get functionality that is of little to no actual use, but could cause issues in some case that I can't imagine currently, but *may* happen
[21:16:49] <skunkworks> jmkasunich: thanks again.. Your very generous with your knowledge.
[21:17:13] <alex_joni> skunkworks: that's only a small tip of his knowledge :D
[21:17:33] <skunkworks> I know. I know.
[21:18:32] <jmkasunich> skunkworks: no prob
[21:38:53] <alex_joni> yay jepler & emc : http://cia.navi.cx/
[22:47:50] <jmkasunich> jepler: you you do a "basic" test, you don't kid around ;-)
[22:49:13] <jepler> SAMPLER: ERROR: depth too large, max is 2666
[22:49:37] <jmkasunich> I somewhat arbitrarily limted the shmem to 64K
[22:49:55] <jmkasunich> streamer.h contains the #define, I really think theres no reason you can't make it bigger
[22:50:38] <jmkasunich> the max depth is mem_size / (numchan+1)*4
[22:50:43] <jmkasunich> the +1 is for the sample number
[22:50:57] <jepler> and 4 is sizeof(void*)?
[22:51:43] <jmkasunich> sizeof a sample (which is a union of float, unsigned, signed, etc)
[22:51:49] <jepler> hm
[22:52:08] <jepler> it turns out to be different on 64- and 32-bit machines
[22:52:30] <jmkasunich> I'm not surprised
[22:52:57] <jmkasunich> typedef volatile unsigned long hal_u32_t;
[22:53:02] <jmkasunich> thats in hal.h
[22:53:04] <jepler> yeah those should not be longs on my platform...
[22:53:42] <jmkasunich> if those were fixed, some of the problems would go away
[22:53:44] <jmkasunich> but not all
[22:54:06] <jmkasunich> I'm afraid there are a lot of places where I assume sizeof(int) = sizeof(pointer)
[22:54:14] <jepler> I haven't found them yet
[22:54:44] <jmkasunich> or maybe its sizeof long = sizeof ptr
[22:54:57] <jmkasunich> which is still true
[22:56:43] <jmkasunich> I wonder if declaring bit as a char is actually saving much of anything?
[22:56:44] <jepler> yeah, that's true on this system and on x86
[22:56:58] <jmkasunich> bit pins are still sizeof *
[22:57:14] <jmkasunich> bit params are char, and pack nicely, but thats about it
[22:57:36] <jmkasunich> all signals use 4 bytes regardless of type
[22:58:16] <jmkasunich> I really regret putting all those integer types into hal
[22:58:24] <jmkasunich> I think it should have bits, integers, and floats
[22:58:33] <alex_joni> jmkasunich: anything uses them?
[22:59:07] <jmkasunich> I'd have to grep to be sure
[22:59:12] <jmkasunich> I doubt it tho
[22:59:20] <jmkasunich> back later
[22:59:20] <alex_joni> so why not remove them?
[22:59:37] <alex_joni> same here.. 10h later :)
[23:00:33] <jepler> jmkasunich: why did you use the basic types (like 'unsigned long') instead of the hal types (like -hal_u32_t') in shmem_data_t?
[23:00:57] <jepler> I guess I'll have to ponder the reason for that
[23:42:06] <jmkasunich> probalby because I wasn't thinking
[23:42:25] <jmkasunich> although the shmem_data_t has nothing really to do with HAL
[23:42:36] <jmkasunich> at that point, its just a sample
[23:42:55] <jmkasunich> for instance, u8, u16, and u32 are all stored as unsigned long