#emc | Logs for 2011-02-01

Back
[00:08:07] <jepler> I'll let the farm chew on this change and if all is OK I'll do some other changes to finish getting rid of traces of the test and its define...
[00:08:29] <cradek> ok, and I'll retry
[00:11:23] <jepler> thanks
[00:12:24] <cradek> builds fine
[00:20:25] <jepler> darn, it couldn't be so easy :(
[00:32:22] <jepler> make: *** No rule to make target `emc/kinematics/mmxavg.h', needed by `depends/emc/usr_intf/usrmot.d'. Stop.
[00:32:25] <jepler> wow this is an old checkout
[00:37:30] <jepler> argh I hate trading one problem for another
[00:37:53] <jepler> but I think I'm going to take the coward's way and just skip hal_joystick on a platform where I can't get <linux/joystick.h> to work
[01:21:17] <jepler> hmph -- scripts/runtests has locked up my machine twice
[01:21:26] <jepler> once I thought maybe it was because I hit ctrl-c
[01:21:34] <cradek> your new kernel?
[01:21:45] <jepler> yes, it's the x86-64 machine
[01:21:59] <cradek> welcome to the problems I had with > 2.6.17
[01:22:18] <jepler> I guess I had the impression your problems were fairly repeatable
[01:22:27] <cradek> they were - every time I tried to run emc
[01:22:27] <jepler> this is "after running a few tests" kind of thing
[01:22:35] <cradek> well yuck
[01:23:00] <cradek> I could load and run some simple hal stuff, but if I tried to load emcmot, it hung
[01:28:10] <jepler> now that I'm runing from the kernel with more kernel logging turned on, it is fine
[01:28:40] <jepler> except 2 other tests fail -- one of the interpreter tests (!) and threads.0
[01:30:01] <jepler> threads.0 should "count up" from 1 to 10 repeatedly (using one thread to increment and capture, and one thread to reset), but this machine manages to count up to 11 sometimes
[01:30:27] <SWPadnos> that's better - it goes to 11!
[01:30:33] <jepler> indeed
[01:33:44] <jepler> cradek: more ominously, tests/ccomp/mill-g90g91g92/ also fails with "Cutter gouging with cutter radius comp" just after COMMENT("interpreter: cutter radius compensation on left")
[01:34:00] <cradek> argh
[01:34:23] <cradek> wonder if the units in their tool files is wrong
[01:34:38] <jepler> I wonder what units a sai is
[01:34:50] <cradek> that's an interesting question
[01:35:33] <jepler> static CANON_UNITS _length_unit_type = CANON_UNITS_MM;
[01:36:16] <cradek> ok, I bet they have inch tool files and inch gcode, which would have worked previously, but not now
[01:36:20] <jepler> there's no G20 or G21 in ccomp/mill-g90g91g92/test.ngc
[01:37:18] <cradek> the gcode appears to be inch, and the tool table definitely is
[01:38:25] <jepler> I did finally get an oops on the console, but most of it scrolled away
[01:38:40] <jepler> I think that two traces scrolled by, actually
[01:38:44] <cradek> those line numbers in the sai output really pork up diff
[01:39:13] <jepler> yeah I was just thinking that...
[01:41:06] <cradek> putting G20 in mill-g9xxxxxx's ngc fixes it
[01:43:49] <cradek> (which is extremely comforting)
[01:54:56] <jepler> I wonder if the kernel tracebacks are right
[01:54:59] <jepler> this one is rather fishy
[01:55:59] <jepler> the outermost frames look plausible for the test: ... / rtai_lxrt:kthread_fun / rtapi:wrapper / hal_lib:thread / sampler:sample
[01:57:04] <jepler> then it's: error_sti / ipipe_handle_exception / handle_mm_fault / ...
[01:57:19] <jepler> but in the innermost frames it gets stuff like vt_console_print
[01:57:34] <jepler> several times
[01:57:39] <jepler> (as many as I can see lines)
[02:10:55] <dmwaters> {global notice} Good day all. In about an hour i'm going to begin maintenence on 3 main rotation servers that have been out of rotation for a while. These servers have been out of rotation for a while, but the impact will still be a bit rough. About 2200 users will be effected.
[03:13:15] <jepler> well I feel much better now -- the "bug" only happens when you "halcmd unload all" or maybe "realtime stop" without "halcmd stop" first
[03:13:51] <jepler> .. I think
[03:14:33] <cradek> bizarre
[03:18:07] <jepler> [ 239.516904] In recent history there were
[03:18:07] <jepler> [ 239.516905] 2012601, 2193, 2009124, 2009970, and 2008257
[03:18:13] <jepler> I've gotten more of these, though
[03:18:38] <jepler> [ 273.957592] 2009144, 2009104, 2038, 2010115, and 2008117
[03:18:58] <jepler> [ 298.764215] 2248, 2006094, 2007888, 2009094, and 2009124
[03:19:33] <cradek> I wonder if something is wrong with the code
[03:19:49] <cradek> I don't see why they are detected when they are
[03:20:02] <SWPadnos> those are latencies, right (or are they actual TSC differences from thread run to thread run)
[03:20:04] <jepler> [ 298.764218] This time, there were 2010503 which is so anomalously
[03:20:13] <cradek> did you know you can scope those motion periods
[03:20:33] <cradek> and even though only the first glitch generates a message, the glitches are counted
[03:20:54] <jepler> the low number drags the average down far enough that a "fairly normal" interval is over the threshhold
[03:21:28] <cradek> it's not an average
[03:21:31] <cradek> (iirc)
[03:21:55] <SWPadnos> even if it is, that should happen as soon as the smaller interval is averaged in (but not the most recent sample)
[03:22:13] <cradek> this_run > 1.2 * cycles[i]
[03:22:40] <cradek> yeah I can't explain those results considering the test
[03:23:23] <SWPadnos> unless there's some int vs long vs timespec thing going on
[03:23:42] <cradek> no timespecs - this is clocks
[03:23:48] <SWPadnos> unless there's some int vs long thing going on
[03:24:14] <cradek> long long int now = rtapi_get_clocks();
[03:24:14] <cradek> long int this_run = (long int)(now - last);
[03:24:41] <SWPadnos> and rtapi_get_clocks() is basically just rdtsc, I think
[03:24:44] <jmkasunich> 'last' is also long long, right?
[03:24:49] <cradek> jmkasunich: yes
[03:25:58] <cradek> jepler: is it possible these are the first 5 samples? that would explain it if so
[03:27:23] <jepler> cradek: I don't think so; it might be a second or so until the window shows
[03:27:26] <jepler> I also see the number climbing in halscope
[03:27:32] <jepler> it's 50 now
[03:27:36] <jepler> it seems to jump by about 5 each time it increases
[03:28:04] <jmkasunich> does that corelate with anything?
[03:28:07] <jmkasunich> mouse movement
[03:28:14] <jmkasunich> the passing of 64 seconds
[03:28:17] <jmkasunich> etc
[03:28:53] <jmkasunich> I think one bad reading would inc the count by 5
[03:29:06] <jmkasunich> because it will remain in the buffer for 5 runs
[03:29:07] <cradek> I agree
[03:29:25] <cradek> last_period or last_period_ns would be nice to see
[03:29:31] <jepler> http://emergent.unpy.net/index.cgi-files/sandbox/overruns.png
[03:30:08] <cradek> huh
[03:30:16] <cradek> except for when it's not, that's a nice flat line
[03:30:28] <jmkasunich> do you have a base thread?
[03:30:30] <cradek> wonder if it's your kernel or your hardware
[03:32:43] <cradek> jmkasunich: are you using the smp rtai kernel I built?
[03:32:50] <jmkasunich> no
[03:32:58] <jmkasunich> I used it very briefly on the new machine
[03:33:00] <cradek> ok
[03:35:21] <jepler> http://emergent.unpy.net/index.cgi-files/sandbox/overruns2.png
[03:35:30] <jepler> I'm going to go for it being a regular effect
[03:35:37] <jepler> and then I'm going to go to bed .. bye guys
[03:36:05] <jmkasunich> something that happens every 12.5 seconds or so
[03:36:27] <cradek> goodnight
[03:36:30] <jmkasunich> night
[03:36:37] <SWPadnos> see you
[03:36:58] <cradek> The fastest 100 m hurdlers run the distance in a time of around 12.5 seconds.
[03:37:06] <cradek> ok that's probably not it
[03:37:42] <jmkasunich> the machine executes about 25 billion clocks in that time - not an "interesting" number
[03:37:48] <jmkasunich> (like 32 bit overflow or such)
[03:38:13] <SWPadnos> assuming that CPU throttling is disabled
[03:38:35] <SWPadnos> I didn't look at the kernel config he used
[03:38:47] <jmkasunich> that can be checked by doing "while true ; do echo "foo" >/dev/null ; done"
[03:39:00] <jmkasunich> keeps the cpu busy
[03:39:04] <SWPadnos> yep
[03:40:08] <SWPadnos> incidentally, is that a local halscope upgrade, or does everyone have the scale along with each trace now?
[03:40:37] <cradek> that must be just him
[03:40:51] <SWPadnos> ah - nope, it's in TRUNK
[03:40:52] <cradek> I don't think I've seen it (can't check now)
[03:40:55] <cradek> ah
[03:41:04] <SWPadnos> as of 5/17/2007
[03:42:02] <cradek> goodnight here too
[03:42:08] <SWPadnos> see you
[03:42:46] <SWPadnos> in fact, I think I'll make it three. good night
[11:38:56] <jepler> cpu MHz: 2009.169
[11:39:28] <jepler> [ 691.300915] RTAI[sched]: Linux timer freq = 250 (Hz), CPU freq = 2009169000 hz.
[11:39:37] <jepler> [ 691.554684] HAL: creating thread servo-thread, 1000000 nsec
[11:39:49] <jepler> [ 691.300697] RTAI[sched]: hard timer type/freq = APIC/12557250(Hz); default timing mode is periodic; linear ordering of timed lists.
[11:40:33] <jepler> how is it able to give me the exact period I requested?
[11:42:09] <cradek> [ 541.462612] HAL: creating thread servo-thread, 500000 nsec
[11:42:26] <cradek> mine says that too, but I don't think I get that value exactly
[11:42:36] <jepler> oh -- I thought the value it reported was the value obtained, not the one requested
[11:42:49] <jepler> Realtime Threads:
[11:42:50] <jepler> Period FP Name ( Time, Max-Time )
[11:42:50] <jepler> 999980 YES servo-thread ( 8479, 29915 )
[11:43:02] <cradek> aha
[11:43:57] <cradek> 999849 YES servo-thread ( 18519, 31108 )
[11:48:54] <jepler> 6 u32 RW 00002EEA motion.servo.overruns
[11:48:57] <jepler> (after running all night)
[12:00:44] <jepler> http://emergent.unpy.net/index.cgi-files/sandbox/overruns3.png
[12:01:00] <jepler> (ignore the last sample, that's on roll mode)
[12:01:15] <jepler> with two threads, the fast one is disturbed at regular intervals and the slow one is "just fine"
[12:02:42] <jepler> http://emergent.unpy.net/index.cgi-files/sandbox/overruns4.png
[12:08:41] <jepler> 1ms -> glitch approx 12500 periods. 50uS -> glitch approx every 1250 periods. 20uS -> glitch approx every 125 periods
[12:21:35] <jepler> oh well
[12:21:39] <jepler> * jepler goes to work
[12:35:20] <jepler> * jepler arrives at work
[13:23:31] <SWPadnos> * SWPadnos lives at work
[13:23:48] <skunkworks> * skunkworks doesn't want to talk about it.
[13:46:33] <jepler> I can't get the weird TSC differences in an rtai-only program (hacked version of 'kern/latency')
[13:47:49] <cradek> jepler: is there any other timing information available, even if it's slow to get, that you could record in that check inside motion?
[13:48:19] <SWPadnos> there's a get time function, which takes microseconds
[13:48:27] <cradek> you could just log a message, and use the kernel timestamps
[13:48:42] <jepler> rtapi_get_time returns ns
[13:49:02] <SWPadnos> there's a kernel time routine, but it takes a long time to execute
[13:49:28] <SWPadnos> I think there's a comment in the rtapi header/code that has the kernel function name
[13:49:54] <jepler> yeah
[13:52:35] <jepler> 38231.201033] In recent history there were
[13:52:35] <jepler> [38231.201034] 999391, 999963, 933, 1000092, and 1000452
[13:52:35] <jepler> [38231.201035] elapsed clocks between calls to the motion controller.
[13:52:36] <jepler> [38231.201036] This time, there were 999056
[13:53:06] <jepler> this is using rtapi_get_time to get ns -- but it may just boil down to rdtsc as well
[13:53:10] <jepler> (I mean, inside rtai)
[13:53:16] <SWPadnos> does the RTAI latency test use rdtsc?
[13:53:25] <jepler> the original doesn't, but my modified version does
[13:56:24] <jepler> http://emergent.unpy.net/index.cgi-files/sandbox/latency-module.c
[13:57:59] <SWPadnos> rtapi_get_time should be returning clocks, not ns
[13:58:56] <jepler> DESCRIPTION
[13:58:58] <jepler> rtapi_get_time returns the current time in nanoseconds.
[13:59:06] <jepler> rtapi_get_clocks returns the current time in CPU clocks.
[13:59:12] <SWPadnos> ah - ok
[14:02:26] <rayh> I was trying some stuff with a week old trunk and couldn't do arcs. "isnan error..."
[14:02:38] <cradek> eek
[14:02:46] <rayh> I'll cvs up and try again.
[14:03:06] <rayh> Morning Chris
[14:03:08] <cradek> are you using cutter comp?
[14:03:12] <cradek> hi ray
[14:03:17] <rayh> no
[14:03:38] <cradek> wow, that's really strange then
[14:03:51] <SWPadnos> hi Ray
[14:04:01] <SWPadnos> do you have the gcode that NAN'ed you?
[14:04:05] <rayh> steven.
[14:04:35] <rayh> g2 x2 i1 j0
[14:04:47] <rayh> from x0
[14:05:07] <cradek> are all arcs broken for you?
[14:05:17] <jepler> do you have the full error? A grep of the source code suggests it probably says "isnan error in (function name here)" which would sure narrow the search.
[14:06:25] <cradek> that arc works for me
[14:07:00] <rayh> They were all broken. It just ignored the arc commands
[14:07:55] <rayh> It may have been a fluke here. I think I lost the feedback since I'm compiling in the same terminal.
[14:08:03] <cradek> something's sure funny then.
[14:08:35] <rayh> seem to remember arctan or some such.
[14:10:50] <rayh> works fine with the compile from today's source.
[14:11:15] <rayh> I ran a make clean and see that it goes through configure.
[14:11:33] <rayh> Does it preserve the --enable stuff from previous?
[14:11:39] <SWPadnos> I had noticed that before
[14:11:51] <cradek> rayh: yes
[14:12:04] <jepler> when make notices that 'configure' is newer than 'Makefile.inc' it re-runs configure in a way that preserves the flags you used before.
[14:12:22] <jepler> (or something like that)
[14:12:24] <SWPadnos> it seems counterintuitive that the re-configure is done on a make clean though
[14:12:34] <SWPadnos> not a big enough problem to complain about separately though :)
[14:13:22] <rayh> I was a bit surprised is all.
[14:13:35] <jepler> if [ -f config.status ]; then ./config.status --recheck; else ./configure --enable-run-in-place; fi
[14:13:41] <rayh> I think "old dog" comes into play here.
[14:13:47] <jepler> this is what gets run when make decides configure information is out of date ^^^
[14:13:50] <jepler> "--recheck" preserves your old choices
[14:14:15] <jepler> config.status should always exist after you've run configure
[14:14:40] <rayh> And that holds the old flags?
[14:15:14] <rayh> Arcs seem to work as promised. Sorry to trouble you guys.
[14:15:14] <jepler> right
[14:15:31] <jepler> let us know if you see any more problems
[14:16:07] <rayh> what's all the clarkexx stuff?
[14:16:28] <SWPadnos> jmk added clarke transform HAL components
[14:16:35] <SWPadnos> if that's what you're asking about
[14:16:41] <cradek> what's a clarke transform?
[14:16:48] <jepler> something to do with 3-phase motors
[14:16:51] <jepler> I think
[14:16:59] <SWPadnos> that's what I gathered from the comments ;)
[14:17:28] <cradek> ok then
[14:17:49] <cradek> if clarke sends me any motors I'll be prepared
[14:17:53] <jepler> rayh: swpadnos: I fixed the "may re-run configure during clean" behavior, let me know if you see that again
[14:17:59] <SWPadnos> hey - I can download IEEE papers now - I joined a couple of months ago
[14:18:05] <SWPadnos> w00t!
[14:18:13] <rayh> Thanks jepler
[14:18:14] <SWPadnos> ok, thanks
[14:18:41] <jepler> configure may still be run at the subsequent "make"
[14:18:57] <rayh> Wow. Guess I'll have to talk with jmk about how to use em.
[14:19:26] <SWPadnos> http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V30-416C0S9-P&_user=10&_coverDate=11%2F01%2F2000&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=cc99866c171eb0fdadd5fddad6a5c059
[14:19:35] <rayh> Automating config is a good thing. I was just curious how it did it.
[14:19:36] <SWPadnos> I don't know if you want to ask ;)
[14:20:58] <SWPadnos> aha: http://www.electricmotors.machinedesign.com/ArticleDraw.aspx?artid=60817
[14:21:02] <SWPadnos> that may be applicable to EMC
[14:21:12] <SWPadnos> with clarke transforms
[14:22:26] <jepler> rayh: just like everything else, 'make' uses file timestamps
[14:22:52] <jepler> rayh: in this case, make has rules which state that Makefile.inc is based on Makefile.inc.in and config.status. config.status is based on configure
[14:23:26] <jepler> so when the timestamp on configure is updated (usually from a "cvs up" or when you are hacking on configure yourself), Make looks at the timestamps and sees that config.status must be rebuilt.
[14:23:41] <jepler> the rule to do that is "... ./config.status --recheck ..." which re-runs configure with the same flags you gave the last time
[14:23:46] <jepler> then it rebuilds Makefile.inc.in
[14:24:15] <jepler> so there should rarely be a need to re-run configure, except for changing configure flags
[14:25:16] <jepler> as an aside, I find that "./configure --enable-run-in-place --with-realtime=/usr/realtime-2.6.15-magma; make; ./configure --enable-run-in-place --enable-simulator; make" gives me a working 'sim' system even though I never manually cleaned
[14:26:27] <rayh> Okay.
[14:26:36] <rayh> fascinating
[14:28:41] <SWPadnos> clarke?
[14:29:40] <rayh> Oh him.
[14:29:46] <SWPadnos> heh
[14:30:13] <rayh> Don't think I will afford the pdf. Thanks for the link to the abstract.
[14:30:27] <SWPadnos> try the second one - there's actually info there
[14:30:34] <rayh> k
[14:32:12] <rayh> You must be a member?
[14:35:22] <rayh> Nice descriptions in the clarke src files.
[16:16:54] <skunkworks> well crap - jmk probably hasn't been reading.
[16:17:00] <jepler> no probably not
[16:19:41] <skunkworks> Those latency numbers seem really high?
[16:19:59] <jepler> that line I pasted is not the normal latency test
[16:20:05] <jepler> it's # CPU cylces, not #ns
[16:20:05] <skunkworks> ah
[16:20:07] <skunkworks> ol
[16:20:08] <skunkworks> ok
[16:20:24] <jepler> and it's time between runs, not time compared to expected time
[16:20:43] <cradek> jepler: taking that machine to fest?
[16:21:03] <jepler> I sure could
[16:21:55] <cradek> jepler: I had the keyboard glitch when typing in emacs - it entered about 4000 of the character I typed
[16:22:12] <cradek> I rebooted to 15-magma, we'll see if I get it again
[16:24:06] <jepler> besides the glitch, this machine has pretty inspiring RT performance. Admittedly it's only running one very simple item on the fast thread, but its working with period=3981ns (requested 4000) and very little variation in time from run to run excluding the glitches down to near 0 cycles
[16:24:48] <skunkworks> Nice - so newer computers still will be good canidates. So far :)
[16:24:51] <jepler> well I take it back -- looking at more samples, I am seeing greater variability
[16:27:40] <skunkworks> my workstation here does em64t - I wonder if it would work with that kernel.
[16:27:46] <jepler> http://emergent.unpy.net/files/sandbox/3981ns.png
[16:29:13] <cradek> ouch, those look pretty awful
[16:29:45] <jepler> the weird glitches are perfectly regular
[16:29:48] <cradek> +- 30-40%
[16:29:54] <cradek> yep
[16:30:13] <cradek> I bet you will figure those out, but the others look more random
[16:31:20] <jepler> the scale on the red graph is 5K cycles per div, or 2500ns
[16:35:45] <jepler> http://emergent.unpy.net/files/sandbox/19988ns.png
[16:36:39] <SWPadnos> do they all seem to drop to the same delay? (ie, around 2000 counts)
[16:37:09] <SWPadnos> hmmm - no. one was 933
[16:37:26] <jepler> "933" was from emc showing ns
[16:37:41] <jepler> the others show cycles
[16:37:58] <jepler> my impression is that the minimum number is related to the typical execution time of the thread
[16:38:25] <jepler> e.g., it drops to about 300 cycles when I am only running timedelta
[16:39:20] <SWPadnos> this is a single-core CPU? (I know the kernel is SMP)
[16:39:21] <jepler> but maybe that's wrong -- 2000 cycles is a lot fewer than the number I had from 'show thread' for the servo thread
[16:39:27] <jepler> no no
[16:39:30] <SWPadnos> err - s/SMP/UP/
[16:39:35] <jepler> then yes
[16:39:37] <SWPadnos> or s/is/sin't/ :)
[16:39:45] <jepler> it's an AMD X2, but with a UP kernel
[16:39:53] <SWPadnos> ok
[16:40:00] <SWPadnos> I wonder how that may affect the TSC
[16:40:52] <SWPadnos> hmmm. maybe I should set up my server for EMC development and try building this kernel
[16:40:53] <jepler> funky (e.g., non-monotonic) TSC is not a top suspect
[16:41:10] <SWPadnos> the TSC on AMD X2 isn't monotonic when you can ping-pong between cores (IIRC)
[16:41:13] <jepler> 1. I still get the effect in emc when using the rtai "get time in ns" function call
[16:41:28] <jepler> 2. I do not get the effect in rtai alone, whether using rdtscll or "get time in ns" rtai function
[16:41:42] <SWPadnos> well, that would point me toward emc/rtapi then
[16:43:16] <jepler> me too, but what?
[16:44:36] <SWPadnos> is the timing setup code for the RTAI tests the same as in RTAPI?
[16:45:36] <SWPadnos> hmmm. the error is faster when the thread has a shorter period, right?
[16:45:59] <jepler> it seems to make similar rtai calls but I haven't compared them extensively
[16:46:07] <jepler> 07:08:40 <jepler> 1ms -> glitch approx 12500 periods. 50uS -> glitch approx every 1250 periods. 20uS -> glitch approx every 125 periods
[16:46:09] <SWPadnos> but it's more or less always the same amount of time - like 2 uS (or depdenent on the thread execution time, as you said)
[16:46:54] <SWPadnos> can you set up a very fast thread with a single function that just counts elapsed time?
[16:47:24] <SWPadnos> I'm thinking that the glitch is caused by the interrupt not being correctly cleared, so it gets retriggered as soon as the thread finishes
[16:47:33] <jepler> that's what timedelta does, except its output is the difference and not the accumulated time
[16:48:01] <jepler> http://pastebin.ca/528642
[16:48:19] <SWPadnos> actually, I want two outputs from this simple component: 1) the accumulated time, and 2) a count of the number of executions of the thread
[16:48:37] <SWPadnos> I'm betting that you have a higher count than expected for the elapsed time
[16:50:30] <jepler> so you want me to find out if over a long term, my 10us thread really gets called 100000 times per second
[16:50:36] <SWPadnos> right
[16:52:24] <SWPadnos> the error in timing due to roundoff would only give a few counts of error, but the 20 uS thread getting a problem every 125 executions would give an extra 400 or so over one second
[16:58:02] <jepler> http://emergent.unpy.net/files/sandbox/timeerr.png
[16:58:22] <SWPadnos> 5G/div O_O
[16:58:42] <jepler> if(last != 0) {
[16:58:42] <jepler> err = err + (now - last) - period;
[16:58:42] <jepler> }
[16:58:45] <SWPadnos> that's the actual count now?
[16:59:08] <jepler> er maybe that plot doesn't show what I think it does
[16:59:07] <jepler> let me try again
[16:59:48] <cradek> s
[16:59:48] <cradek> s
[17:00:00] <SWPadnos> does comp do the right thing with soemthing like this: variable __s64 first=0ll;
[17:00:30] <SWPadnos> or, is there a way to initialize vars / are they always set to 0?
[17:00:49] <jepler> I think that comp memset()s the whole HAL shared memory area which would include the variables
[17:00:54] <SWPadnos> ok
[17:09:03] <SWPadnos> this may do what I want: http://pastebin.ca/528672
[17:09:28] <SWPadnos> if I did it right, that should give a "thread executions per second" output
[17:09:41] <SWPadnos> updated once per second
[17:10:25] <SWPadnos> of course, it'll need to catch up to "now". hmmm
[17:11:49] <SWPadnos> I guess at ~100 KHz, it'll catch up pretty quickly
[18:29:07] <skunkworks> woo hoo. My goto mount for my telescope is on its way.
[18:30:29] <skunkworks> that was the wrong window
[18:54:18] <jepler> SWPadnos: I didn't try your code yet
[18:54:44] <jepler> I fixed up my code which was along the lines of err = err + (now - last) - period;
[18:55:18] <jepler> on a believed-to-be-good 32-bit machine, there is a slope to 'err', but it is small: it represents 2us of error over 6 seconds, or .3ppm
[18:56:39] <jepler> on the 64-bit machine, it is 25000us per 6 seconds
[18:56:39] <SWPadnos> ok, that's a lot
[18:57:02] <SWPadnos> I wonder if I can apt-get build-dep emc2 on a feisty system
[18:57:32] <SWPadnos> nope
[18:58:35] <jepler> the slope on both machines is negative (err gets smaller over time)
[18:58:56] <jepler> which means that more calls are taking place than is desired
[18:59:00] <jepler> (right?)
[18:59:04] <SWPadnos> unfortunately, build-dep needs linux-image-2.6.15-magma
[18:59:27] <SWPadnos> god question
[18:59:29] <SWPadnos> good
[18:59:50] <jepler> try starting with this package list: http://wiki.linuxcnc.org/cgi-bin/emcinfo.pl?Installing_EMC2#Simulator_on_Ubuntu_6_10_Edgy_Eft
[19:04:22] <jepler> on this plot, the too-quick periods come about once every 4.6ms, so in 6 seconds there are about 1300 of them. The fast thread is "52.8kHz" according to halscope. 1300 / 52.8kHz = 24621us
[19:05:15] <jepler> so it seems like the high slope is related to the too-quick periods
[19:51:28] <SWPadnos> jepler, are python-numarray and/or python-imaging needed with python 2.5?
[19:52:29] <SWPadnos> ok, looks like they aren't
[19:52:43] <jepler> image-to-gcode needs those
[19:53:08] <SWPadnos> there aren't corresponding python2.5-* packages
[19:53:29] <jepler> that's too bad
[19:53:39] <jepler> I wonder if they renamed them or dropped them?
[19:53:45] <SWPadnos> at least, the imaging one is version-specific, and I didn't see a python2.5-imaging package
[19:53:55] <jepler> bastards
[19:53:55] <SWPadnos> let me look again
[19:55:17] <SWPadnos> ok, there's a non-versioned version :)
[19:55:22] <SWPadnos> just python-imaging
[20:03:06] <jepler> SWPadnos: you added 'offset' to pwmgen, right?
[20:03:10] <jepler> I wonder if you have it backwards
[20:03:29] <jepler> it looks like you subtract offset rather than adding it
[20:03:34] <jepler> the hal general reference says: If enable is true, read scale, offset and value and output to the adc (scale * value) + offset. If enable is false, then output 0.
[20:05:06] <jepler> of course pwmgen is also using scale_recip insead of scale -- but it sure feels more natural to say "what's the full scale value" than the inverse
[20:06:14] <SWPadnos> it probably is backwards then
[20:39:32] <SWPLinux> so, I get the following error during build:
[20:39:34] <SWPLinux> msgfmt -o po/de_rs274_err.mo po/de_rs274_err.po
[20:39:35] <SWPLinux> make: msgfmt: Command not found
[20:39:35] <SWPLinux> make: *** [po/de_rs274_err.mo] Error 127
[20:40:53] <SWPLinux> I'm not sure if configure should have noticed that I don't have msgfmt, or if I should install <some package>
[20:42:51] <cradek> gettext: /usr/bin/msgfmt
[20:43:12] <jepler> the answer is probably "both"
[20:44:21] <SWPLinux> ok. I can install gettext
[20:45:55] <jepler> sounds like the easy route to me
[20:46:46] <SWPLinux> I just installed gettext (already had gettext-base), and this is what configure says (using configure ... | grep no):
[20:46:58] <SWPLinux> Readline does not require -ltermcap
[20:47:00] <SWPLinux> checking for gettext in -lintl... no
[20:47:56] <jepler> what I meant was, in an ideal world there would be a configure check for "is the msgfmt command available"
[20:48:00] <SWPLinux> it did seem to find msgfmt though, so that's not a problem, I guess
[20:48:04] <SWPLinux> I understand
[20:48:07] <jepler> there's not currently
[20:48:20] <SWPLinux> the check is there already, but it doesn't seem to do the right thing
[20:48:40] <SWPLinux> checking for libintl.h... yes
[20:48:41] <SWPLinux> checking for gettext in -lintl... no
[20:48:43] <SWPLinux> checking for xgettext... /usr/bin/xgettext
[20:48:44] <SWPLinux> checking for xgettext tcl bug... xgettext version 0.16.1 OK
[20:48:45] <jepler> "checking for gettext in -lintl... no" is checking for the function gettext(), not the msgfmt command
[20:48:46] <SWPLinux> checking for msgfmt... /usr/bin/msgfmt
[20:48:47] <SWPLinux> checking for --tcl support in msgfmt... yes
[20:49:01] <SWPLinux> wow - that's an awfully slow pacing
[20:49:32] <jepler> er, OK
[20:49:46] <jepler> well I leave it to the person who cares to make the build system do the right thing when msgfmt is not found
[20:49:52] <SWPLinux> sorry - the multiline paste is very slow
[20:51:38] <SWPLinux> ok, new error: :)
[20:51:56] <SWPLinux> Compiling emc/usr_intf/axis/extensions/_toglmodule.c
[20:51:59] <SWPLinux> hal/halmodule.cc:28: error: conflicting declaration ‘typedef long unsigned int Py_ssize_t’
[20:52:01] <SWPLinux> /usr/include/python2.5/pyport.h:97: error: ‘Py_ssize_t’ has a previous declaration as ‘typedef ssize_t Py_ssize_t’
[20:52:02] <SWPLinux> hal/halmodule.cc:472: error: invalid conversion from ‘int (*)(PyObject*)’ to ‘Py_ssize_t (*)(PyObject*)’
[20:52:02] <SWPLinux> hal/halmodule.cc:608: error: invalid conversion from ‘int (*)(PyObject*, int, void**)’ to ‘Py_ssize_t (*)(PyObject*, Py_ssize_t, void**)’
[20:52:05] <SWPLinux> hal/halmodule.cc:608: error: invalid conversion from ‘int (*)(PyObject*, int, void**)’ to ‘Py_ssize_t (*)(PyObject*, Py_ssize_t, void**)’
[20:52:05] <SWPLinux> hal/halmodule.cc:608: error: invalid conversion from ‘int (*)(PyObject*, int*)’ to ‘Py_ssize_t (*)(PyObject*, Py_ssize_t*)’
[20:52:07] <SWPLinux> hal/halmodule.cc:608: error: invalid conversion from ‘int (*)(PyObject*, int, char**)’ to ‘Py_ssize_t (*)(PyObject*, Py_ssize_t, char**)’
[20:52:21] <cradek> SWPLinux: paste that a few more times please
[20:52:29] <SWPLinux> heh - sorry
[20:52:38] <SWPLinux> I suppose I could use pastebin or something
[20:52:42] <cradek> this is python2.5 incompatibility iirc
[20:53:05] <SWPLinux> ok, so I should configure with-python=...2.4...
[20:53:15] <cradek> or try to figure out how to fix it...
[20:53:35] <SWPLinux> well, not being a python guru (or even an acolyte), I'm not sure I can do that at the moment
[20:55:16] <jepler> SWPadnos: try giving me a ssh login
[20:55:20] <jepler> that's nearly sure to fix it
[20:55:26] <SWPadnos> heh
[20:55:44] <cradek> the right preprocessor turds around line 28 will surely fix it
[20:55:50] <jepler> Compatibility with previous Python versions can be achieved with the test:
[20:55:50] <jepler> #if PY_VERSION_HEX < 0x02050000 && !defined(PY_SSIZE_T_MIN)
[20:55:50] <jepler> typedef int Py_ssize_t;
[20:55:50] <jepler> #define PY_SSIZE_T_MAX INT_MAX
[20:55:50] <jepler> #define PY_SSIZE_T_MIN INT_MIN
[20:55:53] <jepler> #endif
[20:56:04] <jepler> try this ^^^, it's suggested by the python developers
[20:56:07] <cradek> ^^ turds
[20:57:22] <SWPLinux> ok, I'll see if I can insert the turds before leaving at 5:00
[20:57:35] <jepler> (with that change, halmodule still compiles on ubuntu 6.06 / i386)
[20:58:35] <jepler> if it works on your system, either let me know to check it in, or check it in yourself
[21:01:47] <SWPLinux> nope. the conflicting declaration error is gone, but the invalid conversions are still there
[21:02:09] <SWPLinux> though they're in emcmodule now ...
[21:02:16] <SWPLinux> one sec
[21:03:25] <SWPLinux> oops - the dangers of "make -jn" - the problem is still in halmodule
[21:03:34] <SWPLinux> phew
[21:03:44] <jepler> this is a 64-bit feisty system?
[21:03:48] <SWPLinux> yep
[21:06:03] <jepler> -static int pyhal_len(halobject *self) {
[21:06:03] <jepler> +static Py_ssize_t pyhal_len(halobject *self) {
[21:06:05] <jepler> -static int shm_buffer(shmobject *self, Py_ssize_t segment, void **ptrptr){
[21:06:05] <jepler> +static Py_ssize_t shm_buffer(shmobject *self, Py_ssize_t segment, void **ptrptr){
[21:06:07] <jepler> -static int shm_segcount(shmobject *self, Py_ssize_t *lenp) {
[21:06:09] <jepler> +static Py_ssize_t shm_segcount(shmobject *self, Py_ssize_t *lenp) {
[21:06:12] <jepler> try making these changes too ^^
[21:06:36] <jepler> or leave if you need to -- there is the whole time from now until the end of the universe to fix this :-P
[21:06:40] <SWPLinux> heh
[21:06:46] <SWPLinux> oops - 5:06 already
[21:10:25] <SWPLinux> it's in the PyBufferProcs and PyMappingMethods structs now
[21:10:31] <SWPLinux> gotta run. see you later
[21:11:15] <jepler> [64545.860106] RTAPI: ERROR: rt_task_wait_period() returned 65532.
[21:11:17] <jepler> [64545.860108] RTAPI: (this message is printed only once per session)
[21:12:05] <cradek> what's that?
[21:12:18] <jepler> a message I just added
[21:12:33] <jepler> the latency test checks the return value of rt_task_wait_period to detect overruns; emc doesn't test it
[21:12:35] <jepler> or didn't
[21:12:40] <jepler> now I'm trying to figure out what 65532 means
[21:13:46] <cradek> the manpage I found shows it returns void...
[21:15:35] <jepler> int rt_task_wait_period(void)
[21:15:39] <jepler> *
[21:15:40] <jepler> * @return 0 if the period expires as expected. An abnormal termination
[21:15:40] <jepler> * returns as described below:
[21:15:40] <jepler> * - @b RTE_UNBLKD: the task was unblocked while sleeping;
[21:15:40] <jepler> * - @b RTE_TMROVRN: an immediate return was taken because the next period
[21:15:43] <jepler> * has already expired.
[21:16:38] <jepler> #define RT_TMROVRN (1) // keep this the lowest, must be 1
[21:16:47] <jepler> #define RTE_BASE (0xFFFB)
[21:16:57] <jepler> #define RTE_UNBLKD (RTE_BASE + RT_UNBLKD)
[21:16:58] <jepler> >>> hex(65532)
[21:16:58] <jepler> '0xfffc'
[21:17:42] <cradek> that's a nice find
[21:19:51] <jepler> er
[21:20:07] <jepler> so I pasted the wrong line, the error is RTE_TMROVRN
[21:20:39] <cradek> yeah I figured
[21:20:58] <cradek> so it thinks there was an overrun, so it runs the thread again right away
[21:21:13] <cradek> but it seems there wasn't really an overrun
[21:33:19] <jepler> [65543.685220] RTAPI: clock_set_period requested: 19000 actual: 18953 counts: 238
[21:33:20] <jepler> [65543.685354] RTAPI: start_task id: 01
[21:33:25] <jepler> [65543.685355] RTAPI: period_nsec: 18953
[21:33:28] <jepler> [65543.685356] RTAPI: count: 237
[21:41:55] <jepler> stupid rtapi
[21:42:16] <jepler> [66347.068529] RTAPI: clock_set_period requested: 19000 actual: 18953 counts requested: 238 actual: 238
[21:42:17] <jepler> [66347.068531] RTAPI: round-trip counts: 237
[21:42:28] <jepler> you can't round-trip nano2count and count2nano
[21:47:50] <jepler> and now the weird behavior is fixed! http://emergent.unpy.net/files/sandbox/fixed.png
[21:56:50] <jepler> * jepler does a little dance
[22:26:38] <jepler> and the weird "tok tok tok" I heard while running the nist-lathe configuration is gone
[22:28:08] <skunkworksemc> oh
[22:28:21] <skunkworksemc> logger_dev: bookmark
[22:28:21] <skunkworksemc> Just this once .. here's the log: http://www.linuxcnc.org/irc/irc.freenode.net:6667/emcdevel/2007-06-01.txt
[22:30:35] <skunkworksemc> jepler: nice
[22:35:09] <skunkworksemc> tok tok tok reminds me of trying to run turbocnc on a portable ;)
[22:44:29] <jepler> I am divided on whether to call the problem I ran into (and now solved) an emc2 (rtapi) bug or an rtai bug

#emc-devel | Logs for 2007-06-01