#emc-devel | Logs for 2007-07-23

[00:01:04] <cradek> jepler: I don't see how you tracked http://sourceforge.net/tracker/index.php?func=detail&aid=1736182&group_id=6744&atid=106744 to the bug you reported to me
[00:05:25] <cradek> but it seems to run ok now...
[00:06:08] <cradek> well except it errors at line 65 (would esceed liits)
[00:06:14] <cradek> x m
[00:10:17] <cradek> if you reproduced it earlier, would you retest please
[01:05:56] <SWPadnos> well, I've had my test running (with the PC crashed) for nearly 2 hours, and the total deviation recorded by the scope has been 1.2000 uS
[01:07:09] <SWPadnos> this is using a Mesa card, so there's a nice pattern to the traces - showing the 33 MHz PCI bus clock quite nicely :)
[01:07:32] <jmkasunich> sw stepping using the mesa as the I/O
[01:07:33] <jmkasunich> ?
[01:07:40] <SWPadnos> no, simple HAL-only test setup
[01:07:49] <SWPadnos> m5i20 driver and chargepump
[01:07:58] <jmkasunich> oh, ok
[01:08:10] <SWPadnos> set up with a fast base period - 10 or 20 uS
[01:08:14] <jmkasunich> still, the 5i20 is strictly general purpose I/O, nothing HW generatec
[01:08:13] <jmkasunich> d
[01:08:18] <SWPadnos> right
[01:08:47] <SWPadnos> I'll save the scope image to disk and upload that, along with the test files once I reboot the machine
[01:09:06] <SWPadnos> it is very strange that I can't ping the machine, but the RTAI/HAL tasks are still running
[01:09:25] <jmkasunich> not really
[01:09:35] <jmkasunich> the RT tasks run under linux
[01:09:41] <SWPadnos> I'd expect other things that are interrupt-driven to also work
[01:10:01] <jmkasunich> most things aren't interrupt driven
[01:10:53] <jmkasunich> sure, the network might use an interrupt to receive a packet, but the rest of the processing (such as generating a ping response packet) isn't done in the ISR
[01:11:10] <SWPadnos> true
[01:11:39] <SWPadnos> well, this tells me that if we can get the rest of the kernel out of the way, we'll approach the interrupt performance of an AVR with a dual-core 2GHz CPU ;)
[01:15:52] <jmkasunich> the difference is what you can get done during each interrupt
[01:16:05] <SWPadnos> heh
[01:16:06] <SWPadnos> well, there is that
[01:16:20] <SWPadnos> I'm sure the 64-bit math is faster on the A64
[01:25:45] <jepler> cradek: great, I'll retest too
[01:30:37] <SWPadnos> jepler, did you notice any latency difference between the 32-bit and 64-bit kernels?
[01:31:20] <jepler> SWPadnos: I only ran the 64 bit kernel on that machine
[01:31:46] <jepler> and I forget what latency I measured (but maybe I blogged it?)
[01:32:23] <SWPadnos> ok - just wondering if you saw any major differences
[01:32:25] <cradek> have you seen anything close to my smp machine with isolation (~ 8000 iirc)
[01:32:48] <SWPadnos> this A64 gets ~13000 with the 32-bit UP kernel
[01:32:53] <cradek> maybe new machines would be even better
[01:33:08] <SWPadnos> I haven't downloaded all the dev stuff (like source :) ) yer
[01:33:10] <SWPadnos> yet
[01:33:38] <jepler> I think I saw overruns (error returns from rt_task_wait_period) somewhere between 15000 and 17500
[01:33:51] <SWPadnos> I haven't tried twiddling BIOS settings yet - I wanted to see if it makes sense to consider this PC for industrial RT
[01:33:57] <jepler> with BASE_PERIOD that low
[01:34:16] <jepler> cradek: is isolate a boot-time parameter? if so, I can try it ..
[01:34:19] <SWPadnos> I was actually able to run with 10000 period, even with 13000 max latency
[01:34:35] <cradek> jepler: yes
[01:34:39] <cradek> jepler: isolcpus=1 iirc
[01:35:08] <cradek> I don't remember whether I had to do anything special...
[01:35:46] <jepler> let me go try it
[01:41:26] <jepler> cradek: did you have to modify the latency test to make it choose the right CPU?
[01:41:52] <cradek> I don't think so
[01:42:54] <jepler> I already up to ovl max 9586 with a glxgears and a find running
[01:43:31] <cradek> that's still a good number
[01:43:43] <cradek> has anyone tried the new nvidia drivers that are supposed to be safer for realtime?
[01:43:58] <jepler> eek, got 118900 on switch from X to terminal
[01:44:14] <cradek> ah, don't do that
[01:45:04] <jepler> cradek: I used isolcpus=1 but both still show up in top. does that match your experience?
[01:45:37] <cradek> yes, but the only thing running on 1 was the kernel processes ending in /1
[01:47:26] <jepler> huh -- on this run I'm doing the same things (except switching to console and back) and I've got 910 (yes, 3 digits) ovl max
[01:47:31] <jepler> I wonder what's different now
[01:47:45] <jepler> oops, went up to 952
[01:47:54] <cradek> that doesn't seem like a real number
[01:47:54] <jepler> RTH| lat min| ovl min| lat avg| lat max| ovl max| overruns
[01:47:57] <jepler> RTD| -1549| -1614| -1250| -345| 952| 0
[01:48:07] <cradek> wow
[01:48:50] <jepler> RTD| -1611| -1615| -966| -44| 1865| 0
[01:48:51] <jepler> RTD| -1502| -1615| -960| 21825| 21825| 0
[01:48:56] <jepler> well something finally happened to ruin it all
[01:48:59] <cradek> bonk
[01:49:46] <jepler> too bad you can't know when the latency test is done
[01:50:54] <SWPLinux> heh
[01:51:04] <SWPLinux> it's finished when you see a number that's too high ;)
[01:51:22] <SWPLinux> here's the scope trace: http://imagebin.org/9448
[01:51:35] <SWPLinux> the gray lines are persistence from previous traces
[01:52:16] <jmkasunich> what are you triggering on?
[01:53:16] <SWPLinux> either edge
[01:53:37] <SWPLinux> that's why there's a gray horizontal in the top right/bottom left
[01:53:58] <jmkasunich> where is the trigger point? offscreen left?
[01:54:14] <SWPLinux> script here: http://pastebin.com/m7556189a
[01:54:25] <SWPLinux> .hal file here: http://pastebin.com/m1dfdbde
[01:54:48] <SWPLinux> yes, it's 18.42 uS to the left (shown at top middle)
[01:55:15] <jepler> [ 1087.874864] CPU USE SUMMARY
[01:55:15] <jepler> [ 1087.875292] # 0 -> 0
[01:55:15] <jepler> [ 1087.875417] # 1 -> 3486021
[01:56:12] <SWPLinux> if I apt-get source yadadyadda-magma, will I get full RTAI source as well?
[01:56:13] <jepler> * jepler wanders off again
[02:09:05] <SWPLinux> hmmm. is this still the place to get RTAI? http://download.gna.org/rtai/testing/v3/
[02:17:57] <cradek> no, those are old
[02:18:07] <cradek> I got it last from some cvs somewhere
[02:18:23] <cradek> they don't release very often, so you generally have to use cvs for modern kernels
[02:18:53] <cradek> https://gna.org/cvs/?group=rtai
[02:18:56] <cradek> maybe this
[02:21:36] <SWPLinux> I just did a checkout of the magma module
[02:21:45] <cradek> yeah that's the ticket
[02:21:46] <SWPLinux> that link was from the RTAISteps wiki page
[02:22:05] <SWPLinux> (last edited in Dec 2005 by Paul C)
[02:22:53] <SWPLinux> any suggestions on which kernel to try?
[02:23:09] <SWPLinux> on 6.06, possibly trying 64-bit and/or SMP
[02:23:33] <cradek> maybe try the latest 2.6.20.x
[02:23:54] <SWPLinux> hmm. there's a patch for 2.6.22
[02:23:55] <jmkasunich> the RTAI guys make the patches for a specific version
[02:24:05] <jmkasunich> and not many others
[02:24:16] <cradek> maybe try 22 then
[02:24:19] <SWPLinux> right. there are about 50 patches in CVS
[02:24:26] <cradek> it's a total crap shoot
[02:24:34] <SWPLinux> do you know of any troubles using later kernels on 6.06?
[02:24:46] <cradek> no
[02:24:47] <SWPLinux> heh - I've played that game before (with RTAI)
[02:24:55] <cradek> some later ones have problems in general, but not specific to 6.06
[02:25:02] <SWPLinux> ok. I'll try .22, then fall back to something else if necessary
[02:25:05] <cradek> well bootsplash won't work, but that's minor
[02:25:15] <SWPLinux> did you say at some point that 2.6.17 was bad?
[02:25:21] <SWPLinux> or was that the good one
[02:25:30] <cradek> yes it has a bad keyboard bug
[02:25:46] <cradek> I'm not exactly sure when that's fixed either, it bit me on several versions I think
[02:25:49] <SWPLinux> ok, I'll skip that one
[02:25:51] <SWPLinux> heh
[02:26:04] <cradek> iirc, 18 and 19 are totally busted, which is why I was using 17
[02:26:20] <SWPLinux> .19, .20, and .22 are all supported for both i386 and x64
[02:26:23] <SWPLinux> ah
[02:26:26] <cradek> jumping to 20.x seemed to fix the keyboard
[02:26:36] <cradek> but maybe I haven't used it enough to tell for sure yet
[02:27:05] <cradek> I managed to patch bootsplash into .17 but not .20.x, it's pretty different
[02:27:29] <cradek> so I just turned it off in grub, didn't care if the kernel is just for me
[02:27:42] <cradek> fwiw my deb packages are in /experimental on the website
[02:27:59] <cradek> you could easily just try them on your hardware if you want
[02:29:30] <SWPLinux> true enough
[02:30:37] <SWPLinux> are any of those SMP?
[02:30:42] <cradek> all are
[02:30:46] <SWPLinux> ah
[02:30:55] <jmkasunich> I have one question
[02:31:06] <cradek> like I said though, don't bother with the .17
[02:31:08] <jmkasunich> why does anyone give a flying fsck about bootsplash?
[02:31:28] <SWPLinux> because it's pretty
[02:31:31] <SWPLinux> or so some people think
[02:31:36] <jmkasunich> its entirely 100% cosmetic, does nothing at all for the computer
[02:31:49] <jmkasunich> and hides usefull information
[02:31:49] <cradek> I can't seem to get grub to stop turning the damn thing back on
[02:31:52] <SWPLinux> the same is true of cosmetics ...
[02:31:57] <cradek> so I'd rather it works
[02:32:14] <jmkasunich> you have to do menu.lst foo to make it stop
[02:32:52] <jmkasunich> # defoptions=nosplash
[02:33:18] <cradek> thanks I'll try that next time
[02:33:20] <jmkasunich> its not sufficient to do "# defoptions= <nothing>"
[02:33:25] <jmkasunich> you have to explicitly say no
[02:33:26] <cradek> yep that's what I tried
[02:33:34] <jmkasunich> I found that out the hard way
[02:33:36] <cradek> it fills some defoptions back in next time then
[02:33:44] <jmkasunich> right
[02:33:56] <cradek> that file's an abomination
[02:34:02] <cradek> defaults in magic comments ... ugh
[02:34:06] <jmkasunich> yeah
[02:34:30] <SWPLinux> it could be better than defaults compiled in though
[02:34:58] <SWPLinux> either someone is brewing some strong coffee, or a skunk did something outside my window
[02:35:04] <cradek> haha
[02:35:05] <jmkasunich> heh
[02:35:11] <cradek> yum coffee
[02:36:03] <SWPLinux> hmmm. cradek, did you ever post your SMP config online?
[02:36:07] <SWPLinux> I thought you or jepler had
[02:36:18] <cradek> umm not sure
[02:36:21] <cradek> it's in those debs of course
[02:36:34] <SWPLinux> right. another 40M :)
[02:36:55] <cradek> sorry, can't get to that machine right now
[02:37:23] <SWPLinux> I really wish nv could tell that I have a 1680x1050 widescreen LCD - this 1024x768 is killing me
[02:37:40] <SWPLinux> all the letters are so wide
[02:37:51] <cradek> fix your screen to not zoom it
[02:37:59] <cradek> it's on the menu somewhere
[02:38:15] <SWPLinux> not on this screen
[02:38:22] <cradek> yuck
[02:38:26] <jmkasunich> it doesn't have a 1:1 mode?
[02:38:31] <jmkasunich> thats bogus
[02:38:33] <cradek> my dell 2001s have it
[02:38:37] <SWPLinux> the monitor thinks it's in full res mode - the card may be doing the scaling for me
[02:38:53] <SWPLinux> it's a $300 22" LCD - it's bogus all right
[02:38:59] <SWPLinux> Acer, baby!
[02:39:10] <cradek> * cradek loans swp a crt
[02:39:27] <SWPLinux> got one on the floor - 21" Nokia
[02:39:49] <SWPLinux> scaling really loses the advantages of DVI
[02:40:10] <cradek> CRTs are so much better at displaying various scan rates and resolutions - which is amazing since they switch coils and stuff in and out to do it
[02:40:27] <SWPLinux> variable spot size is very nice
[02:40:35] <SWPLinux> that's why I have a 3-CRT TV
[02:41:15] <SWPLinux> interestingly, that's how they do film scanning at various resolutions
[02:41:17] <cradek> TV - is that the device that's kind of like a monitor but has a large crappy picture?
[02:41:30] <cradek> and shows only advertising?
[02:41:45] <SWPLinux> they illuminate the film with a CRT, and can vary the beam width for different spot sizes
[02:41:53] <SWPLinux> I think so, which is why we watch PBS and DVDs on it
[02:42:23] <cradek> I read today (on some bogus internet thing) that an average american spends one year of his life watching advertising on TV
[02:42:39] <SWPLinux> I suspect that's higher now
[02:42:49] <jmkasunich> gotta be higher
[02:43:08] <SWPLinux> not counting "product placement"
[02:43:21] <cradek> I can see where it could be easily an hour a day for many people
[02:43:42] <cradek> considering watching 3-4 hours of TV a day, which might even be low
[02:44:05] <cradek> I guess I have no idea how much TV "average" people watch
[02:44:09] <SWPLinux> luckily, DVDs have fewer commercials
[02:44:22] <jmkasunich> I love my new PC.... just remembered to restart the farm. All 3 VMs started doing complete builds, all finished in less than 3 mins, and the machine didn't bog down at all
[02:44:26] <jmkasunich> that used to take 45 mins
[02:44:29] <SWPLinux> heh
[02:44:38] <cradek> yay
[02:44:52] <cradek> we won't have to hear you complain either anymore :-)
[02:44:54] <SWPLinux> is this since Fest, or just set up well now?
[02:45:05] <jmkasunich> since the fest
[02:45:22] <SWPLinux> cool. what did you end up getting?
[02:45:23] <jmkasunich> I moved the VMs a week or so after I got home, they
[02:45:32] <jmkasunich> oh, the machine isn't that new
[02:45:42] <jmkasunich> its the core 2 E6600, I had that before fest
[02:45:45] <SWPLinux> ok - it's the dual-core with the fanless nvidia card
[02:45:52] <jmkasunich> right
[02:46:04] <jepler> cradek: yep, the pcCase-2.ngc now runs for me
[02:46:13] <cradek> good
[02:46:25] <cradek> I still don't see how it would have caused the problem you saw
[02:47:12] <jepler> well -- it did
[02:47:28] <jmkasunich> and it doesn't now. so there!
[02:47:55] <cradek> how did you make the connection to your test case though?
[02:48:59] <jepler> once I saw that it did continue after that move, I figured it was a mistake in the constraints sent to the planner
[02:49:16] <jepler> based on the debugging information it printed, I found a small program to cause the problem
[02:49:27] <jepler> and luckily it seems to have been the same problem
[02:49:35] <cradek> yay
[02:49:42] <cradek> thanks for finding the easy test case
[02:49:56] <jepler> thanks for fixing it
[02:50:03] <cradek> did you get a chance to check the skipped drilling fix?
[02:50:03] <jepler> will you take care of the report on sf?
[02:50:07] <cradek> sure
[02:50:17] <jepler> no, I only duplicated the original behavior you saw
[02:50:31] <cradek> before my change I hope?
[02:50:39] <jepler> yes
[02:51:20] <cradek> going to work tomorrow?
[02:51:38] <jepler> yeah
[02:57:37] <jepler> 'night
[02:57:54] <cradek> night
[03:07:28] <SWPLinux> hmmm. that's odd. I patched the kernel with arch/x68_64/...x86_64...patch, and it patched files in the kernel arch/i386 directory
[03:34:41] <SWPadnos> well, I guess I'll find out tomorrow just how bad it is to have a computer crash while compiling the kernel
[03:34:47] <SWPadnos> night folks
[12:03:58] <Guest657> Guest657 is now known as skunkworks_
[14:21:58] <jepler> static int emc_pendant(ClientData clientdata,
[14:22:04] <jepler> Pendant read routine from /dev/psaux, /dev/ttyS0, or /dev/ttyS1
[14:22:14] <jepler> huh I have never noticed this code before
[14:22:18] <jepler> I wonder what it's for / can do
[14:22:33] <SWPadnos> where is it?
[14:22:57] <jepler> emcsh.cc
[14:23:02] <SWPadnos> huh
[14:23:08] <jepler> looks like it's unused
[14:23:16] <jepler> it's exposed as a tcl command "emc_pendant" but that's not used in any tcl program
[14:26:05] <SWPadnos> interesting
[17:22:27] <skunkworks__> skunkworks__ is now known as skunkworks
[18:12:17] <jepler> here's my idea for doubling the effective step rate for a given BASE_PERIOD: 1. add a new stepgen mode which is like step+direction, except that step is allowed to be high every cycle
[18:13:08] <jepler> 2. add a new feature to hal_parport: pin reset. This has a new mask of pins to reset, a minimum time between write and reset (e.g., 5uS), and a new parport.X.reset function
[18:13:31] <jepler> 3. rearrange the parport functions so that they are in the order write, read, other activity, reset.
[18:13:41] <jepler> 4. make the "mask of pins to reset" be the same as the "step" pins
[18:14:03] <jepler> setp parport.0.reset_mask 0x55; setp parport.0.reset-min-time 2000
[18:16:22] <LawrenceG> jepler: the 7486 was not a bad approach, a 5us(adjustable length) one shot on each step pin would allow the pin to be toggled very fast ie.. write then reset
[18:17:13] <jepler> to me a solution that doesn't require extra hardware is neater
[18:18:17] <LawrenceG> jepler: true, but built into a breakout board there is little cost.... the software technique will increase the time it takes to execute the fast ~20us code
[18:19:07] <jepler> that's true
[18:19:24] <LawrenceG> if step requires 10us hold time, one ends up using 50% cpu time to generate pulses... aka spinlocks
[18:19:27] <cradek> but if it adds 10% that's still 190% the step rate
[18:21:33] <LawrenceG> now if we could just convince people to abandon step/dir for quadrature drive ( cradek and LawrenceG are convinced)
[18:21:57] <jepler> the fast thread I propose would look like: Write (data port); Write (control port); Read (control port); other computations; Reset (data port). If the time for 2 I/O plus other computations is larger than the required step pulse time, then the additional time cost is a single I/O. If the pulse time is long, it might add more time (gecko seems to be among the longest at 4uS)
[18:23:01] <jepler> the problem isn't whether people prefer quadrature or step+direction, but whether the driver/translators accept it
[18:24:10] <jepler> now an 8-pin micro with quadrature input and (flash-time programmable) step+direction output timings and maybe a multiplier would be cool (except you'd need one per axis)
[18:24:15] <LawrenceG> I know... I had a long discussion with Marris about supporting quadrature drive and he couldnt see the advantage.. maybe with his new cpld drives, there may be a chance
[18:25:36] <cradek> getting quadrature in geckos would be very nice
[18:25:41] <LawrenceG> :}
[18:26:16] <cradek> that would put them very far above all the other options
[18:26:33] <jepler> I hear even mach can do quadrature
[18:27:05] <cradek> wonder whether it gets an increased max step rate with quadrature
[18:32:42] <jepler> direction + either-edge-of-step is probably easier to do on a micro, compared to quadrature
[18:33:18] <jepler> attiny13 has "either logical edge of pin" as an interrupt condition
[18:33:38] <LawrenceG> quadrature is just a 2 bit counter that gets inc or dec
[18:44:14] <cradek> LawrenceG: no it's not! that's a common but wrong way to decode quadrature
[18:44:53] <cradek> imagine the sequence 00 01 00 01 00 01 00 01 00. with real quadrature that means no change in position.
[18:45:02] <LawrenceG> talking of encode
[18:45:29] <cradek> oh ok
[18:47:11] <LawrenceG> +1 -1 +1 -1 +1 -1 agreed... no position change... or pesky noise pulses that have no effect on quad but would trash step/dir
[18:50:03] <SWPadnos> if you use one bit as a clock, and the other as direction, then a counter also works for input, with one addition: you must count on either clock edge, and you must count up on one edge and down on the other
[18:50:26] <SWPadnos> so the 00 01 ... sequence would be +1, -1, +1, -1 ...
[18:50:53] <SWPadnos> and 10 11 10 11 ... would be inverted, so the 0-1 clock transition would count down, and the 1-0 transition would count up
[18:51:14] <SWPadnos> but it's much easier with a lookup table or a few logic gates :)
[18:53:12] <jepler> yeah -- jmk guided me to the lookup table approach when I was playing with quadrature (http://emergent.unpy.net/projects/01149271333 "400kHz Triple quadrature divider for atmega8 and quadrature state table generator")
[18:54:11] <LawrenceG> yup... condition jump into a state transition table is very quick
[18:58:29] <alex_joni> yay ;)
[18:58:44] <alex_joni> anyone got emc2 in runnable conditions around?
[18:58:50] <SWPadnos> not me, not me!
[18:59:09] <jepler> alex_joni: sim only
[18:59:55] <alex_joni> jepler: can you check that I didn't break anything obvious?
[19:00:19] <alex_joni> * alex_joni did test some cases
[19:01:11] <jepler> alex_joni: what are you breaking?
[19:01:27] <alex_joni> I'm breaking bugs :P
[19:02:05] <alex_joni> actually I don't expect the change I just commited to have any impact besides the case petev reported
[19:02:15] <alex_joni> http://sourceforge.net/tracker/index.php?func=detail&aid=1734309&group_id=6744&atid=106744
[19:02:35] <jepler> alex_joni: I was just looking at that
[19:02:51] <jepler> alex_joni: you say it only happens when the program has a %, but petev says it happens when no program is loaded
[19:03:07] <alex_joni> he did say loaded
[19:03:21] <jepler> "The error seems to
[19:03:21] <jepler> happen when the M2 or M30 attempts to reset the program and no program file
[19:03:22] <jepler> is open."
[19:03:36] <alex_joni> read on
[19:04:20] <alex_joni> jepler: it happens like this: you open a file with % at the beginning and end
[19:04:30] <alex_joni> if you abort the file pointer gets reset
[19:04:35] <alex_joni> aka no file loaded
[19:04:48] <alex_joni> but the % flag was still active
[19:04:51] <jepler> OK -- yes, I was able to get the error
[19:04:55] <jepler> (before updating to your version)
[19:05:08] <jepler> hmmm looks like it'll take me a minute to get my tree buildtable again
[19:05:09] <jepler> buildable
[19:06:50] <alex_joni> I know how that goes, that's why I asked earlier :P
[19:07:48] <jepler> almost got it..
[19:11:28] <jepler> alex_joni: yes it's also fixed for me after that change
[19:12:36] <alex_joni> good, I can't think of anything that it might break..
[19:12:56] <alex_joni> I'll backport it too