#emc-devel | Logs for 2010-05-24

[00:00:04] <mozmck> I think the one I had to add the soft memlock has more memory...
[02:10:30] <jepler> so it should be evident at this point that I did not do 2.4.1 this week-end.
[02:21:12] <cradek> I was wondering about that...
[02:54:55] <cradek> *** buffer overflow detected ***: milltask terminated
[02:55:22] <SWPadnos> urk
[02:55:28] <SWPadnos> that can't be good
[02:57:24] <cradek> ok now how the hell do I get it to segfault so I can trap it in gdb
[02:58:09] <cradek> aha, it just works
[02:58:11] <cradek> wheeee
[03:01:11] <cradek> except it's bogus...
[03:06:56] <cradek> I hate rabbit holes
[03:07:16] <cradek> at 10pm
[03:07:23] <cradek> with a busted air conditioner
[03:58:13] <cradek> emccanon.c:arc() is so wrong...
[03:58:43] <cradek> goodnight
[12:52:35] <jepler> cradek: so today NURBS is broken again?
[12:53:22] <SWPadnos> NURBS = "Not Usually Running Broken Splines"
[12:53:34] <jepler> good morning SWPadnos
[12:53:45] <SWPadnos> morning jepler
[12:53:49] <SWPadnos> ^good
[12:54:29] <jepler> * jepler wonders what is wrong in arc() that could cause a buffer overflow
[12:54:43] <jepler> I see how the fabs(den) test fails to guard against division by zero; that much is obvious
[13:03:21] <cradek> jepler: units are wrong for uvw, and it doesn't handle offsets or rotation
[13:03:29] <jepler> oh is that all
[13:03:39] <cradek> the overflow is elsewhere, maybe rcs_print
[13:03:47] <cradek> that's why I was whining about a rabbit hole
[13:07:07] <jepler> http://emergent.unpy.net/files/sandbox/0001-sprintf-snprintf-to-avoid-buffer-overflows.patch
[13:07:15] <jepler> looks like a good idea in any case
[13:07:34] <jepler> .. but not tested, of course
[13:07:52] <jepler> .. not even compile tested, of course
[13:08:02] <cradek> I'll try to reproduce mine and then test your patch
[13:08:11] <cradek> ... later
[13:08:47] <jepler> patch updated to at least compile
[13:08:49] <jepler> coffee?
[14:14:34] <cradek> jepler: http://pastebin.com/Ab7TmfFk
[14:16:06] <cradek> and your patch does fix it!
[14:16:32] <jepler> well that's good news anyway
[14:20:09] <cradek> if (strlen(_fmt) > 250) {
[14:20:09] <cradek> return EOF;
[14:20:09] <cradek> }
[14:20:24] <cradek> this is so bogus
[14:21:09] <jepler> I tried not to look too close at the code
[14:21:38] <cradek> it's like a dog threw up in there
[14:21:48] <cradek> (my thanks to seb for that expression)
[14:21:55] <jepler> he doesn't even own a dog!
[14:22:12] <jepler> $ git grep sprintf src | wc -l
[14:22:12] <jepler> 725
[14:22:19] <jepler> there are only a few more of these..
[14:23:33] <alex_mobile> What's that from?
[14:23:54] <jepler> alex_mobile: uses of v?sprintf in emc2
[14:24:06] <jepler> alex_mobile: the code chris pasted is from somewhere in libnml
[14:24:24] <alex_mobile> I see. Thanks
[14:32:13] <alex_mobile> Wonder if we should look at a newer rcslib
[14:32:37] <alex_mobile> The last one has some interesting features
[14:33:13] <alex_mobile> Like support for multi readers on a channel
[14:33:56] <alex_mobile> Would take care of the multiple gui racing issue
[14:33:56] <alex_mobile> Would take care of the multiple gui racing issue
[14:34:08] <jepler> you mean multiple writers?
[14:35:01] <alex_mobile> And readers for a queue channel
[14:40:33] <SWPadnos> multi readers, with the guarantee that all readers will get to see a message (so there's no race to see who gets it first)
[14:40:35] <SWPadnos> IIRC
[14:40:44] <jepler> oh, you're thinking of the operator message problem
[14:41:16] <SWPadnos> yep, and any others like it :)
[14:41:31] <jepler> I'm thinking of the UIs sending commands at the same time, which is one reader (task) multiple writers (UIs)
[14:42:08] <jepler> ignoring for a moment the problem with having a single sequence number for "wait received"/"wait complete", I've never understood whether a second UI can overwrite a first UI's command before task sees it at all.
[14:42:23] <SWPadnos> good question
[14:42:32] <jepler> but hey, we all agree that we should just drop nml and flail around for a year while emc can't even have its UI and task communicate anymore
[14:42:42] <SWPadnos> agreed
[14:42:54] <SWPadnos> let's assign someone the task of making that happen
[14:43:06] <jepler> right after we have someone replace every "vsprintf" with "vsnprintf"
[14:43:46] <SWPadnos> nah. those would be magically fixed when all the code gets replaced
[14:43:46] <SWPadnos> no sense doing it twice
[14:44:25] <alex_mobile> Will that allow jogging while paused?
[14:44:37] <SWPadnos> yes, and while off (added bonus)
[14:45:00] <jepler> that's what the handwheels are for
[14:45:19] <SWPadnos> you mean the knobby thing on my USB keyboard?
[14:46:08] <alex_mobile> MPG jogging in carthesian space for nontrivkins
[14:47:06] <jepler> carthesian delenda est
[14:48:01] <alex_mobile> Sure?
[14:50:44] <alex_mobile> Quid quid latine dictum sid altum sigitur
[14:51:48] <alex_mobile> Can't tell you how hard that is to type on my cellphone without latin t9
[14:52:55] <jepler> hah
[15:39:41] <CIA-38> EMC: 03cradek 07v2.4_branch * r388833c06625 10/src/libnml/rcs/rcs_print.cc: enlarge this for long arc statements, and remove a stupid check
[15:39:42] <CIA-38> EMC: 03cradek 07v2.4_branch * r176c04753845 10/src/emc/task/emccanon.cc: fix splines when offset and/or rotated
[15:39:46] <CIA-38> EMC: 03cradek 07v2.4_branch * r1bf4b2fd4337 10/src/libnml/rcs/rcs_print.cc: sprintf->snprintf to avoid buffer overflows
[15:52:21] <SWPadnos> how can someone be stupid enough to make this monitor have only 900 vertical pixels?? http://www.ostendotech.com/crvd/specs.php
[15:53:10] <jepler> 640 pixels is enough for anybody!
[15:53:17] <SWPadnos> oh right, I forgot
[15:53:32] <SWPadnos> in a 32x20 arrangement
[16:31:30] <skunkworks_> I have to say - I like my 1080 screen
[16:32:45] <cradek> evil!
[16:33:03] <skunkworks_> heh -
[16:33:28] <skunkworks_> that is almost 1200 vertical ;)
[16:34:09] <skunkworks_> 1920X1080
[16:34:26] <jepler> it means the screen finishes redrawing 10% faster!
[16:42:25] <skunkworks_> dell I think was selling some laptops that where 720.. which seems pretty bad
[16:43:14] <mozmck_work> I like my 1920x1280 screens
[18:59:34] <mozmck-6core> emc is still no go on this 6-core machine. If I run latency-test it shows only 0s everywhere. lsmod | grep rtai shows the rtai modules loaded...
[19:01:26] <cradek> are you using isolcpus?
[19:04:36] <mozmck-6core> no.
[19:05:34] <jepler> what's dmesg say?
[19:06:11] <mozmck-6core> I found out about the 'ulimit' program just recently. ulimit -Sl will show your memlock soft limit. If this is 0 emc will not start.
[19:07:40] <jepler> yes, that is known and why we say to configure limits.conf (and why the package does it, or is supposed to do it, when it's installed)
[19:07:48] <mozmck-6core> http://pastebin.com/aKqCXhaY
[19:08:50] <mozmck-6core> jepler: if hard memlock limit is 20480 and soft memlock limit is 0 EMC2 won't start...
[19:09:05] <jepler> er, ok
[19:09:27] <jepler> then you get the fun of tracking down what sets the soft limit lower than the hard limit
[19:09:58] <jepler> or change "hard memlock" to "- memlock" to set both hard and soft in limits.conf.
[19:10:37] <mozmck-6core> :) I wonder if there are some packages that get installed that lower the soft limit. Some computers I have only needed to set the hard limit and others I set both.
[19:11:13] <mozmck-6core> I normally just put another line that sets soft memlock the same as hard.
[19:11:37] <jepler> "-" in that column is the same as setting hard and soft
[19:11:55] <mozmck-6core> anything look abnormal in my dmesg output?
[19:12:11] <mozmck-6core> ok
[19:12:28] <jepler> no, it looks normal
[19:12:38] <jepler> I assume "halcmd show thread" indicates thread time is 0 too
[19:13:00] <mozmck-6core> Should I run that while latency-test is running?
[19:13:17] <jepler> right
[19:13:46] <mozmck-6core> time and maxtime are 0
[19:15:10] <jepler> pastebin the contents of /proc/rtai/scheduler also while latency-test is running
[19:15:50] <jepler> here's one from a working hardy system: http://pastebin.com/6MptV9bz
[19:16:24] <mozmck-6core> http://pastebin.com/FY56fjfy
[19:18:21] <jepler> that looks fine too
[19:19:29] <mozmck-6core> weird. I tried to run the rtai kern latency test and it just sits there. I was able to run it once the other day.
[19:19:49] <mozmck-6core> maybe twice, but no other tests work
[19:20:39] <mozmck-6core> rtai user latency tells me CANNOT INIT MASTER TASK
[19:20:50] <jepler> make sure you stop hal before running rtai latency tests
[19:21:35] <mozmck-6core> I don't think it's running, how do I make sure again?
[19:24:10] <mozmck-6core> hmm, ps -A shows 'latency' twice, and I can't kill it.
[19:25:42] <jepler> if yuo can't kill it, you may have to reboot. if realtime isn't running properly, it may not unload properly
[19:26:59] <jepler> I dunno if anybody's run emc on a >2 cores/CPUs system before; maybe there's something wrong about what CPU we schedule to run on
[19:27:15] <jepler> you can try this and see -- it will let rtai assign the cpu to run realtime stuff on instead of having emc do it. http://emergent.unpy.net/files/sandbox/0001-don-t-try-to-run-on-specific-CPU.patch
[19:57:41] <SWPadnos> note that isolcpus is a list of CPU numbers, like "3,4,5"
[19:58:35] <SWPadnos> so you need to say "isolcpus=5" to reserve the one I expect our tasks to bind to (I believe it uses the highest CPU number availabel)
[19:58:48] <jepler> is isolcpus even in current kernels? at some point they were going to take it away.
[19:59:06] <SWPadnos> I don't know if CPUsets have replaced it yet
[20:01:16] <SWPadnos> looks like they haven't, google turns up fixes to isolcpus in 2.6.34
[20:01:32] <SWPadnos> (from months ago, but still)
[20:11:13] <mozmck> Thanks jepler, I'll try your patch later. bbl
[20:12:16] <jepler> http://mid.gmane.org/20080918081328.nxu81kudh4fkosgg@webmail.chapter7.ch
[20:12:16] <jepler> Isolcpus was rumoured to be deprecated but afaik
[20:12:17] <jepler> these plans were dropped for now.