#emc | Logs for 2011-02-01

Back
[13:20:35] <micges_work> hi guys
[13:21:13] <micges_work> what could cause this error?
[13:21:15] <micges_work> [ 188.141036] Default Trap Handler: vector 14: Suspend RT task efe73000
[13:21:44] <micges_work> iirc 14 seems to be page fault?
[13:56:00] <jepler> if so, that basically means the same as segmentation fault in userspace
[13:56:19] <jepler> dereference NULL pointer or pointer to freed memory or pointer before/after allocated memory, that sort of thing
[14:00:06] <SWPadnos> so, would it make sense to change the 8192 to 20480 in the wiki page that talks about memlock limits?
[14:01:01] <SWPadnos> and the other few places memlock is mentioned
[14:01:12] <jepler> SWPadnos: I don't know what number is required
[14:01:36] <SWPadnos> me either, but I think the current recommendation is 20480
[14:02:20] <jepler> I made debian/emc2.postinst write 20480 in that file the first time I added it to CVS, and haven't changed it since
[14:02:28] <jepler> so I doubt I've ever used a different (smaller) value
[14:02:43] <jepler> I don't know who came up with 8192 or what system(s) tested OK with that value
[14:02:49] <jepler> is there a report that 8192 is not enough?
[14:03:03] <SWPadnos> apparently salvarane had a problem with 8192
[14:03:08] <SWPadnos> or it was something else
[14:03:23] <cradek> is there any harm in having it slightly too big?
[14:03:25] <jepler> that's in kB, right?
[14:03:35] <SWPadnos> yes, I think it's in kB
[14:03:47] <SWPadnos> I guess the only harm should be on low memory systems
[14:04:10] <SWPadnos> 20M is a lot, but not on 256-512 and higher memory systems
[14:04:50] <jepler> hal's main shared memory area is just over 256kB and extra shared memory areas are typically on the order of a few hundred kB or less. I don't see how you'd get to 8192, frankly.
[14:05:25] <SWPadnos> yeah, I don't understand it either
[14:05:33] <SWPadnos> (that's not saying much, but still)
[14:05:40] <jepler> as far as I know that limit is per process not per user or per system.. if it was per process, then multiply the number of userspace HAL components by the size of the hal shared memory area...
[14:06:37] <jepler> hm, maybe I'm mistaken. strlimit RLIMIT_MEMLOCK ... sets a maximum on the total bytes in shared memory segments that may be locked by the real user ID of the calling process
[14:06:47] <jepler> setrlimit, not strlimit
[14:07:36] <jepler> If you're changing anything on the wiki, change it to say '* - memlock' instead of '* hard memlock' so that it changes the default soft limit too
[14:07:45] <SWPadnos> yep
[14:07:51] <jepler> one system recently tested on 10.04 (dewey?) had to explicitly lift the soft limit up to the hard limit
[14:08:03] <SWPadnos> mozmck, I think
[14:08:19] <SWPadnos> maybe both
[14:09:16] <SWPadnos> it's amazing how much slower the mouse cursor seems to move on my 1920x1200 laptop vs. the 1366x768 one
[14:11:42] <jepler> OK, the only RT system I have handy is 5.10 (!), but I did a little test there
[14:12:23] <jepler> with ulimit -l 300 halrun will start and load 10 instances of userspace components; ulmit -l 100 will not start at all
[14:12:25] <SWPadnos> src/rtapi/rtapi_ulapi.c has RECOMMENDED at 20480*1024
[14:12:45] <jepler> so either the accounting is per processes OR the same locked memory in multiple processes isn't counted multiple times towards the limit
[14:12:53] <SWPadnos> it's supposed to print a warning if the limit is below that
[14:14:45] <jepler> hmmm
[14:15:13] <jepler> if the limit is very low (ulimit -l 10) it dies earlier. If the limit is higher (-l 100) it reaches the point that gives that notice before dying
[14:16:05] <mozmck-6core> jepler, so you had to set soft limit too. Should we change the install script to use "* - memlock"?
[14:16:31] <jepler> mozmck-6core: YES we should change it (and I'm about to push that change). NO, I haven't had a system where I had to set the soft limit, BUT I haven't tested any 10.04
[14:17:09] <mozmck-6core> ah, ok. Thought you just said you tested 10.04
[14:17:31] <mozmck-6core> oh, dewey tested it I see.
[14:18:12] <mozmck-6core> bbl
[14:18:36] <SWPadnos> gah. why doesn't git-config core.pager seem to work
[14:18:57] <SWPadnos> git-config --global core.pager less <-- shouldn't that change the pager to less?
[14:20:55] <SWPadnos> oh, maybe it does work, and I had less than 1 screen of output :)
[14:26:24] <jepler> hm, is there a way to find the current amount of locked memory? getrusage() doesn't have it, and getrlimit() doesn't point at any other promising-sounding manpages.
[14:27:13] <SWPadnos> vmstat or similar?
[14:27:34] <SWPadnos> programmatically I'm at a loss though
[14:34:44] <jepler> Yeah, I was wondering if I can say in the message how much is already locked
[14:34:51] <CIA-1> EMC: 03jepler 07v2.4_branch * rbd58de1ce960 10/debian/emc2.postinst: Set both the soft and hard memory limits
[14:34:51] <jepler> no big deal
[14:34:52] <CIA-1> EMC: 03jepler 07v2.4_branch * r576ad114d3f3 10/src/rtapi/rtai_ulapi.c: Warn about locked memory in both locations that can fail
[15:37:43] <SWPadnos> so, I don't actually see where any memory is explicitly locked at all
[15:38:10] <SWPadnos> kernel memory is locked by default AFAIK, but userspace apps don't allocate kernel memory
[15:45:50] <jepler> I think the deal is that rtai memory is account as locked memory
[15:46:52] <jepler> ./base/include/rtai_shm.h: if ((adr = mmap(start, size, PROT_WRITE | PROT_READ, MAP_SHARED | MAP_LOCKED, hook, 0)) == MAP_FAILED) {;
[15:47:12] <jepler> looks like rtai specifies the MAP_LOCKED flag for memory it maps
[15:52:14] <SWPadnos> ok, I was searching around in src/, so I guess I missed the include stuff
[17:14:06] <micges> Dave_911: I think I'll use rtai serial api for my needs, my driver is hanging out often with no reason
[17:14:22] <micges> maybe rtai will works better
[18:43:33] <mozmck_work> looks like the booting problem with grub2 may be fixed by some recent ubuntu updates.
[18:45:16] <mozmck_work> two machines now seem to be booting reliably with grub2
[18:45:34] <cradek> yay!
[18:46:32] <mozmck_work> :) needs more testing but it's looking hopeful.
[18:48:07] <jepler> which package and version specifically?
[18:48:15] <skunkworks> Nice work! I am liking lucid
[18:54:14] <mozmck_work> jepler, not sure, but I know it updated mountall yesterday.
[18:55:17] <mozmck_work> I noticed my 6core was booting every time last night (it hadn't even been booting relaibly with the ubuntu kernel) so I did the updates here and rebooted a couple of times and it came right up.
[19:29:58] <morficmobile> SWPadnos: thanks, lathe sim now runs much quicker in 10.4
[20:33:15] <Dave911> micges: hanging out ... that stinks..
[20:34:30] <Dave911> The standard termios is pretty quick... and of course that doesn't run in a realtime
[20:34:57] <Dave911> Seems to be much reliable than anything I did in Windows before.
[20:39:25] <jepler> Dave911: looking at your change, you did two things. First, you changed the packet size calculation. Second, you changed SerialReceive to continue reading until it gets that many, as opposed to using VMIN/VTIME to read the desired byte count with timeout. Is that accurate? Are there other substantive changes in there?
[20:43:24] <Dave911> Dropped the 300 baud setting ... I hope that is insignificant also ... but that is pretty much it.
[20:44:10] <jepler> are you using on-board RS232 serial, or a USB-serial module of some sort?
[20:44:39] <Dave911> On board RS-232...
[20:45:44] <Dave911> However when testing with a Modbus simulator program, the Modbus RTU link was made through a USB to RS232 converter on the other PCs end ... Didn't seem to have an effect.
[20:46:41] <Dave911> I didn't try running USB from EMC2 to a slave, however I didn't remove the ability to put a pause after a transmit.. if that is required by someone..
[20:46:53] <jepler> fixing the size calculation on its own didn't fix the problem?
[20:49:52] <Dave911> No, there were multiple errors.. the size calculations were wrong for just about every function code.. He was using the CRC check to determine if the message was any good. That routine works ..
[20:50:38] <Dave911> He would read the message ... if it was complete it passed the CRC... the message length was never checked.. since it was wrong it would not have worked anyway..
[20:51:13] <Dave911> I should say, the calculated message lengths were wrong.... hence he couldn't use them to check for the proper number of chars returned ...
[20:51:22] <Dave911> Sort of crazy stuff...
[20:52:08] <Dave911> I went to port these changes back to the latest Classic Ladder code, and I just realized that he broke the modbus config on the newest version!?
[20:52:43] <jepler> it seems like if the calculated size was too small it would make the termios VMIN not work (say it needs to read 7 characters but VMIN is 6, you'll generally fail to read unless you delay so long that 7 bytes are all ready before the read is started)
[20:53:10] <Dave911> I'd like to move the Latest CL code to the EMC implementation..... but now I am not so sure ..
[20:53:29] <jepler> in emc we have stopped following upstream classicladder development due to its GPL3 license change -- we can't incorporate their new versions.
[20:53:45] <Dave911> I think you might remember me complaining about some of the termios docs
[20:54:22] <Dave911> CL didn't change to GPL3. The code I have is all LGPL2 still.. Unless he hasn't updated something else!
[20:54:34] <jepler> oh?
[20:55:47] <Dave911> Getting back to termios ... the read Vmin, Vtime .. simply doesn't work in some cases.. I found test sample code ... put it in and I could not get a read to block properly.. looked at the other Modbus implementation in EMC and they never tried to use the blocking function.. So I went their way ..
[20:57:03] <jepler> well, I looked at classicladder and I agree 0.8.008 is LGPL2.1 which is compatible with emc's license.
[20:57:08] <Dave911> Yep, just checked again.. classicladder.c says it is LGPL 2.1
[20:57:16] <Dave911> You beat me ..
[20:57:35] <Dave911> :-)
[20:57:41] <jepler> I wonder what I am thinking of, then...
[20:59:12] <Dave911> I'd like to get the newer CL graphics into the version in EMC if possible. It is much nicer..
[21:00:56] <Dave911> But his code sort of scares me... I think that Chris Morley (and you?) found a lot of bugs when it was integrated with emc a few years ago
[21:01:03] <SWPadnos> it's the other modbus library (used in ga2_vfd) that has gone to GPLv3
[21:01:07] <SWPadnos> or lgplv3
[21:01:32] <Dave911> Oh .. that is right .... you told me that Steve....
[21:01:55] <jepler> yes I bet you're right.
[21:02:01] <jepler> I had my wires crossed bad on that one
[21:02:31] <Dave911> I've gotta go but I will BBL ... let me know if you see any gaffs ;-) I'll check back here in a few hours ...

#emc-devel | Logs for 2010-05-26