Back to Home / #uml / 2007 / 10 / Prev Day | Next Day
#uml IRC Logs for 2007-10-01

---Logopened Mon Oct 01 00:00:50 2007
00:17|-|balbir [~balbir@] has joined #uml
00:26|-|flamingo [] has quit [Quit: Leaving]
00:27|-|rasix [~jeruk@] has joined #uml
02:30|-|magotari_school [~d4b631fa@] has joined #uml
03:03|-|Baltam [] has quit [Remote host closed the connection]
03:08|-|Baltam [] has joined #uml
03:39|-|Urgleflogue [~plamen@] has quit [Quit: 01001110 01100101 01110010 01100100 00100001]
03:40|-|Urgleflogue [~plamen@] has joined #uml
04:03|-|magotari_school [~d4b631fa@] has quit [Quit: CGI:IRC]
04:10|-|magotari_school [~d4b631fa@] has joined #uml
04:10|-|tyler29 [] has joined #uml
04:21|-|magotari_school [~d4b631fa@] has quit [Quit: CGI:IRC (EOF)]
05:04|-|rasix [~jeruk@] has quit [Quit: Leaving]
05:55|-|tyler29 [] has quit [Ping timeout: 480 seconds]
06:05|-|tyler29 [] has joined #uml
06:14|-|tyler29 [] has quit [Read error: Connection reset by peer]
06:16|-|tyler29 [] has joined #uml
06:36|-|kokoko1 [~Slacker@] has quit [Read error: Connection reset by peer]
06:47|-|krau [~cktakahas@] has joined #uml
07:10|-|tyler29 [] has quit [Ping timeout: 480 seconds]
07:16|-|tyler29 [] has joined #uml
07:29|-|dang [] has quit [Quit: Leaving.]
07:30|-|krau [~cktakahas@] has quit [Ping timeout: 480 seconds]
07:35|-|krau [~cktakahas@] has joined #uml
08:01|-|dang [] has joined #uml
08:07|-|tyler29 [] has quit [Ping timeout: 480 seconds]
08:24|-|tyler29 [] has joined #uml
08:51|-|tyler29 [] has quit [Ping timeout: 480 seconds]
09:01|-|tyler29 [] has joined #uml
09:59|-|flamingo [] has joined #uml
10:21|-|Magotari_ changed nick to Magotari
10:24|-|jdike [] has joined #uml
10:24<jdike>Hi guys
10:26<flamingo>hi jdike, trying to use os_create_unix_socket() in my driver doesnt seem to work, while I followed the same sample as in mconsole_kern.c, is there any restrictions on using that function in the driver?
10:29<flamingo>hm, I'm not sure what's going wrong, will try placing the call in the driver's init and check
10:31<Magotari>Mr. Jeff Dike, I presume? My name is Karol Swietlicki, and I came here because I have a problem with User Mode Linux.
10:31<Magotari>Memsplit related.
10:32<jdike>yup, that be me
10:32<Magotari>I changed my host kernel to a different one.
10:32<Magotari>After that all UML guests stopped working.
10:32<Magotari>I tracked it back to memsplit settings.
10:33<Magotari>I did set the right thing in the guest, under host memory split.
10:33<jdike>the old split was 3/1?
10:33<jdike>what was the new?
10:33<Magotari>And the new one was 3/1 opt.
10:33<Magotari>Kernel 2.6.23-rc7
10:33<Magotari>Both guest and host.
10:34<jdike>what's the difference?
10:34<Magotari>It would not work. the_hydra told me to compile my guests 2/2 and it worked again.
10:34<Magotari>100 megabytes of ram accesible to me when I set 3/1opt on the host.
10:35<Magotari>Otherwise I get just ~900M, and I have 1G on my computer.
10:35<Magotari>And I want that extra ram. :)
10:35<jdike>so, 3/1opt makes the split (3G-100M)/(1G+100M)?
10:35<Magotari>Erm, not quite.
10:36<Magotari>I think it is 3G/1G instead of 3G+100/1G-100
10:36<Magotari>I am a newbie.
10:36<Magotari>Anyway, I think that is a bug, because even though I set the right thing UML would not run.
10:36<jdike>I'm unclear as to how this would affect UML
10:37<jdike>it cares about where its address space ends
10:37<jdike>for 3/1, that's 0xc0000000
10:37<jdike>is it different for 3/1opt?
10:37<Magotari>No idea. I am sorry, I have no kernel experience.
10:38<Magotari>In fact this is the first time I am dealing with a kernel hacker, so I am quite nervous.
10:38<jdike>There is this option in the UML config
10:38<jdike>3G/1G user/kernel host split (for full 1G low memory)
10:38<jdike>which I guess is what you set
10:38<Magotari>Yes. I am aware of it.
10:38<Magotari>Yes, that is correct.
10:38<Magotari>I can supply any .configs that you might want.
10:39|-|hfb [] has joined #uml
10:40<jdike> default 0xB0000000 if HOST_VMSPLIT_3G_OPT
10:40<jdike>that's where UML considers the top of the address space to be in this case
10:40<jdike>which seems reasonable - it's 256M lower
10:41<Magotari>Yeah, it does seem reasonable, considering we are moving only about 100M
10:42<jdike>what happens with UML in this case?
10:42<Magotari>Kernel panic. Shall I get the exact message?
10:43<Magotari>Ok. Let me compile an opt kernel again, I got rid of all of them.
10:43<flamingo>jdike, I'm getting an error (errno = 13, EACCESS) when I do an os_create_unix_socket() in a directory where I have write permissions. In mconsole, there is a call to umid_file_name which creates a unique directory with the mkstmp trick. Is this the only directory where I can write?
10:45<flamingo>it's mconsole_kern.c instead of mconsole
10:48<Magotari>jdike: Kernel panic - not syncing: start_userspace : expected SIGSTOP, got status = 256
10:48<jdike>flamingo, no, it's not
10:50<jdike>man 7 unix doesn't mention EACCESS
10:50<flamingo>I was checking /usr/include/asm/errno.h
10:51<flamingo>the call to os_create_unix_socket returned this number. I looked at the code and I saw it is returning errno
10:52<jdike>which system call failed?
10:53<jdike>that's not a system call
10:55<flamingo>um, I can't point to the exact sys call which failed coz I'm just checking the return value. hm, there are 3 calls - socket(), bind() and fcntl() (called by os_set_exec_close() )
10:55<Magotari>jdike: I have a backtrace too, if you are interested.
10:56<jdike>can you strace UML or gdb it though that call?
10:56<Magotari>But I think you should just help flamingo first, I am in no rush. Sorry if I am too pushy.
10:56<jdike>probably not the fcntl because that results in a printk
10:56<Magotari>I only used gdb a few times, but I can strace it, yeah.
10:56<jdike>Magotari, the backtrace isn't too interesting in this case
10:57<jdike>it created a new process, which failed
10:57<jdike>the reason it failed is interesting, but that doesn't show up in the backtrace
10:57<Magotari>If you tell me how to do the gdb thing, then I can. Otherwise I can send the strace output somewhere.
10:57<Magotari>You know...
10:57<Magotari>I am a newbie, but I want to learn.
10:58<Magotari>So if you could tell me what needs to be done, I can read and learn how to do it.
10:58|-|mgross [] has joined #uml
10:59<jdike>Magotari, when you do 'cat /proc/$$/maps', what's the last line of output?
10:59<Magotari>ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso]
11:02<jdike>OK, how about the line before that
11:02<jdike>something that doesn't say vdso
11:02|-|hfb [] has quit [Ping timeout: 480 seconds]
11:03<Magotari>afb0d000-afb22000 rw-p affeb000 00:00 0 [stack]
11:03<Magotari>That is the second last line.
11:03<Magotari>THe next one is the vdso one.
11:07<jdike>there were no errors before the panic?
11:08<Magotari>Let me see...
11:09<Magotari>I missed that one: mapping mmap stub failed, errno = 12
11:09<Magotari>RIGHT before the panic.
11:10<jdike>better information
11:11<Magotari>Sorry, I missed it the first twenty times I had a panic.
11:18|-|hfb [] has joined #uml
11:22<jdike>interested in testing a patch?
11:22<jdike>this is just better diagnostics
11:23<Magotari>i am on the phone
11:26<Magotari>need 30 minutes, phone
11:33<Magotari>Ok. Ready to test the patch.
11:35<Magotari>The patch did not apply. (I am using 2.6.23-rc7) I will apply it by hand.
11:38<jdike>I'm using rc8-mm2
11:38<jdike>but it's a simple patch
11:42<Magotari>Compmapping mmap stub at 0xbfffe000 failed, errno = 12
11:42<Magotari>mapping mmap stub at 0xbfffe000 failed, errno = 12
11:42<Magotari>Sorry about that.
11:48<jdike>this is with VMSPLIT_3G_OPT?
11:48<Magotari>Affirmative. Both host and guest are set to 3G_OPT
11:49<Magotari>Also fails (obivously) with guest set to 3G. Works for 2/2 guests.
11:53<jdike>grep STUB .config
11:53<Magotari>Host or guest?
11:54<Magotari>That was guest.
11:55<Magotari>Host returned nothing.
11:57<jdike>grep VMSPLIT .config
12:00<jdike>the Kconfig logic for selecting the top of the address space isn't working
12:01<jdike>works here
12:01<jdike># CONFIG_HOST_VMSPLIT_3G is not set
12:01<jdike># CONFIG_HOST_VMSPLIT_2G is not set
12:01<jdike># CONFIG_HOST_VMSPLIT_1G is not set
12:03<Magotari>So this is not a problem with UML, but with my toolchain?
12:03<jdike>Nope, it's a UML problem
12:03<jdike>config STUB_CODE
12:03<jdike> hex
12:03<jdike> default 0xbfffe000 if !HOST_VMSPLIT_2G
12:03<jdike> default 0x7fffe000 if HOST_VMSPLIT_2G
12:03<jdike>We went to the trouble of calculating TOP_ADDR correctly
12:03<jdike>then completely ignore it when calculating the stub addresses
12:08|-|flamingo [] has quit [Ping timeout: 480 seconds]
12:10|-|krau [~cktakahas@] has quit [Quit: Varei!!!]
12:13<Magotari>I have to go off for a while again, phone stuff. Be back soon.
12:14|-|tyler29 [] has quit [Ping timeout: 480 seconds]
12:37<Magotari>And back.
12:42<jdike>you can't do math in a Kconfig file
12:43<Magotari>Don't you think hardcoding would do the trick?
12:43<Magotari>Not like there are many options, we can just put them all in, till we exhaust the possibilities.
12:43|-|flamingo [] has joined #uml
12:46<Magotari>The only other way I see would be to make UML check at runtime, and adjust itself accordingly. Then again, I am just a newbie, so I will shut up.
12:54<jdike>We've thought about that
12:54<jdike>there's some trickiness to figuring out where the top of your address space is
12:54<jdike>but once you have that, the rest is easy
13:00<jdike>Want to try a patch?
13:01<Magotari>I don't know C, I am new to Linux... All I can do to help is write stuff, test patches and report bugs.
13:01<Magotari>And I do want to help.
13:01<jdike>it includes the previous patch, so back that out
13:01<jdike>also against rc8-mm2 so there might be conflicts, but it's very simple
13:02<Magotari>I will get rc8-mm2.
13:04<Magotari>Eh, big download and slow link. On second thought, I will try my luck with the patch.
13:05|-|flamingo [] has quit [Ping timeout: 480 seconds]
13:12|-|krau [~cktakahas@] has joined #uml
13:13<Magotari>No good. Failures abound. Gotta get -mm after all. *sigh*
13:14<jdike>it's a s/UML_CONFIG// mostly
13:14<jdike>other than that, it's removing stuff from Kconfig.i386 and rearranging as-layout a little
13:29|-|tyler29 [] has joined #uml
13:33<Magotari>Ok, it patched fine now.
13:33<Magotari>Gonna compile and test in a moment.
13:34<Magotari>Hey, UML got tickless in -mm? Nice.
13:37<Magotari>I was about to explode in joy, but then I saw that SMP is for TT mode only.
13:38<jdike>yeah, that's still a work in progress
13:39<Magotari>With TT?
13:39<Magotari>I thought that TT was removed from one of the recent kernel releases...
13:39<jdike>it's gone from -mm
13:39<jdike>no, with skas
13:40<Magotari>Good luck with it. If you ever want someone to act as a guinea pig, I volounteer. :) Anyway, the mm kernel compiled. Trying it now.
13:42<Magotari>It worked.
13:42<Magotari>I have booted my other linux partition with it. Seems to be working fine.
13:42<jdike>stupid bug
13:43<Magotari>Thank you for your work. That was really nice of you to fix this.
13:44<Magotari>I have some more strange things about UML to talk about, but that is when you have time. I really don't want to steal your day on fixing my bugs.
13:44<Magotari>Hmm... Actually, the -mm UML seems to be using 100% CPU.
13:46<jdike>strace it and see what's going on
13:46<jdike>or gdb it and get a backtrace
13:46<Magotari>First I want to try the Fedora image from your website.
13:47<Magotari>Booting a real partition is my favourite use for UML, but it might cause problems.
13:47<Magotari>So best to confirm on the baseline thing.
13:49|-|tyler29 [] has quit [Ping timeout: 480 seconds]
13:50<Magotari>top is going crazy inside the Fedora image. Refreshing like mad.
13:50<jdike>timer problems
13:50<jdike>kill stop and do time sleep 1
13:51<jdike> time sleep 1
13:51<jdike>real 0m1.032s
13:51<jdike>seems OK here
13:51<Magotari>real 0m1.016s
13:51<Magotari>Seems fine too.
13:52<Magotari>But yet UML eats 100% on the host.
13:52<Magotari>I did NOT enable tickless.
13:52<jdike>it's automtiv
13:52<Magotari>I just copied over my config from rc7. Could this be a problem?
13:54<jdike>run oldconfig to make sure
13:57<jdike>top uses select for timing
13:57<jdike>14:55:01.752727 select(1, [0], NULL, NULL, {3, 0}) = 0 (Timeout)
13:57<jdike>14:55:04.722689 fcntl64(0, F_SETFL, O_RDWR|O_NONBLOCK|O_LARGEFILE) = 0
13:57<Magotari>Done, but the oldconfig kernel does not do any better. Still 100%.
13:59<jdike>OK, strace -tt -o /tmp/x top
13:59<jdike>wait a bit, then look at /tmp/x
13:59<Magotari>Do I do this on host or guest?
14:00<Magotari>Yeah, guest does not have strace. Any strace enabled images on nagafix?
14:00<jdike>yum install it
14:00<jdike>those images are fairly minimal
14:01<Magotari>Oh, I know how to install things. The problem is setting up a network. I will use my real partition again, I can emerge it there with no problem.
14:01<Magotari>And the network is all ready.
14:02|-|tyler29 [] has joined #uml
14:04<Magotari>Yeah... But the network is not working.
14:04<Magotari>And I am quite sure it worked with rc7.
14:04<jdike>UML network?
14:06<Magotari>I have just one guest using a TUN/TAP network with my real network.
14:06<Magotari>And this guest now eats 100% CPU and won't network.
14:07<Magotari>I will compile mm from scratch now. Just in case.
14:07<Magotari>Too much breakage to just happen like that.
14:07<Magotari>Probably my bad config.
14:07<jdike>but I have someone else complaining about the network not working
14:07<jdike>so it might be my bug
14:07|-|flamingo [] has joined #uml
14:08<jdike>if you're seeing strange UML things, stay here and talk to me
14:09<jdike>I need to get things fixed, and there are enough changes going into 2.6.24 that I'm worried about what I broke
14:12<Magotari>I will gladly help with whatever I can.
14:13<Magotari>Compiling mm from scratch and defconfig
14:13<Magotari>I left tickless on off.
14:13<Magotari>Erm. I did not enable tickless. That sounds clearer.
14:16<Magotari> * WARNING: net.eth0 has started but is inactive
14:16<Magotari>This is Gentoo's init. No idea what it means, but this message was not there before.
14:17<Magotari>Still 100% cpu.
14:17<Magotari>I am going to compile an rc7 kernel to get the network back to get strace back.
14:17<Magotari>Er no. To get strace. It was never there.
14:22<jdike>that patch breaks x86_64
14:25<Magotari>Yeah. That would hurt.
14:28<jdike>Forgot to fix Kconfig.x86_64
14:31<Magotari>Using a 2/2 kernel, booting sda3. top is sane again. Network is up.
14:38<Magotari>Ok, got strace working. Now I am back in -mm and doing the strace thing.
14:43<Magotari>And here is the select bit: 19:37:57.904289 select(1, [0], NULL, NULL, {3, 0}) = 0 (Timeout)
14:43<Magotari>Same as yours, really.
14:46<Magotari>Host top:
14:46<Magotari>21:44:29.737513 select(1, [0], NULL, NULL, {3, 0}) = 0 (Timeout)
14:46<Magotari>21:44:32.736762 fcntl64(0, F_SETFL, O_RDWR|O_NONBLOCK) = 0
14:46<Magotari>Guest top:
14:46<Magotari>19:37:57.904289 select(1, [0], NULL, NULL, {3, 0}) = 0 (Timeout)
14:46<Magotari>19:37:57.906011 fcntl64(0, F_SETFL, O_RDWR|O_NONBLOCK|O_LARGEFILE) = 0
14:47<Magotari>And yes, the network seems busted.
14:49[~]fo0bar is starting to doubt his resolve in staying with debian etch's 2.6.18 base for UML kernels
14:49<fo0bar>nonetheless, looks like my backport of 2.6.22's arch/um/kernel/time.c to 2.6.18 was successful :)
15:00<jdike>except that a 3 second timeout finishes in .002 sec
15:01<jdike>Magotari, can you send me your .config?
15:01<Magotari>host, guest?
15:02<Magotari>Sorry I keep asking. Is a pastebin ok, or do you want a file? I am using irssi and I don't know how to dcc here.
15:02<jdike>pastebin is OK
15:04<Magotari>Here you go.
15:04<Magotari>That is the -mm kernel.
15:07|-|tyler29 [] has quit [Ping timeout: 480 seconds]
15:08[~]jdike builds it
15:10<jdike>x86_64 works better
15:16<Magotari>Got something interesting?
15:16<jdike>I wouldn't say it's going crazy, but top is very very alert
15:16<Magotari>Yeah, too much coffee. Way too much.
15:17<Magotari>I had a hard time killing it, it was that alert.
15:18<jdike>16:17:38.945428 select(1, [0], NULL, NULL, {3, 0}) = 0 (Timeout)
15:18<jdike>16:17:39.590150 fcntl64(0, F_SETFL, O_RDWR|O_NONBLOCK|O_LARGEFILE) = 0
15:18<jdike>so, 3 sec == .45 sec
15:18<jdike>not as bad as you're seeing
15:19<Magotari>About half.
15:19<Magotari>Probably this is significant somehow.
15:19<Magotari>My user mode linux is pretty close to doing an infinite loop in five seconds now. ;)
15:20<Magotari>Really, pretty horrible. I bet it has to do with me not enabling real time clock. Let me test this idea.
15:22<jdike>the real-time clock option went away with tickless
15:22<Magotari>No real time clock option in -mm?
15:22<jdike>maybe that was a mistake
15:22<Magotari>But I did not enable tickless.
15:22<jdike>I forgot that it was still an option
15:23<Magotari>I *NEVER* had real time clock enabled anywhere anyway.
15:23<jdike>that probably explains it
15:23<jdike>I went through and ripped out stuff that depended on ticking clocks
15:23<Magotari>I enabled tickless. Gonna try now.
15:23<jdike>and now I have to put it back
15:24<jdike>whoops, thought you said you never enabled tickless
15:24|-|tyler29 [] has joined #uml
15:25<Magotari>jdike: No, I enabled tickless just now.
15:25<jdike>OK, so you didn't have it before?
15:25<Magotari>To test if enabling it will fix the problem. It seems to have done it.
15:25<Magotari>No, never.
15:25<Magotari>And I think my network is back too.
15:31<Magotari>UML with tickless feels strange.
15:34<Magotari>First, the performance feels sluggish. Second, the host feel sluggish too.
15:34<Magotari>I will get a benchmarking tool later, and try to measure it.
15:34<Magotari>Might be just some -mm thing, unrelated to UML.
15:35<Magotari>top shows my X working a lot.
15:36<Magotari>7% of work with top running on UML. UML gets no % at all, as seen from the host.
15:38<Magotari>Anyway, just feels different.
15:43<jdike>OK, builds and runs on both i386 and x86_64
15:53<Magotari>It might be an another timer issue, but a context switch benchmark gave me only half the performance of 2.6.23-rc3. The arithmetic benchmark gave me double the performance of rc3. *sigh*
15:53<Magotari>It sure did not feel like the programs are working for 5 seconds, as they were supposed to do.
15:58<Magotari>jdike: Sorry for the monologue there. I got carried away. In other news, I must go to sleep soon. It is one hour before midnight, and I must get up soon. I can come tomorrow, if that would help you.
15:58<jdike>I might have more patches for you
15:58<Magotari>If you have some in the next half an hour, I would be able to test today.
16:00<jdike>putting the real-time clock stuff back will take longer than that
16:01<Magotari>I see.
16:02<Magotari>I really cannot stay up today, sadly.
16:09|-|tyler29 [] has quit [Remote host closed the connection]
16:12|-|flamingo [] has quit [Ping timeout: 480 seconds]
16:17<jdike>Magotari, still there?
16:17|-|dang [] has quit [Quit: Leaving.]
16:19<jdike>want to be CC-d on the patch?
16:20<jdike>your email address, then (pm if you want)
16:20<jdike>I'm about to send it to Andrew
16:21<Magotari>$Mynick at gmail dot com
16:21<Magotari>Please expand the variable.
16:22|-|Magotari [] has quit [Quit: leaving]
16:25|-|tyler29 [] has joined #uml
16:45|-|kos_tom [] has joined #uml
17:05|-|tyler29 [] has quit [Read error: Connection reset by peer]
17:06|-|tyler29 [] has joined #uml
17:19|-|flamingo [] has joined #uml
17:28|-|kos_tom [] has quit [Ping timeout: 480 seconds]
17:44<flamingo>jdike, we had a conversation earlier about os_create_unix_socket() call which was failing. you had asked me to find which system call was failing. I found that the call is failing in bind() with EACCESS errno. I'm trying to open a socket in /tmp/ area which is world writable, I'm wondering which I'm getting a EACCESS error
17:45<flamingo>s/wondering which/wondering why/
18:07|-|tyler29 [] has quit [Ping timeout: 480 seconds]
18:35<flamingo>jdike, I take back the question, it was a pathetically stupid error on my part, despite the comment in the os_create_unix_socket() implmentation about the overflow sigh!
18:43[~]jdike reads the comments
18:48|-|dang [] has joined #uml
18:57|-|jdike [] has quit [Quit: Leaving]
18:57|-|mgross [] has quit [Ping timeout: 480 seconds]
18:58|-|hfb [] has quit [Quit: Leaving]
19:02|-|flamingo [] has quit [Ping timeout: 480 seconds]
19:35|-|_Hunger [] has joined #uml
22:58|-|VS_ChanLog [] has left #uml [Rotating Logs]
22:58|-|VS_ChanLog [] has joined #uml
---Logclosed Tue Oct 02 00:00:06 2007