Back to Home / #uml / 2007 / 01 / Prev Day | Next Day
#uml IRC Logs for 2007-01-05

---Logopened Fri Jan 05 00:00:57 2007
00:38|-|MrX [~chaos@] has quit [Ping timeout: 480 seconds]
00:41|-|MrX [~chaos@] has joined #uml
00:53|-|albertito [] has quit [Ping timeout: 480 seconds]
01:48|-|tchan [] has joined #uml
04:24|-|Netsplit <-> quits: ElectricElf, linbot, MrX, bartman, Urgleflogue
04:24|-|da-x [] has joined #uml
04:25|-|Netsplit over, joins: MrX, Urgleflogue, linbot, bartman
04:25|-|Netsplit over, joins: ElectricElf
05:15|-|tyler [] has joined #uml
05:35|-|richardw [] has joined #uml
05:49|-|tyler [] has quit [Ping timeout: 480 seconds]
06:05|-|tyler [] has joined #uml
07:17|-|mkhl [] has joined #uml
07:45|-|MrX [~chaos@] has quit [Ping timeout: 480 seconds]
07:46|-|MrX [~chaos@] has joined #uml
07:51|-|tyler [] has quit [Ping timeout: 480 seconds]
09:18|-|tyler [] has joined #uml
09:20|-|ram [] has joined #uml
09:25|-|albertito [] has joined #uml
09:28|-|richardw_ [] has joined #uml
09:31|-|Newsome [] has joined #uml
09:35|-|richardw [] has quit [Ping timeout: 480 seconds]
10:30|-|silug [~steve@] has joined #uml
11:00|-|ram [] has quit [Ping timeout: 480 seconds]
11:02|-|tyler [] has quit [Ping timeout: 480 seconds]
11:17|-|mkhl [] has quit []
11:19|-|tyler [] has joined #uml
11:21|-|kos_tom [] has joined #uml
11:21|-|hfb [] has joined #uml
11:41|-|albertito [] has quit [Read error: Connection reset by peer]
11:42|-|jdike [~jdike@] has joined #uml
11:43<jdike>Hi guys
11:43<richardw_>hi jdike
11:44|-|albertito [] has joined #uml
11:48|-|hfb [] has left #uml [Leaving]
11:53|-|richardw_ [] has quit [Quit: Leaving]
12:09|-|the_hydra [~mulyadi@] has joined #uml
12:09<the_hydra>hi everybody
12:19<albertito>the_hydra: hi!
12:44|-|ram [] has joined #uml
12:44|-|tyler [] has quit [Ping timeout: 480 seconds]
13:01|-|tyler [] has joined #uml
13:29<the_hydra>bonjour monseiour kokoko1
13:30[~]the_hydra first attempt using French
13:30<kokoko1>jdike, the UML still crashing :(
13:30<kokoko1>dammit last time before minutes ago it was down for almost 16 minutes
13:31<kokoko1>paul: ok it has stopped for 16 mins <--- boss :(
13:32<kokoko1>the_hydra, howdy
13:32<the_hydra>oh ur boss is there when it stops booting?
13:32<the_hydra>kokoko1: howdy :)
13:33<kokoko1>yes he always there to kick sysadmins asses :S
13:35<the_hydra>consult dgraves, he has many ideas how to keep bosses away :)
13:35<the_hydra>oh wait, that's because he is the boss :)
13:36<jdike>kokoko1, I'd like to figure out how to get a core dump from it
13:36<kokoko1>jdike, i moved all hosts to tmpfs but it doesn't help in case of that particular uml (mail server)
13:36<the_hydra>jdike: make it SIGSEGV-ed?
13:37<kokoko1>how to take core dump :S
13:37<jdike>except that there's apparently no warning
13:37<kokoko1>yes no warning no high memory usage and it crashes
13:38|-|ElectricElf [] has quit [Ping timeout: 480 seconds]
13:38<jdike>refresh my memory - you have no /proc/sys/kernel/print-fatal-signals on this host, correct?
13:38<the_hydra>jdike: how about using some forensics guys' tool? procdump ?
13:39<the_hydra>jdike: you need core dump aka process address space dump,right?
13:39<kokoko1>jdike, yep no /proc/sys/kernel/print-fatal-signals
13:39<jdike>if I had a core dump, that could be most informative
13:40<jdike>is abort() supposed to cause a core dump?
13:41<the_hydra>not sure
13:41<jdike>it does
13:41<jdike>you just need ulimit -c unlimited
13:43<jdike>kokoko1, can you check that the UML's owner has write permission to the UML pwd and that its ulimits permit core dumps
13:43|-|ElectricElf [] has joined #uml
13:45<kokoko1>second one is not permitted
13:45<kokoko1>imean ulimit for uml owner
13:46<jdike>OK, it needs to be unlimited
13:47<the_hydra>jdike: or panic()
13:47<jdike>there's no warning
13:47<jdike>nothing inside UML has any clue that it is being killed
13:48<kokoko1>the_hydra, nothing in uml or host logs why UML killed
13:48<kokoko1>and its happening from last one month it crashes after day or two
13:48<kokoko1>jdike, looking for unlimited
13:49<the_hydra>kokoko1: personally I am really sorry to hear that... no solutions so far?
13:50<kokoko1>in vain
13:50<kokoko1>its very important uml our mail server and if it not stop crashing then boss got no excuse to ask us to move it to xen guest
13:51<jdike>kokoko1, I just got a good-looking core dump from a UML here
13:51<jdike>so just make sure of those two things and see what happens on the next crash
13:51<the_hydra>kokoko1: are there any special libraries or tools installed in your machine?
13:52<kokoko1>jdike, how you get it ? :)
13:52<jdike>kill -ABRT uml-pid
13:52<kokoko1>the_hydra, nothing only postfix , dovecot, SA, ldap
13:52<kokoko1>and yes mysql
13:53<kokoko1>jdike, this operation will take minutes ..?
13:53<jdike>which operation?
13:53<jdike>the kill?
13:53<kokoko1>yep it will kill the uml ?
13:53<the_hydra>core dumping you mean kokoko1 ?
13:54<the_hydra>no, it just sends signal
13:54<kokoko1>ah right
13:54<the_hydra>don[t get confused with the name ;)
13:54<the_hydra>jdike: u pick ABORT because it isn't caugght by UML sighandler?
13:54<the_hydra>if so then you have plenty of choice
13:54<jdike>because the default action for SIGABRT is a core dump
13:55<the_hydra>i see here SIGFPE, SIGILL, and so on
13:55<jdike>maybe others as well, but ABRT is the classical cause-a-core-dump signal
13:55<the_hydra>i see
13:55<the_hydra>ok just pick abort kokoko1
13:56<jdike>well, no
13:57<jdike>kokoko1 just needs to make sure the thing can dump core
13:57<jdike>then we wait for it to die
13:57<jdike>the SIGABRT was just a test here
13:57|-|tyler [] has quit [Read error: Operation timed out]
13:58<the_hydra>yes, I mean to dump core :)
13:59<kokoko1>what proc id should be ?
13:59<kokoko1>as UML is running nder dtach
13:59<jdike>proc id?
14:00<jdike>what does the script that runs UML look like?
14:02<kokoko1>ps -U vm1 -oppid,pid,user,cmd
14:02<kokoko1> PPID PID USER CMD
14:02<kokoko1> 5035 5036 vm1 /linux mem=192M ubd0=root ubd1=var ubd2=swap eth0=tuntap,tap1,, con=pty con0=fd:0,fd:1 umid=vm1
14:02<kokoko1> 5036 5042 vm1 [linux]
14:02<kokoko1> 5036 5046 vm1 /linux mem=192M ubd0=root ubd1=var ubd2=swap eth0=tuntap,tap1,, con=pty con0=fd:0,fd:1 umid=vm1
14:02<kokoko1> 5036 5047 vm1 /linux mem=192M ubd0=root ubd1=var ubd2=swap eth0=tuntap,tap1,, con=pty con0=fd:0,fd:1 umid=vm1
14:02<kokoko1> 5036 5048 vm1 /linux mem=192M ubd0=root ubd1=var ubd2=swap eth0=tuntap,tap1,, con=pty con0=fd:0,fd:1 umid=vm1
14:02<kokoko1>[root@k2 ~]# ps -p 5035
14:02<kokoko1> PID TTY TIME CMD
14:02<kokoko1> 5035 ? 00:00:00 dtach
14:02<kokoko1>[root@k2 ~]# ps -p 5035 -oppid,pid,user,cmd
14:02<kokoko1> PPID PID USER CMD
14:02<kokoko1> 1 5035 root dtach -n console-socket compartment --user vm1 --group vm1 --chroot . /linux mem=192M ubd0=root ubd1=var ubd2=s
14:03<kokoko1>umlrun script
14:04<jdike>that's not complete
14:05<kokoko1>yes :S
14:06<the_hydra>kokoko1: just curious, are u using grsecurity patchset?
14:07<kokoko1>jdike, some more part of that script :S
14:07<kokoko1>the_hydra, nope
14:07<the_hydra>kokoko1: ok
14:08<kokoko1>damn its a long script
14:08<the_hydra>gee, what's that? perl wrapper?
14:08<jdike>how is the script run?
14:08<jdike>by hand?
14:08<the_hydra>sorry, I am not so familiar with it
14:09<kokoko1>you mean hw we fire uml using this script
14:09<kokoko1>umlrun vm1 (inside /var/uml/vm1/ directory
14:09<kokoko1>and there is a uml service running when host boot
14:10<kokoko1>to fire all uml on boot
14:10<jdike>can I see umlrun?
14:10<kokoko1>the three part script which i pasted is umlrun
14:11<jdike>what runs umlrun?
14:11<kokoko1>you remember you asked me to add these lines to this script..
14:11<kokoko1>&dosystem("mount -o bind /proc/mm proc/mm");
14:11<kokoko1> &dosystem("mount -o bind /dev/shm tmp");
14:11<kokoko1> $cmd = "compartment --user $myuser --group $myuser"
14:11<kokoko1> &dosystem("mount -o bind /dev/shm
14:12|-|tyler [] has joined #uml
14:12<jdike>can I see that again?
14:16<kokoko1>umlrun script
14:16<jdike>that's what you already showed me
14:16<kokoko1>when host reboot umlrun fired by /etc/init.d/uml
14:16<jdike>what runs that?
14:17<jdike>can I see /etc/init.d/uml
14:18<kokoko1>when uml crashes we do it by hand 'umlrun vm1'
14:18<kokoko1>or umlrun vmX (x for uml )
14:19<jdike>put 'ulimit -c unlimited' near the top of that script
14:20<jdike>like after the 'test -f /etc/uml.conf || exit 0'
14:20<jdike>and when you run it by hand, make sure you have done ulimit -c unlimited
14:31<kokoko1>anything else to do ?
14:32<jdike>yes, one sec
14:32<jdike>I have a patch for you to apply
14:37<kokoko1>what you thinks may i give a reboot to uml , by setting ulimit to unlimied now?
14:39<kokoko1>can't we add this ulimit to 'umlrun' ?
14:39<jdike>umlrun is a perl script
14:39<kokoko1>it will not affect if guest started using umlrun
14:40<jdike>if you system("ulimit...") that will affect the short-lived shell run by system, and nothing else
14:40<kokoko1>hmm umm
14:41<jdike>that will tell you what the core dump limit is, so you know you got it right
14:42<jdike>BTW, wget -O - -q | patch -p1 should do the trick
14:42<kokoko1>you mean apply this patch against kernel source?
14:42<jdike>against UML source
14:43<kokoko1>sorry uml kernel source right?
14:43<jdike>the first thing UML will print is the core dump limits
14:43<jdike>you want to see
14:44<jdike>soft - NONE
14:44<jdike>hard - NONE
14:49<kokoko1>linux-]$ wget -O - -q | patch -p1
14:49<kokoko1>(Stripping trailing CRs from patch.)
14:49<kokoko1>patching file arch/um/os-Linux/main.c
14:49<kokoko1>Hunk #1 succeeded at 54 (offset -1 lines).
14:49<kokoko1>patch unexpectedly ends in middle of line
14:49<kokoko1>Hunk #2 succeeded at 164 with fuzz 1 (offset -19 lines).
14:49<kokoko1>its okay?
14:53<kokoko1>we are using , you want me to get latest vanilla kernel ?
14:53<kokoko1>or its okay to go with this one
14:55<kokoko1>jdike, its okay if i start the compilation with 'make linux ARCH=um'
14:55<kokoko1>no need to do make mrproper ARCH=um ; make defconfig ARCH=um
14:56<jdike>yeah, that's OK
14:57<kokoko1>running 'make linux ARCH=um atm
14:57<jdike>to make sure it's OK, just boot it and make sure you see good core dump info at the top
14:58<kokoko1>right, and before rebooting i also give ulimit -c unlimited on host shell right?
15:05|-|the_hydra [~mulyadi@] has quit [Quit: using sirc version 2.211+KSIRC/1.3.12]
15:25|-|Newsome [] has quit [Quit: Linux: Now with employee pricing!]
15:26|-|yakker [~yakker@aegis.CS.Princeton.EDU] has joined #uml
15:28<yakker>how does one attach gdb to a UML kernel thread?
15:31<jdike>gdb linux <pid>
15:32<jdike>what do you mean by a "kernel thread"?
15:33<yakker>eg. init in init/main.c
15:33<jdike>that's not a kernel thread
15:33<jdike>that becomes init
15:33<yakker>which is launched using kernel_thread
15:33<yakker>eventually, yes, but I'm facing a situation in which it crashes before that
15:34<jdike>just run UML under gdb then
15:34<jdike>with a breakpoint in init
15:45<kokoko1>jdike, just rebooted uml into patched kernel and before rebooting i did ulimit -c unlimited
15:46<kokoko1>jdike, here is the console output when uml starting...
15:46<kokoko1>I can't see anything related to core dump :(
15:47<jdike>where's the boot output?
15:48<kokoko1> <-- this is the boot output
15:49<jdike>no it's not
15:49<kokoko1>i have this when we fire uml using 'umlrun' script :(
15:50<jdike>where do you put the boot output?
15:52<jdike>boot output looks like
15:52<jdike>afk for a bit
15:52<kokoko1>I duno where umlrun put boot output :S
15:53<kokoko1>I thinks i should run it from console
15:53<kokoko1> ./linux mem=128M ubd0=root ubd1=var ubd2=swap
15:53<kokoko1>this way i will get the boot output
15:56|-|tyler [] has quit [Ping timeout: 480 seconds]
16:02<kokoko1>jdike, see this output
16:02<kokoko1>we have ..
16:02<kokoko1>Core dump limits :
16:02<kokoko1> soft - NONE
16:02<kokoko1> hard - NONE
16:13|-|tyler [] has joined #uml
16:21<kokoko1>anything else you want me to do :S
16:22<kokoko1>what this Core dump limits will do ?
16:23<jdike>now boot your mail server, make sure it says the same thing, and wait for it to die
16:23<kokoko1>yes i have already booted the mail server
16:23<jdike>and double-check that it can write into its pwd
16:23<kokoko1>okay tell me if it dies where to look for core dump ?
16:24<jdike>in its pwd
16:24<kokoko1>I'll let you know
16:24<jdike>s why it needs to be able to write there
16:25<kokoko1>drwxr-xr-x 6 vm1 vm1 4096 Jan 5 22:05 vm1
16:25<kokoko1>vm1 is mail server pwd,
16:25<jdike>and the user is vm1?
16:25<kokoko1>it should be able to write into vm1 directory
16:25<jdike>now just wait
16:25<kokoko1>day or two :)
16:26<jdike>and you checked that it said it had no limits?
16:26<jdike>the mail server, not the test UML you just booted?
16:26<jdike>or was that it?
16:28<kokoko1>it was mail server boot output
16:28<kokoko1>i show you
16:28<kokoko1>its now off peak hours, i can reboot it anytime safely :)
16:28<kokoko1>its 3:28 am here in .pk :)
16:29<kokoko1>heh, hard earn money
16:30[~]kokoko1 light b&h
16:31<kokoko1>waiting when the uml dies
16:35|-|tchan [] has quit [Ping timeout: 480 seconds]
16:36<kokoko1>jdike, shoot me with your email address in case i have to mail you core dump during weekends
16:36<kokoko1>if possible for you to share email
16:40|-|tyler [] has quit [Ping timeout: 480 seconds]
16:43<kokoko1>when patching host kernel with skas do i have to enable ..
16:43<kokoko1> Make UML childs /proc/<pid> completely browsable (PROC_MM_DUMPABLE) [N/y/?] (NEW ?
16:49<kokoko1>I am doing linux- + skas3 on host
16:49<yakker>has anyone else faced problems with booting UML on FC5/2.6.18?
16:50<yakker>I'm seeing a freeze-up just after the line VFS: Mounted root
16:50<kokoko1>yakker, i'm booting lot of uml on fc5 but kernel is
16:50<yakker>I know - 2.7.17.x works for me too
16:51<kokoko1>yakker, you are runnign skas on host?
16:51<yakker>looks like that the actual freeze happens when init is launched
16:51<kokoko1>hmm i was just wondering how good skas3 patch for
16:51<kokoko1>i'm compiling it right now
16:51<yakker>kokoko1, i use it, works fine
16:52<kokoko1>that's good
16:52<kokoko1>yakker, why not for your uml ?
16:52<kokoko1>why you are trying 2.6.18 any good reason
16:57|-|tyler [] has joined #uml
17:14|-|tchan [] has joined #uml
17:16|-|tyler [] has quit [Ping timeout: 480 seconds]
17:26|-|level2 [~none@] has joined #uml
17:33|-|tyler [] has joined #uml
17:44|-|kos_tom [] has quit [Quit: I like core dumps]
17:51|-|kos_tom [] has joined #uml
17:51|-|kos_tom [] has quit []
17:55<yakker>jdike, are you around?
18:00|-|tyler [] has quit [Ping timeout: 480 seconds]
18:20|-|level2 [~none@] has quit [Quit: Leaving]
18:38|-|yakker [~yakker@aegis.CS.Princeton.EDU] has quit [Quit: Leaving]
19:36|-|yakker [~sapan@] has joined #uml
19:36<yakker>hi all
19:47[~]yakker tries to chase down a UML hang on Fedora
19:49|-|ram [] has quit [Ping timeout: 480 seconds]
19:56|-|silug [~steve@] has quit [Ping timeout: 480 seconds]
20:53|-|yakker [~sapan@] has quit [Quit: yakker]
21:50|-|tchan [] has quit [Quit: WeeChat 0.2.2-cvs]
21:51|-|tchan [] has joined #uml
21:59|-|silug [~steve@] has joined #uml
22:18|-|Newsome [] has joined #uml
22:34|-|Nem^1 [] has joined #uml
22:34|-|MrX [~chaos@] has quit [Quit: X]
22:41|-|Nem^ [] has quit [Ping timeout: 480 seconds]
22:41|-|Nem^1 changed nick to Nem^
22:59|-|VS_ChanLog [] has left #uml [Rotating Logs]
22:59|-|VS_ChanLog [] has joined #uml
23:46|-|jdike [~jdike@] has quit [Quit: Leaving]
---Logclosed Sat Jan 06 00:00:58 2007