Back to Home / #uml / 2007 / 11 / Prev Day | Next Day
#uml IRC Logs for 2007-11-27

---Logopened Tue Nov 27 00:00:43 2007
00:22|-|balbir [~balbir@122.167.180.57] has joined #uml
02:38|-|ram [~ram@pool-72-90-125-50.ptldor.fios.verizon.net] has quit [Quit: Leaving]
03:04|-|Baltam [~WIKIMOKI@tor-irc.dnsbl.oftc.net] has quit [Remote host closed the connection]
03:07|-|Baltam [~WIKIMOKI@tor-irc.dnsbl.oftc.net] has joined #uml
03:10|-|mgross [~mgross@pool-71-117-236-31.ptldor.fios.verizon.net] has joined #uml
03:46|-|mgross [~mgross@pool-71-117-236-31.ptldor.fios.verizon.net] has quit [Quit: Leaving]
04:48|-|karol [~karol@89.66.106.109] has joined #uml
04:48|-|karol changed nick to Magotari
05:10|-|mjf [~mjf@r5bb59.net.upc.cz] has joined #uml
05:13|-|mjf [~mjf@r5bb59.net.upc.cz] has quit []
06:14|-|newbie [~jeff@ita4fw1.itasoftware.com] has quit [Server closed connection]
06:15|-|newbie [~jeff@ita4fw1.itasoftware.com] has joined #uml
07:06|-|rjbell4 [bbell@c-75-67-251-249.hsd1.nh.comcast.net] has quit [Server closed connection]
07:06|-|rjbell4 [bbell@c-75-67-251-249.hsd1.nh.comcast.net] has joined #uml
07:42|-|Baltam [~WIKIMOKI@tor-irc.dnsbl.oftc.net] has quit [Remote host closed the connection]
07:46|-|Baltam [~WIKIMOKI@tor-irc.dnsbl.oftc.net] has joined #uml
07:52|-|Baltam [~WIKIMOKI@tor-irc.dnsbl.oftc.net] has quit [Remote host closed the connection]
08:07|-|Baltam [~WIKIMOKI@tor-irc.dnsbl.oftc.net] has joined #uml
08:26|-|Baltam [~WIKIMOKI@tor-irc.dnsbl.oftc.net] has quit [Remote host closed the connection]
08:32|-|Baltam [~WIKIMOKI@tor-irc.dnsbl.oftc.net] has joined #uml
09:08|-|dang [~dang@aa-redwall.nexthop.com] has joined #uml
09:09|-|ds2 [noinf@netblock-66-245-251-24.dslextreme.com] has quit [Server closed connection]
09:09|-|ds2 [noinf@netblock-66-245-251-24.dslextreme.com] has joined #uml
09:54|-|jjkola [~jjkola@dsl-olubrasgw1-fec8de00-184.dhcp.inet.fi] has joined #uml
09:55<jjkola-#uml->>hi
10:13|-|hfb [~hfb@pool-71-106-219-180.lsanca.dsl-w.verizon.net] has joined #uml
10:26|-|jdike [~jdike@pool-71-248-190-161.bstnma.fios.verizon.net] has joined #uml
10:26<jdike-#uml->>Hi guys
10:27<caker-#uml->>good morning
10:35<dang-#uml->>Morning, jdike
10:44<jjkola-#uml->>jdike: I got different back trace this time
10:44<jdike-#uml->>OK, let's see it
10:45<jjkola-#uml->>http://rafb.net/p/xlFdmH84.html
10:45<jjkola-#uml->>this time there wasn't any mention about corrupt stack
10:46<jdike-#uml->>that's gdb being confused anyway
10:46<jdike-#uml->>that's the idle thread
10:46<jdike-#uml->>completely normal
10:46<jdike-#uml->>that's from a hung UML?
10:46<jjkola-#uml->>yes
10:47<jdike-#uml->>can you get a sysrq t from it?
10:47<jjkola-#uml->>through uml console?
10:47<jdike-#uml->>yup
10:48<jjkola-#uml->>OK [43023060.710000] SysRq : Show State
10:48<jjkola-#uml->>and empty line after that
10:49<jdike-#uml->>hmm
10:49<jdike-#uml->>can you strace it?
10:50<jjkola-#uml->>ops, it wasn't hung
10:51<jjkola-#uml->>I assumed that it was hung as it was taking quite much processing time : /
10:51<jjkola-#uml->>sorry about that
10:52<jdike-#uml->>Ha
10:53<jjkola-#uml->>here is back trace from core dump: http://rafb.net/p/PAVlTv82.html
10:53<jdike-#uml->>when can you say whether it's behaving better than before?
10:53<jdike-#uml->>when is that dump from?
10:54<jjkola-#uml->>this morning
10:54<jjkola-#uml->>about six hours ago
10:54<jdike-#uml->>with my last patch?
10:54<jjkola-#uml->>so I'm not showing old one to you : )
10:54<jjkola-#uml->>yes
10:55<jjkola-#uml->>and with stack debugging enabled (+ basic mutex debugging)
10:55<jdike-#uml->>can you send the core and binary to me?
10:55<jdike-#uml->>and the console output if you have it
10:57<jjkola-#uml->>unfortunately I don't have the console output, but I can try to dig logs if needed
10:59<jdike-#uml->>there are messages of the form "%s used greatest stack depth: %lu bytes left\n"
11:09<jjkola-#uml->>I don't seem to be able to find that message from any of the crashed umls
11:10<jjkola-#uml->>or even from that uml which didn't crash/hang
11:14<jdike-#uml->>there should be some
11:14<jdike-#uml->>it'll report the greatest stack usage seen to date
11:15<jdike-#uml->>so there will be at least a few during boot
11:15<jjkola-#uml->>if there is they aren't in any of the log files
11:15<jdike-#uml->>can you double-check the config
11:17<jjkola-#uml->>order 1, intrumentation y in .config
11:19<jdike-#uml->>can you check your kernel/exit.c and see if there's a check_stack_usage in it?
11:21<jjkola-#uml->>yes, it's called one time in do_exit
11:23<jdike-#uml->>can you gdb a UML and see what the value of lowest_to_date is?
11:26<jjkola-#uml->>can you tell me the command to do that? I tried 'p lowest_to_date' after attaching to process
11:27<jdike-#uml->>that's right
11:32<jjkola-#uml->>I tried it in every frame and it always complained that it couldn't find lowest_to_date in current context
11:33<jdike-#uml->>hmm
11:33<jjkola-#uml->>sorry, I'm not used to using gdb : /
11:33<jdike-#uml->>just try 'p check_stack_usage'
11:34<jjkola-#uml->>same happened
11:36<jdike-#uml->>then you don't have CONFIG_DEBUG_STACK_USAGE enable
11:36<jdike-#uml->>d
11:38<jjkola-#uml->>hmm, but I have it enabled in .config file and I did change the value before compiling :/
11:38<jdike-#uml->>does ./linux --showconfig | grep DEBUG_STACK_USAGE agree?
11:39<jjkola-#uml->>interesting, it says that it's not enabled!?
11:40<jjkola-#uml->>I don't understand
11:41<jjkola-#uml->>I think I did copy the file in right place after compiling :/
11:50<jjkola-#uml->>I'm really sorry that I have wasted your precious time because of my incompetense
11:53<jjkola-#uml->>as I will be moving to other apartment I will not be able to help in this matter before thursday evening and it's quite propable that it will be friday before I get things in working order
11:54<jdike-#uml->>OK
12:07|-|jjkola [~jjkola@dsl-olubrasgw1-fec8de00-184.dhcp.inet.fi] has quit [Quit: *pop*]
12:36|-|ram [~ram@pool-72-90-125-50.ptldor.fios.verizon.net] has joined #uml
12:41|-|tyler29 [~tyler@ARennes-257-1-19-164.w81-53.abo.wanadoo.fr] has joined #uml
12:49|-|dang [~dang@aa-redwall.nexthop.com] has quit [Quit: Leaving.]
12:49|-|dang [~dang@aa-redwall.nexthop.com] has joined #uml
12:52|-|Baltam [~WIKIMOKI@tor-irc.dnsbl.oftc.net] has quit [Remote host closed the connection]
12:57|-|Baltam [~WIKIMOKI@tor-irc.dnsbl.oftc.net] has joined #uml
13:56[~]jdike #uml whines to LKML about pts device brokenness#uml-> whines to LKML about pts device brokenness
13:56|-|kos_tom [~thomas@col31-3-82-247-183-72.fbx.proxad.net] has joined #uml
14:28|-|tyler29 [~tyler@ARennes-257-1-19-164.w81-53.abo.wanadoo.fr] has quit [Ping timeout: 480 seconds]
14:30<Magotari-#uml->>jdike: Something I would like to discuss. I have been thinking about alternatives to a random umid. My idea was to make it a hash function of the commandline used to start uml. This should be pretty unique, with adding -1 -2 -3... to the directory name should it already exist. I have about 30 dead umids in my .uml now, and I hate it. I am positive I can do everything needed to make it work, but I wanted to know your thoughts.
14:30<Magotari-#uml->>Of course we would need something to take over a dead umid, but I think I saw something like that in the code.
14:31<jdike-#uml->>but that won't happen with a random umid
14:31<Magotari-#uml->>Not with a random, but with a hash based one...
14:31<jdike-#uml->>why not get into the habit of putting a umid on the command line?
14:31<Magotari-#uml->>It is an idea, yes. But that way the user has it a bit easier.
14:32<Magotari-#uml->>I guess you have a point there though.
14:32<Magotari-#uml->>I think I understand.
14:32<jdike-#uml->>the hash will screw you when you run two UMLs with the same command line, BTW
14:32<Magotari-#uml->>I am aware. Thus if a directory is used but not dead we add a -1 to it. Or -2.
14:32<jdike-#uml->>Hah
14:33<Magotari-#uml->>Or make a hash out of the hash, which again makes it all change in a predictable fashion.
14:33<jdike-#uml->>but then the hashed hashes will pollute your .uml
14:33<Magotari-#uml->>Yes, but you run the same program twice with the same commanline rarely.
14:33<jdike-#uml->>as will the UMLs which die right before you change your customary command line
14:34<Magotari-#uml->>And that is also true.
14:34<Magotari-#uml->>However, it will be a reduction.
14:34<jdike-#uml->>yeah, that will reduce the pollution, but I'm not convinced it's worth it
14:34<Magotari-#uml->>Yes, I agree with you. It is a good point and you are right.
14:34<dang-#uml->>Magotari: I run tons of UMLs with the same command line...
14:35<Magotari-#uml->>dang: How do you handle disk access then?
14:35<dang-#uml->>If course, the command line differs by UML id, which I specify. :)
14:35<dang-#uml->>Different directories.
14:35<Magotari-#uml->>Great. A hash could then include a directory you start uml from.
14:35<jdike-#uml->>ubda=cow,/foo/bar/baz
14:35<Magotari-#uml->>But nevermind. I am convinced it is a not worthy project.
14:36<dang-#uml->>jdike: Yeah, but that's a different command line... I actually run from a different working dir.
14:36<jdike-#uml->>dang, no, it's not
14:36<Magotari-#uml->>Including both the cmdline and dir would solve most of the collisions. Unless you use disks read-only or don't mind data corruption.
14:36<jdike-#uml->>you specify a local COW file and an absolute backing file name, right?
14:37<jdike-#uml->>or just the local COW file, I guess
14:37<dang-#uml->>Oh, right.
14:37<dang-#uml->>Misread...
14:37<jdike-#uml->>including the local directory in the hash would actually work
14:38<Magotari-#uml->>That was my original design, actually.
14:38<jdike-#uml->>unless you're booting a cluster
14:38<Magotari-#uml->>I just kinda chickened out with saying it.
14:38<jdike-#uml->>then you can have identical command lines running from the same directory
14:38<dang-#uml->>And you have nfs mount homedir, or something?
14:39<jdike-#uml->>that, too
14:39<dang-#uml->>Because otherwise, the .uml dir would be on different boxes. ;)
14:40<jdike-#uml->>Oh
14:40<jdike-#uml->>I meant a UML cluster
14:40<jdike-#uml->>where the nodes use the same disk rw with ocfs2 or something
14:41<dang-#uml->>Ah. That could do it, I guess.
14:41<jdike-#uml->>I thought you were talking about UMLs on different hosts, same command line, same directory on each host
14:42<jdike-#uml->>but the umids collide because you have the same $HOME, nfs-mounted
14:42<jdike-#uml->>on all of them
14:44<dang-#uml->>That's what I was talking about, yes.
14:44<Magotari-#uml->>I recounted the umids. I have 70. Wow. That's a lot of crashing. rm -fr time.
14:45|-|tyler29 [~tyler@ARennes-257-1-101-166.w81-48.abo.wanadoo.fr] has joined #uml
14:45<jdike-#uml->>what are you doing to the poor things?
14:46<Magotari-#uml->>Crashme. Hacking attempts. Trying to fix things. Memory stuff. Bugfinding.
14:47<jdike-#uml->>hehe
14:47<Magotari-#uml->>All the things that I have been doing for the last two weeks hoping to be useful and failing.
14:47<Magotari-#uml->>But hey, gotta keep on trying.
14:59[~]jdike #uml sends in the !NOHZ busy-loop fix#uml-> sends in the !NOHZ busy-loop fix
15:06|-|tyler29 [~tyler@ARennes-257-1-101-166.w81-48.abo.wanadoo.fr] has quit [Ping timeout: 480 seconds]
15:10|-|Magotari [~karol@89.66.106.109] has quit [Quit: Reconnecting]
15:10|-|Magotari [~karol@89.66.106.109] has joined #uml
15:12|-|dang [~dang@aa-redwall.nexthop.com] has quit [Quit: Leaving.]
15:23|-|tyler29 [~tyler@ARennes-257-1-108-202.w86-210.abo.wanadoo.fr] has joined #uml
17:56|-|anderiv [~anderiv@66-162-60-4.static.twtelecom.net] has quit [Server closed connection]
17:56|-|anderiv [~anderiv@66-162-60-4.static.twtelecom.net] has joined #uml
18:02|-|dang [~dang@nemesis.fprintf.net] has joined #uml
18:04|-|tyler29 [~tyler@ARennes-257-1-108-202.w86-210.abo.wanadoo.fr] has quit [Ping timeout: 480 seconds]
18:17<Magotari-#uml->>jdike: I think I just fixed the mconsole stop bug.
18:17<Magotari-#uml->>It did stop when I told it to.
18:18<Magotari-#uml->>It even resumed fine.
18:19<Magotari-#uml->>This will need a few more hours of work, to make it look good, but I can send it in tomorrow.
18:22<jdike-#uml->>cool
18:31|-|kos_tom [~thomas@col31-3-82-247-183-72.fbx.proxad.net] has quit [Quit: I like core dumps]
18:40<Magotari-#uml->>Ouch. The whole bit of code which did the stopping seems kinda wrong now.
18:41<Magotari-#uml->>I changed it a bit, but that exposed some more things, as far as my inexperienced eye sees.
18:41<Magotari-#uml->>The loop was never entered, the fuction inside while() returned 0.
18:41<Magotari-#uml->>I think, anyway.
18:42<Magotari-#uml->>When I changed it, I discovered that even had it not returned 0, it would be all no good anyway, as it is non-blocking.
18:42<Magotari-#uml->>My changed made things stop, yeah. With a busy loop.
18:43<Magotari-#uml->>No damn good. Needs more work than a two-liner.
18:53<Magotari-#uml->>Oh dear.
18:54<Magotari-#uml->>There is something mighty wrong with mconsole, and I don't think I caused it.
18:55<Magotari-#uml->>Bah, tomorrow is another day...
18:55<Magotari-#uml->>Good night.
18:57|-|jdike [~jdike@pool-71-248-190-161.bstnma.fios.verizon.net] has quit [Quit: Leaving]
19:07|-|hfb [~hfb@pool-71-106-219-180.lsanca.dsl-w.verizon.net] has quit [Quit: Leaving]
22:30|-|besonen_mobile__ [~besonen_m@71-220-198-145.eugn.qwest.net] has joined #uml
22:36|-|besonen_mobile_ [~besonen_m@71-220-228-70.eugn.qwest.net] has quit [Read error: Operation timed out]
23:59|-|VS_ChanLog [~stats@ns.theshore.net] has left #uml [Rotating Logs]
23:59|-|VS_ChanLog [~stats@ns.theshore.net] has joined #uml
---Logclosed Wed Nov 28 00:00:05 2007