Back to Home / #uml / 2007 / 08 / Prev Day | Next Day
#uml IRC Logs for 2007-08-09

---Logopened Thu Aug 09 00:00:59 2007
00:29|-|IRCFrEAK [~jircii@c-71-230-160-185.hsd1.pa.comcast.net] has quit [Quit: IRCFrEAK has no reason]
00:45|-|silug [~steve@ppp-70-225-91-130.dsl.covlil.ameritech.net] has quit [Ping timeout: 480 seconds]
02:37|-|silug [~steve@ppp-70-225-35-206.dsl.covlil.ameritech.net] has joined #uml
02:45|-|horst [~horst@a89-182-144-130.net-htp.de] has joined #uml
03:12|-|horst [~horst@a89-182-144-130.net-htp.de] has quit [Remote host closed the connection]
05:12|-|tyler [~tyler@adsl196-65-226-206-196.adsl196-8.iam.net.ma] has joined #uml
05:44|-|silug [~steve@ppp-70-225-35-206.dsl.covlil.ameritech.net] has quit [Ping timeout: 480 seconds]
05:44|-|silug [~steve@ppp-70-225-35-206.dsl.covlil.ameritech.net] has joined #uml
06:37|-|balbir [~balbir@122.167.72.90] has quit [Read error: Connection reset by peer]
07:12|-|balbir [~balbir@122.167.74.45] has joined #uml
09:15|-|dang [~dang@aa-redwall.nexthop.com] has joined #uml
09:28|-|kokoko1 [~Slacker@203.148.65.8] has quit [Ping timeout: 480 seconds]
09:35|-|jdike [~jdike@pool-72-70-38-233.bstnma.fios.verizon.net] has joined #uml
09:35<jdike>Hi guys
09:41|-|jjkola [~jjkola@dsl-olubrasgw1-fe56fb00-241.dhcp.inet.fi] has joined #uml
09:42<jdike>jjkola, what were you saying about 64K vs 16K limits last night?
09:43<jdike>there's a 16K limit on COW I/O, where does 64K come from?
09:52|-|ram [~ram@bi01p1.co.us.ibm.com] has joined #uml
09:55|-|kokoko1 [~Slacker@203.148.65.8] has joined #uml
09:56<jdike>hey slacker
10:04<kokoko1>Hi Jeff.
10:04<kokoko1>:)
10:05<kokoko1>jdike, wondering how much performance we will get if we mount a partition inside UML with noatime, nodiratime
10:05<kokoko1>I am sure we can do all these linux things inside uml
10:10<jdike>sure
10:10<jdike>and I imagine the benefits will be similar
10:13<kokoko1>I am thinking ot implement this on all uml noatime and nodiratime
10:16|-|hfb [~hfb@pool-72-87-254-188.lsanca.dsl-w.verizon.net] has joined #uml
10:17<jjkola>jdike: lines 714-716 in arch/um/drivers/ubd_kern.c
10:17<jjkola>MAX_SG is 64k
10:18<jdike>No
10:18<jjkola>no?
10:18<jdike>it's 64, and that's entirely separate from the number of sectors involved
10:19<jjkola>ah, ok
10:19<jdike>you can have 1 sg segment with 64K of I/O in it
10:19<jjkola>I thought they changed same thing
10:19<jdike>or you can have 64 sg segments each with 1K of I/O
10:19<jjkola>I see
10:21<jjkola>but still, that line 1038 restricts block size to 16384
10:22<jjkola>at least in my computer
10:22<jjkola>or should I say request length
10:22<jdike>right
10:23<jdike>which matches this
10:23<jdike>blk_queue_max_sectors(ubd_dev->queue, 8 * sizeof(long));
10:23<jdike>right?
10:23<jjkola>yes
10:23<jjkola>but I had request which length was 64k
10:24<jdike>even after the patch?
10:24<jjkola>so it seems like not all code paths are handled
10:24<jjkola>yes
10:24<jdike>can you make it happen again?
10:24<jjkola>the patch was already applied to rc2
10:24<jjkola>ok, one moment
10:30<jjkola>request length 32k
10:30<jdike>is this 32-bit or 64-bit
10:31<jjkola>32 bit
10:31<jjkola>if I remember right
10:31<jdike>uhh, you should know what architecture you're using
10:32<jdike>but the thing actually died?
10:32<jjkola>segfault
10:33<jjkola>processor Athlon XP +2500
10:33<jdike>but before that, panic("request too long"), right?
10:33<jjkola>*2500+
10:33<jjkola>panic("Operation too long")
10:34<jdike>can you disassemble ubd_add and paste it somewhere?
10:34<jjkola>ok
10:37<jjkola>how can I dissamble that ubd_add when it's not on the backtrace?
10:38<jdike>disas ubd_add
10:38<jdike>in gdb
10:38<jjkola>ah, ok
10:40<jjkola>here it is: http://rafb.net/p/cNG1vh78.html
10:42<jdike>in gdb, print sizeof(long)
10:42<jjkola>4
10:43<jdike>this UML was built on this host?
10:43<jjkola>yes
10:43<jdike>well
10:43<jdike>0x08065e43 <ubd_add+147>: movl $0x40,0x4(%esp)
10:43<jdike>0x08065e4b <ubd_add+155>: mov 0x170(%ebx),%eax
10:43<jdike>0x08065e51 <ubd_add+161>: mov %eax,(%esp)
10:43<jdike>0x08065e54 <ubd_add+164>: call 0x8186c40 <blk_queue_max_hw_segments>
10:44<jdike>that 0x40 is the limit, and that's 64
10:44<jdike>the code says 8 * sizeof(long)
10:44<jdike>and that's 32
10:45<jdike>so, we have a conundrum
10:45<jjkola>yes
10:46<jdike>can you diassemble cowify_req?
10:46<jjkola>ok
10:47<jjkola>http://rafb.net/p/HAt1nT97.html
10:50<jdike>0x080664d9 <cowify_req+41>: cmp $0x4000,%edx
10:50<jdike>that's 16K
10:50<jdike>or 32 sectors
10:51<jjkola>yes
10:52<jdike>p sizeof(((struct io_thread_req *) 0)->sector_mask)
10:52<jjkola>4
10:53<jdike>I'd say your compiler is screwing up
10:54<jjkola>ok
10:55<jjkola>gcc version 4.1.2 (Ubuntu 4.1.2-0ubuntu4)
10:55<jdike>is it at all expermental or bleeding-edge?
10:55<jdike>same as what I have here
10:55<jjkola>ubuntu 7.04 server version
11:00<jjkola>[18:43:43] <jdike> 0x08065e43 <ubd_add+147>: movl $0x40,0x4(%esp) -> that actually referring to line 714 where there is MAX_SG
11:00<jjkola>and MAX_SG is 64
11:00<jdike>MAX_SG has nothing to do with any of this
11:01<jdike>um, hold on
11:01<jdike>are you sure?
11:01<jdike>crap, you're right
11:01<jdike>hold on
11:02<jdike>I misread the assembly
11:02<jdike>0x08065e60 <ubd_add+176>: movl $0x20,0x4(%esp)
11:02<jdike>0x08065e68 <ubd_add+184>: mov 0x170(%ebx),%eax
11:02<jdike>0x08065e6e <ubd_add+190>: mov %eax,(%esp)
11:02<jdike>0x08065e71 <ubd_add+193>: call 0x8186b80 <blk_queue_max_sectors>
11:02<jdike>so, there's 0x20, which is right
11:02<jjkola>yes
11:02<jdike>can you get a core dump from it?
11:10<jjkola>core: http://www.savefile.com/files/957057
11:11<jdike>actually, I'd like you to gdb it for me
11:11<jjkola>ok
11:11<jdike>actually
11:11<jdike>savefile.com is a PITA
11:12<jjkola>should I then send it by dcc?
11:12<jdike>no, let's just gdb the thing
11:12<jjkola>ok
11:12<jdike>go up the stack until you're at cowify_req
11:13<jjkola>ok, I'm there
11:13<jdike>OK
11:14<jdike>now, go up one more time :-)
11:14<jdike>p *req and paste that
11:15<jjkola>here or pastebin?
11:16<jjkola>http://rafb.net/p/JaD6Co67.html
11:17<jdike>whoops
11:17<jdike>the limits aren't there
11:18<jjkola>they should be?
11:18<jdike>just me being stupid
11:18<jdike>p *req.q and paste that
11:20<jjkola>http://rafb.net/p/zadRke43.html
11:21<jdike>max_hw_segments = 64
11:21<jdike>oops
11:21<jdike>I keep getting them confused
11:22<jdike>max_sectors = 255, max_hw_sectors = 255
11:23<jjkola>max_segment_size = 65536?
11:23<jdike>dunno about that
11:23<jjkola>wouldn't that affect request length?
11:24<jjkola>or do I misunderstand it?
11:26<jdike>it should, but whatever limit is most constraining should win
11:26<jjkola>ok
11:26<jdike>and here, max_hw_sectors isn't what we want it to be
11:26<jjkola>I see
11:27<jdike>You OK with stuffing some printks in there?
11:27<jjkola>no problem
11:28<jjkola>just say what and where and I'll add them
11:29<jdike>OK, after the blk_queue_max_sectors call, print out ubd_dev->cow.file (with %s) and ubd_dev->queue->max_hw_sectors (%d)
11:29<jdike>and ubd_dev->queue (%p)
11:29<jdike>then in cowify_req, before the panic
11:29<jdike>not there
11:30<jdike>in prepare_request, before the cowify_req call
11:30<jdike>print req (%p), req->max_hw_sectors (%d again)
11:35<jjkola>building now
11:36<jjkola>error: 'struct request’ has no member named ‘max_hw_sectors’
11:37<jjkola>that was in prepare_request function
11:41<jjkola>printk("%p", req);
11:41<jjkola> printk("%d", req->max_hw_sectors);
11:42<jdike>sorry
11:42<jdike>req->q->max_hw_sectors
11:43<jdike>put some newlines in there while you're at it
11:43<jjkola>ok
11:43<jdike>otherwise everything just gets mashed together
11:54<jjkola>http://rafb.net/p/yb1WPa25.html
11:57<jdike>the ubd_add printk never happened?
11:58<jjkola>[42949373.570000] cow.file: <NULL>
11:58<jjkola>[42949373.570000] max_hw_sectors: 255
11:58<jjkola>[42949373.570000] ubda:&req: 084912e0 max_hw_sectors: 255
11:58<jjkola>[42949373.570000] unknown partition table
11:58<jjkola>[42949373.570000] cow.file: <NULL>
11:58<jjkola>[42949373.570000] max_hw_sectors: 255
11:58<jjkola>[42949373.570000] ubdb:&req: 084915c0 max_hw_sectors: 255
11:58<jjkola>[42949373.570000] unknown partition table
11:59<jdike>OK, those weren't from prepare_request?
11:59<jjkola>no
12:00<jdike>well
12:00<jdike>ubda:&req: 084912e0 max_hw_sectors: 255
12:00<jjkola>that is
12:00<jdike>right after the blk_queue_max_sectors call
12:00<jjkola>but the two lines before that are from ubd_add
12:01<jdike>OK, but there's no COW file, so that's OK
12:01<jjkola>well, I'm using cow files ...
12:01<jjkola>only swapfs isn't cow file
12:02<jdike>cow.file: <NULL>
12:02<jdike>that's ubda
12:02<jjkola>ubd0=/home/virtual/uml/www.cow
12:03<jjkola>so I think it should be : )
12:04|-|horst [~horst@a89-182-144-130.net-htp.de] has joined #uml
12:04<jdike>hmm
12:05<jdike>I think I see
12:06<jdike>when the COW file isn't on the command line, cow.file is filled in when it's opened
12:06<jdike>because until then, it doesn't know
12:07<jjkola>ok
12:09<jdike>try this
12:09<jdike>http://rafb.net/p/C3VlOm89.txt
12:12<jjkola>ok, now it did start
12:15<jjkola>so that was the cause
12:19<jdike>OK, one more for -stable
12:24|-|mgross [~mgross@jffwpr02.jf.intel.com] has joined #uml
12:24<jdike>Hi mark
12:25<mgross>hi
12:38|-|jjkola [~jjkola@dsl-olubrasgw1-fe56fb00-241.dhcp.inet.fi] has quit [Ping timeout: 480 seconds]
12:40|-|jjkola [~jjkola@dsl-olubrasgw1-fe56fb00-241.dhcp.inet.fi] has joined #uml
12:43<jjkola>yesterday I talked about 2.6.23-rc2 host kernel segfaulting all uml kernels and here is one backtrace from the situation: http://rafb.net/p/tn80PQ92.html
12:43<jjkola>host is stock + exec patch + humfs patches
12:45|-|tyler_ [~tyler@adsl196-180-193-217-196.adsl196-15.iam.net.ma] has joined #uml
12:45<jjkola>ah, yes, I applied 2.6.22-skas3-v9-pre9 unofficial patch
12:47<jjkola>I had to do some manual applying for the patches but those parts shouldn't contain anything which would cause problems
12:48|-|tyler [~tyler@adsl196-65-226-206-196.adsl196-8.iam.net.ma] has quit [Ping timeout: 480 seconds]
12:48<jjkola>I can try to run some older uml kernel if that helps to pinpoint the problem
12:48<jdike>does it happen when you back out the skas patch?
12:49<jjkola>I'll try that
13:09|-|tyler_ [~tyler@adsl196-180-193-217-196.adsl196-15.iam.net.ma] has quit [Remote host closed the connection]
13:26|-|baroni [~baroni@tera.lsi.usp.br] has quit [Ping timeout: 480 seconds]
13:31|-|tchan [~tchan@c-24-13-84-219.hsd1.il.comcast.net] has quit [Ping timeout: 480 seconds]
13:45|-|tchan [~tchan@c-24-13-84-219.hsd1.il.comcast.net] has joined #uml
13:59|-|kokoko1 [~Slacker@203.148.65.8] has quit [Ping timeout: 480 seconds]
14:07|-|jjkola1 [~jjkola@dsl-olubrasgw1-fe56fb00-241.dhcp.inet.fi] has joined #uml
14:09|-|jjkola2 [~jjkola@dsl-olubrasgw1-fe56fb00-241.dhcp.inet.fi] has joined #uml
14:11|-|jjkola23 [~jjkola@dsl-olubrasgw1-fe56fb00-241.dhcp.inet.fi] has joined #uml
14:13|-|jjkola [~jjkola@dsl-olubrasgw1-fe56fb00-241.dhcp.inet.fi] has quit [Ping timeout: 480 seconds]
14:13|-|jjkola23 changed nick to jjkola
14:15|-|jjkola1 [~jjkola@dsl-olubrasgw1-fe56fb00-241.dhcp.inet.fi] has quit [Ping timeout: 480 seconds]
14:15<jjkola>uml kernel did start when I used host kernel without skas patch
14:17<jjkola>but I would like to get it to work with skas patch as I have several umls running at the same time so they clutter the process list if run without skas patch
14:17|-|jjkola2 [~jjkola@dsl-olubrasgw1-fe56fb00-241.dhcp.inet.fi] has quit [Ping timeout: 480 seconds]
14:18<jjkola>besides this way takes much more memory and is definitely slower
14:31<jdike>Can you back the host out to whatever the skas patch matches?
14:35|-|da-x [karrde@bzq-88-155-223-135.red.bezeqint.net] has quit [Read error: Connection reset by peer]
14:36|-|da-x [karrde@bzq-88-155-223-135.red.bezeqint.net] has joined #uml
14:39<jjkola>sorry but I didn't fully understand what you said
14:43<jjkola>you want me to boot to the kernel with skas patches?
14:45<jdike>the skas patch you have didn't apply cleanly to your currrent kernel
14:45<jdike>what I'm suggesting is using a kernel that that patch was intended for
14:45<jjkola>ah
15:00|-|kokoko1 [~Slacker@203.148.65.8] has joined #uml
15:01[~]kokoko1 hates power outages
15:22<jjkola>was it a long one?
15:34<kokoko1>yeah more then 30m
16:14|-|jjkola1 [~jjkola@dsl-olubrasgw1-fe56fb00-241.dhcp.inet.fi] has joined #uml
16:17|-|jjkola changed nick to Guest263
16:17|-|jjkola1 changed nick to jjkola
16:18<jjkola>now let's see if the umls can run
16:19|-|nessie [~nessie@lucifer.nerdfest.org] has quit [Ping timeout: 480 seconds]
16:19<jjkola>hmm, it seems to work
16:19|-|Guest263 [~jjkola@dsl-olubrasgw1-fe56fb00-241.dhcp.inet.fi] has quit [Ping timeout: 480 seconds]
16:20<jjkola>so it's a change in 2.6.23 series which makes the skas patch not work
16:24<jjkola>by the way, I used same config settings for both 2.6.22.1-skas3 and 2.6.23-rc2-skas3 builds
16:25<jjkola>so it's either some new feature or change of behaviour which is causing this
16:36|-|Blissex [~Blissex@82-69-39-138.dsl.in-addr.zen.co.uk] has joined #uml
17:03|-|horst [~horst@a89-182-144-130.net-htp.de] has quit [Remote host closed the connection]
17:07|-|dang [~dang@aa-redwall.nexthop.com] has quit [Quit: Leaving.]
17:12<jjkola>I'm building that rc2 kernel with "downgraded" config file from 2.6.22.1 to see if it works
17:23|-|Blissex [~Blissex@82-69-39-138.dsl.in-addr.zen.co.uk] has quit [Remote host closed the connection]
18:17|-|mgross [~mgross@jffwpr02.jf.intel.com] has quit [Quit: Leaving]
18:17<jjkola>hmm, the uml kernel which was using "downgraded" config file segfaultet
18:17|-|jdike [~jdike@pool-72-70-38-233.bstnma.fios.verizon.net] has quit [Quit: Leaving]
18:18|-|hfb [~hfb@pool-72-87-254-188.lsanca.dsl-w.verizon.net] has quit [Quit: Leaving]
18:19<jjkola>so either there is some behaviour change or the code around the parts which was meant to be replaced has changed in some way
18:58<jjkola>hmm, the code path seems to be always same for the segfaulting uml kernels regardless of the version, give or take a few functions: http://rafb.net/p/NoJh3M21.html
18:59<jjkola>and the panic seems to always happen in function switch_mm_skas
19:22|-|dang [~dang@nemesis.fprintf.net] has joined #uml
19:48|-|ram [~ram@bi01p1.co.us.ibm.com] has quit [Ping timeout: 480 seconds]
19:59|-|jjkola [~jjkola@dsl-olubrasgw1-fe56fb00-241.dhcp.inet.fi] has quit [Ping timeout: 480 seconds]
21:09|-|tchan [~tchan@c-24-13-84-219.hsd1.il.comcast.net] has quit [Quit: WeeChat 0.2.6-cvs]
21:12|-|baroni [~baroni@bd21254a.virtua.com.br] has joined #uml
21:13|-|Urgleflogue [~plamen@83.228.65.158] has joined #uml
21:50|-|tchan [~tchan@c-24-13-84-219.hsd1.il.comcast.net] has joined #uml
22:45|-|ram [~ram@pool-71-245-108-202.ptldor.fios.verizon.net] has joined #uml
22:59|-|VS_ChanLog [~stats@ns.theshore.net] has left #uml [Rotating Logs]
22:59|-|VS_ChanLog [~stats@ns.theshore.net] has joined #uml
23:57|-|ram_ [~ram@pool-71-245-108-202.ptldor.fios.verizon.net] has joined #uml
23:57|-|ram [~ram@pool-71-245-108-202.ptldor.fios.verizon.net] has quit [Ping timeout: 480 seconds]
---Logclosed Fri Aug 10 00:00:02 2007