Back to Home / #uml / 2007 / 04 / Prev Day | Next Day
#uml IRC Logs for 2007-04-23

---Logopened Mon Apr 23 00:00:43 2007
00:12|-|caker [] has quit [Server closed connection]
00:12|-|caker [] has joined #uml
01:12|-|wyvern [] has quit [Ping timeout: 480 seconds]
01:22|-|wyvern [] has joined #uml
02:40|-|motp [~motp@] has joined #uml
02:47|-|moalla [] has joined #uml
02:48|-|aroscha [] has quit [Quit: aroscha]
02:48|-|moalla [] has left #uml []
04:04|-|kybe__ [] has quit [Ping timeout: 480 seconds]
04:07|-|polyonymous [] has quit [Ping timeout: 480 seconds]
04:07|-|kybe [] has joined #uml
04:22|-|polyonymous [] has joined #uml
05:27|-|aroscha [] has joined #uml
06:14|-|AllenJB [] has quit [Server closed connection]
06:15|-|AllenJB [] has joined #uml
06:50|-|motp [~motp@] has quit [Quit: Leaving]
07:04|-|krau [~cktakahas@] has joined #uml
08:23|-|wyvern [] has quit [Ping timeout: 480 seconds]
08:28|-|baroni [] has joined #uml
08:34|-|wyvern [] has joined #uml
09:04|-|karrde_ [] has joined #uml
09:04|-|karrde_ [] has quit [Remote host closed the connection]
09:26|-|hfb [] has quit [Quit: Leaving]
09:33|-|da-x [] has quit [Remote host closed the connection]
09:34|-|da-x [] has joined #uml
09:34|-|da-x [] has quit [Remote host closed the connection]
09:35|-|da-x [] has joined #uml
09:39|-|da-x [] has quit [Remote host closed the connection]
09:43|-|da-x [] has joined #uml
09:45|-|shze [] has quit [Quit: Konversation terminated!]
10:36|-|hfb [] has joined #uml
10:41|-|hfb [] has left #uml []
10:47|-|jdike [] has joined #uml
10:47<jdike>Hi guys
11:11<albertito>jdike: Hi
11:31<albertito>I've got a problem with UML, the following testcase runs fine on the host but fails on the guest:
11:31<albertito>It fails because fchmod() returns ENOENT
11:33<albertito>Its using hostfs, and happens both with x86-64 and i386 (btw, I was amazed that I could do ARCH=um SUBARCH=i386), kernel is 2.6.21-r7
11:34<albertito>I noticed the problem because locales-gen (which is run in debian when you do dpkg reconfigure-locales) fails because of that
11:35<albertito>Maybe it shouldn't, neither SUSv3 nor Linux manpages specify the case were you fchmod() a deleted but existant file (although SUSv3 doesn't say it can return ENOENT and Linux do)
11:36|-|krau [~cktakahas@] has quit [Quit: Leaving]
11:39|-|aroscha [] has quit [Quit: aroscha]
11:40|-|linbot [] has quit [Server closed connection]
11:40|-|linbot [] has joined #uml
11:40<jdike>albertito, hostfs has problems with deleted files
11:40<albertito>jdike: :S
11:40<jdike>the rewrite is better, but still not completely right
11:41<albertito>jdike: if there is a rewrite I assume there is little hope of some simple fix, isn't it?
11:41<jdike>well, the rewrite was done for a number of reasons
11:44<albertito>jdike: so, should I try the rewrite, or see if I can fix this issue?
11:45<jdike>try the externfs patches from patches.html
11:45<albertito>jdike: I will, thanks a lot!
11:45<jdike>it is a lot better in this regard
12:05<albertito>jdike: does it have any dependencys with other patches?
12:25|-|aroscha [] has joined #uml
12:41<jdike>you want every patch that says externfs
12:53|-|krau [~cktakahas@] has joined #uml
12:58|-|ram [] has joined #uml
13:46<albertito>jdike: it compiled but it hangs after mounting root (using hostfs as root), saying "Warning: unable to open an initial console." right after mounting root
13:48[~]amitg re-types his question
13:49<amitg>how can I have more info when kernel panics within UML?
13:49<amitg>It says it receives a SIGSEGV in sig_handler_common_skas, which I don't think is the real cause
13:50<amitg>no backtrace, and gdb backtrace is not helpful..
14:36<jdike>can you paste the whole output somewhere?
14:45<aroscha>jdike: I have some questions later about scheduling and context switches in UML
14:46<aroscha>seems like this is the limiting factor for me now
14:50|-|krau [~cktakahas@] has quit [Quit: Leaving]
14:50|-|aroscha_ [~aroscha@] has joined #uml
14:57|-|aroscha [] has quit [Ping timeout: 480 seconds]
15:30|-|arun [~arun@chobie.cs.Virginia.EDU] has quit [Server closed connection]
15:30|-|arun [~arun@chobie.cs.Virginia.EDU] has joined #uml
15:43<aroscha_>what is the exact behaviour if I specify parameters as eth0=tuntap,$TAPDEV,,$IP ? If I leave $IP empty then ... the setup between host and guest is done automatically and if I specify it then I have to set it up manually ?
15:44<jdike>yes, except that if you want it done automatically, you also leave out $TAPDEV
15:45<aroscha_>so only if I leave out the $TAPDEV then the automatic mechanism works
15:45<aroscha_>that is what I wanted to know htx
15:49<aroscha_>very strange, I can tcpdump on the tapX device and see all the traffic on the inside but i can not ping inside
15:50<aroscha_>i mean ping to the inside
15:50<jdike>ping from the host?
15:50<jdike>you have a route from the inside to the host?
15:51<aroscha_>i want that they see each other on the arp level . Since I am testing routing daemons the daemon should set up the route by itself
15:52<aroscha_>so basically I need arp connectivity
15:52<aroscha_>I thought I do it like that: tap1 for the UML instance, tap99 for the tap device on the host
15:52<aroscha_>both are connected via a bridge
15:53<jdike>do you see traffic on the bridge?
15:53<aroscha_>i had it running already, but seems like I simply can not reproduce it anymore and was wondering why
15:53<aroscha_>tcpdump -ni brdevice shows me the packets
15:53<aroscha_>also on the tapX devices
15:53<jdike>do you see outside traffic on a UML eth?
15:53<aroscha_>outside (= the real LAN? eth0 on the host ?)
15:54<jdike>yes, except outside == anything besides the UML
15:54<jdike>another UML, the host, etc
15:55<aroscha_>yes, I can see that two UML instances actually communicate over the bridge
15:56<aroscha_>just the connection from the host to the UML instances is not possible
15:57<aroscha_>all devices are in the 10.x.x.x/8 IP range
15:57<jdike>so, if you see packets from another UML and host on the bridge at the same time, the UML sees only the other UML packets?
15:58<aroscha_>i will check that. have to add tcpdump to the instance
16:11<aroscha_>jdike: I can perfectly see the other UML instances from within one
16:12<jdike>just not the host?
16:12<jdike>that smacks of some filtering somewhere
16:13<aroscha_>i have no ebtables
16:13<aroscha_>default allow
16:15<aroscha_>default accept
16:15<aroscha_>I am guessing it is arp
16:16<jdike>but tcpdump inside UML should at least see host packets
16:16<jdike>even if it doesn't think they belong to it
16:17<aroscha_>no, it does not
16:17<aroscha_>ok, i will repeat it and go into details
16:17<jdike>that's why it looks like filtering
16:23<aroscha_>no, but htis is quite interesting:
16:23<aroscha_># iptables -L
16:23<aroscha_>-sh: iptables: Input/output error
16:23<aroscha_>inside the UML instance
16:23<jdike>strace it
16:24<aroscha_>i am adding strace to my root_fs ;-)
16:27<aroscha_>no there is something bigger wrong: inside: # ls *iptables*
16:27<aroscha_>ls: iptables: Input/output error
16:27<aroscha_>ls: iptables-restore: Input/output error
16:28<jdike>maybe a filesystem problem
16:28<aroscha_>looks like it
16:30<aroscha_>no error message in dmesg
16:30<aroscha_># mount
16:30<aroscha_>proc on /proc type proc (defaults)
16:30<aroscha_># iptables -L
16:30<aroscha_>-sh: iptables: Input/output error
16:30<jdike>mount it ro and fsck it
16:34<aroscha_>from the outside fsck.ext2 says it is clean
16:34<aroscha_>for the inside fsck I still need to recompile busybox heeh
16:35<jdike>fsck -f
16:35<jdike>and don't fsck from the outside when a UML is running on it
16:35<aroscha_>there was none
16:35<jdike>but just add -f and see what it says
16:36<aroscha_>root@texas:/home/aaron# fsck.ext2 -f ff-8-root_fsBUSY2
16:36<aroscha_>e2fsck 1.40-WIP (14-Nov-2006)
16:36<aroscha_>Pass 1: Checking inodes, blocks, and sizes
16:36<aroscha_>Pass 2: Checking directory structure
16:36<aroscha_>Pass 3: Checking directory connectivity
16:36<aroscha_>Pass 4: Checking reference counts
16:36<aroscha_>Pass 5: Checking group summary information
16:36<aroscha_>ff-8-root_fsBUSY2: 433/2048 files (1.2% non-contiguous), 6907/8192 blocks
16:36<aroscha_>it is fine
16:36<jdike>but ls iptables still gives you -EIO?
16:45|-|aroscha [~aroscha@] has joined #uml
16:45<aroscha>ok, ... his was some really weird FS stuff.
16:45<aroscha>i restarted the instance now and it is fine
16:45<aroscha>ls works
16:45<aroscha>good, I found out that the module for iptables was missing doh
16:46<aroscha>but still. AFAIK iptables rules in linux should be default allow in case you do not have the modules loaded, right?
16:46<aroscha>I mean of course the equivalent of default allow
16:46|-|aroscha_ [~aroscha@] has quit [Read error: No route to host]
16:46<jdike>I guess
16:47<aroscha>anything else would make it cease to be a server ;-)
17:10|-|ElectricElf [] has quit [Server closed connection]
17:11|-|ElectricElf [] has joined #uml
17:36<albertito>jdike: I wrote a patch to make hostfs work when doing set_attr() on deleted files, by checking if the inode has an open fd and using the funcions that take an fd instead of a path
17:37<albertito>(because I couldn't find how to use externfs as root)
17:37<albertito>jdike: should I send it to you, send it to a mailing list, or just sit on it because externfs is the way to go?
18:09<jdike>to the list
18:17<albertito>jdike: I'll send it later, thanks a lot!
18:21<aroscha>jdike: hmmm. I rebooted everything and still have the same behaviour. Maybe it is just a simple stupid mistake but... can you cross-check please? 1) ./linux ... eth0=tuntap,$TAP,,10.0.$x.$y 2) ifconfig $TAP 10.0.$x.$y up 3) brctl addif mybridge $TAP 4) tunctl -t hosttap 5) ifconfig hosttap up 6) brctl addif mybridge hosttap
18:21<aroscha>I don't see why the UMLs can talk via the bridge but not the hosttap device
18:23<jdike>looks OK offhand
18:23<aroscha>think so too
18:23<aroscha>and indeed they do talk, just not the host's tap
18:23<jdike>I'm wondering about the use of 10.0.0.x and 10.99.0.x, but I don't see that breaking ethernet connectivity
18:24<aroscha>but the host's tap device actually can see the traffic (checked via tcpdump)
18:24<aroscha>well it is all in a /8 net
18:24<jdike>actually, devices in a bridge aren't supposed to have IP addresses assigned
18:24<jdike>dunno if that would affect anything here
18:24<aroscha>ok, can try without
18:25<aroscha>it is quite strange because I had it working already
18:25<jdike>what did you change?
18:27<aroscha>i had it working at 4:00 a.m. my time yesterday hehe
18:27<aroscha>and today I tried to find out why hahaha
18:28<jdike>that'll teach you
18:28<aroscha>yes, as my mommy said "always document" hehe
18:29<jdike>or, if it's late and you have it working, don't turn it off
18:29<jdike>and have everything in your shell history
18:35|-|jdike [] has quit [Quit: Leaving]
18:42<aroscha>did it!
18:42<aroscha>and it was not documented
19:51|-|shze [] has joined #uml
20:00|-|ram [] has quit [Ping timeout: 480 seconds]
20:20|-|sbw [] has joined #uml
20:26<aroscha>caker: what kind of HW do you use for hosting?
20:26<aroscha>for supporting 40 instances?
20:27<caker>aroscha: <-- scroll to Server Hardware. Most of are machines are now the top most entry
20:28<aroscha>hmm... very similar
20:28<aroscha>ram is really expensive
20:29<caker>It's like 4/5ths the cost of the entire rig
20:29<aroscha>yeah... the hidden costs of virtualization hehe
20:30<aroscha>so, disappointment over here... my bridge loses no single packet only when I connect 10-20 instances to it.
20:30<aroscha>Everything else means some kind of packetloss on the bridge
20:30<aroscha>but! the bridge has to copy over every packet to every other instance
20:30<aroscha>so that is O(N^2) and when n -> 30 then we have to send 900pkts /sec over a brctl bridge
20:31<aroscha>is that a lot or not?
20:31<aroscha>I mean should I start to worry that I did something wrong?
20:40|-|baroni [] has quit [Ping timeout: 480 seconds]
20:53|-|baroni [] has joined #uml
21:04<fo0bar>aroscha: at work we have a few UML machines, splitting a PD 3.4GHz 4GB machine into 8x 384MB. Nothing fancy, but my point is 1GB sticks are a lot cheaper than 2GB/4GB :)
21:04<aroscha>yes, agreed
21:04<aroscha>except for when you take into account the electricity
21:05<aroscha>for more servers
21:05<aroscha>would be interesting to compare
21:05<aroscha>ah! in the states you still pay 4 cents (US) per KWh right?
21:05<fo0bar>well, in my specific case each one of those instances were (or were going to be before we started with UML) individual servers, so we're definitely saving money over just not using UML
21:06<fo0bar>but yeah, there might be a sweet spot for server cost vs electricity cost
21:06<aroscha>over here we pay something like 7 cents (EU) per KWh
21:06<aroscha>so, almost double!
21:07[~]caker ponders converting amp/month into kw/h
21:07<aroscha>caker: 200 Watt server =~ 1Amp at 220 V
21:07<caker><-- $13 to about low $20s an Amp
21:07<caker>per month.
21:08<caker>aroscha: I know what our servers use, just trying to get it into your unit
21:08<fo0bar>I haven't checked in awhile, but I seem to remember there are so many fees and regulatory items and whatnot (with different rules for when they get applied) that it was impossible to say "electricity costs me X per kwh"
21:08<caker>that didn't sound right.
21:09<fo0bar>and I know costs vary wildly depending on where in the US you arre
21:10<aroscha>10 euros to 20 euros per server
21:10<aroscha>fo0bar: ah
21:11<aroscha>fo0bar: but what I am finding out is that memory is not really the issue so much
21:11<aroscha>at least for my experiments with many very similar root_fses
21:12<aroscha>but the context switch rate is really high and same for the IRQ rate (as caker already noted)
21:12<aroscha>70k CS / sec is quite a lot!
21:13<aroscha>so i guess the best is to have quadcore ?
21:13<aroscha>or i take my linux book and try to help jdike to get down the CS rate hehe
21:18|-|apic [] has quit [Server closed connection]
21:18|-|apic [] has joined #uml
21:35<sbw>Hi, how is jiffies incremented in uml? I can't seem to find where it is in the code..
21:37|-|fghj [~tkxue@agp.Stanford.EDU] has joined #uml
22:14|-|aroscha [~aroscha@] has quit [Quit: aroscha]
22:28|-|fghj [~tkxue@agp.Stanford.EDU] has quit [Quit: Leaving]
22:50|-|aroscha [] has joined #uml
22:53|-|fghj [~tkxue@agp.Stanford.EDU] has joined #uml
22:59|-|VS_ChanLog [] has left #uml [Rotating Logs]
22:59|-|VS_ChanLog [] has joined #uml
23:09|-|Hunger [] has quit [Server closed connection]
23:10|-|Hunger [] has joined #uml
23:13|-|fghj [~tkxue@agp.Stanford.EDU] has quit [Quit: Leaving]
23:16|-|tasaro [] has quit [Server closed connection]
23:16|-|tasaro [] has joined #uml
23:22|-|VS_ChanLog [] has quit [Server closed connection]
---Logclosed Tue Apr 24 00:00:41 2007