-
gitomat
[illumos-gate] 13356 syslog(3c) should not open syslog_door without LOG_CONS -- David Hanisch <titanic⊙dd>
-
sjorge
jbk had anymore time to look at the viona <-> sockfs not returning the buffer issue?
-
sjorge
I ran into it again today when doing an scp
-
jbk
no i haven't
-
jimklimov
associative line: can it be similar to MTU mismatch problems?
-
jimklimov
we had some back in the day of modems in data path of the cross-country links, maybe windowing didn't work, maybe some VPN was unaccounted for, but packets some 4 bytes under ethernet maximum worked and full-size did not or intermittently
-
jimklimov
like you can spend half an hour in an SSH session and then a big directory listing kills it
-
jimklimov
hm, or VLAN
-
jimklimov
scp often stalled at 128KB
-
nbjoerg
jimklimov: complete or for a moment?
-
jbk
this problem is something with either tcp or sockfs + mac loopback path
-
sjorge
jbk ack
-
jbk
basically packets from viona->non-viona zone on the same box get queued up in the netstack of the non-viona zone
-
jbk
since they're loaned from viona
-
jbk
once viona runs out of available descriptors, tx from the bhyve instance stops
-
jimklimov
IIRC when scp stalled like that, it was done for
-
jimklimov
was long time ago, memory is faded :)
-
jimklimov
maybe it somehow came around after some minutes, unless tcp timeout struck in
-
jimklimov
I think the generic solution was, if VPN was involved, to just make the VPN interface MTU small, like 1-1.2kb to surely pass, and it became stable
-
jbk
yeah, i could kinda of reproduce it with scp.. and after a while some timer appears to kick in which unclogs things.. but for non-scp stuff, that doesn't seem to happen
-
LeftWing
Don't you just need a simple TCP server in the zone which maxes out the socket buffer size and doesn't read anything that it gets sent?
-
LeftWing
If it's a lending issue
-
jbk
well the problem is there's supposed to be a 'please stop sending me stuff' signal that appears to get ignored
-
jbk
which if that was working, i think would at least stop things before the senders descriptors were exhausted, allowing other TX traffic to proceed
-
jbk
there's also probably an argument to be made that when stuff is lent, there should be a way for the receiver to know and decide 'hey it's going to be a while, let me copy it and release the loaned resource'.. though how to determine when to do that gets tricky
-
jbk
we (amusingly) have some old code (I think related to zero copy) to deal with that on the outbound side, but not for receive
-
pmooney
it's a pretty sticky problem, IMO
-
pmooney
since normally there's no impact for those long lending times
-
pmooney
I'm not sure anything besides viona would benefit from a system for copying those buffers after a spell
-
LeftWing
Could we just stop lending once you've leant half of the available resources
-
pmooney
we could
-
pmooney
but it's still visible from the guest
-
pmooney
if I ask a NIC to send a packet, and it doesn't for 30 seconds or more...
-
LeftWing
Yeah but at least it wouldn't entirely stall
-
rmustacc
With most NIC drivers we reserve an upfront amount of descriptors for loaning, fwiw.
-
rmustacc
And fall back to alloc/bcopy when we run out.
-
pmooney
LeftWing: And there's still the issue of VM shutdown
-
LeftWing
I guess you can't just leave the relevant pages around with the current architecture
-
pmooney
definitely not today
-
pmooney
even if you could, teardown of viona gets a little weird
-
pmooney
closing the rings down with outstand mblks which expect to make transmit-complete notification callbacks would be a little grimy
-
pmooney
*outstanding
-
pmooney
one would certainly need to take exquisite care to not leak anything (or use-after-free) if mblks were allowed to be left hanging like that
-
rmustacc
grimy subsystems is my second middle name. I suspect we can figure out a way to make it work.
-
pmooney
well, I think a lot of that grime woudl fall to viona
-
pmooney
wouldn't shock me if the ultimate fix was to just bcopy in the case of loopback :-/
-
pmooney
we already had to undo the checksum offload for loopback
-
pmooney
*checksum ellision
-
rmustacc
Yes, but I think that'll change over time.
-
rmustacc
There's a big difference between opt-in versus opt-out.
-
pmooney
perhaps!
-
rmustacc
I mean, there is a long history of IP opt-in already for this stuff.
-
sjorge
So far the pkgsrc cache thing in a pkgbuild zone triggers it and from what I can tell in today's case... a cherrypy webserver after sending it a few requests.
-
sjorge
Killing the zone fixed the issue, killing just the bhyve vm... make it unable to boot again, probably as hinted above because the viona thingy is not torn down yet
-
sjorge
I'd defiantly personally
-
sjorge
Opt for less performance over not having to deal with this.
-
sjorge
Max I waited was about and hour and it didn't continue... so I assume some magic timeout does not fix it in all casss
-
LeftWing
I wonder if we could first add a tunable that forces copies across the board here
-
LeftWing
And you could tune it on
-
LeftWing
Without hopefully a great deal of upheaval
-
LeftWing
We should be correct before we are fast
-
LeftWing
Is there a bug filed for the hang/stall
-
jbk
i thought so.. let me see if i can find it