-
tsoome
nice, buggy tests in zfs-tests:)
-
jbk
yeah let me look at mine..
-
jbk
hmm.. a lot more failures than normal..
-
jbk
it looks like almost all of the 'new' ones are a variant of: set_partition: line 769: 928881: Abort(coredump)
-
jbk
which appears to be the format command tripping over the stack protection
-
andyf
I expect there will be a few of those still to find :)
-
jbk
-
jbk
i think it's in there
-
jbk
(though still digging)
-
tsoome
about 30 fails?
-
jbk
-
tsoome
my vm was still struggling, low swap space, / full etc.
-
jbk
though the 'setup' ones means none of the tests in that group ran
-
jbk
but it looks like this format issue might be the cause for a lot/maybe all of those
-
tsoome
aye
-
andyf
does your stack trace show fdisk_init_master_part_table twice and fdisk_read_master_part_table not at all?
-
jbk
yes
-
jbk
but i think might be due to 'syscall' not setting up a stack frame
-
jbk
of course looking at it
-
jbk
that function
-
jbk
if that is the call
-
jbk
it'd suggest it's either DKIOCGMEDIAINFOEXT or DKIOCGMEDIAINFO that's clobbering things
-
tsoome
ou, that would be nasty
-
tsoome
-
tsoome
altho, its looking familiar:)
-
andyf
Now that /is/ shiny
-
andyf
yes, 32-bit sizeof (struct dk_minfo_ext) = 0x14, 64-bit = 0x18
-
jbk
aha!
-
jbk
good catch
-
yuripv
certainly better than your previous screenshot (with broken menu)
-
andyf
that's ::sizeof in mdb
-
andyf
looks similar to the lx one that was fixed in Joyent recently..
-
tsoome
yuripv yea, i had some garbage left on stack:)
-
yuripv
:)
-
jbk
yeah though that's public(ish)
-
jbk
so i'm not sure if we can fix it's definitions
-
jbk
-s
-
andyf
so the kernel needs to not copyout the 4 bytes of padding at the end for 32-bit processes
-
andyf
or something..
-
danmcd
Wasn't an LX fix, it was a varpd (our overlay driver virtual arp daemon) fix. Same problem though with padding where it was not expected.
-
andyf
danmcd: right, sorry, misremembered
-
jbk
i wonder if it'd be enough to add __pack for the definition (at least on the kernel side)
-
danmcd
commit 6590e107290fe4cfe0c5189a9a28be863b9cf577 in illumos-joyent
-
jbk
.. on the other hand.. adding explicit padding bytes..
-
danmcd
6 one half-dozen the other.
-
jbk
well the struct is arguably public
-
jbk
so we'd be changing its size for 32-bit apps
-
jbk
but..
-
danmcd
Does it have distinct 32 & 64 bit?
-
jbk
no
-
danmcd
If it doesn't, you might be better off calling out the pad bytes explicitly. I see what you mean.
-
andyf
it's probably easier than checking the datamodel in the four places that do copyout
-
jbk
well it's in every driver that supports the ioctl
-
jbk
but the ioctl was added in 2009
-
jbk
so this has been happening already since then
-
jbk
seemingly w/o problems
-
jbk
so i'm not sure it'd be that horrible to just be explicit with the padding
-
jbk
(i.e. we've been copying out the extra 4 bytes for > 10 years)
-
andyf
Just swap the first two local variables in the function then, and run away.
-
andyf
On that note, I'm off.. bbl
-
jbk
i'm happy to throw up a change for this, though it'd be nice to know it won't be instantly shot down for technically ABI breakage (though effectively not)
-
tsoome
build 64-bit format:)
-
rmustacc
I can be a third-party opinion on ABI breakage, if that's useful.
-
jbk
if you want to avoid the scroll, the issues is the 64-bit struct dk_minfo_ext has 4 bytes of padding not present in the 32-bit version, so every kernel driver that implements the DKIOCGMEDIAINFOEXT ioctl has been copying out an extra 4 bytes of data
-
jbk
err issue
-
jbk
which now is getting caught by ssp
-
tsoome
so we need to fix drivers
-
jbk
we could create struct dk_minfo_ext32, and fix all the in-gate drivers that support the ioctl, though that won't help any unbundled drivers (at least w/o code changes on their part)
-
jbk
or we could just explicitly add the padding
-
jbk
to the definition
-
jbk
the latter is obviously simpler, but technically abi-breaking.. but the extra 4 bytes copied out for 32-bit callers has been there since the ioctl was added > 10 years ago.. seemingly w/o harm
-
jbk
(obviously no harm due to luck)
-
jbk
(this is causing libfdisk to trigger the ssp protections, crashing format)
-
yuripv
good excuse to burn format?
-
jbk
that doesn't fix the underlying problem
-
rmustacc
So the problem here is that the 32-bit size and 64-bit size mismatch.
-
jbk
yeah
-
rmustacc
All we can do is fix the kernel and do proper ILP32/LP64 work.
-
rmustacc
And be sad that we missed it originally and work on building better tooling for the future.
-
rmustacc
Which means, yes, unbundled drivers will need to be fixed, but I don't think there's another solution here.
-
jbk
if we just make the padding explicit so the 32 and 64-bit versinos are the same size, nothing else really needs to be fixed -- but it does mean existing stuff will get the extra trailing 4 bytes copied after the struct, but that's already been happening for the past 10 years (and if they recompile, it goes away)
-
rmustacc
jbk: I don't think we should do that. Because it means we're getting lucky.
-
rmustacc
We should just deal with this in the kernel.
-
jbk
though the status of the ioctl isn't clear either i suppose -- it's never actually been explicitly marked public, though the header file has been shipped (though that could be a very legacy bit)
-
jbk
either way, i've filed #13324 for the bug itself
-
richlowe
The old theory was that anything public would also be documented
-
richlowe
but that theory is also pretty useless here in the present.
-
richlowe
the best thing to do is look for actual consumers, github searches an such are good proxies for that.
-
jbk
tsoome: after working around #13324, everything looks good for the 64-bit zfs/zpool
-
jbk
the only zfs test failures are ones known to fail
-
richlowe
andyf: could you look at the updated uts/lint review?
-
jbk
if you need me to send an email or such, let me know
-
tsoome
thanks!
-
tsoome
once I get more disk space, I should be able to get zfs-tests running myself as well:D
-
yuripv
how much does it require nowadays anyway?
-
tsoome
3 x 10GB for tests, but I do not know about rpool yet (some test was creating ufs image on /var/tmp and it was filling up my /)
-
_mjg
i hear openzfs takes 80GB
-
_mjg
you may as well future-proof yourself
-
tsoome
yes.. well, I am using VM, and there is only so much of the disk space available on my laptop.
-
richlowe
tsoome: I'd hope that location was tweakable? if not, maybe file a bug.
-
tsoome
richlowe no idea, I haven't had time yet to dig into... but yes, we should just dig through those tests and make them to work.
-
tsoome
also would be really nice to have ability to resume from some specific test (set)
-
andyf
richlowe - done
-
jbk
heh
-
jbk
i'm pretty sure i got an email intended for someone else
-
jbk
but it's also close enough that i'm not entirely sure
-
jbk
(someone talking about loader configurations but for something i've never heard of)
-
jbk
richlowe: rmustacc: searching for DKIOCGMEDIAINFO on github as well as just a general google search only turned up what appears to be an 'example' program to get the block sizes
-
jbk
err DKIOCGMEDIAINFOEXT
-
jbk
in terms of code
-
jbk
(outside of copies of illumos-gate).. the rest apepar to be copy of man pages
-
gitomat
[illumos-gate] 13323 CTF forward test needs adjusting after 13278 -- Andy Fiddaman <omnios⊙ccu>
-
richlowe
jbk: "SSP protection"
-
richlowe
grrrr
-
jbk
just call me jbk two times
-
jbk
:P
-
jbk
. and fixed