Created
May 2, 2011 09:28
-
-
Save thesjg/951362 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<[email protected]> [03:10] i think we should commit an optimization for that phoronix threaded i/o tester | |
<[email protected]> [03:10] that just ignores about 80% of the i/o's | |
<[email protected]> [03:10] :-) | |
<[email protected]> [03:11] we could certainly optimize buffers containing all-zeros (if we don't already). heh | |
<[email protected]> [03:11] that might be useful | |
<[email protected]> [03:11] they are probably all zero's | |
<[email protected]> [03:11] i looked at what it did | |
<[email protected]> [03:11] it just allocated a chunk ala malloc() and then wrote that allocated buffers contents to a file | |
<[email protected]> [03:12] it never touched the buffer | |
<[email protected]> [03:13] how do you know if its all zero's, in an efficient fashion? | |
<[email protected]> [03:13] i guess you compute a crc for the data, you are touching it all anyway, you could just not actually store it | |
<[email protected]> [03:14] freebsd's pagezero() on i686 actually finds the first nonzero index in a page and starts zeroing from there :d | |
<[email protected]> [03:16] it would be easy for the crc32 code to have another argument which returns the zero/non-zero state of the buffer | |
<[email protected]> [03:16] well, I guess it would be faster just to scan the buffer twice anyway | |
<[email protected]> [03:17] well, whats the crc32 of a zero-filled buffer? | |
<[email protected]> [03:17] you could just conditionalize on it | |
<[email protected]> [03:18] you'd still have to check that the buffer contains all-zeros, but yes that could be a first-order approximation to avoid the zero-check if it doesn't match | |
<[email protected]> [03:18] would it be hard to make hammer smart enough to not lay the zero's down on disk? | |
<[email protected]> [03:18] no, it would be trivial. | |
<[email protected]> [03:18] yeah thats what i figured | |
<[email protected]> [03:18] stop encouraging each other ! :) | |
<[email protected]> [03:19] you'd just have a data record with a data_offset of 0 as a special case meaning 'buffer full of zeros' | |
<[email protected]> [03:19] it might already be coded or partially coded. it would be very easy to implement | |
<[email protected]> [03:23] dillon: hammer_object.c somewhere? | |
<[email protected]> [03:24] mmmm. I'd have to look. hold on a sec | |
<[email protected]> [03:24] for writes, hammer_ip_add_bulk() | |
<[email protected]> [03:25] check for all zeros, and don't allocate a data offset (set record->leaf.data_offset to 0) . well, there might be a learning experience there, those code paths are complex but the actual mod isn't going to be very complex | |
<[email protected]> [03:26] ok thats what i thought, i was looking at the dedup hooks | |
<[email protected]> [03:26] then on the read-back side checking for a data_offset of 0 and zero-filling the read buffer | |
<[email protected]> [03:26] instead of doing a direct data read | |
<[email protected]> [03:26] read side just allocates and return()'s a buffer? | |
<[email protected]> [03:26] or similar? | |
<[email protected]> [03:27] read side is accessing hammer via its b-tree and filling in buffer cache buffers for the related file | |
<[email protected]> [03:27] so in the all-zeros case it would check that the b-tree record has no data offset (implied all-zeros) and bzero()'s the buffer cache buffer | |
<[email protected]> [03:28] I think you could do it fairly easily inside a vkernel for testing | |
<[email protected]> [03:28] dillon: yeah i'll try that | |
<[email protected]> [03:28] conditionalize the code on hammer filesystem version 6 (the current WIP version) so it doesn't execute on your root filesystem | |
<[email protected]> [03:28] then use a small hammer partition formatted w/ hammer and upgraded to version 6 for testing. or inside a vkernel | |
<[email protected]> [03:31] thesjg: another thing that can be done is to check for the append-all-zeros case. In that case no new record needs to be written at all, the file size is simply adjusted and it is a hole | |
<[email protected]> [03:31] dillon: ahh, yeah, good call, i'll note that too |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment