torsdag, mars 20, 2008

The contract of IO#read

It's interesting. After Charlie made an immense effort and rewrote our IO system, basically from scratch, I have started to find bugs. But these are generally not bugs in the IO code, but bugs in Ruby libraries that depend on the way MRI usually works. One of the more annoying ones are IO#read(n), where n is the length you want to read.

This method is not guaranteed to return a string of length n, even if we haven't hit EOF yet. You can NEVER be sure that what you get back is the length you requested. Ever. If you have code that doesn't check the length of the returned string from read, you are almost guaranteed to have a bug just waiting to happen.

Of course, it might work perfectly on your machine and every other machine you test it on. The reason for this is that read(n) will usually return n bytes, but that depends on the socket implementation or file reading implementation of the operating system, it depends on the size of the cache in the network interface, it depends on network latency, and many other things. Please, just make sure to check the return values length before going ahead and using it.

Case in point: net/ldap has this exact problem. Look in net/ber.rb and you will see. There are also two - possibly three - bugs (couple of years old too) that reports different failures because of this.

One thing that makes this problem hard to find is the fact that if you insert debug statement, you will affect the timing in such a way that the code might actually work with debug statement but not without them.

Oh, did I mention that StringIO#read works the same way? It has exactly the same guarantees as IO#read.

7 kommentarer:

Barry Kelly sa...

There's no good reason for expecting a full read anyhow, so IMHO those guys were always broken. With blocking reads across sockets, the most you can ever get with a single read is as much as the other guy sent - and it would be a far, far worse idea for the IO layer to block, waiting to fill the entire size that you requested.

If you must read exactly x bytes, you need to loop and progressively fill a buffer. This should be the same in all languages that access socket I/O, as it's just how the underlying transport works.

ich sa...

Years ago i got bitten by a documented behaviour in IO#skip(n):
"The skip method may, for a variety of reasons, end up skipping over some smaller number of bytes, possibly 0" - See http://java.sun.com/j2se/1.4.2/docs/api/java/io/FilterInputStream.html#skip(long)

which I did not expect;-)

Joshua Graham sa...

Very well pointed out, and I'm sure we'll find lots of oddness like that.

Some of those Ruby libraries, particularly the net ones I've used, are pretty poor.

Now that they're getting industrial-strength use, there'll be a need to re-implement them, but I suppose that can wait for Ruby 2.

PS: I hope when I'm doing a skicka uppföljningskommentarer, it's not going to hurt me ;-)

Daniel Berger sa...

It gets better. IO.read leaks handles if the read fails:

http://rubyforge.org/tracker/index.php?func=detail&aid=15065&group_id=426&atid=1698

Anonym sa...

quantum programming!

JM sa...

I'm with Barry on this one - the behavior you're talking about is the generic behavior of a read() call. You're asking for up to N, not precisely N. This isn't a Ruby thing.

Sometimes you'll see another call provided that's something like keep_reading_until_eof_or_N_bytes(), but that's not the expected behavior of a read().

Ola Bini sa...

Barry, James, I know you're right on this one. That MRI's IO#read(n) always maps to the underlying C library read(fd, n) is not a part of the contract of the method, but yeah, if you have done ANY IO in any language you shouldn't expect it anywhere. I have no problems with this behavior, my problem was that several library writes apparently have no idea about this, and as a result provide buggy libraries that "seem" to work on their machine.