torsdag, mars 20, 2008

The contract of IO#read

It's interesting. After Charlie made an immense effort and rewrote our IO system, basically from scratch, I have started to find bugs. But these are generally not bugs in the IO code, but bugs in Ruby libraries that depend on the way MRI usually works. One of the more annoying ones are IO#read(n), where n is the length you want to read.

This method is not guaranteed to return a string of length n, even if we haven't hit EOF yet. You can NEVER be sure that what you get back is the length you requested. Ever. If you have code that doesn't check the length of the returned string from read, you are almost guaranteed to have a bug just waiting to happen.

Of course, it might work perfectly on your machine and every other machine you test it on. The reason for this is that read(n) will usually return n bytes, but that depends on the socket implementation or file reading implementation of the operating system, it depends on the size of the cache in the network interface, it depends on network latency, and many other things. Please, just make sure to check the return values length before going ahead and using it.

Case in point: net/ldap has this exact problem. Look in net/ber.rb and you will see. There are also two - possibly three - bugs (couple of years old too) that reports different failures because of this.

One thing that makes this problem hard to find is the fact that if you insert debug statement, you will affect the timing in such a way that the code might actually work with debug statement but not without them.

Oh, did I mention that StringIO#read works the same way? It has exactly the same guarantees as IO#read.

8 kommentarer:

Barry Kelly sa...

There's no good reason for expecting a full read anyhow, so IMHO those guys were always broken. With blocking reads across sockets, the most you can ever get with a single read is as much as the other guy sent - and it would be a far, far worse idea for the IO layer to block, waiting to fill the entire size that you requested.

If you must read exactly x bytes, you need to loop and progressively fill a buffer. This should be the same in all languages that access socket I/O, as it's just how the underlying transport works.

thies sa...

Years ago i got bitten by a documented behaviour in IO#skip(n):
"The skip method may, for a variety of reasons, end up skipping over some smaller number of bytes, possibly 0" - See

which I did not expect;-)

Joshua Graham sa...

Very well pointed out, and I'm sure we'll find lots of oddness like that.

Some of those Ruby libraries, particularly the net ones I've used, are pretty poor.

Now that they're getting industrial-strength use, there'll be a need to re-implement them, but I suppose that can wait for Ruby 2.

PS: I hope when I'm doing a skicka uppföljningskommentarer, it's not going to hurt me ;-)

Daniel Berger sa...

It gets better. leaks handles if the read fails:

Anonym sa...

quantum programming!

James Moore sa...

I'm with Barry on this one - the behavior you're talking about is the generic behavior of a read() call. You're asking for up to N, not precisely N. This isn't a Ruby thing.

Sometimes you'll see another call provided that's something like keep_reading_until_eof_or_N_bytes(), but that's not the expected behavior of a read().

Ola Bini sa...

Barry, James, I know you're right on this one. That MRI's IO#read(n) always maps to the underlying C library read(fd, n) is not a part of the contract of the method, but yeah, if you have done ANY IO in any language you shouldn't expect it anywhere. I have no problems with this behavior, my problem was that several library writes apparently have no idea about this, and as a result provide buggy libraries that "seem" to work on their machine.

Anonym sa...

看房子,買房子,建商自售,自售,台北新成屋,台北豪宅,新成屋,豪宅,美髮儀器,美髮,儀器,髮型,EMBA,MBA,學位,EMBA,專業認證,認證課程,博士學位,DBA,PHD,在職進修,碩士學位,推廣教育,DBA,進修課程,碩士學位,網路廣告,關鍵字廣告,關鍵字,課程介紹,學分班,文憑,牛樟芝,段木,牛樟菇,日式料理, 台北居酒屋,日本料理,結婚,婚宴場地,推車飲茶,港式點心,尾牙春酒,台北住宿,國內訂房,台北HOTEL,台北婚宴,飯店優惠,台北結婚,場地,住宿,訂房,HOTEL,飯店,造型系列,學位,牛樟芝,腦磷脂,磷脂絲胺酸,SEO,婚宴,捷運,學區,美髮,儀器,髮型,牛樟芝,腦磷脂,磷脂絲胺酸,看房子,買房子,建商自售,自售,房子,捷運,學區,台北新成屋,台北豪宅,新成屋,豪宅,學位,碩士學位,進修,在職進修, 課程,教育,學位,證照,mba,文憑,學分班,網路廣告,關鍵字廣告,關鍵字,SEO,关键词,网络广告,关键词广告,SEO,关键词,网络广告,关键词广告,SEO,台北住宿,國內訂房,台北HOTEL,台北婚宴,飯店優惠,住宿,訂房,HOTEL,飯店,婚宴,台北住宿,國內訂房,台北HOTEL,台北婚宴,飯店優惠,住宿,訂房,HOTEL,飯店,婚宴,台北住宿,國內訂房,台北HOTEL,台北婚宴,飯店優惠,住宿,訂房,HOTEL,飯店,婚宴,結婚,婚宴場地,推車飲茶,港式點心,尾牙春酒,台北結婚,場地,結婚,場地,推車飲茶,港式點心,尾牙春酒,台北結婚,婚宴場地,結婚,婚宴場地,推車飲茶,港式點心,尾牙春酒,台北結婚,場地,居酒屋,燒烤,美髮,儀器,髮型,美髮,儀器,髮型,美髮,儀器,髮型,美髮,儀器,髮型,小套房,小套房,進修,在職進修,留學,證照,MBA,EMBA,留學,MBA,EMBA,留學,進修,在職進修,牛樟芝,段木,牛樟菇,關鍵字排名,網路行銷,关键词排名,网络营销,網路行銷,關鍵字排名,关键词排名,网络营销,PMP,在職專班,研究所在職專班,碩士在職專班,PMP,證照,在職專班,研究所在職專班,碩士在職專班,SEO,廣告,關鍵字,關鍵字排名,網路行銷,網頁設計,網站設計,網站排名,搜尋引擎,網路廣告,SEO,廣告,關鍵字,關鍵字排名,網路行銷,網頁設計,網站設計,網站排名,搜尋引擎,網路廣告,SEO,廣告,關鍵字,關鍵字排名,網路行銷,網頁設計,網站設計,網站排名,搜尋引擎,網路廣告,SEO,廣告,關鍵字,關鍵字排名,網路行銷,網頁設計,網站設計,網站排名,搜尋引擎,網路廣告,EMBA,MBA,PMP