翻訳:Masanori ITOH (marchan@computer.org) >\input texinfo @c -*-texinfo-*- >@setfilename hurd.texi > >@ifinfo >@format >START-INFO-DIR-ENTRY >* Hurd: (hurd). The interfaces of the GNU Hurd. >END-INFO-DIR-ENTRY >@end format >@end ifinfo > >@ifinfo >Copyright @copyright{} 1994 Free Software Foundation, Inc. > >Permission is granted to make and distribute verbatim copies of >this manual provided the copyright notice and this permission notice >are preserved on all copies. > >@ignore >Permission is granted to process this file through TeX and print the >results, provided the printed document carries a copying permission >notice identical to this one except for the removal of this paragraph >(this paragraph not being relevant to the printed manual). > >@end ignore > >Permission is granted to copy and distribute modified versions of this >manual under the conditions for verbatim copying, provided also that >the entire resulting derived work is distributed under the terms of a >permission notice identical to this one. > >Permission is granted to copy and distribute translations of this manual >into another language, under the above conditions for modified versions. >@end ifinfo > >@setchapternewpage odd >@settitle Hurd Interface Manual @settitle Hurd インタフェースマニュアル >@titlepage >@finalout >@title The GNU Hurd Interface Manual @title GNU Hurd インタフェースマニュアル >@author Michael I. Bushnell >@page > >@vskip 0pt plus 1filll >Copyright @copyright{} 1994 Free Software Foundation, Inc. > >Permission is granted to make and distribute verbatim copies of >this manual provided the copyright notice and this permission notice >are preserved on all copies. > >Permission is granted to copy and distribute modified versions of this >manual under the conditions for verbatim copying, provided also that >the entire resulting derived work is distributed under the terms of a >permission notice identical to this one. > >Permission is granted to copy and distribute translations of this manual >into another language, under the above conditions for modified versions. >@end titlepage > >@node Top @node Top >@top Introduction @top イントロダクション >This manual describes the interfaces that make up the GNU Hurd. It is >assumed that the reader is familiar with the features of the Mach >kernel, and with using the Hurd interfaces as a user, and all of the >associated C library calls. It concentrates on requirements and advice >for the writing of Hurd servers, as well as describing the libraries >that come with the GNU Hurd. このマニュアルは,GNU Hurd を構成するインタフェースについて説明する. 読者は Mach kernel と,ユーザとしてHurd インタフェースを使うこと, およびすべての関連する C 言語のライブラリ呼び出しに親しんでいることを 仮定している. このマニュアルは,GNU Hurd に提供されるライブラリについて記述すると ともに,Hurd のサーバを書く場合における要求と助言にテーマを絞る. >It is assumed that the reader of the manual is perusing the referenced >MiG interface definitions and library header files for the section being >examined. このマニュアルの読者は,参照された MiG インタフェースの定義および, 吟味されている部分に関するライブラリのヘッダファイルを 熟読していることを仮定している. >@menu @menu >* I/O interface:: The interface for reading and writing > I/O channels * I/O インタフェース:: I/O チャネルの読み書きについてのインタフェース > >@ignore @ignore >* Shared I/O:: The interface for doing input and output > using shared memory *共有 I/O:: 共有メモリを使う入出力についての インタフェース >@end ignore @end ignore > >* File interface:: The interface for modifying file-specific > characteristics * ファイルインタフェース:: ファイル特有の特徴を操作するインタフェース >* Filesystem interface:: Interfaces supported to control file-servers * ファイルシステムインタフェース:: ファイルサーバの制御のためにサポートされた インタフェース >* Socket interface:: Interfaces used for manipulating sockets * ソケットインタフェース:: ソケットを扱うためのインタフェース >* Ports library:: A library to manage port rights for servers * Port ライブラリ:: サーバ用?port の使用権 >* Iohelp library:: A library to implement some common parts > of the I/O and shared I/O interfaces. * Iohelp ライブラリ:: I/O インタフェースと共有I/Oインタフェースの 共通部分の一部を実装するライブラリ >* Fshelp library:: A library to implement some common parts > of the file interface. * Fshelp ライブラリ:: ファイルインタフェースの共通部分の一部を 実装するライブラリ >* Pager library:: A library to implement complex > multi-threaded pagers. * ページャライブラリ:: 複雑なマルチスレッドページャを実装する のライブラリ >* Diskfs library:: A library to do almost all the work of > implementing a disk-based filesystem. * Diskfs ライブラリ:: ディスク上のファイルシステムを実装する のに用られる,ほぼすべてのライブラリ >* Trivfs library:: A library to do the work of handling the > file protocol for directory-less > filesystems. * Trivfs ライブラリ:: ディレクトリなしのファイルシステム用の ファイルプロトコルを扱うためのライブラリ >* Mapped data:: Getting memory objects referring to the > data of an I/O object. * マップされたデータ:: I/Oオブジェクトのデータを参照するメモリ オブジェクトを得る >@node I/O interface @node I/O インタフェース >@chapter I/O interface @chapter I/O インタフェース >The I/O interface is used to interact with almost all servers in the GNU >Hurd. It provides facilities for reading and writing I/O streams. The >I/O interface facilities are described in and > The latter portion of and all of > describe how to implement shared-memory I/O operations, >and are described later. The present chapter discusses RPC-based I/O >operations. I/Oインタフェースは,GNU Hurd のほとんどすべてのサーバと相互作用するのに 用いられる.これは I/O ストリームを容易に読み書きするための機構を提供する. I/O インタフェース機構は に 記述されている. の後半部分と 全体は, 共有メモリ I/O 操作をいかに実装するかについて記述しており, これについては後述する. この章では RPCに基づいた I/O 操作について議論する. >(訳注) >facility は「機構」と訳出した. >@menu @menu >* I/O object ports:: How ports to I/O objects work * I/O オブジェクト ポート:: I/O オブジェクトに対するポートの動き >* Simple operations:: Read, write, and seek * 単純な操作:: Read, write, seek >* Open modes:: State bits that affect pieces of > operation *オープンモード:: ここの操作に影響を及ぼす状態ビット >* Asynchronous I/O:: How to get notified when I/O is possible * 非同期I/O:: I/O が可能になった時に通知を受ける方法 >* Information queries:: How to implement io_stat and > io_server_version * 情報の要求:: io_stat と io_server_version の実装法 >@end menu @end menu >@node I/O object ports @node I/O オブジェクトポート >@section I/O object ports @section I/O オブジェクトポート >The I/O server must associate each I/O port with a particular set of >uids and gids, identifying the user who is responsible for operations on >the port. Every port to an I/O server should also support either the >file protocol or the socket protocol; naked I/O ports are not allowed. I/O サーバは,各I/Oポート上の操作に責任を負うべきユーザを特定し, そのI/Oポートと特定の uid と gid の組を関連付けなければならない. すべてのポートは,I/O サーバに対してファイルプロトコルとソケットプロトコル 双方をサポートするべきであり,裸の(naked) I/O ポートは許されない. >(訳注) > a particular set of uids and gids の set は組と訳したが, >集合の方がいいかもしれない. >In addition, the server associates with each port a default file >pointer, a set of open mode bits, a pid (called the ``owner''), and some >underlying object which can absorb data (for write) or provide data (for >read). さらに,I/O サーバは各ポートにデフォルトファイルポインタと, オープンモードビットの集合と,pid ("owner" と呼ばれる)と,データを吸い込 むこと(write の場合)や,データを供給すること(read の場合)が可能な 何らかの基本的なオブジェクトを関連づける. >The uid and gid sets associated with a port may not be visibly shared >with other ports, nor may they ever change. The server must fix the >identification of a set of uids and gids with a particular port at the >moment of the port's creation. The other characteristics of an I/O port >may be shared with other users. The I/O server interface does not >generally specify in what way servers may share these other >characteristics are shared (with the exception of the deprecated O_ASYNC >interface); however, the file and socket interfaces make further >requirements about what sharing is expected and prohibited from >occurring. あるポートに関連づけられた uid と gid の組は,明に他のポートと共有されては ならないし,変わってもならない.サーバは,特定のポートの生成の瞬間に, そのポートと uid と gid の集合の関連づけを確定しなければならない. そのほかの I/O ポートの特徴は,他のユーザと共有されてもよい. I/O サーバインタフェースは,各サーバの間で,そのほかの I/O ポートの 特徴が共有される手段を,一般に規定しない. (deprecatedな O_ASYNC インタフェースを除く) ただし,ファイルインタフェースとソケットインタフェースでは,さらに 何に対する共有が期待され,何が起こるのが禁止されるのかについて要求事項が ある. >In general, users get send-rights to I/O ports by some mechanism that is >external to the I/O protocol. (For example file servers give out I/O >ports in response to the dir_pathtrans and fsys_getroot calls. Socket >servers give out ports in response to the socket_create and >socket_accept calls.) However, the I/O protocol provides methods of >obtaining new ports that refer to the same underlying object as another >port. In response to all of these calls, all underlying state >(including, but not limited to, the default file poirter, open mode >bits, and underlying object) must be shared between the old and new >ports. In the following descriptions of these calls, the term >"identical" means this kind of sharing. All these calls must return >send-rights to a newly-constructed Mach port. 一般に,ユーザは I/O プロトコル以外のしくみから I/O ポートに対する send 権 を得る.(例えば,ファイルサーバは dir_pathtrans と fsys_getroot 呼び出し に応じてI/O ポートを割り当てる.ソケットサーバはsocket_create と socket_accept 呼び出しに応じてポートを割り当てる.) しかしながら,I/O プロトコルは,他の I/O ポートと同じ基本的オブジェクト を参照するような,新しいポートを獲得する手段を提供する. これらすべての呼び出しに応じて,すべての基本的状態は新旧双方のポートで 共有されなければならない.(これらには,デフォルトファイルポインタ, オープンモードビット,基本的オブジェクトを含むが,これらだけに 限られるわけではない) 以下の,これらの呼び出しに関する記述では,"同一" ("identical")という 言葉はこの種の共有を示す. これらすべての呼び出しは,新しく生成された Mach ポートに対するsend 権を 返さなければならない. >The io_duplicate call simply returns another port which is identical >to an existing port and has the same uid and gid set. io_duplicate 呼び出しは,単純に既に存在するポートと"同一"で,同じ uid と gid の組を持つ,別のポートを返す. >The io_restrict_auth call returns another port, identical to the >provided port, but which has a smaller associated uid and gid set. The >uid and gid sets of the new port are the intersection of the set on the >existing port and the lists of uids and gids provided in the call. io_restruct_auth 呼び出しは,与えられたポートと"同一"な, ただし(与えられたポートのものよりも)小さな uid と gid の集合に関連づけ られた,別のポートを返す. この新しいポートの uid と gid の集合は,既に存在したポートの uid と gid の集合と,呼び出しに対して与えられた uid と gid の集合の共通部分に なっている. >(訳注) > 共通部分のところは,set に対して intersection というような,数学的な > 用語が使われているが,読みやすさを重視して普通の言葉遣いで訳出した. >Users use the io_reauthenticate call when they wish to have an entirely >new set of uids or gids associated with a port. In response to the >io_reauthenticate call, the server must create a new port, and then make >the call auth_server_authenticate to the auth server. The rendezvous >port for the auth_server_authenticate call is the I/O port to which was >made the io_reauthenticate call. The server provides rend_int parameter >to the auth server as a copy from the corresponding parameter in the >io_reauthenticate call. The I/O server also gives the auth server a new >port; this must be a newly created port identical to the old port. The >auth server will return the set of uids and gids associated with the >user, and guarantees that the new port will go directly to the user that >possessed the associated authentication port. The server then >identifies the new port given out with the specified id's. ユーザが,ポートに関連づけられたまったく新しい uid または gid の集合 を得たい場合には,io_reauthenticate 呼び出しを用いる. io_reauthenticate 呼び出しに応じて,I/Oサーバは新しいポートを生成し, 認証サーバに対して auth_server_authenticate 呼び出しを行わなければならない. この auth_server_authenticate 呼び出しに対して生成されたポートに対応する ポートは,io_reauthenticate 呼び出しされた I/O ポートである. >(訳注) > rendezvous port のくだりは専門用語があるのかも知れないが, > よく知らないので意訳してしまった.専門の学生さんの意見を聞きたい. > I/O サーバは,io_reauthenticate 呼び出しでの対応するパラメータのコピーとして 認証サーバに対して rend_int パラメータを渡す. また,I/O サーバは新しいポートを認証サーバに渡す.これは古いポート と"同一"な,新しく生成されたポートでなければならない. 認証サーバはユーザに関連づけられた uid と gid の組を返し,新しいポートが, 認証されたポートを所有するユーザに直接渡ることを保証する. そして,(I/O)サーバは新しく割り当てられたポートを,指定された id に 関連づける. >@node Simple operations @node 単純な操作 >@section Simple operations @section 単純な操作 >Users write to I/O ports by calling the io_write RPC. They specify an >offset parameter; if the object supports writing at arbitrary offsets, >the server should honor this parameter. If -1 is passed as the offset, >then the server should use the default file pointer. The server should >return the amount of data which was successfully written. If the >operation was interrupted after some but not all of the data was >written, then it is considered to have succeeded and the server should >return the amount written. If the port is not an I/O port at all, the >server should reply with the error EOPNOTSUPP. If the port is an I/O >port, but does not happen to support writing, then the correct error is >EBADF. ユーザは io_write RPC を呼び出すことによって I/O ポートに対して書き込み を行う. そのオブジェクトが任意のオフセットへの書き込みをサポートしていれば, ユーザはパラメータにオフセット指定する.オフセットとして -1 が渡された 場合,サーバはデフォルトファイルポインタを用いなければならない. サーバは書き込みに成功したデータの量を返す. すべてのデータが書き込まれる前に操作が中断された場合,操作は成功された と見なされ,サーバは書き込まれただけの量を返すべきである. ポートが(いかなる意味でも) I/O ポートでない場合には, サーバはエラーとして EOPNOTSUPP を返すべきである. ポートが I/O ポートの一種ではあるが,書き込みをサポートしてない 場合には,正確なエラーは EBADF である. >Users read from I/O ports by calling the io_read RPC. The specify the >amount of data they wish to read and the offset. The offset has the >same meaning as for io_write above. The server should return the data >read. If the call is interrupted after same data has been read (and the >operation is not idempotent) then the server should return the amount >read, even if less than the amount requested. The server should return >as much data as possible, but never more than requested by the user. If >there is no data, but there might be later, the call should block until >data becomes available. Indicate end-of-file conditions by returning >zero bytes. If the call is interrupted after some data has been read, >but the call is idempotent, then the server may return EINTR rather than >actually filling the buffer (taking care that any modifications of the >default file pointer have been reversed). Preferably, however, servers >should return data if possible. ユーザは io_read RPC を呼び出すことによって I/O ポートから読みだし を行う.ユーザは読みだしたいデータの量とオフセットを指定する. ここで,オフセットとは上の io_write と同じ意味である. サーバは読み出されたデータを返さなければならない. もし,呼び出しがいくらかのデータが読み出された後に中断された場合, サーバは読み出したデータの量を返さなければならない. >(訳注) > このパラグラフ4行目の same は some の typo だと信じる. > また,idempotent のところは意味がよくわからない. サーバは,可能なかぎりの量のデータを返さなければならないが, ユーザが要求した量を越えることはない. データが存在しなかったが,後に存在するであろう場合, その呼び出しはデータが読みだし可能になるまでブロックされるべきである. ファイルの終りは0バイトと返すことによって示される. もし,いくらかのデータが読み出されてはいるが呼び出しが中断された場合(?????), (デフォルトファイルポインタへのいかなる変更も元に戻されることを考慮すると) サーバは実際にバッファを満たすよりは,EINTR を返してもよい. >(訳注) > 上と同様に,idempotent のところは意味がよくわからない. > 文脈的には手続き呼び出しの中断と読めるので,そう訳してある. しかし,むしろサーバは可能であるかぎりデータを返すべきである. >There are two categories of objects: seekable and non-seekable. >Seekable objects must accept arbitrary offset parameters in the io_read >and io_write calls, and to implement the io_seek call. Nonseekable >objects must ignore the offset parameters to io_read and io_write, and >should return ESPIPE to the io_seek call. オブジェクトには2つの分類がある.seek 可能なものと seek 不可能な ものである. seek 可能なオブジェクトは,io_read と io_write 呼び出しのパラメタにおいて, 任意のオフセットを許可し,io_seek 呼び出しを実装しなければなければならない. seek 不可能なオブジェクトは io_read と io_write に対するオフセット パラメータを無視しなければならず,io_seek 呼び出しに対して ESPIPE を 返さなければならない. >(訳注) >訳者が仕事で少しかかわっているOSでは seek 相当のことを「位置づけ」と >言うことが多いのだが,UNIX流に seek とそのまま訳出した. >On seekable objects, io_seek changes the default file pointer for reads >and writes. (See the C library manual for the interpretation of the >WHENCE and OFFSET arguments.) It returns the new offset as modified by >io_seek. seek 可能なオブジェクトでは,io_seek は読み出しと書き込みのための デフォルトファイルポインタを変更する. (C 言語ライブラリのマニュアルの,WHENCE と OFFSETの解釈について 参照すること) >The io_readable interface returns the amount of data which can be >immediately read. For the special technical meaning of "immediately", >see the description of asynchronous I/O. (*Note: Asynchronous I/O.) io_readable インタフェースは,即時(immediately)読み出し可能なデータの 量を返す. 即時(immediately)という言葉の特殊な技術的意味については,非同期 I/O の 記述を参照すること. (*Note: 非同期 I/O) >@node Open modes @node オープンモード >@section Open modes @section オープンモード >The server associates each port with a set of bits that affect its >operation. The io_set_all_openmodes call modifies these bits and the >io_get_openmodes call returns them. In addition, the >io_set_some_openmodes and io_clear_some_openmodes do an atomic >read/modify/write of the openmodes. サーバは各ポートを,それぞれに対する操作に影響を及ぼすビットの集合と 関連づける.io_set_all_openmodes 呼び出しはこれらのビットを変更し, io_get_openmodes 呼び出しはこれらを返す.加えて,io_set_some_openmodes 呼び出しと io_clear_some_openmodes は,オープンモードに対する read/modify/write を個別に行う. >The O_APPEND bit, when set, changes the behavior of io_write when it >uses the default file pointer on seekable objects. When io_write is >done on a port with the O_APPEND bit set, is must set the filepointer to >one more than the "maximum correct value" (described below) before doing >the write (which would then increment the file pointer as usual). The >server must atomically bind this update to the actual data write with >respect to other users of io_read, io_write, and io_seek. O_APPEND ビットがセットされた時,seek 可能なオブジェクトのデフォルト ファイルポインタを使う場合に io_write の振るまいが変わる. O_APPEND ビットがセットされたポートに対する io_write が完了した時, 書き込み(通常どおり,これによりファイルポインタが増加する)が実行される前に, ファイルポインタが(以下に記述する) "maximum correct value" より大きな 値に設定される. >A "correct value" for the file pointer which, when provided to io_read, >will successfully return at least one byte of data and not end-of-file. >The "maximum correct value" referred to in the description of O_APPEND >is the maximum such correct value. (For ordinary files [see the >description of the file protocol for more information] this is the same >as the current file size.) ファイルポインタに対する "correct value" とは,io_read に対して与えられ た場合には,EOF ではなく少なくとも1バイトを返すのに成功するような 値である. O_APPEND ビットの記述で触れた "maximum correct value" とは, 上の意味を満たすような最大の値である. (通常のファイルに対しては現在のファイルサイズと同じである.[さらに詳細 についてはファイルプロトコルの記述を参照のこと]) >The O_FSYNC bit, when set, causes io_write not to delay writing data to >underlying media in any fashion. O_FSYNC ビットがセットされた時,io_write の振るまいに影響し, いかなる意味でも下位の媒体に対するデータの書き込みが遅延しない. >The O_NONBLOCK bit, when set, prevents read and write from blocking. >They should copy such data as is immediately available. If no data is >immediately available they should return EWOULDBLOCK. O_NONBLOCK ビットがセットされた時,読み込みと書き込みはブロックされない. 読み込み操作と書き込み操作は,即時操作可能なデータをコピーする. 即時操作可能なデータが無い場合には EWOULDBLOCK を返さなければならない. >The definition of "immediate" is more or less server dependent. Some >servers (disk-based file servers, most notably) regard all data as >immediatebly available. The one criterion is that something which must >happen immediately may not wait for any user-synchronizable event. "即時" という言葉の定義は,多かれ少なかれサーバに依存する. あるサーバ(最も典型的には,ディスク上のファイルのサーバ)は, 全てのデータを即時操作可能なものとみなす. ひとつの基準は,即時実行されなければならないものは,ユーザが 同期制御可能なイベントを待つべきではないということである. >(訳注) > ちょっと自身なし... >The O_ASYNC bit is deprecated; its use is documented in the following >section. This bit must be shared between all users of the same >underlying object. O_ASYNC ビットは推奨されない. このビットの用途は後節に記述されている.このビットは,同一の下位オブジェ クトのすべてのユーザの間で共有されなければならない. >(訳注) > deprecated ってこれでいいんでしたっけ? >@node Asynchronous I/O @node 非同期 I/O >@section Asynchronous I/O @section 非同期 I/O >Users may wish to be notified when I/O can be done without blocking; >they use the io_async call to indicate this to the server. In the >io_async call the user provides a port on which will the server should >send sig_post messages as I/O becomes possible. The server must return >a port which will be the reference port in the sig_post messages. Each >io_async call should generate a new reference port. (See the C library >manual for information on how to send sig_post messages.) > >The server then sends one SIGIO signal to each registered async user >everytime I/O becomes possible. I/O is possible if at least one byte >can be read or written immediately. (The definition of ``immediately'' >must be the same as for the implementation of the O_NONBLOCK flag.) In >addition, everytime a user calls io_read or io_write on a non-seekable >object, or at the default file pointer on a seekable object, another >signal should be sent to each user if I/O is still possible. > >Some objects may also define "urgent" conditions. Such servers should >send the SIGURG signal to each registered async user anytime an urgent >condition appears. After any RPC that has the possibility of clearing >the urgent condition, the server should again send the signal to all >registered users if the urgent condition is still present. > >A more fine-grained mechanism for doing async I/O is the io_select call. >The user specifies the kind of access desired, and a send-once right. >If I/O of the kind the user desires is immediately possible, then the >server should return so indicating, and destroy the send-once right. If >I/O is not immediately possible, the server should save the send-once >right, and send a select_done message as soon as I/O becomes immediately >possible. (Again, the definition of ``immediate'' must be the same for >io_select, io_async, and O_NONBLOCK.) > >For compatibility, the I/O interface provides a deprecated feature >(known as icky async I/O).. The calls io_mod_owner and io_get_owner set >the ``owner'' of the object, providing either a pid or a pgrp (if the >value isnegative). Whenever the I/O server is sending sig_post messages >to all the io_async users, if the O_ASYNC bit is set, the server should >also send a signal to the owning pid/pgrp. The ID port for this call >should be different from all the io_async id ports given to users. >Users may find out what ID port the server uses for this by calling >io_get_icky_async_id. > >@node Information queries >@section Information queries > >Users may call io_stat to find out information about the I/O object. >Most of the fieds of a struct stat are meaningful only for files. All >objects, however, must support the fields st_fstype, st_fsid, st_ino, >st_atime, st_atime_usec, st_mtime_user, st_ctime, st_ctime_usec, and >st_blksize. > >st_fstype, st_fsid, and st_ino must be unique for the underlying object >across the entire system. > >st_atime and st_atime_usec hold the seconds and microseconds, >respectively, of the system clock at the last time the object was >read with io_read. > >st_mtime and st_mtime_usec hold the second and microseconds, >respectively, of the system clock at the last time the object was >written with io_write. > >Other appropriate operations may update the atime and the mtime as well; >both the file and socket interfaces specify such operations. > >st_ctime and st_ctime_usec hold the seconds and microseconds, >respectively, of the system clock at the last time permanent meta-data >associated with the object was changed. The exact operations which >couse such an update are server-dependent, but must include the creation >of the object. > >The server is permitted to delay the actual update of these times until >stat is called; before the server stores the times on permanent media >(if it ever does so) it should update them if necessary. > >st_blksize gives the optimal I/O size in bytes for io_read and io_write; >users should endeavor to read and write amounts which are multiples of >the optimal size, and to use offsets which are multiples of the optimal >size > >In addition, objects which are seekable should set st_size to the >"maximum correct value" described above in the description of the >O_APPEND flag. > >The st_uid and st_gid fields are unrelated to the ``owner'' as described >above for icky async I/O. > >Users may find out the version of the server they are talking to by >calling io_server_version; this should return strings and integers >describing the version number of the server, as well as its name. > >@node Mapped data >@section Mapped data > >Servers may optionally implement the io_map call. The ports returned by >io_map must implement the XP kernel interface and be suitable as >arguments to vm_map. > >Seekable objects must allow access from 0 to the "maximum correct value" >described for O_APPEND. Whether they provide access beyond such a point >is server dependent; in addition, the meaning of such an object for a >non-seekable object is server dependent. > >@ignore >However, servers which >implement the facilities of the next section must obey to certain >requirements about which addresses in the memory objects provided by >io_map must be valid. Simply put, any user following the rules >described in the next chapter should not get any memory faults except as >explicitly permitted by the next chapter. >@end ignore > >@ignore >@node Shared I/O >@chapter Shared I/O > >I/O servers may, optionally, provide the services described in this >chapter in addition to the generic services described in the previous >chapter. These facilities allow users to read and write I/O objects >without making RPC's to the server in most circumstances. > >@menu >* Rules:: The rules users must obey in using > shared I/O. >* Examples:: Examples of the way different types > of servers could implement shared I/O. >@end menu > >@node Rules >@section Rules > >Any server implementing the facilities of this chapter must also support >the io_map call as described in the previous chapter. > >Users of the shared I/O facilities must call io_map_cntl; this will >return a memory object, called the shared page object. One page of this >object should be mapped from offset zero into the user's address space. >At the front of this page is a struct shared_io as described in >. Frequent reference will be made to the members of this >structure in this chapter, without further qualification. The shared >page past the struct shared_io may be used by users as they wish. > >Users should examine the shared_page_magic field; from it they can >discover the byte ordering used by the server. Users should not blindly >assume that the server uses the same byte ordering as they. > >Only one shared user can be active on a given port at a time. If a user >calls io_map_cntl on a port which already has an active shared user, the >server should return EBUSY, at which point the user should call >io_duplicate to obtain a new port, and call io_map_cntl there. > >@menu >* Conch:: How access to the shared page is mediated >* Access rules:: Where in the io_map memory objects users > may peek and poke >* Behavior modification:: Modifications of behavior >* Status notifications:: Calls users should make at certain > times to keep the server abreast of the > current state of the object >* Violations:: When the rules are broken > >@end menu > >@node Conch >@subsection Conch > >Access to the shared page is mediated through a facility known as the >``conch''. The ``lock'' field of the shared page protects the >conch_status field; users and the server must acquire this lock with >spin_lock before they may modify or examine conch_status. > >If the conch_status field is USER_HAS_CONCH or USER_RELEASE_CONCH, then >the user has the conch, and may access the shared page after releasing >the spin lock. If the conch_status field is USER_COULD_HAVE_CONCH, then >the user may immediately set conch_status to USER_HAS_CONCH, and proceed >to access the shared page after releasing the spin lock. If the conch >status is USER_HAS_NOT_CONCH, then the user should release the spin >lock, and call io_get_conch. Upon return from io_get_conch, the user >should reacquire the spin lock and check conch_status again. > >When the user is through accessing the shared page, the user should >acquire the spin lock and examine the conch_status field. If it has >been set to USER_RELEASE_CONCH, then the user should release the spin >lock and call io_release_conch. Otherwise, the user should change >conch_status from USER_HAS_CONCH to USER_COULD_HAVE_CONCH and then >release the spin lock. > >The implementation of io_read and io_write must not modify the object >data or the default file pointer except when the server is holding the >conch; users who wish to be atomic with respect to those functions >should be similarly reticent. > >The server must guarantee that at most one user of an underlying object >has the conch at a time; the server may only have the conch if no user >does. The server may not modify conch_status or the shared page if the >status is USER_HAS_CONCH except to set it to USER_RELEASE_CONCH, thus >requesting a call to io_release_conch. > >The server is permitted to modify any characteristics of the shared page >anytime the conch_status is not USER_HAS_CONCH or USER_RELEASE_CONCH; >users may not assume that the shared page has not changed even when only >upgrading USER_COULD_HAVE_CONCH to USER_HAS_CONCH. > >@node Access rules >@subsection Access rules > >The conch fields file_size, read_size, and prenotify_size affect which >areas of the data objects may be accessed. In addition, for >non-seekable objects, the file pointers rd_file_pointer, >wr_file_pointer, and xx_file_pointer affect which areas may be accessed. > >For seekable objects, the user may read the read object from offset 0 >through the minimum of file_size and read_size. > >For seekable objects, the user may write the write object from offset 0 >through the prenotify_size. > >For nonseekable objects, the user may read the read object from >rd_file_pointer through the minimum of file_size and read_size. > >For nonseekable objects, the user may write the write object from >wr_file_pointer through prenotify_size. > >The server may permit access outside these regions, but need not >preserve data for any length of time if so written. If the server >wishes to deny such access, it issue faults with EIO. Servers may also >issue faults on modifications of the write object for reasons such as >EDQUOT and ENOSPC, as well as reporting hardware errors with EIO. >Servers may only fault valid addresses in the read object in the event >of hardware failure, giving EIO. > >Users should ignore the foo field if the value use_foo is clear in the >shared page; this may result in there being no maximum valid address for >a particular access. In that case, the user may access the object to >the end of its virtual address space. > >If use_file_size is set, the user may increase the file_size, but may >not decrease it, to indicate the new "maximum correct value" as >described for O_APPEND. Normally when users write beyond the current >file_size they should extend it at least to the end of the write. > >The xx_file_pointer for seekable objects must be the same as the default >file pointer used by io_read and io_write. > >If use_read_size is set and the user wishes to read past read_size, she >may call io_readsleep, which must return as soon as read_size is >increased. The server should set read_block_reason anytime >use_read_size is set; if read_block_reason is RBR_BUFFER_FULL, then the >server is indicating that the read_size might never be increased until >the rd_file_pointer is sufficiently increased. > >If the server has set use_prenotify_size and the user wishes to write >past prenotify_size, she may call io_prenotify, specifying the maximum >offset the user intends to write. The server should return when after >increasing prenotify_size, but is not obligated to extend it as far as >the user wishes. In addition, io_prenotify may return errors such as >ENOSPC, indicating that the prenotify_size cannot be increased. > >Users of seekable objects may modify the xx_file_pointer at will >(including pointing past read_size, file_size, or prenotify_size). >Users of non-seekable objects, however, may only increase the >rd_file_pointer and wr_file_pointer. In addition, they may not modify >them to point past the valid data as described above. Failing to >advance them at all may prevent the read_size or prenotify_size from >being increased. > >If the server sets eof_notify, then the user may attempt to have the >file_size to be increased by calling io_eofnotify after "noticing" the >current file size limit. io_eofnotify must return immediately, but need >not actually increase the file_size or clear user_file_size. (However, >if it is impossible for io_eofnotify to ever do anything, then the >server should not bother setting eof_notify.) > >@node Status notification >@subsection Status notification > >The flag do_sigio requests the user to call io_sigio every time she >changes the file pointers or the file_size. > >If the server sets use_postnotify_size, then the user should call >io_postnotify after writing data that extends past postnotify_size. The >server may buffer writes internally beyond postnotify_size for >arbitrarily long periods until io_postnotify is called, regardless of >the setting of the O_FSYNC bit. > >After modifying or reading the object contents, the user should set the >written or accessed fields respectively. (Users who fail to set these >fields will not thereby defeat the mtime/atime mechanism.) > >If the flag use_eof is set, then users should call io_eofnotify after >reading up to the file_size and noticing it. > >@node Behavior modification >@subsection Behavior modification > >The server flag append_mode is a copy of the O_APPEND open mode bit; if >it is set, then the user should do writes at file_size and set the file >pointer appropriately (this applies only if the user would be writing at >the file pointer in the first place). > >Servers should implement the flag O_FSYNC by using the postnotify_size >field. > >Servers should implement the io_async and O_ASYNC notifications by using >the do_sigio field. > >@node Violations >@subsection Violations > >Users who hold the conch for too long while conch_status is set to >USER_RELEASE_CONCH may have the conch stolen from them and their >conch_status unilaterally downgraded to USER_HAS_NOT_CONCH by the >server. Users who hold the spin lock for too long (where this ``too >long'' is much much shorter than the previous one) may have the spin >lock stolen from them by the server. > >Users who read or write outside the valid regions described above may >get memory faults and may not expect data written to be saved in any >fashion. > >Users who write the read object (when it is different from the write >object) may or may not get faults; they may not expect such data to be >saved in any fashion. > >Users who fail to call io_postnotify may cause data to be buffered for >arbitrarily long periods. > >Users who reduce rd_file_pointer, wr_file_pointer, or file_size will >have such modifications ignored. > >Users may not call any server functions (whether in the I/O protocol or >another) while holding the conch except for those specified in this >chapter. Such calls may block indefinitely or fail silently. >@end ignore > >@node File interface >@chapter File interface > >This chapter documents the interface for operating on files. > >@menu >* File Overview:: Basic concepts for the file interface >* Changing Status:: Changing the owner (etc.) of a file >* Program Execution:: Executing files >* File locking:: Implementing the flock call >* File frobbing:: Other active calls on files >* Opening files:: Looking up files in directories >* Modifying directories:: Creating and deleting nodes >* Notifications:: File and directory change callbacks >* Translators:: How to set and get translators >@end menu > >@node File Overview >@section File Overview > >The file interface is a superset of the I/O interface (@pxref{I/O >Interface}). Servers which provide the file interface are required to >support the I/O interface as well. All objects reachable in the >filesystem are expected to provide the file interface, even if they do >not contain data. (The trivfs library make it easy to do so for >ordinary sorts of cases. @xref{Trivfs library}.) > >The interface definitions for the file interface are found in >. > >Files have various pieces of status information which are returned by >io_stat (@pxref{Information queries}). Most of this status information >can be directly changed by various calls in the file interface; some of >it should vary implicitly as the contents of the file change. > >Many of these calls have general rules associated with them describing >how security and privilege should operate. The diskfs (@pxref{Diskfs >library}) implements these rules for disk-based filesystems. (Trivfs >based servers generally have no need to implement these rules at all.) >We hope to move the implementation of these rules to the fshelp >library. > >In special cases, there may be a reason to implement a different >security check from that specified here, or to implement a call to do >something slightly different. But such cases must be carefully >considered; make sure that you will not confuse blameless user programs >through excessive cleverness. > >If some operation cannot be implemented (for example, chauthor over >ftp), then the call should return EOPNOTSUPP. If it is merely difficult >to implement a call, it is much better to figure out a way to implement >it as a series of operations rather than returning errors to the user. > >@node Changing Status >@section Changing Status > >There are several RPC's avalaible for users to change much of the status >information associated with a file. (The information is returned by the >io_stat rpc; @ref{Information queries}.) > >All these operations are restricted to root and the owner of the file. >When attempted by another user, they should return EPERM. > >The file_chown RPC changes the owner and group of the file. Only root >should be able to change the owner, and changing the group to a group >the caller is not in should also be prohibited. Violating either of >these conditions should return EPERM. > >The file_chauthor RPC changes the author of the file. It should be >legitimate to change the author to any value without restriction. > >The file_chflags RPC changes the flags of the file. It should be >legitimate to change the flags to any value without restriction. No >standard meanings have been assigned to the flags yet, but we intend to >do so. Do not assume that the flags format we choose will map >identically to that for some existing filesystem format. > >The file_utimes RPC changes the atime and mtime of the file. Making >this call must cause the ctime to be updated as well, even if no actual >change to either the mtime or the atime occurs. > >The file_set_size RPC is special; not only does it change the status >word specifing the size of the file, but it also changes the actual >contents of the file. If the file size is being reduced it should >release secondary storage associated with the previous contents of the >file. If the file is being extended, the new region added to the file >must be zero filled. Unlike the other RPC's in this section, >file_set_size should be permitted to any user who is allowed to write >the file. > >@node Program execution >@section Program execution > >Execution of programs on the Hurd is done through file servers with the >file_exec RPC. The fileserver is expected to verify that the user is >allowed to execute the file, make whatever modifications to the ports >are necessary for setuid execution, and then invoke the standard >execserver found on /servers/exec. > >This section specifically addresses what file servers are expected to >do, with minimal attention to the other parts of the process. > >The file must be opened for execution; if it is not, EBADF should be >returned In addition, at least one of the execute bits must be on. A >failure of this check should result in EACCES--not ENOEXEC. It is not >proper for the file server to ever respond ENOEXEC in response to the >file_exec RPC. > >If either the setuid or setgid bits are set, the server needs to >construct a new authentication handle with the additional new ID's. >Then all the ports passed to file_exec need to be reauthenticated with >the new handle. If the fileserver is unable to make the new >authentication handle (for example, because it is not running as root) >it is not acceptable to return an error; in such a case the server >should simply silently fail to implement the setuid/setgid semantics. > >If the setuid/setgid transformation adds a new uid or gid to the user's >authentication handle that was not previously present (as opposed to >merely reordering them) then the EXEC_SECURE and EXEC_NEWTASK flags >should both be added in the call to exec_exec. > >The server then needs to open a new port onto the executed file which >will not share any filepointers with the port the user passed in, opened >with O_READ. Finally, all the information (mutated appropriately for >setuid/setgid) should be sent to the execserver with exec_exec. >Whatever error code exec_exec returns should returned to the caller of >file_exec. > > > > >@bye