uda - UDA50 disk controller interface
uda0 at uba? csr 0172150
uda1 at uba? csr 0160334
mscpbus* at uda?
This is a driver for the DEC UDA50 disk controller and other
compatible
controllers. The UDA50 communicates with the host through a
packet protocol
known as the Mass Storage Control Protocol (MSCP).
Consult the
file <vax/mscp.h> for a detailed description of this protocol.
The uda driver is a typical block-device disk driver; see
physio(9) for a
description of block I/O. The script MAKEDEV(8) should be
used to create
the uda special files; should a special file need to be created by hand,
consult mknod(8).
The MSCP_PARANOIA option enables runtime checking on all
transfer completion
responses from the controller. This increases disk I/O
overhead and
may be undesirable on slow machines, but is otherwise recommended.
The first sector of each disk contains both a first-stage
bootstrap program
and a disk label containing geometry information and
partition layouts
(see disklabel(5)). This sector is normally write-protected, and
disk-to-disk copies should avoid copying this sector. The
label may be
updated with disklabel(8), which can also be used to writeenable and
write-disable the sector. The next 15 sectors contain a
second-stage
bootstrap program.
During autoconfiguration, as well as when a drive is opened
after all
partitions are closed, the first sector of the drive is examined for a
disk label. If a label is found, the geometry of the drive
and the partition
tables are taken from it. If no label is found, the
driver configures
the type of each drive when it is first encountered.
A default
partition table in the driver is used for each type of disk
when a pack
is not labelled. The origin and size (in sectors) of the
default pseudodisks
on each drive are shown below. Not all partitions begin on cylinder
boundaries, as on other drives, because previous drivers
used one
partition table for all drive types. Variants of the partition tables
are common; check the driver and the file /etc/disktab
(disktab(5)) for
other possibilities.
Special file names begin with `ra' and `rra' for the block
and character
files respectively. The second component of the name, a
drive unit number
in the range of zero to seven, is represented by a `?'
in the disk
layouts below. The last component of the name is the file
system partition
designated by a letter from `a' to `h' and which corresponds to a
minor device number set: zero to seven, eight to 15, 16 to
23 and so
forth for drive zero, drive two and drive three respectively
(see
physio(9)). The location and size (in sectors) of the partitions:
RA60 partitions
disk start length
ra?a 0 15884
ra?b 15884 33440
ra?c 0 400176
ra?d 49324 82080 same as 4.2BSD ra?g
ra?e 131404 268772 same as 4.2BSD ra?h
ra?f 49324 350852
ra?g 242606 157570
ra?h 49324 193282
RA70 partitions
disk start length
ra?a 0 15884
ra?b 15972 33440
ra?c 0 547041
ra?d 34122 15884
ra?e 357192 55936
ra?f 413457 133584
ra?g 341220 205821
ra?h 49731 29136
RA80 partitions
disk start length
ra?a 0 15884
ra?b 15884 33440
ra?c 0 242606
ra?e 49324 193282 same as old Berkeley ra?g
ra?f 49324 82080 same as 4.2BSD ra?g
ra?g 49910 192696
ra?h 131404 111202 same as 4.2BSD
RA81 partitions
disk start length
ra?a 0 15884
ra?b 16422 66880
ra?c 0 891072
ra?d 375564 15884
ra?e 391986 307200
ra?f 699720 191352
ra?g 375564 515508
ra?h 83538 291346
RA81 partitions with 4.2BSD-compatible partitions
disk start length
ra?a 0 15884
ra?b 16422 66880
ra?c 0 891072
ra?d 49324 82080 same as 4.2BSD ra?g
ra?e 131404 759668 same as 4.2BSD ra?h
ra?f 412490 478582 same as 4.2BSD ra?f
ra?g 375564 515508
ra?h 83538 291346
RA82 partitions
disk start length
ra?a 0 15884
ra?b 16245 66880
ra?c 0 1135554
ra?d 375345 15884
ra?e 391590 307200
ra?f 669390 466164
ra?g 375345 760209
ra?h 83790 291346
The ra?a partition is normally used for the root file system, the ra?b
partition as a paging area, and the ra?c partition for packpack copying
(it maps the entire disk).
/dev/ra[0-9][a-p]
/dev/rra[0-9][a-p]
panic: udaslave No command packets were available while the
driver was
looking for disk drives. The controller is not extending
enough credits
to use the drives.
uda%d: no response to Get Unit Status request A disk drive
was found,
but did not respond to a status request. This is either a
hardware problem
or someone pulling unit number plugs very fast.
uda%d: unit %d off line While searching for drives, the
controller found
one that seems to be manually disabled. It is ignored.
uda%d: unable to get unit status Something went wrong while
trying to
determine the status of a disk drive. This is followed by
an error detail.
uda%d: unit %d, next %d This probably never happens, but I
wanted to
know if it did. I have no idea what one should do about it.
uda%d: cannot handle unit number %d (max is %d) The controller found a
drive whose unit number is too large. Valid unit numbers
are those in
the range [0..7].
uda%d: uballoc map failed UNIBUS resource map allocation
failed during
initialization. This can only happen if you have 496 devices on a
UNIBUS.
uda%d: timeout during init The controller did not initialize within ten
seconds. A hardware problem, but it sometimes goes away if
you try
again.
uda%d: init failed, sa=%b The controller refused to initialize.
uda%d: controller hung The controller never finished initialization.
Retrying may sometimes fix it.
uda%d: still hung When the controller hangs, the driver occasionally
tries to reinitialize it. This means it just tried, without
success.
panic: udastart: bp==NULL A bug in the driver has put an
empty drive
queue on a controller queue.
uda%d: command ring too small If you increase NCMDL2, you
may see a performance
improvement. (See /sys/arch/vax/uba/uda.c.)
panic: udastart A drive was found marked for status or online functions
while performing status or on-line functions. This indicates a bug in
the driver.
uda%d: controller error, sa=0%o (%s) The controller reported an error.
The error code is printed in octal, along with a short description if the
code is known (see the UDA50 Maintenance Guide, DEC part
number AA-M185BTC,
pp. 18-22). If this occurs during normal operation, the
driver will
reset it and retry pending I/O. If it occurs during configuration, the
controller may be ignored.
uda%d: stray intr The controller interrupted when it should
have stayed
quiet. The interrupt has been ignored.
uda%d: init step %d failed, sa=%b The controller reported
an error during
the named initialization step. The driver will retry
initialization
later.
uda%d: version %d model %d An informational message giving
the revision
level of the controller.
uda%d: DMA burst size set to %d An informational message
showing the DMA
burst size, in words.
panic: udaintr Indicates a bug in the generic MSCP code.
uda%d: driver bug, state %d The driver has a bogus value
for the controller
state. Something is quite wrong. This is immediately followed
by a `panic: udastate'.
uda%d: purge bdp %d A benign message tracing BDP purges. I
have been
trying to figure out what BDP purges are for. You might
want to comment
out this call to log() in /sys/arch/vax/uba/uda.c.
uda%d: SETCTLRC failed: `detail' The Set Controller Characteristics
command (the last part of the controller initialization sequence) failed.
The detail message tells why.
uda%d: attempt to bring ra%d on line failed: `detail' The
drive could
not be brought on line. The detail message tells why.
uda%d: ra%d: unknown type %d The type index of the named
drive is not
known to the driver, so the drive will be ignored.
uda%d: attempt to get status for ra%d failed: `detail' A
status request
failed. The detail message should tell why.
panic: udareplace The controller reported completion of a
REPLACE operation.
The driver never issues any REPLACEs, so something is
wrong.
panic: udabb The controller reported completion of bad
block related
I/O. The driver never issues any such, so something is
wrong.
uda%d: lost interrupt The controller has gone out to lunch,
and is being
reset to try to bring it back.
panic: mscp_go: AEB_MAX_BP too small You defined
AVOID_EMULEX_BUG and
increased NCMDL2 and Emulex has new firmware. Raise
AEB_MAX_BP or turn
off AVOID_EMULEX_BUG.
uda%d: unit %d: unknown message type 0x%x ignored The controller responded
with a mysterious message type. See /sys/vax/mscp.h
for a list
of known message types. This is probably a controller hardware problem.
uda%d: unit %d out of range The disk drive unit number (the
unit plug)
is higher than the maximum number the driver allows (currently 7).
uda%d: unit %d not configured, message ignored The named
disk drive has
announced its presence to the controller, but was not, or
cannot now be,
configured into the running system. Message is one of
`available attention'
(an `I am here' message) or `stray response op 0x%x
status 0x%x'
(anything else).
Emulex SC41/MS screwup: uda%d, got %d correct, then changed
0x%x to
0x%x You turned on AVOID_EMULEX_BUG, and the driver successfully avoided
the bug. The number of correctly handled requests is reported, along
with the expected and actual values relating to the bug being avoided.
panic: unrecoverable Emulex screwup You turned on
AVOID_EMULEX_BUG, but
Emulex was too clever and avoided the avoidance. Try turning on
MSCP_PARANOIA instead.
uda%d: bad response packet ignored You turned on MSCP_PARANOIA, and the
driver caught the controller in a lie. The lie has been ignored, and the
controller will soon be reset (after a `lost' interrupt).
This is followed
by a hex dump of the offending packet.
uda%d: %s error datagram The controller has reported some
kind of error,
either `hard' (unrecoverable) or `soft' (recoverable). If
the controller
is going on (attempting to fix the problem), this message
includes the
remark `(continuing)'. Emulex controllers wrongly claim
that all soft
errors are hard errors. This message may be followed by one
of the following
5 messages, depending on its type, and will always be
followed by
a failure detail message (also listed below).
memory addr 0x%x A host memory access error; this is
the address
that could not be read.
unit %d: level %d retry %d, %s %d A typical disk error; the retry
count and error recovery levels are printed, along
with the block
type (`lbn', or logical block; or `rbn', or replacement block) and
number. If the string is something else, DEC has been
clever, or
your hardware has gone to Australia for vacation (unless you live
there; then it might be in New Zealand, or Brazil).
unit %d: %s %d Also a disk error, but an `SDI' error,
whatever
that is. (I doubt it has anything to do with Ronald
Reagan.) This
lists the block type (`lbn' or `rbn') and number.
This is followed
by a second message indicating a microprocessor error
code and a
front panel code. These latter codes are drive-specific, and are
intended to be used by field service as an aid in locating failing
hardware. The codes for RA81s can be found in the
RA81 Maintenance
Guide, DEC order number AA-M879A-TC, in appendices E
and F.
unit %d: small disk error, cyl %d Yet another kind of
disk error,
but for small disks. (``That's what it says, guv'nor.
Dunnask me
what it means.'')
unit %d: unknown error, format 0x%x A mysterious error: the given
format code is not known.
The detail messages are as follows:
success (%s) (code 0, subcode %d) Everything worked,
but the controller
thought it would let you know that something
went wrong.
No matter what subcode, this can probably be ignored.
invalid command (%s) (code 1, subcode %d) This probably cannot occur
unless the hardware is out; %s should be `invalid
msg length',
meaning some command was too short or too long.
command aborted (unknown subcode) (code 2, subcode %d)
This should
never occur, as the driver never aborts commands.
unit offline (%s) (code 3, subcode %d) The drive is
offline, either
because it is not around (`unknown drive'),
stopped (`not
mounted'), out of order (`inoperative'), has the same
unit number
as some other drive (`duplicate'), or has been disabled for diagnostics
(`in diagnosis').
unit available (unknown subcode) (code 4, subcode %d)
The controller
has decided to report a perfectly normal event
as an error.
(Why?)
media format error (%s) (code 5, subcode %d) The
drive cannot be
used without reformatting. The Format Control Table
cannot be read
(`fct unread - edc'), there is a bad sector header
(`invalid sector
header'), the drive is not set for 512-byte sectors
(`not 512 sectors'),
the drive is not formatted (`not formatted'),
or the FCT
has an uncorrectable ECC error (`fct ecc').
write protected (%s) (code 6, subcode %d) The drive
is write protected,
either by the front panel switch (`hardware')
or via the
driver (`software'). The driver never sets software
write protect.
compare error (unknown subcode) (code 7, subcode %d)
A compare operation
showed some sort of difference. The driver
never uses compare
operations.
data error (%s) (code 7, subcode %d) Something went
wrong reading
or writing a data sector. A `forced error' is a software-asserted
error used to mark a sector that contains suspect data. Rewriting
the sector will clear the forced error. This is normally set only
during bad block replacement, and the driver does no
bad block replacement,
so these should not occur. A `header compare' error
probably means the block is shot. A `sync timeout'
presumably has
something to do with sector synchronisation. An `uncorrectable
ecc' error is an ordinary data error that cannot be
fixed via ECC
logic. A `%d symbol ecc' error is a data error that
can be (and
presumably has been) corrected by the ECC logic. It
might indicate
a sector that is imperfect but usable, or that is
starting to go
bad. If any of these errors recur, the sector may
need to be replaced.
host buffer access error (%s) (code %d, subcode %d)
Something went
wrong while trying to copy data to or from the host
(Vax). The
subcode is one of `odd xfer addr', `odd xfer count',
`non-exist.
memory', or `memory parity'. The first two could be a
software
glitch; the last two indicate hardware problems.
controller error (%s) (code %d, subcode %d) The controller has detected
a hardware error in itself. A `serdes overrun'
is a serialiser
/ deserialiser overrun; `edc' probably stands
for `error detection
code'; and `inconsistent internal data struct'
is obvious.
drive error (%s) (code %d, subcode %d) Either the
controller or
the drive has detected a hardware error in the drive.
I am not
sure what an `sdi command timeout' is, but these seem
to occur benignly
on occasion. A `ctlr detected protocol' error
means that
the controller and drive do not agree on a protocol;
this could be
a cabling problem, or a version mismatch. A `positioner' error
means the drive seek hardware is ailing; `lost rd/wr
ready' means
the drive read/write logic is sick; and `drive clock
dropout' means
that the drive clock logic is bad, or the media is
hopelessly
scrambled. I have no idea what `lost recvr ready'
means. A `drive
detected error' is a catch-all for drive hardware
trouble; `ctlr
detected pulse or parity' errors are often caused by
cabling problems.
intro(4), mscpbus(4), uba(4), disklabel(5), disklabel(8)
The uda driver appeared in 4.2BSD.
OpenBSD 3.6 March 27, 1991
[ Back ] |