blas - IRIX/libblas/

· Home

+ man pages

-> Linux

-> FreeBSD

-> OpenBSD

-> NetBSD

-> Tru64 Unix

-> HP-UX 11i

-> IRIX

· Linux HOWTOs

· FreeBSD Tips

· *niX Forums

man pages->IRIX man pages -> libblas/blas (3)


BLAS(3F)							      BLAS(3F)

NAME [Toc] [Back]

     BLAS - Basic Linear Algebra Subprograms

DESCRIPTION [Toc] [Back]

     BLAS is a library of routines that	perform	basic operations involving
     matrices and vectors. They	were designed as a way of achieving efficiency
     in	the solution of	linear algebra problems. The BLAS, as they are now
     commonly called, have been	very successful	and have been used in a	wide
     range of software,	including LINPACK, LAPACK and many of the algorithms
     published by the ACM Transactions on Mathematical Software. They are an
     aid to clarity, portability, modularity and maintenance of	software, and
     have become the de	facto standard for elementary vector and matrix
     operations.

     The BLAS promote modularity by identifying	frequently occurring
     operations	of linear algebra and by specifying a standard interface to
     these operations.	Efficiency is achieved through optimization within the
     BLAS without altering the higher-level code that has referenced them.

     There are three levels of BLAS. The original set of BLAS, commonly
     referred as the Level 1 BLAS, perform low-level operations	such as	dotproduct
 and the adding of a multiple of one vector	to another. Typically
     these operations involve O(N) floating point operations and O(N) data
     items moved (loaded or stored), where N is	the length of the vectors. The
     Level 1 BLAS permit efficient implementation on scalar machines, but the
     ratio of floating-point operations	to data	movement is too	low to achieve
     effective use of most vector or parallel hardware.

     The Level 2 BLAS perform Matrix-Vector operations that occur frequently
     in	the implementation of mant of the most common linear algebra
     algorithms.  They involve O(N^2) floating point operations. Algorithms
     that use Level 2 BLAS can be very efficient on vector computers, but are
     not well suited to	computers with a hierarchy of memory (such as cache
     memory).

     The Level 3 BLAS are targeted at matrix-matrix operations.	These
     operations	generally involve O(N^3) floating point	operations, while only
     creating O(N^2) data movement. These operations permit efficient reuse of
     data that resides in cache	and create what	is often called	the surfaceto-volumne
	effect for the ratio of	computations to	data movement. In
     addition, matrices	can be partitioned into	blocks,	and operations on
     distinct blocks can be performed in parallel, and within the operations
     on	each block, scalar or vector operations	may be performed in parallel.

     BLAS2 and BLAS3 modules have been optimized and parallelized to take
     advantage of SGI's	RISC parallel architecture. The	best performances are
     achieved for BLAS3	routines (e.g. DGEMM), where "outer-loop" unrolling +
     "blocking"	techniques were	applied	to take	advantage of the memory	cache.
     The performance of	BLAS2 routines (e.g. DGEMV) is sensitive to the	size
     of	the problem, for large sizes the high rate of cache miss slows down
     the algorithms.
     LAPACK algorithms use preferably BLAS3 modules and	are the	most



									Page 1






BLAS(3F)							      BLAS(3F)



     efficient.	 LINPACK uses only BLAS1 modules and therefore is less
     efficient than LAPACK.

     To	link with "libblas", it	is advised to use "f77"	to load	all the
     Fortran Libraries required, otherwise include -lftn in your link line.
     For R8000 and R10000 based	machines, you should use the mips4 version.
     This is accomplished by using -mips4 when linking:
	  f77 -mips4 -o	foobar.out foo.o bar.o -lblas
     To	use the	parallelized version, use
	  f77 -mips4 -mp -o foobar.out foo.o bar.o -lblas_mp

SUMMARY [Toc] [Back]

     BLAS Level	1:
	 .....function......	  ....prefix,suffix.....  rootname
	 dot product		  s-  d- c-u c-c z-u z-c  -doty
 = a*x + y		  s-  d-     c-	     z-	  -axpy
	 setup Givens rotation	  s-  d-		  -rotg
	 apply Givens rotation	  s-  d-     cs-     zd-  -rot
	 copy x	into y		  s-  d-     c-	     z-	  -copy
	 swap x	and y		  s-  d-     c-	     z-	  -swap
	 Euclidean norm		  s-  d-     sc-     dz-  -nrm2
	 sum of	absolute values	  s-  d-     sc-     dz-  -asum
	 x = a*x		  s-  d- cs- c-	 zd- z-	  -scal
	 index of max abs value	 is- id-     ic-     iz-  -amax


     BLAS Level	2:
	MV Matrix vector multiply
	R  Rank	one update to a	matrix
	R2 Rank	two update to a	matrix
	SV Solving certain triangular matrix problems.

     single precision Level 2 BLAS     |     Double precision Level 2 BLAS
     -----------------------------------------------------------------------
	     MV	  R    R2  SV	       |	     MV	  R    R2  SV
     SGE     x	  x		       |     DGE     x	  x
     SGB     x			       |     DGB     x
     SSP     x	  x    x	       |     DSP     x	  x    x
     SSY     x	  x    x	       |     DSY     x	  x    x
     SSB     x			       |     DSB     x
     STR     x		    x	       |     DTR     x		    x
     STB     x		    x	       |     DTB     x		    x
     STP     x		    x	       |     DTP     x		    x

     complex  Level 2 BLAS	       | Double	precision complex Level	2 BLAS
     -----------------------------------------------------------------------
	     MV	  R	RC   RU	 R2  SV|	  MV   R     RC	  RU  R2  SV
     CGE     x		x    x	       |  ZGE	  x	     x	  x
     CGB     x			       |  ZGB	  x
     CHE     x	  x		 x     |  ZHE	  x    x	      x
     CHP     x	  x		 x     |  ZHP	  x    x	      x
     CHB     x			       |  ZHB	  x



									Page 2






BLAS(3F)							      BLAS(3F)



     CTR     x			     x |  ZTR	  x			  x
     CTB     x			     x |  ZTB	  x			  x
     CTP     x			     x |  ZTP	  x			  x

     BLAS Level	3:
	MM  Matrix matrix multiply
	RK  Rank-k update to a matrix
	R2K Rank-2k update to a	matrix
	SM Solving triangular matrix with many right-hand-sides.

     single precision Level 3 BLAS     |     Double precision Level 3 BLAS
     -----------------------------------------------------------------------
	     MM	  RK   R2K SM	       |	     MM	  RK   R2K SM
     SGE     x			       |     DGE     x
     SSY     x	  x    x	       |     DSY     x	  x    x
     STR     x		    x	       |     DTR     x		    x

     complex  Level 3 BLAS	       | Double	precision complex Level	3 BLAS
     -----------------------------------------------------------------------
	     MM	  RK   R2K SM	       |	     MM	  RK   R2K SM
     CGE     x			       |     ZGE     x
     CSY     x	  x    x	       |     ZSY     x	  x    x
     CHE     x	  x    x	       |     ZHE     x	  x    x
     CTR     x		    x	       |     ZTR     x		    x

C INTERFACE [Toc] [Back]

     There is a	C interface for	the BLAS library. The implementation is	based
     on	the proposed specification for BLAS routines in	C [1].

     The argument lists	follow closely the equivalent Fortran ones. The	main
     changes being that	enumeration types are used instead of character	types
     for option	specification, and two dimensional arrays are stored in	one
     dimensional C arrays in an	analogous fashion as a Fortran array (column
     major). Therefore,	a matrix A would be stored as:

	   double (*a)[lda*n];
	/*							   */
	/*	    a is a pointer to an array of size tda*n	   */
	/*							   */

     where  element  A(i+1,j)  of matrix A  is stored  immediately  after the
     element  A(i,j), while  A(i,j+1) is lda  elements apart from  A(i,j). The
     element A(i,j) of the matrix can be accessed directly by reference	to  a[
     (j-1)*lda + (i-1) ].

     The names of the C	versions of the	BLAS are the same as the Fortran
     versions since the	compiler puts the Fortran names	in upper case and adds
     an	underscore after the name.

     The argument lists	use the	following data types:

	     Integer:	     an	integer	data type of 32	bits.



									Page 3






BLAS(3F)							      BLAS(3F)



	       float:	     the regular single	precision floating-point type.
	      double:	     the regular double	precision floating-point type.
	     Complex:	     a single precision	complex	type.
	     Zomplex:	     a double precision	complex	type.

     plus the enumeration types	given by

       typedef enum { NoTranspose, Transpose, ConjugateTranspose }
		    MatrixTranspose;

       typedef enum { UpperTriangle, LowerTriangle }
		    MatrixTriangle;

       typedef enum { UnitTriangular, NotUnitTriangular	}
		    MatrixUnitTriangular;

       typedef enum { LeftSide,	RightSide }
		    OperationSide;

     The complex data types are	stored in cartesian form, i.e.,	as real	and
     imaginary parts. For example

       typedef struct {	 float real;
			 float imag;
				     } Complex;

       typedef struct {	double real;
			double imag;
				     } Zomplex;

     The operations performed by the C BLAS are	identical to those performed
     by	the corresponding Fortran BLAS,	as specified in	[2], [3] and [4].

     To	use the	C BLAS,	link with "libblas". It	is advised to use "f77"	to
     load all the Fortran Libraries required:
	  f77 -o foobar.out foo.o bar.o	-lblas

FILES [Toc] [Back]

     /usr/lib/libblas.a
     /usr/lib/libblas_mp.a
     /usr/include/cblas.h

ORIGIN [Toc] [Back]

     The original Fortran source code comes from netlib.

REFERENCES [Toc] [Back]

     S.P. Datardina, J.J. Du Croz, S.J.	Hammarling and M.W. Pont, "A Proposed
     Specification of BLAS Routines in C", NAG Technical Report	TR6/90.

     C Lawson, R. Hanson, D. Kincaid, and F. Krough, "Basic Linear Algebra
     Subprograms for Fortran usage ", ACM Trans. on Math. Soft.	5(1979)
     308-325



									Page 4






BLAS(3F)							      BLAS(3F)



     J.Dongarra, J.DuCroz, S.Hammarling, and R.Hanson, "An extended set	of
     Fortran Basic Linear Algebra Subprograms",	ACM Trans. on Math. Soft. 14,
     1(1988) 1-32

     J.Dongarra, J.DuCroz, I.Duff,and S.Hammarling, "An	set of level 3 Basic
     Algebra Subprograms", ACM Trans on	Math Soft( Dec 1989)


									PPPPaaaaggggeeee 5555

[ Back ]

Similar pages

Name	OS	Title
inttypes	HP-UX	basic integer data types
pcserver	HP-UX	Basic Serial and HP AdvanceLink server
showcase	IRIX	Basic drawing and presentation tool
trusted_networking	IRIX	Trusted IRIX network administration: basic concepts.
gcov	NetBSD	display basic block profile / coverage data
kernbb	FreeBSD	generate a dump of the kernels basic-block profile buffers
EZsetup	IRIX	login for basic system setup and user environment configuration
bos	OpenBSD	is the client part of the Basic Overseer Daemon AFS server processes.
mkpasswd	Tru64	Creates a version of the basic user database organized for efficient searches
Term::ReadLine	IRIX	Perl interface to various readline packages. If no real package is found, substitutes stubs instead of basic f

newsletter delivery service

Contents

NAME [Toc] [Back]

DESCRIPTION [Toc] [Back]

SUMMARY [Toc] [Back]

C INTERFACE [Toc] [Back]

FILES [Toc] [Back]

ORIGIN [Toc] [Back]

REFERENCES [Toc] [Back]